Archive for the 'software' Category

Maintainability Pitfalls in PHP

Tuesday, January 8th, 2008

Tim Bray makes this prediction about PHP for 2008:

PHP will remain popular but its growth will slow, as people get nervous about its maintainability and security stories.

I share Tim’s love/hate relationship with PHP. It’s definitely a powerful and easy language. But,

… speaking as an actual computer programmer, I really dislike PHP. I find it ugly and un-modular and there’s something about it that encourages people to write horrible code. We’re talking serious maintainability pain.

I’m seeing this right now in some code I’ve recently taken over. The previous programmer was quite skilled and did a great job, but it’s clear there are some areas he had to write quickly and hack together. The flip side of PHP’s ease of use is that sloppiness accumulates very quickly when you’re doing things in a hurry. To some extent, that’s an unavoidable aspect of a growing codebase. But there’s also specific things about PHP itself that foster disorganization and unmaintainability:

* The lack of namespaces. This makes it hard to quickly locate a function or class definition. Classes can be used as namespaces, but that’s a hack, and leads to ugly un-OOPish uses of classes. PHP could really benefit from packages or modules.

* While PHP5 has vastly improved its object functionality, it often feels like the developer culture remains mired in a function-oriented paradigm. PHP’s relative ease of use and wide availability on commodity webhosting has produced a huge pool of developers whose skills are pretty wide-ranging. The low end of that tends towards hacky, function-oriented code that simply “gets the job done.” I’d like to see more thoughtful discussion on PHP sites and forums about object design and philosophy, about when to use functions and classes, and about how to mix them up harmoniously.

* Having a library of thousands of built-in functions in a global namespace with little rhyme or reason to their naming doesn’t exactly provide a great model of maintainability.

* extract() should die. Die, die, die.

* There’s not much agreement about OOP performance: some insist that heavy usage of some OOP features slows PHP down a lot, so you should avoid them whenever possible. Which not only is plain dumb but leads to deliberately confusing and half-assed uses of OOP in the name of better performance.

Maintainability is a matter of discipline, since you can write sloppy code in any language. That aside, PHP does make it extra hard to keep things orderly. I think CakePHP is a step in the right direction, though if you’re going to use a strict MVC architecture, you might as well dump PHP and just go with Ruby on Rails or Python.

Amateur thoughts and ambitions

Monday, December 31st, 2007

One of the better things I’ve stumbled across this past year is Larry Lessig’s talk, How creativity is being strangled by the law.

The piece makes his usual argument that copyright law stifles innovation in the age of new media. Most striking to me, though, was the part where he uses the phrase “amateur culture.” He explains, “…I don’t mean amateurish culture, I mean culture where people produce for the love of what they’re doing and not for the money.” He uses the term to describe the activity of “kids” (?) creating their own remixes from existing media.

I can remember another amateur culture that’s now largely disappeared. Back in my teens, modem-based bulletin board systems (BBSes) fostered a rich “read-write” culture for amateur programmers. Most of us did not work in technology; after all, the commercial Internet hadn’t been born yet, so the computing industry was much smaller and more obscure. A career as a programmer seemed like a mysterious and rarefied thing to me back then. The coders you met on BBSes were often people who simply liked to do programming in their spare time.

These systems allowed us to circulate public domain source code for fun games and useful applications written in BASIC, Pascal, C, even assembler. We hacked on existing code to get it to do what we wanted, trying to figure out ways to push the limits of our little 8086 processors and 640K of RAM. We mingled regardless of our level of knowledge, beginners and experts alike. We had friendly user meetings in diners in Brooklyn and Manhattan (I lived in NY at the time), where we chatted about home-grown upgrades and discussed how to link up to the nation-wide discussion networks that existed then.

It was amateur culture at its best: lots of exchange, circulation, and cooperation happened all the time. But it was definitely not amateurish. Many were extremely capable and knowledgeable coders.

Today, there are still people who code just because they enjoy it, but the amateur culture and its community hardly exist anymore. Beginners on web forums are more interested in what they need to know in order to land a job, rather than in coding itself. Even open source projects tend to be dominated by career professionals; read any public mailing list and you’ll see how unhelpful they often are to amateurs who want to get involved. One reason I like python is that the project makes a genuine effort to connect to the sensibilities of amateurs. But even its forums are littered with snarky individuals.

All of this is largely due, I think, to the ideology of professionalism, which convinces us that having a stable career is the pinnacle of achievement. It damagingly equates amateurs with dilettantes. That’s why one of the first things we ask in this country when meeting a stranger is, “So what do you do?” By which we really mean, “Tell me what you do for a living so I can know who you are and whether you’re worth talking to.”

In 2008, I resolve to be more wary of this ideology and its negative effects. I want to embrace being an amateur in the various things that I do. I want to think less about careers and focus more on how to best spend my time doing what’s important to me. And I want to find more amateurs to hang out with as well.

Software is an Art

Saturday, December 22nd, 2007

Today a blogger named Damon Poole wrote a short post titled, “Designing Software is the same as Predicting the Future.” It resonates with my post from a while back on whether “software engineering” is the right metaphor for writing code.

The essential problem of coding is to deal with the unknown as best you can. Software is made to solve a problem, but the more unique the problem, the more difficult it is to draw upon existing knowledge to create good solutions. Unknowns force you to make guesses. Educated guesses, hopefully, but guesses nonetheless.

This is why I’m in the camp of those who believe that creating software is an art. It’s an endeavor that wrestles with the unknown. This artistry is highest when you find yourself asking, “How do I do X?” and there don’t seem to be any pre-packaged answers you can look up in a textbook or simply google.

Paradoxically, once the software is written and refined, the unknowns are removed from the picture. Art largely disappears once the pure functionalism of operational software emerges. I think this is why many good programmers have short attention spans, get bored, and tend to jump from project to project. They crave the excitement and gratification of facing the unknown. But this is always ultimately ephemeral.

HTTP + XML do not a RESTful interface make

Monday, December 17th, 2007

I’ve been stumbling across criticisms of the un-RESTful design of Amazon’s new SimpleDB service. Worth reading in particular is a piece by someone named Subbu Allamaraju, who seems both smart and accomplished. He did a quick rewrite of the API in his post, A RESTful version of Amazon’s SimpleDB. It’s a great example of how clean URLs can be when a bit of thought is put into them.

And people should also read it to clear up a popular misunderstanding about REST. I’ve already given it away in my title: HTTP + XML do not a RESTful interface make.

As Roy Fielding’s dissertation chapter lays it out, a REST architecture should follow the abstraction of “resources” from the “connectors” that perform operations on them. The HTTP protocol happens to be able to do this nicely: URLs refer to resources and the GET/POST/PUT/DELETE methods manipulate them. However, this isn’t inherent or automatic: you have to use its vocabulary properly. SimpleDB is a perfect example: it violates the principle that resource identification and operations should be separate. The API embeds operations in URLs.

So yes, it uses HTTP and XML. But no, those things alone don’t make it truly RESTful.

REST is certainly a huge step forward in enforcing cleaner abstractions, though in Amazon’s defense, it’s obvious why they choose to design their API the way they did. The URLs for SimpleDB have the same structure as those in ECS; consistency was probably the goal. So yes, it’s a conservative move that forward-thinking coders are turning their noses up at, but it’s also one for which existing developers will probably be grateful. Rant all you want about the evils of non-idempotent GET requests… for Amazon’s customers, the old API style feels familiar, and means one less new thing to get used to, or to learn.

Figuring out the mystery of T-Zones(tm)

Thursday, December 13th, 2007

I’m convinced that cellular phone companies keep dark basement cubicles full of evil marketing gnomes whose sole job is dreaming up ways to confuse and frustrate customers.

I thought it’d be cool to use my phone for browsing those new fangled web pages designed for mobile, and to be able to check my email. Having no idea how to go about this, I dug into the incredibly twisted world of marketing lingo, misdirection, and bad technological practices of T-Mobile.

Part 1: T-Zones(tm) Means Fun!

First, I went to the T-Mobile website and looked at their Services & Accessories page. Something called T-Zones(tm) is featured in a large box. What is it?

“The place to buy services—on your phone or on the Web. Check out all the fun and useful things available at t-zones.”

Okay. It has an icon that says “Web & Apps,” which sounds promising. Except… in a smaller box separate from T-Zones(tm), there’s also a list of “Other Services” that includes “Internet & E-mail Services.” What’s the difference? Well, obviously, T-Zones(tm) is “fun and useful” (trademarks will do that sometimes) whereas the other service isn’t.

Fine, I’ll bite. What does T-Zones(tm) Web & Apps give me?

E-mail on the go
Keep in touch while you’re out of the house. E-mail service lets you use your phone or other T-Mobile device to access personal e-mail from a variety of common providers so you’re never out of the loop.
[...]
Choose when and where you Web
Just $5.99 per month gives you unlimited access to mobile Web destinations and great capabilities—like reading, editing, and sending e-mail directly from your phone.

Sounds good, aside from the minor fact that web is not a verb. But how is it different from Internet & E-Mail Services? Well, the latter seems to be only for BlackBerry, Sidekick, and Smartphones. I consider my phone to be pretty snazzy, but I don’t know if it’s a Smartphone. How can I tell?

Luckily, there’s an FAQ on the side that tries to help out with this. Sort of.

Do I need a special phone to get Internet access?
Most phones can display text-only Internet by using T-MobileWeb service. To find a phone with a full Web browser, go to the Phones page and check the T-Mobile Internet check box.

Holy crap, that’s confusing. Checking off the T-Mobile Internet box doesn’t really show me Internet-capable phones, it shows phones with a FULL Web browser. “Most” phones are already capable of text-only Internet. Thanks for making that crystal clear!

Part 2: Wrestling with the phone

It would appear, then, that T-Zones(tm) is the right choice for my Nokia 6086. Which, incidentally, is NOT a Smartphone, after all. I do want “unlimited access to mobile Web destinations.” And if T-Zones(tm) makes it fun and useful, so much the better.

Frankly, the T-Zones(tm) menu item has always scared me. I didn’t know what it did, and it terrified me to think that I might click something by accident and discover a $1000 charge on my next bill. But this time, I clicked it. It brought up a little screen that sort of resembled a web page. I could navigate, just like using a regular web browser on a PC, to a screen that let me turn on T-Zones(tm) service for $5.99/month.

I suddenly understood. The T-Zones(tm) item is a simple mostly-text web browser. That’s all.

Which suits me fine. I can get the basic information I want, and I don’t really need the bells and whistles of a fancier phone with better browsing capabilities. On the other hand, I want to make the most of what the device can do. Googling around, I found that there’s a better browser, Opera Mini, that’s available for this phone. Why not try it out?

I transferred it successfully, but when I ran it, it showed the message: “Application access set to not allowed.”

It reminds me of that old passive voice trick, “War has been declared.” By whom, damnit?! Just who is it exactly who won’t allow this application to run? I want some answers.

Part 3: The mystery of T-Zones(tm) solved

Under the Option menu for the Opera Mini application, there’s an item called “App. access.” It’s what you’d use to grant permission to use the network. That’s a pretty good idea, since you don’t want applications secretly doing undesirable things on your phone. They’re written in Java, using a mobile API that has a strong security model built-in.

But on my phone, “App. access” is greyed out, disabled.

More Googling revealed that so-called unbranded Nokia 6086 phones let you adjust this setting, and Opera Mini works on them. But T-Mobile’s phones have that option deliberately disabled. Other users have reported this same problem with other applications that use the network as well.

So I can’t give Opera Mini permission to use the network, because the phone’s software has been crippled (a sadly un-P.C. word choice that’s stuck). By who? T-Mobile, that’s who.

What does this tell us about T-Zones(tm)? From what I can tell, and I might be wrong, both T-Zones(tm) and the more expensive Total Internet plan give you unlimited web access. The phones make the difference: lower-end consumers are forced to use the stock browser on crippled phones, while the more expensive service and application options are offered to users with high-end phones. This, in spite of the fact that your humble little phone might very well be capable of running applications that access the web. Definitely not very fun or useful.

It’d be like selling 2 models of Ferrari with the exact same engine, but one is capped at 50mph in the systems software. It’s capable of going faster, but it’s limited for no reason other than to encourage you to buy the faster, more expensive one. Which also happens to have a sunroof.

It’s fair to pay for a service, like network usage. It’s fair to pay for a device. But it’s bad business and bad technology to artificially disable goods simply to differentiate a product line. I don’t know how other T-Mobile customers feel, or if most of them even know about this aspect of their business, but to me, it’s downright insulting.

Small Victories

Sunday, November 4th, 2007

It always feels very rewarding when my coding-related problems have happy outcomes. A few small but cool things that have happened recently:

A while ago, I discovered an issue with the way the cherrypy web server resolves “localhost” to an IP6 address on some operating systems. The cherrypy folks actually listened to my suggestion and made a helpful change.

It looks like one person has already benefited from the tiny patch I created for a feedparser issue last week.

When I couldn’t get SQLAlchemy’s “pool_recycle” option to properly close and re-open inactive connections, the estimable Michael Bayer took a minute out of his busy life to explain what I had missed, for which I was extremely grateful. (One of these days, I need to write about a post about how truly amazing SQLAlchemy is.)

Open source projects create the opportunities for good things to happen. Most of the time. (Or maybe it’s just the python projects.)

In praise of feedparser

Sunday, October 28th, 2007

I discovered an issue this morning in the excellent feedparser module.

feedparser (aka Universal Feed Parser) has a reputation in the python community for being an incredible piece of code. With good reason: it understands a mind-boggling array of feed formats and versions, and it’s been put through the paces with a suite of 3262 unit tests. Mark Pilgrim’s terrific work has saved me (and many others, no doubt) months of toil and sweat.

The issue has to do with encountering multiple “title” tags defined in different XML namespaces. A dc:title or media:title tag, if it is encountered anywhere after a regular RSS title tag, will overwrite it. Several other people have actually already documented this in the bug tracking system.

It’s unclear how much work is actively being done with feedparser these days, so I reluctantly dived into the module to try to see what was going on. A little over an hour later, with almost no pain, I found myself with a patch that passed the test suite. I’d love to say it was due to my incredible coding prowess, but really, the code is just amazingly clean and easy to understand. That’s how fixing bugs should be.

The patch is here if anyone wants it.

The Right Metaphor

Thursday, October 11th, 2007

If you haven’t caught Kyle Wilson’s recent piece, “Software is Hard,” I highly recommend it. The essay moves elegantly from book review, to musings on knowing when the code is “done,” to issues of measuring quality, to the ever-present problems of lateness and going over budget, to the potential inadequacy of “engineering” as the metaphor for writing software.

It’s the last topic that’s the most fascinating to me. Kyle points out that new software is written only in response to new problems (otherwise, you’d just use existing software). As such, new code ventures into the unknown, where you can, at best, only guess at the challenges you’ll encounter. We always try our best to assess what we’ll face, but by their very nature, these are imperfect assessments. As Kyle puts it, “The only way to avoid that is to have your design go all the way down to specifying individual lines of code, in which case you aren’t designing at all, you’re just programming.”

Which is not to say, of course, we should simply give up engineering. Without some sort of plan for design and advance assessment, we’d be utterly lost. Businesses couldn’t function and programmers couldn’t make a living. For better or worse, the smooth functioning of our society is founded on the arrogance of making accurate predictions, not just about business and software, but about everything from politics and law, to human behavior and psychology, to weather. Such hubris…

No surprise, then, that even real-world traditional engineering often fails to be predictable. Kyle mentions the Oakland Bay Bridge as a project that’s hugely over time and budget. Just yesterday, Boeing announced its much-anticipated Dreamliner would be six months late.

So maybe software engineering IS the right term after all.

A Lesson in Software Development

Monday, September 24th, 2007

Slashdot reported on an opinion piece by David Sivers entitled, “7 reasons I switched back to PHP after 2 years on Rails.” It’s sparked heated discussion about languages, but if you read the article carefully, it’s not really anti-Ruby or pro-PHP, though David does imply serious shortcomings to Rails. He writes, “at every step, it seemed our needs clashed with Rails’ preferences.”

The problem with the short piece is that he’s not very specific about what these “preferences” are. The main criticisms seem to be that Rails is too complicated, too slow, and doesn’t allow direct SQL. Complexity and performance are often the cost of using a large web framework; these don’t seem like Rails problems, but issues you’d have to deal with when using any big framework (in fact, there’s an apples-and-oranges dimension here: Ruby on Rails is a full web stack and architecture, whereas PHP is a scripting language with built-in access to tons of different libraries. But we’ll put that aside for now.). As for direct SQL, a lot of folks have pointed out that David is just plain mistaken about not being able to do what he wanted.

The real reasons he switched back actually have little to do with PHP per se: read the piece carefully and you’ll see he’s much more comfortable thinking in terms of libraries than frameworks. Also, he needed to integrate tightly with an existing codebase. Those two things are the real reasons why switching back to PHP worked for him. They don’t have anything to do with either the strengths or failures of Ruby on Rails itself.

David doesn’t quite do justice to his main point, which is actually about software development, not language features: don’t expect a much touted language or tool to work magic for you. Of course, this should apply as much to PHP as to Rails! I’m not trying to defend Rails, which I know zilch about. There’s a bigger lesson here: the strengths and weaknesses of languages have less to do with success than the overall environment, the functional and integration requirements, and the coders’ facility with a toolset.