Category Archives: php

Taking A Fresh Look at PHP

I’ve recently started working on a PHP/Laravel project.

PHP isn’t new to me. Many years ago, I wrote a very simple online catalog and shopping cart, from scratch, for a friend who had his own business as a rare book dealer. He used it with much success for several years. I’d also done a bit of hacking on some Drupal plugins.

Coming back to PHP now, I’m finding myself in a world MUCH different than the one I’d left.

First off, let’s admit that PHP comes with a lot of baggage. For a long time, “real” programmers shunned PHP because it was born as a language cobbled together to do simple web development but not much more. Its ease of use, combined with the fact that it was easy to deploy on commodity web hosting, meant you could find PHP talent for relatively cheap to build your applications. The stereotype was that PHP developers relied on a lot of patchy copy-and-paste solutions to build shoddy and insecure websites.

A LOT has happened since then. Here’s what I’ve encountered so far, diving back into PHP:

Object-orientation: PHP has had objects for a long time, but more recent features like namespaces, traits, and class autoloading have made newer PHP projects very strongly object-oriented. You can even find books on design patterns for PHP.

To me, this is the single most important positive change to the PHP world. The culture has changed from an ad hoc procedural mindset to more sophisticated thinking about coding for large-scale architectures.

Frameworks: Several major MVC frameworks exist, many of them drawing inspiration from Rails.

Performance: As of 5.5, PHP has a built-in opcode cache, making it much more performant. An alternative to core PHP is the HHVM project, backed by Facebook, which is a high-performance PHP implementation. HHVM has had a “rising tide” effect: the forthcoming PHP7 is supposed to be as fast as HHVM. So whatever you use, you can expect good performance at scale.

Tooling: There is sophisticated tooling like composer and a vibrant ecosystem of packages. While you can still deploy PHP applications the old way, using Apache and mod_php, there is a mature FastCGI Process Manager (PHP-FPM) engine that isolates PHP processes from the web server. PHP-FPM allows Apache/nginx/whatever web server to handle static content while a pool of processes handles PHP requests. This results in much more efficient memory usage and better performance.

Success: Many respectable, high-profile products have been built using PHP: WordPress, Drupal, and Facebook, just to name a few.

But all this is just to state a bunch of known facts. To me, the biggest suprise has been in the EXPERIENCE of beginning to write code again in PHP and using Laravel: what does that FEEL like?

In a word, it feels like Java, minus the strong typing. This is an entirely good thing in my opinion, despite criticisms that PHP technologies have become too complex and overdesigned.

The biggest paradigm difference between PHP and other popular web application back-ends is that nothing remains loaded in memory between requests. It’s true that opcode caching means PHP doesn’t have to re-compile PHP source code files to opcodes every time, which speeds things up greatly, but the opcodes still need to be executed for each request, from bootstrapping the application to finishing the HTTP response. In practice, this doesn’t actually matter, but it’s such a significant under-the-hood difference from, say, Django and Rails, that I find myself thinking about it from time to time.

It’s reassuring that when I scour the interwebs researching something PHP-related, I’m finding a lot of informed technical discussions and smart people who have come to PHP from other languages and backgrounds. It bodes well for the strengths and the future of the technology.

comment_notify for Drupal 7

I’ve been doing a little work with Drupal for a personal project. Since I can code and have the time to work with it, I chose to go with the latest and greatest version, instead of the older stable release. Poking around in the innards of Drupal has been an interesting experience, to say the least.

I wish Drupal’s repositories were hosted on github. Sometimes module maintainers can’t keep up with all the patches submitted (I don’t blame them, it’s a ton of work), so fixes are very slow to make it into the source base. github would allow people to fork and send their patches back upstream, instead of having to hunt around on the Drupal bug reporting system.

I’ve been eager to get the comment_notify module working better for Drupal 7. I found fixes for some known issues, and also created, tested, and submitted patches of my own. I decided to make available my own fork of the module on github, for those who want to grab a patched version for their own use:

https://github.com/codeforkjeff/comment_notify

It’s meant to be a temporary solution until the fixes make it into the official module, but it’s better than nothing!

In and Out

I’ve taken a hiatus from programming work the last few months in order to do some other things, both paying and non-paying. I did some teaching, spent a lot of time at the bicycle co-op, published some articles in a local periodical, and started a few side projects. It has felt very gratifying and healthy.

My erratic life nowadays makes me think back to the starry-eyed aspirations to be a professional programmer that I had, say, ten years ago. I have a much more casual relationship to computers these days. Gone is the anxiety of having to prove, both to myself and to the world (including employers), that I am skilled and capable. I don’t care much anymore for the company of obsessive coders who enjoy religious debates about languages, tools, and best practices. They don’t seem to realize that in their pursuit of mastery and professionalism, technology is actually controlling them, and not the other way around.

Technology has become truly “technical” for me, in the fundamental sense of being a means to an end. I’ve been writing and tweaking Javascript and PHP recently. I don’t like them much, but they are decent tools for the specific tasks I am trying to accomplish. And so it goes. Getting back into code has been a Zen-like experience: when I am doing it, I am fully engaged in all the particularities. But when I’m finished with it, that’s it. Programming, and software more generally, is instrumental. I dive into it; then I come out. And there is nothing more to it than that.

It is silly to fetishize a hammer when you should be focusing on building the house, but that’s exactly what a lot of the culture of coding often feels like on blogs, message forums, mailing lists, conference agendas, etc. How silly and ridiculous. One should, of course, be thoughtful about tools, when and how to best use them, and how they can be improved. I do appreciate and value that. But at the end of the day, code is simply code. There is just so much more to life. Remembering that helps us keep sight of the things that code is for in the first place.

Lessons Learned

After about 5 months, I’ve decided that it’s time to move on to another gig. I’ve learned a few things, and I’m posting them here in the hopes that the lessons might be helpful to other programmers and techies-at-large.

Working in a small business as the sole do-it-all technology person has its unique challenges. It can be very fulfilling to be the sole expert and “enabler,” if that turns you on. But the flip side is that management might not really understand or care that much about their technology. Is there a reasonable budget for what they’re trying to accomplish? Do they understand, at a high level, your projects and how they contribute to the mission? Are technology projects considered a burdensome mystery or something valuable and embraced by the company? Question the reasons why there’s only one tech guy/girl and whether that seems right.

Another thing to assess is whether you can deal with taking over the existing codebase. I’ve taken over other code before with success, retaining what was good and doing clean up as necessary. At this past gig, things looked reasonably tidy at a first glance, but as time progressed, I realized a ton of abstractions weren’t in place, and those that did exist didn’t make sense. Some refactoring might have been interesting to do, but this endeavor wasn’t valued when I proposed it as a project.

Lastly, I think it’s important to be wary of promises about the future. Even with the best of intentions, things change quickly at small businesses. The projects I was initially excited about got perpetually deferred for various reasons, and I found myself preoccupied with doing maintenance code fixes, making cosmetic tweaks, performing server administration, and providing support for third party software (which I really don’t like to do). The company needed these things done, so I did them with as much cheer as I could muster, hoping we’d eventually get to a place where some solid new development could occur (and I could sneak in some refactoring)—that’s what floats my boat. But it became to clear to me that wasn’t going to happen anytime soon.

So that’s that. It’s a shame it didn’t work out, especially since I actually liked everyone I worked with. At least it’s an amicable departure, and I hope to be involved in hiring a replacement who might be a better fit for their current needs than I am.

The new gig? Java. Been catching up on it, since it’s been a few years. Oh, it feels so nice to have package namespaces, real data types, full-featured APIs, and real object-orientedness again. Like coming home.

A Quick Observation

For some potential upcoming work, I’ve been catching up on the changes made to Java over the last few years, and exploring the popular frameworks and libraries now in use.

Folks on reddit.com harshly criticize the bloat, unnecessary complexity, and huge runtime requirements for Java. They have their points. But I have to say, having worked on perl and PHP lately, where good code organization is the exception and not the norm, looking at Java again is a very welcome change.

The APIs for stuff like Servlets, Faces, EJBs, and Hibernate may be difficult to learn and remember, but at the very least, I find I always know where to look for something, and it’s usually where I expect to find it. In my book, over-abstraction is the lesser evil compared to not enough.

EAcceleratorCacheFunction = Cache_Lite_Function + EAccelerator

It’s pretty much all in the title. In a nutshell, EAcceleratorCacheFunction is a “memoizing” cache class for PHP that uses shared memory for storage. It is mostly compatible with Cache_Lite and Cache_Lite_Function.

Just like Cache_Lite_Function, it supports per-cache-object lifetime values, instead of specifying the lifetime of an item at the time you store it. This lets you dynamically change the lifetime of the cache. For example, if system load goes up and you don’t mind serving sightly older content instead of regenerating it:

$load = sys_getloadavg();
// use 5 min avg (ignore momentary spikes)
if($load[1] >= 6) {
    $lifetime = 900; # 15 min
} elseif($load[1] >= 3) {
    $lifetime = 600; # 10 min
} else {
    $lifetime = 300; # 5 min
}
$cache = new EAcceleratorCacheFunction(array('lifeTime' => $lifetime));
$cache->call('make_page');

I wrote EAcceleratorCacheFunction as a drop-in replacement for Cache_Lite_Function. On a virtual private server, doing cache reads/writes from memory instead of disk has made a noticeable difference in performance; it helps tremendously that the database has to contend with less disk I/O.

Two Styles of Caching (PHP’s Cache_Lite vs memcached)

Since the recent slashdotting of our website (we held up okay, but there’s always room for improvement), I’ve been investigating the possibility of moving from Cache_Lite (actually, Cache_Lite_Function) to memcached in our PHP code.

Much discussion comparing these solutions focuses on raw performance in benchmarks. In the real world, though, not all things outside the benchmark are equal. On a VPS, disk I/O times are notorious for being highly variable. This makes memcached all the more attractive. Yes, memory is faster than disk in almost every environment, but also, avoiding disk access conserves a precious resource so fewer processes must block for it.

A public mailing list post by one Brian Moon points this out exactly:

If you rolled your own caching system on the local filesystem, benchmarks would show that it is faster. However, what you do not see in benchmarks is what happens to your FS under load. Your kernel has to use a lot of resources to do all that file IO. […]

So, enter memcached. It scales much better than a file based cache. Sure, its slower. I have even seen some tests where its slower than the database. But, tests are not the real world. In the real world, memcached does a great job.

Okay, great. memcached is better when you take into account overall resources. But there’s a very useful Cache_Lite_Function feature that memcached doesn’t seem to have.

When you initialize a Cache_Lite_Function object, you set a “lifeTime” parameter, then use the call() method to wrap your regular function calls. If the output of the function hasn’t been cached within that time period, the call gets made and its results replaced in the cache with a new timestamp.

The cool thing about it is that you can create different cache objects pointing to the same directory store without a problem. Pages can increase and decrease the lifetime of the cache dynamically as load changes, so you can serve slightly older data from cache if necessary, keeping the site responsive while saving database queries. On a site where content changes relatively infrequently, this is a great feature to have: serve it fresh when load is low, serve from cache when load is high.

memcached, on the other hand, requires that you specify an expiration time at the time you place data in the cache. A retrieval call doesn’t let you specify a time period, so you can’t do the above. If data has expired, it’s expired.

It’d be interesting to hack Cache_Lite_Function to use memcached as its store, so you could get the best of both worlds. It would involve storing things in memcached with no expiration, tacking on a timestamp in the data, and doing the checking manually. But it might work.

Maintainability Pitfalls in PHP

Tim Bray makes this prediction about PHP for 2008:

PHP will remain popular but its growth will slow, as people get nervous about its maintainability and security stories.

I share Tim’s love/hate relationship with PHP. It’s definitely a powerful and easy language. But,

… speaking as an actual computer programmer, I really dislike PHP. I find it ugly and un-modular and there’s something about it that encourages people to write horrible code. We’re talking serious maintainability pain.

I’m seeing this right now in some code I’ve recently taken over. The previous programmer was quite skilled and did a great job, but it’s clear there are some areas he had to write quickly and hack together. The flip side of PHP’s ease of use is that sloppiness accumulates very quickly when you’re doing things in a hurry. To some extent, that’s an unavoidable aspect of a growing codebase. But there’s also specific things about PHP itself that foster disorganization and unmaintainability:

* The lack of namespaces. This makes it hard to quickly locate a function or class definition. Classes can be used as namespaces, but that’s a hack, and leads to ugly un-OOPish uses of classes. PHP could really benefit from packages or modules.

* While PHP5 has vastly improved its object functionality, it often feels like the developer culture remains mired in a function-oriented paradigm. PHP’s relative ease of use and wide availability on commodity webhosting has produced a huge pool of developers whose skills are pretty wide-ranging. The low end of that tends towards hacky, function-oriented code that simply “gets the job done.” I’d like to see more thoughtful discussion on PHP sites and forums about object design and philosophy, about when to use functions and classes, and about how to mix them up harmoniously.

* Having a library of thousands of built-in functions in a global namespace with little rhyme or reason to their naming doesn’t exactly provide a great model of maintainability.

* extract() should die. Die, die, die.

* There’s not much agreement about OOP performance: some insist that heavy usage of some OOP features slows PHP down a lot, so you should avoid them whenever possible. Which not only is plain dumb but leads to deliberately confusing and half-assed uses of OOP in the name of better performance.

Maintainability is a matter of discipline, since you can write sloppy code in any language. That aside, PHP does make it extra hard to keep things orderly. I think CakePHP is a step in the right direction, though if you’re going to use a strict MVC architecture, you might as well dump PHP and just go with Ruby on Rails or Python.