perl – codefork.com

The Myth of Artisanal Programming

Paul Chiusano, the author of the excellent Functional Programming in Scala from Manning (one of the few tech publishers I buy from; worth every penny), recently wrote a blog post titled, “The advantages of static typing, simply stated”.

Lately all I seem to do is rant to people about this exact topic. Paul’s post is way more succinct than anything I can write, so go over there and read it.

While he takes pains to give a balanced treatment of static vs dynamic type systems, it seems much more cut and dry to me. Dynamic languages are easier and faster for development when you’re getting started on a project, and it’s great if that project never gets very big. But they scale very poorly, for all the reasons he describes. Recently, I had the daunting task of reading almost ~10k lines of Perl code (pretty good Perl, in my opinion). It was hard to make sense of and figure out how to modify and extend, whereas the MUCH larger Java codebase (over 100k lines, if I recall) that I worked with years ago felt very manageable.

My own history as a programmer matches Paul’s very closely. I started with Java, which was annoying but not a bad language by any means. Then Python came along and seemed like a liberation from Java’s rigidity and verbosity. But Python, Ruby and others are showing their weaknesses, and it’s no mystery why people are turning to the newer generation of statically typed languages like Scala, Haskell, Go, etc.

People who haven’t been around as long don’t necessarily have this perspective.

In retrospect, it’s interesting to me how we programmers “got sold” on dynamic languages, from a cultural perspective. You might recall that a big selling point was using simple text editors rather than IDEs, and there was this sense that writing code this way made you closer to the software somehow. Java was corporate, while Python was hand-crafted. There was a vague implicit notion of “artisanal” programming in these circles.

The upshot, of course, is that every time you read a chunk of code or call a function or method, your brain has to do a lot of the work that a statically typed language would be able to enforce and verify for you. But in a dynamic language, you won’t know what happens until the code runs. In large measure, the quality of software hinges on how much you can tell, a priori, about code before it runs at all. In a dynamic world, anything can happen, and often does.

This is a nightmare, pure and simple. Much of the strong focus on writing automated tests is to basically make up for the lack of static typing.

True artisanship lies in design: namely, thinking hard about the data structures and code organization you’re committing to. It’s not about being able to take liberties that can result in things that make no sense to the machine and that can cause errors at runtime that could have been caught beforehand.

Installing DBI on Leopard’s perl 5.8.8

I needed to get my perl installation updated to do some development locally. As usual, perl was a pain in the ass. Long story short: Install Xcode 3.0, copy the “reentr.inc” file from the 5.8.8 source distribution, and DBI should install.

Below is the long-winded log of my woes, offered in the hopes it might help someone.

First, DBI 1.604 wouldn’t install via cpan, so I tried installing it by hand. But I just got the same error when running “make”:

No rule to make target `/System/Library/Perl/5.8.8/darwin-thread-multi-2level/CORE/config.h', needed by `Makefile'.

I found this blog post, “Leopard Perl 5.8.8 installation throws errors when compiling (makefile)” mentioning the exact message, which recommended copying the CORE directory from 5.8.6 instead the above location. I tried that, but then I got this error instead:

DBI.xs: In function ‘dbi_profile’: DBI.xs:2398: warning: implicit declaration of function ‘GvSVn’ DBI.xs:2398: error: invalid lvalue in assignment DBI.xs: In function ‘dbi_profile’: DBI.xs:2398: warning: implicit declaration of function ‘GvSVn’ DBI.xs:2398: error: invalid lvalue in assignment DBI.xs: In function ‘XS_DBI_dispatch’: DBI.xs:2970: warning: assignment makes pointer from integer without a cast DBI.xs:2972: error: invalid lvalue in assignment DBI.xs:2985: error: invalid type argument of ‘->’ DBI.xs:2989: error: invalid lvalue in assignment DBI.xs:3293: warning: unused variable ‘Perl___notused’ DBI.xs: In function ‘XS_DBI_dispatch’: DBI.xs:2970: warning: assignment makes pointer from integer without a cast DBI.xs:2972: error: invalid lvalue in assignment DBI.xs:2985: error: invalid type argument of ‘->’ DBI.xs:2989: error: invalid lvalue in assignment

Googling these error messages turned up surprisingly little. On another blog with a post titled, “Mac OS 10.5: Leopard” that mentioned difficulties with DBI, commenters suggested various solutions, but none of them worked for me. The blog author got an older version of DBI to install but I couldn’t get that to work either.

I discovered that Xcode 3.0 (developer tools for Leopard) contains the 5.8.8 files that belong in that CORE directory. This seemed like a better option than copying the probably outdated 5.8.6 files. You can get the gigantic Xcode disk image, a whopping 1.1 gigabytes, from the Apple Developer Connection site. Registration is required through the “Member” link, and once you’re in, go to Downloads and search for Xcode.

Before installing Xcode, I cleared out the hosed CORE directory I’d been mucking with. “DeveloperTools.pkg” is what contains the perl headers, so you can probably get away with just installing that (double-click it), instead of the entire XcodeTools.pkg. It did the trick: the compiler was now finding the “GvSVn” symbol it couldn’t before. But now I got this message during make:

In file included from DBIXS.h:19, from Perl.xs:6: /System/Library/Perl/5.8.8/darwin-thread-multi-2level/CORE/perl.h:3993:22: error: reentr.inc: No such file or directory In file included from DBIXS.h:19, from Perl.xs:6: /System/Library/Perl/5.8.8/darwin-thread-multi-2level/CORE/perl.h:3993:22: error: reentr.inc: No such file or directory lipo: can't open input file: /var/tmp//ccQ8vbDU.out (No such file or directory) make: *** [Perl.o] Error 1

In desperation, I downloaded the perl 5.8.8 source distribution tarball, and simply copied reentr.inc into the CORE directory. Voila! Make went to completion and I could install the module. From there, I went back into cpan to install DBD::mysql without any problems (you need mysql installed in the default location, /usr/local/mysql, of course).

The oxymoron of perl best practices

I’ve been browsing Damien Conway’s excellent book, Perl Best Practices. His description of “best practices” in Chapter 1 is quite good, bestowing meaning upon an otherwise empty buzzword for marketing departments:

Rules, conventions, standards, and practices help programmers communicate and coordinate with one another. They provide a uniform and predictable framework for thinking about problems, and a common language for expressing solutions. This is especially critical in Perl, where the language itself is deliberately designed to offer many ways to accomplish the same task, and consequently supports many incompatible dialects in which to express any solution.

Simply put, best practices are the recognizably tried-and-true. The key word is recognizable: others should be able to identify the problem being solved as well as the specific type of solution. In a way, it’s redundant to speak of best practices when talking about the programming: unless no one will ever see your code, the foremost consideration of the programmer should be, Can other people understand what I’ve done easily and quickly, and also make changes and improvements easily and quickly?

That question is really at the heart of coding as a craft of logical elegance, I feel. Giving it due care and consideration is what distinguishes artisans from those who just “get it done” (they always seem to run into perpetual headaches down the line).

I find the philosophy of perl to be highly antithetical to the sort of shared understanding and communication described above. The motto of “there’s more than one way to do it” is self-serving and individualistic, to the detriment of maintainability. Perl seems to foster stubbornness and isolation: I’ll do things MY preferred way, and you do it YOURS. The codebase quickly devolves into baffling inconsistency. Of course, skilled perl programmers pride themselves on being able to know all the different ways of doing something, but at that point, the strength of the language isn’t the issue anymore; the ego of the coder is.

Anyway, back to Conway’s book. What’s interesting, but perhaps not surprising, is the number of “don’t do X” recommendations it makes. Leaving off parentheses on subroutine calls? Convenient, but ugly and confusing when a bunch are these are combined, so don’t do it. Tricks with dereferencing syntax? Save everyone headaches and stick to arrows. Pseudohashes? Stay away from them. Indirect object syntax? Potentially conflicts with subroutine names, so don’t use it.

Take away all the crazy tricks that perl devotees love, and what’s left? The only truly great thing about perl is its regular expression integration and text processing. It’s amazing stuff. But otherwise, perl doesn’t particularly excel at much else.

Which is not to say large complex projects in perl are impossible. The University of Washington’s Catalyst tools use a custom web framework written in perl, which they’ve open sourced. I only cite that particular example because I’ve used some of the tools and I know they’re nicely done. So sure, it’s possible, but it takes a lot of deliberate restraint and discipline, more so than with other languages and tools that are oriented towards helping you get your job done.

Seems to me that if you must willfully refrain from using what a language offers, and constantly fight the temptation to write mangled code that costs time, money, and frustration for others to decipher, perl isn’t a particularly strong choice. In the case of perl, best practices aren’t a way to get the most out of the language; they’re a stopgap measure to stave off its natural tendency towards chaos.

A Quick Observation

For some potential upcoming work, I’ve been catching up on the changes made to Java over the last few years, and exploring the popular frameworks and libraries now in use.

Folks on reddit.com harshly criticize the bloat, unnecessary complexity, and huge runtime requirements for Java. They have their points. But I have to say, having worked on perl and PHP lately, where good code organization is the exception and not the norm, looking at Java again is a very welcome change.

The APIs for stuff like Servlets, Faces, EJBs, and Hibernate may be difficult to learn and remember, but at the very least, I find I always know where to look for something, and it’s usually where I expect to find it. In my book, over-abstraction is the lesser evil compared to not enough.

The Virtues of Simplicity

In this age of bloated software, it’s hard to find solutions that do what you need without a ton of unnecessary complexity. Software that’s more complex than the problem at hand makes it costly to learn, maintain, and troubleshoot. For companies and individuals with limited resources, that’s a real challenge. Glenn (my current client) and I wanted a solution to replace some slow performing CGIs but mod_perl was too much. Plus the idea of embedding a perl interpreter in Apache seemed scary.

I looked for perl web application server options that could be proxied through Apache. That architecture would give us the performance benefits of code preloaded into memory, persistent database connections, and precompiled templates. The only thing I found was OpenInteract, but its module list is quite large, and the project hasn’t been updated in a while. I didn’t want to have to dig around a ton of foreign code if I needed to squash a bug. So I decided to write my own.

The result is a neat little thing I call “perlserver,” a heavily modified version of this piece of public domain code for a preforking HTTP server. I deliberately kept it simple but gave it all the needed features: URL-to-handler mapping, safeguards against memory leaks, maximum configurability. It’s only 450 lines of code and is tightly integrated with the LWP modules. To convert the existing set of perl CGIs, the code was simply wrapped in packages that conform to a simple API understood by perlserver. Existing URLs were proxied to perlserver by clever RewriteRules in Apache.

Before, a CGI that built a complex page typically took 900 – 1500ms. Now, the same page served from perlserver by proxy takes around 300 – 400ms.

I’m thinking about releasing the code as open source. It’s not fancy, but that’s why it’s great: it’s a good option for sites that want better perl performance without the tedious complexity of other existing solutions.