Category Archives: software

Making Emacs an IDE

It’s that time when bloggers wax introspective about the past year. For me, the major personal revelation in 2011 was re-discovering something very old, and putting it to new use. For me, 2011 was the year of the Emacs IDE.

I’ve been using Emacs, on and off, for close to a decade now. What’s changed is that, in the past few months, I’ve been writing extensions for it. It started with a simple desire to better navigate files in a complex directory hierarchy that followed specific and somewhat convoluted conventions. At first, learning Emacs Lisp was simply a means to an end, but I ended up liking it so much that I started exploring Common Lisp (and more recently, Clojure, since I’ve worked with Java in the past).

What started as a small task has become a larger project of turning Emacs into an IDE.

To understand this, one needs to know some context about the system I work with. We developers edit tons of XML files and files in other text formats, which all drive our proprietary web application product. We have many command line tools that manipulate these files in various ways; the system was originally designed by folks who followed the UNIX philosophy of building orthogonal tools and chaining them together.

There are pros and cons to this system; for reasons I won’t get into, I don’t love it, but it’s what we work with right now. When I started the job, the vast majority of the developers used screen, vi, and the shell prompt. Typical workflows that involved working with only a few files could be extremely hard to keep track of, and usually required a lot of copying and pasting between screen and/or ssh sessions. Few people seemed to mind, but I found the workflow to contain too much extraneous cognitive load, and the state of the tools made development very prone to error.

Gradually, I’ve been integrating our tools into Emacs. Sometimes that simply means binding a key combination to running a diagnostics program and storing the output in a buffer. Sometimes it means accumulating that output for history’s sake. Sometimes it means parsing program output, processing it in Emacs Lisp, and inserting a result into the current buffer. Sometimes it means running external programs, even GUI applications, and tweaking them a bit to tell Emacs to open specific files you want to look at.

The productivity gains have been amazing. This is no reason to brag: managing several screen sessions with different vi and shell instances wasn’t exactly hard to improve upon. But Emacs made it fairly painless. Emacs Lisp has proved to be wonderful “glue” for integrating existing tools in our environment.

Writing tools that enable you to do other job tasks better is a really interesting experience; I’ve never done it to such an extensive degree. So far, one other person in my group is using this Emacs IDE, and she has been happy with how much it facilitates her work. Others who swing by my desk for something often watch me work for a moment, and ask, “how did you do that?! that’s not vi, is it?”

Getting more people to switch over means convincing them that the steep learning curve of Emacs is worth the gains of the integration I’ve done. I’m not sure how much that will happen, since a big part of it is cultural. But if there aren’t any more converts, I don’t really care. The best thing about this ongoing project is that I am the end user. The software I wrote is really for myself. It is something I use intensively every single day. And that makes it all the more gratifying.

Learning Lisp

For work projects, I often have to jump around a lot of XML files referenced in one another by name. The files follow a strict naming convention, but it’s one that looks like gibberish to human eyes. Typing them is difficult and prone to error; plus there are so many files in a given project that navigating a visual tree of the directory isn’t much easier.

So I wrote a small library for Emacs to find and open a file in the directory tree, if the cursor is over a string of text that matches a valid filename in the convention. I’m embarrassed to admit this took me two full days (a few other functions were in the library too). The only previous exposure to Emacs Lisp I had was tweaking my .emacs file, so I had to learn the language as well as hunt down the right functions to call for what I wanted. The work is paying off: it feels 10x’s easier to navigate project files now.

I experimented with trying to do this in Komodo Edit, a popular editor among coworkers, but it involved learning Mozilla’s XUL and Komodo’s own API, as well as writing javascript (yuck), so I abandoned that effort pretty quickly.

I didn’t think I would like Lisp, but the experience has been pretty fascinating, and I’m now making my way through the Practical Common Lisp book by Peter Seibel. It’s interesting learning a language that’s half a century old (!), and that’s influenced so many contemporary languages. One might think that there’s nothing there worth learning or revisiting, but that is so wrong. In particular, I’m trying to wrap my head around the power of Lisp macros and the way they allow you to create new syntactic abstractions. The idea of extending the very language itself, rather just adding new functions, is mind-boggling, to say the least. And, from what I understand, it remains fairly unique to Lisp in spite of the flood of new languages in the past few decades.

It’s disheartening not being able to find much info about who actually uses Lisp anymore, aside from hackers building modules for Emacs. Paul Graham has a cool essay, “Beating the Averages”, about using Lisp to build online store software that Yahoo eventually acquired. And ITA software, which makes an airfare search engine that powers the entire travel industry, uses Lisp. But aside from these bits of info, there isn’t much out there in the way of Lisp “success stories.”

Tradesmen-Programmers

It takes quite a bit of effort to decipher my work history. I’ve worked on enough diverse software projects that I’m not a “junior” programmer; but because of the time I spent in academia doing non-tech related studies, I don’t have a substantial enough career history to justify a “senior” status. Recruiters and prospective employers usually lack the patience to make sense of it all. Though my core software development skills are solid, I’m regarded as a misfit.

Three months ago, I landed a new full-time job. The company is full of highly intelligent oddballs with convoluted, non-linear professional histories—so I feel like I fit right in. I’m liking it a lot so far, and I feel lucky to have work in such an awful economy.

It’s precisely because of my weird background that I often forget exactly how much I know, and what I’m capable of. It occurred to me recently that I think of programmers as tradesmen. Most recent college grads with CS degrees are poorly prepared for real world software development. Genuine knowledge comes from hands-on experiences as “apprentices” to seasoned programmers. If you are lucky to work with good people, what you learn is not simply a particular programming language or technology, but a paradigm of core principles that will stay with you for life.

I’ve been lucky to work under some great “masters” of the trade. More than a decade after my first “real” programming job, for example, I still use the fundamentals I learned from coworkers on that project: proper object oriented design and modeling, what abstract data types really are and how they’re useful, the extreme importance of extensibility and maintainability in an “agile” world. These things have become such a part of me that I take them for granted.

When I meet or talk to other programmers, I listen carefully to how they talk about things, what concerns they raise, how they approach problems. These are what distinguish the tradesmen-programmers from those whose aim is just to “get it to work,” and who inevitably run into all sorts of problems because of that mindset. Tradesmen-programmers have to balance the practical and theoretical aspects of software design in order to create something of quality, in all measures of that word. And this is something that can only be learned through dedication, experience, and open-mindedness, not a degree, certificate, or book.

On Programmer Insecurity: Is it Personality or the Market?

Here’s a wonderful blog post by Ben Sussman-Collins, “Programmer Insecurity”, to which Jesse Noller has responded with “Programmer Insecurity and Mea Culpa”. (I don’t know either of these folks, I just follow their blogs in my RSS reader.) Ben talks about the need for more transparency, communication, and iterative growth in a programmer’s development:

Be transparent. Share your work constantly. Solicit feedback. Appreciate critiques. Let other people point out your mistakes. You are not your code. Do not be afraid of day-to-day failures — learn from them. (As they say at Google, “don’t run from failure — fail often, fail quickly, and learn.”) Cherish your history, both the successes and mistakes. All of these behaviors are the way to get better at programming. If you don’t follow them, you’re cheating your own personal development.

At the moment, I’m lucky to have fairly down-to-earth colleagues who generally foster these principles, but overall, this sort of perspective is sadly all too rare.

I don’t think it’s purely a matter of personality peculiar to programmers, or as Ben suggests, just “human nature” to fear embarrassment. I mean, sure, to an extent… but the fear is also fostered by a competitive labor market that values personal marketing over personal growth.

That’s why there are so many “best practices” blogs, vanity websites boasting of track records, and heated religious arguments about almost anything pertaining to code. The market has created a culture of showing off. And if you can demonstrate you are more “perfect” than the next guy or gal, you’ll impress the interviewer and land the job or the gig. One might argue, rightfully, that these are not great places to work. But places like Google where there is a generous philosophy of employee growth are probably the exception rather than the rule.

I can remember a time when things were different.

Making It Happen

A while ago, I had an idea for a cool website. It involved grouping blog posts under “debate” questions. There was a simple mechanism for auto-detection if you linked a blog post to a debate page. The point was to be able to group postings with more semantic richness than simple tags or categories.

It fell by the wayside, so I took it down after a few months. Recently I’ve been seeing sites pop up based on similar ideas. AllVoices is one, and Debategraph.org is another. It’s nice to see the idea of information richness continuing to develop in interesting ways.

And it makes me wish I had stuck with my site idea, though to be honest, it wasn’t realistically feasible. The hardest part wasn’t coding the functionality, which only took 2-3 weeks of the large pool of free time I had back then. (Side goals were to get a working knowledge of CherryPy and sqlalchemy, so at least those were accomplished!) No, the real difficulty was “selling” it to users: publicizing the site, making it visually attractive and user-friendly, and getting people to use it in their own blogs. I didn’t have the skills or resources to make those things happen.

There’s an adage that says success on the web largely depends upon execution, not the concept. That’s so true. I feel like I’ve known so many smart, talented technology people who excel at what they do but haven’t been able to pull off their interesting side projects. I think it’s because we often underestimate the non-technical challenges in getting a website off the ground. In many ways, those are more important to do well than solving the technological problems.

The oxymoron of perl best practices

I’ve been browsing Damien Conway’s excellent book, Perl Best Practices. His description of “best practices” in Chapter 1 is quite good, bestowing meaning upon an otherwise empty buzzword for marketing departments:

Rules, conventions, standards, and practices help programmers communicate and coordinate with one another. They provide a uniform and predictable framework for thinking about problems, and a common language for expressing solutions. This is especially critical in Perl, where the language itself is deliberately designed to offer many ways to accomplish the same task, and consequently supports many incompatible dialects in which to express any solution.

Simply put, best practices are the recognizably tried-and-true. The key word is recognizable: others should be able to identify the problem being solved as well as the specific type of solution. In a way, it’s redundant to speak of best practices when talking about the programming: unless no one will ever see your code, the foremost consideration of the programmer should be, Can other people understand what I’ve done easily and quickly, and also make changes and improvements easily and quickly?

That question is really at the heart of coding as a craft of logical elegance, I feel. Giving it due care and consideration is what distinguishes artisans from those who just “get it done” (they always seem to run into perpetual headaches down the line).

I find the philosophy of perl to be highly antithetical to the sort of shared understanding and communication described above. The motto of “there’s more than one way to do it” is self-serving and individualistic, to the detriment of maintainability. Perl seems to foster stubbornness and isolation: I’ll do things MY preferred way, and you do it YOURS. The codebase quickly devolves into baffling inconsistency. Of course, skilled perl programmers pride themselves on being able to know all the different ways of doing something, but at that point, the strength of the language isn’t the issue anymore; the ego of the coder is.

Anyway, back to Conway’s book. What’s interesting, but perhaps not surprising, is the number of “don’t do X” recommendations it makes. Leaving off parentheses on subroutine calls? Convenient, but ugly and confusing when a bunch are these are combined, so don’t do it. Tricks with dereferencing syntax? Save everyone headaches and stick to arrows. Pseudohashes? Stay away from them. Indirect object syntax? Potentially conflicts with subroutine names, so don’t use it.

Take away all the crazy tricks that perl devotees love, and what’s left? The only truly great thing about perl is its regular expression integration and text processing. It’s amazing stuff. But otherwise, perl doesn’t particularly excel at much else.

Which is not to say large complex projects in perl are impossible. The University of Washington’s Catalyst tools use a custom web framework written in perl, which they’ve open sourced. I only cite that particular example because I’ve used some of the tools and I know they’re nicely done. So sure, it’s possible, but it takes a lot of deliberate restraint and discipline, more so than with other languages and tools that are oriented towards helping you get your job done.

Seems to me that if you must willfully refrain from using what a language offers, and constantly fight the temptation to write mangled code that costs time, money, and frustration for others to decipher, perl isn’t a particularly strong choice. In the case of perl, best practices aren’t a way to get the most out of the language; they’re a stopgap measure to stave off its natural tendency towards chaos.

Lessons Learned

After about 5 months, I’ve decided that it’s time to move on to another gig. I’ve learned a few things, and I’m posting them here in the hopes that the lessons might be helpful to other programmers and techies-at-large.

Working in a small business as the sole do-it-all technology person has its unique challenges. It can be very fulfilling to be the sole expert and “enabler,” if that turns you on. But the flip side is that management might not really understand or care that much about their technology. Is there a reasonable budget for what they’re trying to accomplish? Do they understand, at a high level, your projects and how they contribute to the mission? Are technology projects considered a burdensome mystery or something valuable and embraced by the company? Question the reasons why there’s only one tech guy/girl and whether that seems right.

Another thing to assess is whether you can deal with taking over the existing codebase. I’ve taken over other code before with success, retaining what was good and doing clean up as necessary. At this past gig, things looked reasonably tidy at a first glance, but as time progressed, I realized a ton of abstractions weren’t in place, and those that did exist didn’t make sense. Some refactoring might have been interesting to do, but this endeavor wasn’t valued when I proposed it as a project.

Lastly, I think it’s important to be wary of promises about the future. Even with the best of intentions, things change quickly at small businesses. The projects I was initially excited about got perpetually deferred for various reasons, and I found myself preoccupied with doing maintenance code fixes, making cosmetic tweaks, performing server administration, and providing support for third party software (which I really don’t like to do). The company needed these things done, so I did them with as much cheer as I could muster, hoping we’d eventually get to a place where some solid new development could occur (and I could sneak in some refactoring)—that’s what floats my boat. But it became to clear to me that wasn’t going to happen anytime soon.

So that’s that. It’s a shame it didn’t work out, especially since I actually liked everyone I worked with. At least it’s an amicable departure, and I hope to be involved in hiring a replacement who might be a better fit for their current needs than I am.

The new gig? Java. Been catching up on it, since it’s been a few years. Oh, it feels so nice to have package namespaces, real data types, full-featured APIs, and real object-orientedness again. Like coming home.

“It Works”

This blog post, “The Worst Thing You Can Say About Software Is That It Works,” written by one Kenny Tilton, is pretty hilarious. This is the most beautiful thing I’ve read in a while:

if a pile of code does not work it is not software, we’ll talk about its merit when it works, OK? Therefore to say software works is to say nothing. Therefore anything substantive one can say about software is better than to say it works.

Reading this triggered flashbacks and PTSD. I’d mentioned to a manager recently that I wanted some time to do some badly needed refactoring. My explanation of why was met with a pause, then, “Let me get this straight. You want time to take something that already works, reorganize it, possibly break things, and we wouldn’t have anything new to even show for it?”

That last part was wrong–the value added comes from maintainability and extensibility, but I couldn’t get him to really grasp those ideas. He’s not a technology person. For all he knew, maybe this was an elaborate ruse on my part to be left undisturbed while I surfed porn at my desk for a few weeks.

I work in a very small shop with all non-technology people, so this sort of thing happens a lot. It’s frustrating. It’s sort of nice to know I’m not alone in encountering this mindset. But man… if even the fellow programmer in Kenny’s story doesn’t get it, I’m not sure there’s much hope for the rest of the world.

EAcceleratorCacheFunction = Cache_Lite_Function + EAccelerator

It’s pretty much all in the title. In a nutshell, EAcceleratorCacheFunction is a “memoizing” cache class for PHP that uses shared memory for storage. It is mostly compatible with Cache_Lite and Cache_Lite_Function.

Just like Cache_Lite_Function, it supports per-cache-object lifetime values, instead of specifying the lifetime of an item at the time you store it. This lets you dynamically change the lifetime of the cache. For example, if system load goes up and you don’t mind serving sightly older content instead of regenerating it:

$load = sys_getloadavg();
// use 5 min avg (ignore momentary spikes)
if($load[1] >= 6) {
    $lifetime = 900; # 15 min
} elseif($load[1] >= 3) {
    $lifetime = 600; # 10 min
} else {
    $lifetime = 300; # 5 min
}
$cache = new EAcceleratorCacheFunction(array('lifeTime' => $lifetime));
$cache->call('make_page');

I wrote EAcceleratorCacheFunction as a drop-in replacement for Cache_Lite_Function. On a virtual private server, doing cache reads/writes from memory instead of disk has made a noticeable difference in performance; it helps tremendously that the database has to contend with less disk I/O.

Two Styles of Caching (PHP’s Cache_Lite vs memcached)

Since the recent slashdotting of our website (we held up okay, but there’s always room for improvement), I’ve been investigating the possibility of moving from Cache_Lite (actually, Cache_Lite_Function) to memcached in our PHP code.

Much discussion comparing these solutions focuses on raw performance in benchmarks. In the real world, though, not all things outside the benchmark are equal. On a VPS, disk I/O times are notorious for being highly variable. This makes memcached all the more attractive. Yes, memory is faster than disk in almost every environment, but also, avoiding disk access conserves a precious resource so fewer processes must block for it.

A public mailing list post by one Brian Moon points this out exactly:

If you rolled your own caching system on the local filesystem, benchmarks would show that it is faster. However, what you do not see in benchmarks is what happens to your FS under load. Your kernel has to use a lot of resources to do all that file IO. [...]

So, enter memcached. It scales much better than a file based cache. Sure, its slower. I have even seen some tests where its slower than the database. But, tests are not the real world. In the real world, memcached does a great job.

Okay, great. memcached is better when you take into account overall resources. But there’s a very useful Cache_Lite_Function feature that memcached doesn’t seem to have.

When you initialize a Cache_Lite_Function object, you set a “lifeTime” parameter, then use the call() method to wrap your regular function calls. If the output of the function hasn’t been cached within that time period, the call gets made and its results replaced in the cache with a new timestamp.

The cool thing about it is that you can create different cache objects pointing to the same directory store without a problem. Pages can increase and decrease the lifetime of the cache dynamically as load changes, so you can serve slightly older data from cache if necessary, keeping the site responsive while saving database queries. On a site where content changes relatively infrequently, this is a great feature to have: serve it fresh when load is low, serve from cache when load is high.

memcached, on the other hand, requires that you specify an expiration time at the time you place data in the cache. A retrieval call doesn’t let you specify a time period, so you can’t do the above. If data has expired, it’s expired.

It’d be interesting to hack Cache_Lite_Function to use memcached as its store, so you could get the best of both worlds. It would involve storing things in memcached with no expiration, tacking on a timestamp in the data, and doing the checking manually. But it might work.