Buffer Menu Sorting in Emacs 24

Upgrading to Emacs 24 a few months ago was mostly seamless. I only encountered one issue that was difficult to fix and has been lingering until today, when I finally got annoyed enough to devote some time to researching and fixing it.

I like having the Buffer Menu configured to always be sorted by filename. In Emacs 23, you could simply set the Buffer-menu-sort-column variable and you were done. That variable isn’t recognized in Emacs 24, despite claims to the contrary on various web sources.

If you look at the code in buff-menu.el, it was rewritten in 24 to use tabulated-list for its display mechanics. There is a Buffer-menu-sort function which is an alias to tabulated-list-sort. Once you are in the *Buffer List* buffer, you can call Buffer-menu-sort interactively using a prefix argument to specify the column: type ‘M-5 M-x Buffer-menu-sort’ to sort by filename. This works very nicely. Doing it a second time reverses the sort order.

So the challenge is to automate this, calling Buffer-menu-sort only once, automatically at the time a *Buffer List* buffer is first created. The mode will keep its entries sorted going forward.

I tried doing this by advising the list-buffer function, and having a buffer local variable keep track of whether I had already called Buffer-menu-sort once before. This didn’t work: something seemed to be resetting my buffer local variable, but I couldn’t figure out what.

The solution I ended up with is below. The advice function checks a variable called jc-buffer-menu which stores the current Buffer Menu buffer, and calls Buffer-menu-sort as needed. Put the code in your .emacs file, and you’re done.

Whew!

string-match and dealing with state

(or, How I Spent a Few Hours Wrestling with Emacs)

I encountered some strange behavior and errors when calling replace-regexp-in-string in some Emacs code today.

My code was passing a function as the REP argument to replace-regexp-in-string. Here’s a similar, simplified example:

Fine, that’s what we expected. Now let’s change how mask-ssn works. It should still do the same thing, but does it?

Whoa, that’s not right at all!

What happened? Well, there’s a while loop in replace-regexp-in-string that calls string-match, then calls the REP function, and then uses the result in a call to replace-match. Since mask-ssn also calls string-match (via split-string), this messes up the next call to replace-match in the loop.

Figuring this out took a while, as my actual function was way more complicated than mask-ssn. I didn’t even suspect string-match to be the problem initially, so I went through the process of taking everything out and adding code back in line by line, chunk by chunk, call by call, until I pinned it down. In addition to weird string replacements, I also sometimes got an “args-out-of-range” error, depending on the size of the strings and the various functions using string-match that I experimented with calling.

There IS a solution, happily: save-match-data, which is better documented here than in the Emacs help. In a nutshell, the macro saves the current match state, evaluates the forms inside the body, and then restores the state afterwards.

That does the trick.

I discovered this only after going through the trouble of building a replacement to replace-regexp-in-string.

I felt pretty happy with myself after writing that. Since it doesn’t call any match functions after calling REP, all is well.

But then I thought about it some more. my-replace ITSELF is changing match data, so any callers will suffer from the same problem. D’OH. So we still need to wrap our code in save-match-data, if we’re going to be responsible.

I think save-match-data illustrates an interesting “pattern” of sorts. It deals with a problem of state, which is unavoidable, as we’re concerned with needing to track searches of text buffers. Since we HAVE to store this state, we can’t do it in a clean, purely functional way.

save-match-data handles this by saving the state of the string match in progress, probably pushing it on a stack somewhere (I didn’t dive too deep), letting the wrapped code manipulate the world, and then popping the original state off the stack to restore it afterwards. From the outside, it seems as if no state has changed, between the point at which you enter save-match-data and the point at which you exit. This allows for nested code, any number of call levels deep, to do what it needs, without interfering with other levels, as long as the relevant chunks of code are wrapped in the macro.

Pretty cool.

Composition and Inheritance

For as long as I can remember, writing any kind of non-trivial software meant you needed to use object-oriented programming. It was a no-brainer. So I learned all the fundamentals of OOP, and design patterns as well, since one couldn’t get very far in Java without knowing the most common patterns.

I think taking OOP for granted as the only natural way to manage complexity is why learning Lisp is so mind-blowing for programmers like myself. Take, for example, polymorphism. I didn’t know that there was anything besides parametric polymorphism–and I didn’t know it was called that; I knew it only as polymorphism, plain and simple. The ability of Lisp to do multiple dispatch was incredibly eye-opening.

I think this is the sort of thing people mean when they go on about how Lisp has broadened their horizons and deepened their understanding of concepts.

To mention another example, in a bit more detail: as an experiment in some recent python code for work, I’ve been using fewer classes/objects and more functions. (It’s debatable just how much actual “functional programming” mileage you can get out of Python, but I’ll put that aside for now.) In such an approach, you inevitably end up using a lot of composition, rather than object inheritance, to build higher level abstractions. And that’s been working out very well so far.

Composition is a powerful thing because you can control the granularity of code reuse. With a carefully constructed library of functions, you can choose to call the functions at the appropriate level of abstraction you need, and even mix and match. That’s much harder to do with object inheritance, where classes force you into an all-or-nothing package deal–if you want only some of the functionality, you need to instantiate the whole object anyway. And if you want to selectively override functionality, you need to subclass, which effectively ties the parent class to its subtree, making it harder to modify in the future.

I’ve been thinking lately that objects are useful mostly to facilitate data abstraction: there’s no reason not to group together related accessors and mutators. But when I consider more complex bundles of functionality, I think twice before creating a class hierarchy and see if I can do it using functional composition instead.

Exercise Mania

My ex-girlfriend’s mom started doing a jazzercise class in the 80s and hasn’t stopped since. It’s a weird thing.

Programming exercises can be just as addictive as some forms of well-marketed physical exercise. I have to actively resist writing solutions in Lisp to every silly problem I stumble upon online. But this morning, I came across this simple one (a blog post from 2007, but that happens on Hacker News) and, for some reason, couldn’t pass it up:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

It’s so silly that I guess some folks, like the blogger linked above, have turned it into a problem of “what’s the most creative/strange way I can think of to do this?” Which is interesting and fun in its own right.

What I don’t understand is the attitude expressed in the original fizzbuzz blog post, “Using FizzBuzz to Find Developers who Grok Coding”. Does it really matter if it takes someone 5 or 20 minutes to write a solution? (FWIW, I took about 10 mins). A shorter time with such a goofy example means someone “groks” coding more than another? Ridiculous.

These types of puzzles are amusing precisely because a lot of real-world development (which is what the vast majority of people who code for a living do) is unfortunately pretty rote, and doesn’t require a lot of algorithmic thinking. Code monkeys learn to use libraries and frameworks, and spend a lot of time trying to use them correctly and cleanly, in order to implement straightforward requirements. So these exercises are a shift in mindset. In a tiny way, they put you back on the path of engineering. It’s why people find them interesting to do. It’s why I’m reading the Structure and Interpretation of Computer Programs.

Taking a few more minutes might mean you’re not accustomed to solving such problems every minute of your working life. But that’s not at all the same as gauging whether someone understands coding or not. That kind of measurement is naive at best.

An SICP exercise in implementing cons, car, and cdr

I started working my way through the Structure and Interpretation of Computer Programs this week.

Contemporary “intro to programming concepts” texts tend to focus on real-world examples in areas like business and web-based applications. In contrast, SICP involves a lot of math problems. I’m liking this about the book. Math is particularly good for illustrating the difference between declarative and imperative knowledge, between “describing properties of things and describing how to do things.” You may know, formally, what a square root is, but coming up with a procedure for calculating the square root of a number is another matter. Bridging these two kinds of knowledge is the essence of what computer programming is about.

The second section, “Building Abstractions with Data,” starts by discussing different ways to implement an abstraction for rational numbers. The text walks through using a Lisp cons cell to store a numerator and denominator pair, and using car and cdr as selectors. Then it provides this exercise, which I found particularly intriguing:

Exercise 2.5. Show that we can represent pairs of nonnegative integers using only numbers and arithmetic operations if we represent the pair a and b as the integer that is the product 2a3b. Give the corresponding definitions of the procedures cons, car, and cdr.

That’s a really clever way to store a pair! But perhaps I only think so because I’m not a math person.

At any rate, I wrote a solution in Common Lisp, with some helper lambdas to calculate log and pow arithmetically.

I’m finding that thinking about math problems and learning Lisp are helping me think more algorithmically about solving problems in the code I write for work. Maybe I’m starting to experience what Eric S Raymond famously wrote about Lisp: that it “will make you a better programmer for the rest of your days, even if you never actually use Lisp itself a lot.” I’m holding out hope for that last part though…