string-match and dealing with state

(or, How I Spent a Few Hours Wrestling with Emacs)

I encountered some strange behavior and errors when calling replace-regexp-in-string in some Emacs code today.

My code was passing a function as the REP argument to replace-regexp-in-string. Here’s a similar, simplified example:

Fine, that’s what we expected. Now let’s change how mask-ssn works. It should still do the same thing, but does it?

Whoa, that’s not right at all!

What happened? Well, there’s a while loop in replace-regexp-in-string that calls string-match, then calls the REP function, and then uses the result in a call to replace-match. Since mask-ssn also calls string-match (via split-string), this messes up the next call to replace-match in the loop.

Figuring this out took a while, as my actual function was way more complicated than mask-ssn. I didn’t even suspect string-match to be the problem initially, so I went through the process of taking everything out and adding code back in line by line, chunk by chunk, call by call, until I pinned it down. In addition to weird string replacements, I also sometimes got an “args-out-of-range” error, depending on the size of the strings and the various functions using string-match that I experimented with calling.

There IS a solution, happily: save-match-data, which is better documented here than in the Emacs help. In a nutshell, the macro saves the current match state, evaluates the forms inside the body, and then restores the state afterwards.

That does the trick.

I discovered this only after going through the trouble of building a replacement to replace-regexp-in-string.

I felt pretty happy with myself after writing that. Since it doesn’t call any match functions after calling REP, all is well.

But then I thought about it some more. my-replace ITSELF is changing match data, so any callers will suffer from the same problem. D’OH. So we still need to wrap our code in save-match-data, if we’re going to be responsible.

I think save-match-data illustrates an interesting “pattern” of sorts. It deals with a problem of state, which is unavoidable, as we’re concerned with needing to track searches of text buffers. Since we HAVE to store this state, we can’t do it in a clean, purely functional way.

save-match-data handles this by saving the state of the string match in progress, probably pushing it on a stack somewhere (I didn’t dive too deep), letting the wrapped code manipulate the world, and then popping the original state off the stack to restore it afterwards. From the outside, it seems as if no state has changed, between the point at which you enter save-match-data and the point at which you exit. This allows for nested code, any number of call levels deep, to do what it needs, without interfering with other levels, as long as the relevant chunks of code are wrapped in the macro.

Pretty cool.

On Reading Blogs and News Sites About Coding

My newsreader currently contains 60 feeds from blogs and news sites that cover programming and technology. There are some weeks where I refresh these feeds several times a day, combing through them carefully. And there are periods when I ignore them completely because I’m occupied with other things in life.

I’ve discovered a pattern. When I scour the feeds frequently, the information I learn doesn’t tend to help me all that much with my work or non-work projects. Once in a while I will learn about a useful library, or some programming language feature I didn’t know about before. But mostly, what happens is that I end up feeling anxious, like I am not productive enough. Just as Facebook connects us all while making us lonelier, discussions about coding on the Interwebs make me a bit smarter while also paralyzing me to the point of not being able to use that knowledge.

For all that I complain about technology (and I complain a lot), I do love coding. Becoming a more seasoned (I do not say “good”) programmer, for me, means learning the different ways of thinking embedded in various languages and tools: that is, understanding what makes something powerful and expressive, and what, in that same thing, makes it limiting. You create worlds when you write code, and these are fun and interesting to inhabit, to understand, to make grow.

Reading blogs and news sites usually doesn’t make love coding more, but less. Egomaniacal jerks abound. There is so much fluff in the latest trends. Even halfway decent web content is usually driven by shameless self-promotion. Github, as revolutionary as it is, sometimes feels like the Facebook of coding culture, a way to show off rather than genuinely collaborate. And don’t even get me started on ridiculous sites like Hacker News, where Silicon Valley is the promised land and every founder of a silly startup is Jesus. (Although if we could crucify them…)

All of this fuels a culture where people focus on one-upping one another, where the pace of change is so fast you are always playing catch up, where hyper-productivity overshadows questions about what that productivity is for…

These days I skim my tech feeds no more than once a week. My time is just better spent in other ways, like actually solving problems in the code I’m working on, instead of reading about the trendiest or most marketable ways to do it. There’s good knowledge out there, but it’s far and few between, and everything else damages rather than nurtures The Love.

Upgrading to Ubuntu 12.04

I finally took the plunge and upgraded to Ubuntu 12.04 from 11.10. It was surprisingly painless, taking about an hour and a half on a Core i5 laptop.

In many years of using various flavors of Linux (Slackware, Red Hat, Debian, now Ubuntu), this is the first time I’ve used an automated upgrade program! In the past, I always made a backup of up my data, did a fresh install, and restored the data. I never trusted upgrades to get everything exactly right. The hassle of a new installation always seemed like less trouble to me than having to figure out any strange system quirks resulting from a slightly imperfect upgrade process.

The only hair-raising moment was when the upgrade program seemed to stall during the “Installing the Upgrades” stage, with the progress bar showing “Configuring debconf.” If you’re not noticing any CPU or disk activity for a while, click on the Terminal toggle to show what dpkg is doing. In my case, it was prompting for input there instead of popping up a graphical window, because gtk was temporarily hosed at that point during the upgrade.

But it seems to have done the job, and everything’s working well. Nice job, Canonical. I’m still undecided about Ubuntu past 12.04 because of the serious data privacy issues I’ve mentioned before. It’s a shame, really–despite the disappointing direction that Ubuntu is taking, it’s done so much right to improve desktop Linux.

Ubuntu Woes with the Samsung Galaxy S3

Last week I purchased a Samsung Galaxy S3. It’s a beautiful phone, and I’ve been happy with it so far. Getting it to work with Ubuntu, so I can transfer or sync my music files with it, has been a huge headache, however.

The short of it is this: version 1.1.0 of libmtp, which is what’s packaged with Ubuntu 11.10, doesn’t seem to work with the Galaxy S3. (MTP is the protocol the phone uses to transfer files to and from a PC. Unlike the previous Galaxy devices, the S3 doesn’t mount as a normal USB drive.) The phone just doesn’t get recognized. After reading this bug report, I downloaded the source for the latest version, 1.1.5, and compiled it by hand. (Note: you’ll need to install a -dev package for libusb via apt-get.) That was partially successful: the gmtp program could connect to the phone and show files and directories, but Banshee (2.2.1) now crashed on startup. I was hoping to use Banshee, since it’s a nice iTunes-like music management application that I’d already been using regularly. I could try the latest Banshee (2.6) by compiling that by hand too, but that feels like a bigger ordeal than I’d like to deal with right now.

The easiest solution, of course, is to upgrade to a newer version of Ubuntu with newer versions of all the above software. But 12.04 ships with libmtp 1.1.3, and 12.10 ships with 1.1.4, and I have no idea whether these are recent enough to work.

I’ve been putting off an upgrade because I’m not even sure I want to stick with Ubuntu at all, given the recent issues with data privacy in 12.10.

So it looks like I’m out of luck, in terms of using Banshee to sync music on my current OS installation. I’ve resorted to installing an FTP server on the S3, and copying music that way. It’s awkward and annoying, but it will have to do for now. Perhaps I will write a quick script to do better facilitate music sync’ing over FTP…

(NOTE: This blog post was reconstructed after my super-light traffic WordPress database got mysteriously corrupted this afternoon. Thank you, MySQL. This has not been the greatest of technology days.)

Looking at Go

In the latest stage of my exploration/deepening of programming knowledge, I’ve been looking at Go.

There’s got to be something that piques my intellectual curiosity or solves a specific problem for me to want to learn a new language. Not much about the latest “hot” languages like Ruby, Scala, and Erlang appeals to me, so I haven’t bothered with them. In real world work, I like Python as a general purpose language, and I like Java (seriously!) for large projects that need the strong tooling and frameworks available for it. Lisp and Clojure have provided useful perspective and food for thought, but in practice, they haven’t found a place in the real world software I write. Everything else I tolerate only because I have to (I’m looking at you, Javascript).

Go is extremely intriguing. It strikes me as combining some of the best things about Python and Java. It would be great not to have to choose! I like the simple syntax (not as simple as Python, alas!), the static typing, the fact that it’s compiled, and the general philosophy of favoring composition over inheritance, an idea I’ve come to support more and more. In a world currently dominated by highly dynamic, interpreted languages with very loose typing systems and a hierarchical object oriented paradigm, Go is incredibly unique! Follow the trend of languages like Clojure, Go has concurrency features that take strong advantage of multicore computing, except that its concurrency mechanisms seem much simpler. I’ve started to look at code samples and play with it a bit, and I really like what I see so far.

There’s actually a lot of negative discussions of Go on the web, but most of them are about the language in its messy pre-1.0 state. The March 1.0 release has supposedly tightened up a lot of things, and of course, performance will only get better, now that the fundamental semantics and features are solidly in place. This is an exciting time for what feels like the next evolutionary step in programming languages.