In praise of feedparser

I discovered an issue this morning in the excellent feedparser module.

feedparser (aka Universal Feed Parser) has a reputation in the python community for being an incredible piece of code. With good reason: it understands a mind-boggling array of feed formats and versions, and it’s been put through the paces with a suite of 3262 unit tests. Mark Pilgrim’s terrific work has saved me (and many others, no doubt) months of toil and sweat.

The issue has to do with encountering multiple “title” tags defined in different XML namespaces. A dc:title or media:title tag, if it is encountered anywhere after a regular RSS title tag, will overwrite it. Several other people have actually already documented this in the bug tracking system.

It’s unclear how much work is actively being done with feedparser these days, so I reluctantly dived into the module to try to see what was going on. A little over an hour later, with almost no pain, I found myself with a patch that passed the test suite. I’d love to say it was due to my incredible coding prowess, but really, the code is just amazingly clean and easy to understand. That’s how fixing bugs should be.

The patch is here if anyone wants it.

Leave a Reply

Your email address will not be published. Required fields are marked *