Category Archives: work

Annoyances in Xubuntu 16.04 LTS

This week, I installed Xubuntu on a new work computer. I’d previously sworn off Ubuntu, but I admit, I’m crawling back now… the reality is that Ubuntu has smoothed out many of the rough edges that I’m simply not willing to deal with at work. Sigh.

Even as generally polished as Xubuntu is, I did encounter a few hiccups.

1) To adjust settings for the screen locking software, light-locker, I needed to make sure the light-locker-settings package was installed. Nothing happened when I selected “Light Locker Settings” from the whisker menu, though, because it was crashing. I ran “light-locker-settings” via a terminal, and saw some python error messages.

Python was trying to import a module from python-gobject, which wasn’t installed and wasn’t a prerequisite for light-locker-settings for some reason.

After that error went away, I got another one about a missing function. To fix it, you have to manually patch two lines in a python file, as described in this bug report. [NOTE: This has been fixed as of 7/20/2016, in version 1.5.0-0ubuntu1.1 of light-locker-settings]

2) Another light-locker quirk: the mouse pointer becomes invisible when I lock the screen by hitting Ctrl-Alt-Del and then unlock it. To make it visible again, hit Ctrl-Alt-F1 to switch to a text console and then Ctrl-Alt-F7 to return to Xfce.

3) The “Greybird” theme is notorious for making it VERY difficult to resize windows by dragging the handles that appear when you mouse-over the window edges and bottom corners. The pointer has to be EXACTLY on an edge or corner; it won’t display the resize handle if you’re slightly off.

For reasons I don’t understand, the devs seem intent on not changing this. But enough users have complained that the Xubuntu blog even has a post about alternative ways to resize windows. The disregard for user experience here is simply mind-blowing.

I’ve grudgingly started using the Alt and right-click drag combo to resize windows.

Addendum:

4) Intermittent DNS problems: hostnames on our internal domain weren’t always resolving. This seems like a common problem on Ubuntu caused by dnsmasq. The solution is to disable it by commenting out the line “dns=dnsmasq” in /etc/NetworkManager/NetworkManager.conf and rebooting.

Taking A Fresh Look at PHP

I’ve recently started working on a PHP/Laravel project.

PHP isn’t new to me. Many years ago, I wrote a very simple online catalog and shopping cart, from scratch, for a friend who had his own business as a rare book dealer. He used it with much success for several years. I’d also done a bit of hacking on some Drupal plugins.

Coming back to PHP now, I’m finding myself in a world MUCH different than the one I’d left.

First off, let’s admit that PHP comes with a lot of baggage. For a long time, “real” programmers shunned PHP because it was born as a language cobbled together to do simple web development but not much more. Its ease of use, combined with the fact that it was easy to deploy on commodity web hosting, meant you could find PHP talent for relatively cheap to build your applications. The stereotype was that PHP developers relied on a lot of patchy copy-and-paste solutions to build shoddy and insecure websites.

A LOT has happened since then. Here’s what I’ve encountered so far, diving back into PHP:

Object-orientation: PHP has had objects for a long time, but more recent features like namespaces, traits, and class autoloading have made newer PHP projects very strongly object-oriented. You can even find books on design patterns for PHP.

To me, this is the single most important positive change to the PHP world. The culture has changed from an ad hoc procedural mindset to more sophisticated thinking about coding for large-scale architectures.

Frameworks: Several major MVC frameworks exist, many of them drawing inspiration from Rails.

Performance: As of 5.5, PHP has a built-in opcode cache, making it much more performant. An alternative to core PHP is the HHVM project, backed by Facebook, which is a high-performance PHP implementation. HHVM has had a “rising tide” effect: the forthcoming PHP7 is supposed to be as fast as HHVM. So whatever you use, you can expect good performance at scale.

Tooling: There is sophisticated tooling like composer and a vibrant ecosystem of packages. While you can still deploy PHP applications the old way, using Apache and mod_php, there is a mature FastCGI Process Manager (PHP-FPM) engine that isolates PHP processes from the web server. PHP-FPM allows Apache/nginx/whatever web server to handle static content while a pool of processes handles PHP requests. This results in much more efficient memory usage and better performance.

Success: Many respectable, high-profile products have been built using PHP: WordPress, Drupal, and Facebook, just to name a few.

But all this is just to state a bunch of known facts. To me, the biggest suprise has been in the EXPERIENCE of beginning to write code again in PHP and using Laravel: what does that FEEL like?

In a word, it feels like Java, minus the strong typing. This is an entirely good thing in my opinion, despite criticisms that PHP technologies have become too complex and overdesigned.

The biggest paradigm difference between PHP and other popular web application back-ends is that nothing remains loaded in memory between requests. It’s true that opcode caching means PHP doesn’t have to re-compile PHP source code files to opcodes every time, which speeds things up greatly, but the opcodes still need to be executed for each request, from bootstrapping the application to finishing the HTTP response. In practice, this doesn’t actually matter, but it’s such a significant under-the-hood difference from, say, Django and Rails, that I find myself thinking about it from time to time.

It’s reassuring that when I scour the interwebs researching something PHP-related, I’m finding a lot of informed technical discussions and smart people who have come to PHP from other languages and backgrounds. It bodes well for the strengths and the future of the technology.

On Magic

Kids, I hate to break it to you, but there is no such thing as magic.

The cool whizzy stuff on your screen that impresses you: that’s the result of work. The button that was broken yesterday, that now works correctly today: also the product of work. The screen that was discussed in a meeting last week that suddenly appeared today on the development server: yup, work. When you look for a feature in the web application and it isn’t there, there’s this thing that can create it and put it there: it’s called work.

Someday we’ll all get over the mystifying aura of technology. Someday people will learn to recognize that programmers are not magicians, just workers, and that the work they do involves mundane, non-magical tasks, like wrestling with code libraries and frameworks to get them to do what we want, reorganizing files to make sure stuff exists in sensible places, and figuring out what to do when changing one piece affects three other pieces in unexpected ways.

And this means, someday, people will understand that, like any other kind of work, software development takes resources (namely, time!), not a magic wand. And no amount of “ambition” (read: wishful thinking) can really change that basic equation. You can pretend magic exists, but that doesn’t make it so. You aren’t fooling anyone. You just look childish.

When software development is recognized as work, there can be clarity about what is possible with a given set of resources. Then tasks can be sanely identified, specified, prioritized, coordinated, scheduled, executed, completed.

And then some really cool things can happen. Not magical things, but really cool things. Great things, even. The kind of great things that result from understanding, dedication, and hard work.

Adventures in Docker

large_h-trans

After you’ve taken the time to puzzle through what it is exactly, Docker is nothing short of life changing.

In a nutshell, Docker lets you run programs in separate “containers”, isolating the dependencies each one requires. This is similar to virtualization solutions like VMWare and Virtualbox but Docker is a much more fine grained, customizable tool that operates at a different level.

It took me a week of experimentation to develop a firm grasp of the Docker concepts and how all its pieces work together in practice. I’m now using it at work for development, and I hope to be setting up a configuration for staging soon.

This is a short write-up of what I’ve learned so far.

The Concepts

On a first glance, almost everyone (including me) mistakes Docker as another virtualization technology. It’s not. Virtualization lets you run a machine within a machine. A Docker container is more subtle: it segments or isolates part of your existing Linux operating system.

A container consists of a separate memory space and filesystem (taken from something called an image). A container actually uses the same kernel as your “host” system. Some fancy Linux kernel technologies allow all of this to happen; there is no hardware virtualization going on.

You start using Docker by creating an image or using an existing one made by someone else. An image is a filesystem snapshot. You can build images in an automated fashion using a Dockerfile, which allows you to “script” the running of commands to do things like install software and copy files around.

When you launch a container, Docker makes a “copy” of an image (well, not really, but we’ll pretend for now) and uses that to start a process. The fact that the filesystem is a “copy” is important: if you launch 2 containers using the same image, and the processes in each modify files, those changes happen in each CONTAINER’s filesystem; the image doesn’t change. Processes inside containers only see what’s in the container, so they are isolated from one another. This allows complete dependency separation, at the level of processes.

You can do a lot with containers. You can run multiple processes in them (though this is discouraged). You can start another process in an already running container, so it can interact with the already running process. After a container has stopped (by halting the process running in it), you can start it back up again. Again, any file changes made are in the container’s filesystem; the image remains unchanged.

Issues

There are three issues I’ve personally encountered so far in using Docker:

1) Persistent Storage

Containers are meant to be transient. In a well designed setup, you should, theoretically, be able to spin up new containers and discard old ones all the time, and not lose any data. This means that your persistent storage has to live somewhere else, in one of two places: a special “data container” or a directory mounted from the host filesystem.

Data containers were too complicated and weird, and I couldn’t get them to work the way I expected, so I mounted directories instead. This has the nice side effect that, as Dockerized processes change files, you can see those changes immediately from the host without having to do anything special to access them. I’m not sure, however, what “best practices” are concerning storage.

2) Multi-Container Applications

Many modern applications consist of several processes or require other applications. For example, my project at work consists of a Rails web app, a delayed_job worker process, an Apache Solr instance, and a MySQL database.

Since Docker strongly recommends a one-process-per-container configuration, you need a way to coordinate a set of running processes and make sure they can communicate with one another. Docker Compose does this, allowing you to easily specify whether containers should be able to open connections to each other’s network ports, share filesystems, etc.

Currently, Docker Compose is not yet considered “production ready.” While it addresses the need to orchestrate processes, there is also the problem of monitoring and restarting processes as needed. It’s not clear to me yet what the best tool is for doing this (it may even be a completely orthogonal concern to what Docker does).

3) Running Commands

Sometimes you need to use the Rails CLI to do things like run database migrations or a rake task. Running commands takes a bit of extra effort, since they need to happen in a container. A slight complication is whether to run the command in the existing Rails container or to start another one entirely for that new process. It’s a bit of typing, which is annoying.

The Payoffs

How is Docker life changing? There are several scenarios I encounter ALL THE TIME that Docker helps tremendously with.

* On the same machine, you can run several web applications, which each require different versions of, say, Ruby, Rails, MySQL and Apache, without dealing with software conflicts and incompatibilities.

* Related to the previous point, Docker lets you more easily experiment with different software without polluting your base system with a lot of crap you might never use again.

* There is MUCH less overhead and less wasted memory usage than with virtualization. If you allocate 1GB of RAM to a virtual machine but only use 512MB, the other half goes to waste. Docker containers only use as much memory as the processes themselves take up, plus a small bit of overhead. Since Docker uses unionfs to “copy” (really, overlay) images to container filesystems, the actual disk space used isn’t as much as you might think.

* Since Docker containers are entirely self-contained, they can be deployed in development, staging, and production environments with almost NO differences among them, thereby minimizing the problems that usually arise.

For me, a lot of the benefits boil down to this: virtualization is amazing, but in practice, I don’t use it that much because it’s too heavyweight and too coarse for my needs. A Virtualbox is a whole other machine that I need to think about. By working at the level of Linux processes, Docker is exactly the right kind of tool for managing application dependencies.

A cautionary note: there’s a lot of buzz right now around containers, including efforts at defining vendor-neutral standards, such as appc. Although Docker releases have been rapid and it is already seeing a lot of adoption, it feels bleeding edge to me. It’s exciting but in a few years, it’s entirely possible that another container solution might surpass it to become the de facto standard. The playing field is just too new, which means Docker comes with some risk. But it’s well worth exploring at this early stage, even if only to get a taste of the new ideas shaping systems configuration and deployment that are definitely here to stay.

A VIAF Reconciliation Service for OpenRefine

open-refine

OpenRefine is a wonderful tool my coworkers have been using to clean data for my project at work. Our workflow has been nice and simple: they take a CSV dump from a database, transform the data in OpenRefine, and export it as CSV. I write scripts to detect the changes and update the database with the new data.

We have a need, in the next few months, to reconcile the names of various individuals and organizations with standard “universal” identifiers for them in the Virtual International Authority File. The tricky part is that any given name in our system might have several candidates in VIAF, so it can’t be a fully automated process. A human being needs to look at them and make a decision. OpenRefine allows you to do this reconciliation, and also provides an interface that lets you choose among candidates.

Communicating with VIAF is not built in, though. Roderic D. M. Page wrote a VIAF reconciliation service, and it’s publicly accessible at the address listed on the linked page (the PHP source code is available here). It works very nicely.

I wanted to write my own version for 2 reasons: 1) I needed it to support the different name types in VIAF, 2) I wanted to host it myself, in case I needed to make large numbers of queries, so as not to be an obnoxious burden on Page’s server.

The project is called refine_viaf and the source code is available at https://github.com/codeforkjeff/refine_viaf.

For those who just want to use it without hosting their own installation, I’ve also made the service publicly accessible at http://refine.codefork.com, where there are instructions on how to configure OpenRefine to use it.

What Django Can (and Can’t) Do for You

I’m joining a team at work for the next few weeks to hammer out what will be my second significant Django project.

I’m not an expert on Django, but I have enough experience with it now to say that it facilitates the vast majority of web application programming tasks with very little pain. It’s highly opinionated and very complex, and has all the issues that come with that, but if you learn its philosophy, it serves you extremely well. And in cases where it doesn’t—say, with database queries that can’t be written easily with the Django ORM—you can selectively bypass parts of the framework and just do it yourself.

So I’ve been puzzled by complaints I’ve been hearing about how difficult it is to work with Django. There’s an initial learning curve, sure, but I didn’t think it was THAT bad. Yet over and over again, I kept hearing the grumbling, “why do I have to do it this way?”

A recent example came up with the way that Django does model inheritance. There’s a few ways to do it, with important differences in how the database tables are organized. You have to understand this in order to make a good choice, so of course, it takes a little time to research.

Having worked with Java’s Hibernate, I recognized some of the similarities in Django’s approach to addressing the fundamental problem of the impedance mismatch between object classes and database tables. Every ORM must deal with this, and there are nicer and less nice ways to deal with it. The point is, there’s no way to avoid it.

I realized that the complaints weren’t actually about Django per se, despite appearances. They were complaints about not understanding the underlying technologies. People were expecting Django to replace the need for knowledge about databases, HTTP, HTML, MVC architecture, etc. It doesn’t. That’s a poor way to approach what Django can do for you.

The metaphor of tools is useful here. If you gave me a set of sophisticated, high-quality tools to build a house, but I didn’t know the first thing about construction, I might complain that the tools suck because I can’t easily accomplish what I want, because I’m forced to use them in (what seems to me to be) clumsy ways. But that would be terribly misguided.

So the complaints weren’t about the merits of Django compared to how other frameworks do things. What they’re really saying is, “This is different from how I would do it, if I were writing a web framework from scratch.” Which is funny, because I’m not convinced they could invent a better wheel, given their limited knowledge and experience. (This is not a dig: I’ve worked on quite a few projects, many with custom frameworks, and doubt I could conceive of something easier to use and still as powerful as Django. Designing frameworks is hard.) Sometimes the complaints are thinly veiled anti-framework rants, which is fine, I suppose, if you prefer the good old days of 1998. But God help you if you try to create anything really complicated or useful.

Goodbye, Sublime Text?

When one of my coworkers started using Sublime Text about a year ago, I was intrigued. I played with it and found it to be a very featureful and speedy editor. I wasn’t compelled enough to make the switch from Emacs, though. (You’ll pry it from my cold dead hands!) But I really liked the fact that you could write plugins for it in Python.

So for fun, I gradually ported my emacs library, which integrates with a bunch of custom development tools at work, to Sublime Text. It works very well, and the ST users in the office have been happy with it. Although I don’t actually use ST regularly, I’ve since been following news about its development.

What I discovered is that many of its users are unhappy with the price tag and dissatisfied with the support they received via the forums. So much so, in fact, that there’s now an attempt to create an open source clone by reverse engineering it. The project is named lime.

I learned about this with very mixed feelings. There’s a good chance the project will take off, given how much frustration exists with ST. Of course, the trend is nothing new: open source software has been supplanting closed source commercial software for a long time now. But this isn’t Microsoft or Oracle we’re talking about; it’s a very small company, charging what I think is a reasonable amount of money for their product. While they undoubtedly could do more to make their users happier, I imagine that they probably can’t do so without hurting what I imagine are pretty slim profit margins. That, or not sleeping ever again.

It’s not news that making a software product is much less viable than it used to be. Where money is made, it’s increasingly through consulting and customization, but one wonders about the size of that market.

It’s generally a good thing that open source has “socialized” software development: technology has enabled communities of programmers to contribute and collaborate on a large scale, in a highly distributed fashion, to create good quality software available to all, taking it out of the profit equation. The problem is that the rest of the economy hasn’t caught up with this new kind of economics.

I don’t mean to sound dramatic: there are many jobs out there for programmers, of course. But it saddens me that if you want to try to create a product to sell, it’s simply not enough to have a good idea anymore, in this day and age. It has to be dirt cheap or free, you have to respond to every message immediately, and respect every single feature request. Between the open source world and the big software companies that service corporate customers, there is a vast middle ground of small companies that is quickly vanishing.

Exercise Mania

My ex-girlfriend’s mom started doing a jazzercise class in the 80s and hasn’t stopped since. It’s a weird thing.

Programming exercises can be just as addictive as some forms of well-marketed physical exercise. I have to actively resist writing solutions in Lisp to every silly problem I stumble upon online. But this morning, I came across this simple one (a blog post from 2007, but that happens on Hacker News) and, for some reason, couldn’t pass it up:

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

It’s so silly that I guess some folks, like the blogger linked above, have turned it into a problem of “what’s the most creative/strange way I can think of to do this?” Which is interesting and fun in its own right.

What I don’t understand is the attitude expressed in the original fizzbuzz blog post, “Using FizzBuzz to Find Developers who Grok Coding”. Does it really matter if it takes someone 5 or 20 minutes to write a solution? (FWIW, I took about 10 mins). A shorter time with such a goofy example means someone “groks” coding more than another? Ridiculous.

These types of puzzles are amusing precisely because a lot of real-world development (which is what the vast majority of people who code for a living do) is unfortunately pretty rote, and doesn’t require a lot of algorithmic thinking. Code monkeys learn to use libraries and frameworks, and spend a lot of time trying to use them correctly and cleanly, in order to implement straightforward requirements. So these exercises are a shift in mindset. In a tiny way, they put you back on the path of engineering. It’s why people find them interesting to do. It’s why I’m reading the Structure and Interpretation of Computer Programs.

Taking a few more minutes might mean you’re not accustomed to solving such problems every minute of your working life. But that’s not at all the same as gauging whether someone understands coding or not. That kind of measurement is naive at best.

Making Emacs an IDE

It’s that time when bloggers wax introspective about the past year. For me, the major personal revelation in 2011 was re-discovering something very old, and putting it to new use. For me, 2011 was the year of the Emacs IDE.

I’ve been using Emacs, on and off, for close to a decade now. What’s changed is that, in the past few months, I’ve been writing extensions for it. It started with a simple desire to better navigate files in a complex directory hierarchy that followed specific and somewhat convoluted conventions. At first, learning Emacs Lisp was simply a means to an end, but I ended up liking it so much that I started exploring Common Lisp (and more recently, Clojure, since I’ve worked with Java in the past).

What started as a small task has become a larger project of turning Emacs into an IDE.

To understand this, one needs to know some context about the system I work with. We developers edit tons of XML files and files in other text formats, which all drive our proprietary web application product. We have many command line tools that manipulate these files in various ways; the system was originally designed by folks who followed the UNIX philosophy of building orthogonal tools and chaining them together.

There are pros and cons to this system; for reasons I won’t get into, I don’t love it, but it’s what we work with right now. When I started the job, the vast majority of the developers used screen, vi, and the shell prompt. Typical workflows that involved working with only a few files could be extremely hard to keep track of, and usually required a lot of copying and pasting between screen and/or ssh sessions. Few people seemed to mind, but I found the workflow to contain too much extraneous cognitive load, and the state of the tools made development very prone to error.

Gradually, I’ve been integrating our tools into Emacs. Sometimes that simply means binding a key combination to running a diagnostics program and storing the output in a buffer. Sometimes it means accumulating that output for history’s sake. Sometimes it means parsing program output, processing it in Emacs Lisp, and inserting a result into the current buffer. Sometimes it means running external programs, even GUI applications, and tweaking them a bit to tell Emacs to open specific files you want to look at.

The productivity gains have been amazing. This is no reason to brag: managing several screen sessions with different vi and shell instances wasn’t exactly hard to improve upon. But Emacs made it fairly painless. Emacs Lisp has proved to be wonderful “glue” for integrating existing tools in our environment.

Writing tools that enable you to do other job tasks better is a really interesting experience; I’ve never done it to such an extensive degree. So far, one other person in my group is using this Emacs IDE, and she has been happy with how much it facilitates her work. Others who swing by my desk for something often watch me work for a moment, and ask, “how did you do that?! that’s not vi, is it?”

Getting more people to switch over means convincing them that the steep learning curve of Emacs is worth the gains of the integration I’ve done. I’m not sure how much that will happen, since a big part of it is cultural. But if there aren’t any more converts, I don’t really care. The best thing about this ongoing project is that I am the end user. The software I wrote is really for myself. It is something I use intensively every single day. And that makes it all the more gratifying.

Learning Lisp

For work projects, I often have to jump around a lot of XML files referenced in one another by name. The files follow a strict naming convention, but it’s one that looks like gibberish to human eyes. Typing them is difficult and prone to error; plus there are so many files in a given project that navigating a visual tree of the directory isn’t much easier.

So I wrote a small library for Emacs to find and open a file in the directory tree, if the cursor is over a string of text that matches a valid filename in the convention. I’m embarrassed to admit this took me two full days (a few other functions were in the library too). The only previous exposure to Emacs Lisp I had was tweaking my .emacs file, so I had to learn the language as well as hunt down the right functions to call for what I wanted. The work is paying off: it feels 10x’s easier to navigate project files now.

I experimented with trying to do this in Komodo Edit, a popular editor among coworkers, but it involved learning Mozilla’s XUL and Komodo’s own API, as well as writing javascript (yuck), so I abandoned that effort pretty quickly.

I didn’t think I would like Lisp, but the experience has been pretty fascinating, and I’m now making my way through the Practical Common Lisp book by Peter Seibel. It’s interesting learning a language that’s half a century old (!), and that’s influenced so many contemporary languages. One might think that there’s nothing there worth learning or revisiting, but that is so wrong. In particular, I’m trying to wrap my head around the power of Lisp macros and the way they allow you to create new syntactic abstractions. The idea of extending the very language itself, rather just adding new functions, is mind-boggling, to say the least. And, from what I understand, it remains fairly unique to Lisp in spite of the flood of new languages in the past few decades.

It’s disheartening not being able to find much info about who actually uses Lisp anymore, aside from hackers building modules for Emacs. Paul Graham has a cool essay, “Beating the Averages”, about using Lisp to build online store software that Yahoo eventually acquired. And ITA software, which makes an airfare search engine that powers the entire travel industry, uses Lisp. But aside from these bits of info, there isn’t much out there in the way of Lisp “success stories.”