Announcing conciliator

I’ve just created a github repository for conciliator, a growing collection of OpenRefine reconciliation services, as well as a Java framework for creating them.

conciliator is a major refactoring of my refine_viaf project and supercedes it. This new project cleanly separates the VIAF-specific parts and the more “boilerplate” pieces needed for any OpenRefine reconciliation service. The result is a framework that allows you to easily write new reconciliation services. My intent here is to make some existing code way more flexible, so that it might be useful to more users and have a longer lifespan.

http://refine.codefork.com has already been running conciliator for a week now; if you’ve been using it, you don’t need to make any changes in OpenRefine.

Currently, conciliator out-of-the-box can query VIAF exactly like refine_viaf does, down to the same URLs. Additionally, conciliator can now query ORCID names. This was a somewhat arbitrary choice; I’ve been doing some ORCID integration at work so it was convenient for me to implement a data source for it as a proof of concept.

With VIAF and ORCID, conciliator acts as an intermediate or “bridge” service, but it would be possible to use conciliator to query other types of data sources as well: files, SQL databases, etc. Right now, you’d have to write your own code to read and parse files, open database connections, etc. But in the future, I hope to add support for these options to make them easier to implement.

For details on how to write your own service in Java using conciliator, see the README.

Are there data sources you’d like to see available as a reconciliation service? Leave a comment to this post. No promises, but I’ll at least consider all requests. And if you write your own service for a data source, please consider submitting your code as a pull request so that others can use it too!

The Myth of Artisanal Programming

Paul Chiusano, the author of the excellent Functional Programming in Scala from Manning (one of the few tech publishers I buy from; worth every penny), recently wrote a blog post titled, “The advantages of static typing, simply stated”.

Lately all I seem to do is rant to people about this exact topic. Paul’s post is way more succinct than anything I can write, so go over there and read it.

While he takes pains to give a balanced treatment of static vs dynamic type systems, it seems much more cut and dry to me. Dynamic languages are easier and faster for development when you’re getting started on a project, and it’s great if that project never gets very big. But they scale very poorly, for all the reasons he describes. Recently, I had the daunting task of reading almost ~10k lines of Perl code (pretty good Perl, in my opinion). It was hard to make sense of and figure out how to modify and extend, whereas the MUCH larger Java codebase (over 100k lines, if I recall) that I worked with years ago felt very manageable.

My own history as a programmer matches Paul’s very closely. I started with Java, which was annoying but not a bad language by any means. Then Python came along and seemed like a liberation from Java’s rigidity and verbosity. But Python, Ruby and others are showing their weaknesses, and it’s no mystery why people are turning to the newer generation of statically typed languages like Scala, Haskell, Go, etc.

People who haven’t been around as long don’t necessarily have this perspective.

In retrospect, it’s interesting to me how we programmers “got sold” on dynamic languages, from a cultural perspective. You might recall that a big selling point was using simple text editors rather than IDEs, and there was this sense that writing code this way made you closer to the software somehow. Java was corporate, while Python was hand-crafted. There was a vague implicit notion of “artisanal” programming in these circles.

The upshot, of course, is that every time you read a chunk of code or call a function or method, your brain has to do a lot of the work that a statically typed language would be able to enforce and verify for you. But in a dynamic language, you won’t know what happens until the code runs. In large measure, the quality of software hinges on how much you can tell, a priori, about code before it runs at all. In a dynamic world, anything can happen, and often does.

This is a nightmare, pure and simple. Much of the strong focus on writing automated tests is to basically make up for the lack of static typing.

True artisanship lies in design: namely, thinking hard about the data structures and code organization you’re committing to. It’s not about being able to take liberties that can result in things that make no sense to the machine and that can cause errors at runtime that could have been caught beforehand.

Data Streams in Ruby

Recently I wrote up some notes on how to do data processing using streams (lazy enumerators) in Ruby. Doing so served two purposes: 1) to help clarify my own thinking about better ways to write code for common data-munging tasks, 2) to pass along to co-workers in the hopes of establishing some informal best practices and initiating some conversations.

I decided to post my notes on github. Take a peek if this sort of thing interests you.

Getting OpenVPN to Add DNS Servers

I couldn’t get my OpenVPN client to add a DNS server that I know the VPN server was telling it about. It turns out that on Xubuntu 16.04 (and all flavors of Ubuntu, probably), you need to supply additional arguments to make it handle the “dhcp-option” information it receives. More specifically, you have to use the –up and –down options to point to an Ubuntu-supplied script that needs to run when the VPN connection goes up and down.

sudo openvpn --config office.ovpn --script-security 2 --up /etc/openvpn/update-resolv-conf --down /etc/openvpn/update-resolv-conf

Upgrading the Touchpad on a Thinkpad x240

This is a stock photo of a Thinkpad x240, stolen from the interwebs:

x240_stock

This is my own x240, which I bought back in January 2015.

x240_trackpad

If you know about Thinkpads, you probably noticed the difference right away. The x240 (and other models that year) suffered from an incredibly crappy buttonless touchpad. It’s so bad that it’s barely usable. Clicking is ridiculously inaccurate: there’s so much travel that the mouse pointer moves during a click, and there are no buttons to use instead. There were so many complaints that Lenovo replaced it with a better one in the next year’s lineup.

This weekend I finally got around to upgrading it with a touchpad replacement part for the x250. It cost $32 on ebay. This modification is popular, so you can find info about it scattered around in forums and such. I followed the instructions on this page, How to change an x240 trackpad, as it’s one of the clearest ones out there. I couldn’t find much info about the author, whose name appears only as “Michael” on that blog.

Some notes and tips from my experience:

1) Michael’s picture shows a set of wires connected to the touchpad along its side, but mine didn’t have them.

2) The touchpad sits in a well, held in place with adhesive tape, so to remove it, you just pry it off. The problem is that it’s hard to reach “under” the entire touchpad assembly, which is sort of like a sandwich with layers. I ended up partially prying off the top layer before I could get to the bottom and pry the whole thing from the case. Needless to say, this bent the touchpad.

I couldn’t figure out a way to avoid effectively destroying the old touchpad. But since it was so crappy, it was also somewhat satisfying.

3) Detaching and re-attaching the small ribbon cable from/to the underside of the touchpad is VERY tricky. The end of the ribbon is held in place to the connector on the touchpad by a thin black “latch” sitting just behind it. You CANNOT just yank the ribbon out. (This took me a while to figure out!) Lift the black latch, and the ribbon will slide from the connector easily. When connecting it to the new touchpad, tuck the ribbon end securely into the connector, then flip the latch down to lock it in place.

4) At first, the new touchpad wasn’t being recognized by the machine. It worked after I re-seated the ribbon in the connector and also reset the BIOS (as shown in this video: stick a paper clip end into the tiny hole beside the battery and press for 20 seconds). I should have tried those things separately, but got a bit too excited. So you may or may not have to do a BIOS reset.

So far so good. The new touchpad is definitely a big improvement. There’s much less click travel using the pad, and it feels snappier. I really like having the buttons.

The only quirk is that the surface of the touchpad now sits just a hair higher than the palm rest. It’s probably slightly more likely for my hand to accidentally brush it while typing, as compared to the original touchpad, but only time will tell for sure.

It feels like a totally different computer.