The Myth of Artisanal Programming

Paul Chiusano, the author of the excellent Functional Programming in Scala from Manning (one of the few tech publishers I buy from; worth every penny), recently wrote a blog post titled, “The advantages of static typing, simply stated”.

Lately all I seem to do is rant to people about this exact topic. Paul’s post is way more succinct than anything I can write, so go over there and read it.

While he takes pains to give a balanced treatment of static vs dynamic type systems, it seems much more cut and dry to me. Dynamic languages are easier and faster for development when you’re getting started on a project, and it’s great if that project never gets very big. But they scale very poorly, for all the reasons he describes. Recently, I had the daunting task of reading almost ~10k lines of Perl code (pretty good Perl, in my opinion). It was hard to make sense of and figure out how to modify and extend, whereas the MUCH larger Java codebase (over 100k lines, if I recall) that I worked with years ago felt very manageable.

My own history as a programmer matches Paul’s very closely. I started with Java, which was annoying but not a bad language by any means. Then Python came along and seemed like a liberation from Java’s rigidity and verbosity. But Python, Ruby and others are showing their weaknesses, and it’s no mystery why people are turning to the newer generation of statically typed languages like Scala, Haskell, Go, etc.

People who haven’t been around as long don’t necessarily have this perspective.

In retrospect, it’s interesting to me how we programmers “got sold” on dynamic languages, from a cultural perspective. You might recall that a big selling point was using simple text editors rather than IDEs, and there was this sense that writing code this way made you closer to the software somehow. Java was corporate, while Python was hand-crafted. There was a vague implicit notion of “artisanal” programming in these circles.

The upshot, of course, is that every time you read a chunk of code or call a function or method, your brain has to do a lot of the work that a statically typed language would be able to enforce and verify for you. But in a dynamic language, you won’t know what happens until the code runs. In large measure, the quality of software hinges on how much you can tell, a priori, about code before it runs at all. In a dynamic world, anything can happen, and often does.

This is a nightmare, pure and simple. Much of the strong focus on writing automated tests is to basically make up for the lack of static typing.

True artisanship lies in design: namely, thinking hard about the data structures and code organization you’re committing to. It’s not about being able to take liberties that can result in things that make no sense to the machine and that can cause errors at runtime that could have been caught beforehand.

Algorithms II: An Aside in Week 2 on Learning to Code

I finished Part I a while ago (yay!) and am currently in week 2 of Part II.

Tangential thoughts: a particularly challenging aspect of studying algorithms is proof of solvability and correctness. How can you tell if a computation is possible to solve at all? If you devise a new method for computing something, how do you know that it really works in every case? Mathematical reasoning can allow you to definitively prove that something does what you intended it to do. This is especially important when empirical verification makes it difficult or even impossible to cover all possible cases.

Sedgewick usually glosses the proofs in his lectures, since they’re not the core focus of the course. Some of these proofs are pretty hard to grasp even at a general level of description.

This aspect of algorithms dovetails with my excursion into functional programming in that both are deeply mathematical. They both indicate a view of computing as a branch of formal mathematics. Edsger Dijkstra was a strong proponent of this approach to computer science. I don’t claim to understand what this means in a very deep way, but I found the following example in Dijkstra’s essay, “On the Cruelty of Really Teaching Computer Science”, extremely helpful in starting to grasp this principle:

Consider the plane figure Q, defined as the 8 by 8 square from which, at two opposite corners, two 1 by 1 squares have been removed. The area of Q is 62, which equals the combined area of 31 dominos of 1 by 2. The theorem is that the figure Q cannot be covered by 31 such dominos.

Another way of stating the theorem is that if you start with squared paper and begin covering this by placing each next domino on two new adjacent squares, no placement of 31 dominos will yield the figure Q.

So, a possible way of proving the theorem is by generating all possible placements of dominos and verifying for each placement that it does not yield the figure Q: a tremendously laborious job.

The simple argument, however, is as follows. Color the squares of the squared paper as on a chess board. Each domino, covering two adjacent squares, covers 1 white and 1 black square, and, hence, each placement covers as many white squares as it covers black squares. In the figure Q, however, the number of white squares and the number of black squares differ by 2—opposite corners lying on the same diagonal—and, hence, no placement of dominos yields figure Q.

Not only is the above simple argument many orders of magnitude shorter than the exhaustive investigation of the possible placements of 31 dominos, it is also essentially more powerful for it covers the generalization of Q by replacing the original 8 by 8 square with any rectangle with sides of even length. The number of such rectangles being infinite, the former method of exhaustive exploration is essentially inadequate for proving our generalized theorem.

And this concludes my example. It has been presented because it illustrates, in a nutshell, the power of down-to-earth mathematics; needless to say, refusal to exploit this power of down-to-earth mathematics amounts to intellectual and technological suicide. The moral of the story is: deal with all elements of a set by ignoring them and working with the set’s definition.

The bombshell here is that learning to code shouldn’t be treated as a matter of what he calls its “operational semantics.” It’s a mistake to focus on what code does or how it behaves in its execution. Instead, you should think about code as a purely formal system:

… A programming language, with its formal syntax and with the proof rules that define its semantics, is a formal system for which program execution provides only a model. It is well-known that formal systems should be dealt with in their own right and not in terms of a specific model. And, again, the corollary is that we should reason about programs without even mentioning their possible “behaviors.”

This isn’t academic. When people often talk about the ability to “reason about code,” I think this is what they’re talking about. It’s a skill that can be hard to pin down exactly, but you can recognize it right away in the programmers who have it. They can well envision the challenges in designing a piece of software without being at a computer or writing any code; they can predict the consequences that a given change has for complex systems; and they can often effectively troubleshoot bugs by asking the right questions rather than rooting around in code. This is the holy grail of programming.

Needless to say, it’s a life-long pursuit.

Algorithms I: Notes in Week 5

Scattered thoughts:

A course on learning a programming language will help answer the question, “how do I do X?” The fun thing about an algorithms course is that the question is “how do I do X within certain parameters of time and space?”

In the real world, the two questions are actually one and the same. I’ve just come away from a project that had serious scalability problems, because many of its features could handle only very small sets of data used in development; when the app was run against live data, things stopped working because they would hit a timeout limit or processes would run out of memory.

I’m learning quickly that I can often intuit the “shape” of how an algorithm will perform, and I now have better language for describing this, but I’m not so good at calculating precisely the order of growth for even slightly complex code. It’s hard!

One paranoia-inducing aspect of programming assignments: for week 4’s assignment, a single timing test (1 out of 17) failed for my code because it took too long to finish. It’s hard to figure out… does this single failure expose a flaw in my overall implementation (if so, why did the other 16 pass)? Or was this last test thrown in as a “bonus” involving a difficult set of inputs that would require further optimization if you wanted to get full points? This is a tricky thing to assess as a student, and something only a human being would be able to tell you.

Trees are truly magical. I feel like I’ve barely started to grasp their many applications.

Algorithms I on Coursera

I’m currently taking the “Algorithms I” course on Coursera, a session of which started on January 22nd. I thought I’d write up my impressions so far on taking my first MOOC.

As someone who taught at a university for seven years in the humanities, I should say right off the bat that I dislike the idea of online learning for the reasons you might expect. But this course appealed to me for a few reasons. It’s developed and taught by Robert Sedgewick and Kevin Wayne, the authors of the highly regarded Algorithms, 4th Edition book. The syllabi of the two-course sequence on Coursera would make for the type of semester-length course you’d find in a respectable Computer Science department. Finally, Coursera has a reputation for offering more rigorous and demanding courses than other similar MOOC sites.

So far, I’m keeping up with the schedule and am in the middle of the Week 2 material. I’ve found it to be a positive experience so far, and more challenging than I’d expected!

Some initial impressions:

  • The course is a serious time commitment. Per week, it’s 2 hours of lecture + 2 hours for exercises + 4-12 hours for the programming assignment. I’ve chosen to skip the “interview questions” supplementary material.
  • Assignment grading is, thus far, very rigorous. Submitted source code is analyzed and run through a battery of tests measuring not only correctness, but code cleanliness, run times, and memory use, and scored accordingly.
  • The ability to submit exercises and assignments as many times as you like in order to improve your grade score is a fantastic feature. (I don’t know if all Coursera courses work this way.) It means you can really learn from your mistakes by correcting them; also, it gives you the chance to try out alternative solutions. This is WAY better than the traditional one-shot-only model of graded assignments, which is terrible for actual learning.
  • Basing the course on a published textbook (which is optional) is extremely helpful. There’s material covered more deeply in the text than in the lectures, but the lectures also address some aspects of topics and problems not covered in the book. This makes for a strong complementary relationship between the two; it doesn’t feel like the lectures are simply repeating the textbook.
  • You’re firmly expected to have some basic programming skills and a bit of math as a prerequisite. I like that the lectures keep the focus on the topics at hand, and don’t try to make the course all things to all people. If students need to “catch up” because they’re new to Java or their math is rusty, they use the discussion forums to do so.

As for the actual material, I’ve already learned a lot so far:

  • I’ve gotten some exposure to formal methods for algorithm analysis. A week and a half obviously isn’t going to make anyone great at this, but at least I now have some approaches for thinking through correctness, run times, and memory use mathematically, whereas before, I would mostly work empirically.
  • I can better identify different orders of growth and some of the common code patterns that indicate them.
  • The first week’s case study of different algorithms for Union-Find was, for me, a thought-provoking exercise in what is possible with arrays vs trees in representing relationships among data. The programming assignment is so stringent that it’s difficult to satisfy all the run time and memory requirements for a perfect score. This has generated a lot of insightful discussion in the forums about optimization.

Algorithms really get at the essence of what programming is. Anyone who works as a programmer has to put into practice algorithmic thinking to some degree, even if they aren’t aware of it.

I plan to continue writing about this as a way to keep me accountable for completing the two-course sequence.