codefork.com – building a new world in the shell of the old

Downgrading a package in Ubuntu

A recent regression badly broke the docker compose plugin. A comment in that issue indicated it probably wouldn’t be fixed until after the holidays, so I needed to roll back to a previous working version.

I’ve never actually had to do that with a package before. It was pretty easy, though the steps are not immediately apparent. Here’s what I did, maybe it will help someone trying to accomplish something similar.

I’m using the repository for Docker Engine. You can load the repository URL in a browser to navigate around: https://download.docker.com/linux/ubuntu/

Find the exact version string to tell apt to install. For third party repositories, the string will often include the distro and version in it. I’m running Ubuntu 22.04 (jammy) so I downloaded this text file containing the released versions for my distro, version, and architecture:
https://download.docker.com/linux/ubuntu/dists/jammy/stable/binary-amd64/Packages

Looking through that file, it’s evident that the last version released prior to 2.32.0 for the docker-compose-plugin package was “2.31.0-1~ubuntu.22.04~jammy”.

Run this command to install that version:

apt install docker-compose-plugin=2.31.0-1~ubuntu.22.04~jammy

Now apt needs to be told not to try to upgrade docker-compose-plugin the next time you run “apt upgrade”:

apt-mark hold docker-compose-plugin

When a new package is released with the fix, replace “hold” with “unhold” in the above command to allow the upgrade.

Netgear AX5400 Headaches

I recently purchased this wifi router, model RAX54v2, and I’m regretting it.

To my great annoyance, it requires you to create an account on the Netgear website. You need to login to that account when accessing the admin web page interface on the device. I’m guessing that it’s used for the paid “Netgear Armor” feature, but I don’t want to use that. Regardless, it’s still REQUIRED.

It gets worse. The login doesn’t even work. I successfully created an account and was able to log in via the netgear.com website. But when the router’s admin page asks me to log in using the Netgear account, I can see that it makes a POST request to a URL that returns a 404, and the page shows an extremely helpful “Something went wrong” message. Even playing by their rules, I’m effectively locked out of MY OWN DEVICE.

The good news is I did find a solution!

Just unplug the cable from the Internet port. Then, after you login to http://192.168.1.1/ using the admin credentials, you will NOT be prompted to log in with your Netgear account. Of course, you won’t be able to use the Internet, which is a pretty horrible experience, but at least you can make your configuration changes. Plug the cable back in when you’re done.

You do NOT have to reset the router and lose all your settings before doing the above.

I posted two messages, a reply and a new post, to the community support forums on the Netgear site containing the above information, but both times, the messages mysteriously disappeared after a minute or two. There are sporadic mentions of this solution buried deep in various threads in the context of other problems. I suspect Netgear doesn’t want to make it too clear to desperate users that they can bypass the non-functional Netgear login IN ORDER TO CONFIGURE THE DEVICE THEY OWN.

I hope someone looking for this information finds it here. Needless to say, I won’t be purchasing any more products from Netgear.

Engineering

Engineering is about making tradeoffs. You consider all the factors, optimize for the ones that are most important, and make the best choices based on that.

Put in that way, engineering is pretty damn boring. It’s not easy. But being difficult doesn’t necessarily make something interesting.

In my younger days, I put a lot of energy into exploring the latest fads and keeping up with “best practices,” all in the hopes of becoming a better software engineer. I do that a lot less these days–only as much as I need to for work. Anyone who genuinely gets off on “best practices” is a bit suspect in my book.

At some point, I think my interest in coding changed from “is this well written software?” to something like “what kind of thinking does this code imply about how it solves the problem?”

This is way more interesting! Even so-called “bad” code can be very interesting and teach you something about a unique way to approach the problem it solves.

Obviously, we can’t spend all our time appreciating various aspects of less-than-optimal code. There’s work to be done, efficiency to be measured, best practices to follow. It’s not like we got into coding because we enjoyed thinking or anything.

Generator Expressions

I discovered Python generator expressions this week. They’ve been around since v2.4, way back in 2004, and yet, somehow, they managed to escape my notice until now. I don’t think I’ve ever seen them used in any codebase I’ve worked on.

Generator expressions are almost the same as list comprehension syntax, except instead of square brackets, you use parentheses. And instead of returning a list, the comprehension returns a generator object.

This is quite a nice bit of syntactic sugar that helps to keep things more succinct than writing iterators and generator functions. But it’s not as succinct as, say, the chained method syntax you find in Scala or Ruby, which I think is vastly superior in terms of clarity. Several years ago, I wrote up some notes about Ruby streams in a github repo for my co-workers because it was a valuable technique in some data processing scripts we were working on. (Since then, I haven’t done much Ruby.)

Here’s a bit of Python code that duplicates the simple Ruby example in the linked repo above, with the identical output. Again, it’s not nearly as nice to look at as chained method calls, but still better than having to write separate generator functions or classes.

nums = list(range(1,6))

add10 = lambda i: print(f"adding 10 to {i}") or i + 10
filter_even = lambda i: print(f"filtering {i} on evenness") or i % 2 == 0
output = lambda i: print(f"in each: {i}")

# list comprehensions

r1 = [ add10(x) for x in nums ]
r2 = [ x for x in r1 if filter_even(x) ]
r3 = [ output(x) for x in r2 ]

# generator expressions

r1 = ( add10(x) for x in nums )
r2 = ( x for x in r1 if filter_even(x) )
r3 = ( output(x) for x in r2 )
list(r3)

Data as Coping Mechanism

Life under this pandemic has been hard. Oddly, one of the things that’s helped me deal is to play around with the coronavirus data. The numbers in the U.S. are horrifying, of course, but they’ve also been soothing at a technical level, maybe because working with the data is somewhat different from the work I do for my job. It’s also been neat to do hands-on validation of reporting in the media and various claims made about trends. I’ve been showing my results to friends who seem to find them insightful.

The repository is here. There are links in the README to the charts and visualizations.

Some technical reflections on this little hobby project:

I used Python and SQLite. Apache Spark seemed like overkill and I haven’t spent enough time with Spark to be able to troubleshoot intermediate pipeline steps as easily as I can in a plain SQL database. SQLite is fantastic for doing ETL or ELT. I can’t recommend it enough for non-“big data” scenarios. It’s fast (if you use local disk access), has enough SQL features for most ELT/ETL work, and is well-suited for use by a single user. It’s also good if the end goal is to produce data that will ultimately get imported into another system, say, a full-fledged RDBMS data warehouse that serves multiple users.

Currently, with just over 5 months of county- and state-level data, it takes ~2 minutes to to load all the raw data, transform it into dimensional tables that calculate various measures, and create data files used by the web pages that display tables and charts. The SQLite file is 850 MB which includes a lot of stage tables. This is on my laptop with a i5-7300U processor. Not too bad.

I created a Makefile to handle dependencies in the data pipeline, so that it only re-runs parts as needed. It’s currently not as fine-grained as it could be. For example, any change in the JHU CSSE data files will reload ALL the files into the database, but that portion of the code takes only maybe 10s total anyway. Similarly, all the dimensional models are created in a single process and could be split out. I’m happy with how it turned out overall with the qualification that writing and maintaining the Makefile is a bit of a pain. I might try using Apache Airflow instead at some point.

Storing data files in a git repo feels gross. But I did this so the chart and map web pages served through GitHub Pages could load the static data files. It’s a simple and free hosting solution.

In general, I like how simple this setup turned out to be and how easy it’s been to add new measures or tweak existing ones.

EDIT: In November 2020, I switched to using BigQuery.