Porting my site to Hugo

first published:
Oct 30, 2023
tags:

My website used to be hand-crafted in vim. This resulted in a set of static files that were extremely fast and efficient to serve, but painful to update. Setting up a system that factors out the common elements and autogenerating indexes was a no-brainer, but I needed to find time to implement it.

However, during my recent downtime between posts, I took the time to rebuild much of my domestic computing infrastructure, including updating my web hosting configuration — and found that my newly deployed webserver, Nginx, does not support full content negotiation, which my previous hand-crafted configuration relied on.

(Indeed, content-negotiation has been rejected by the WHAT WG as a good approach in favour of explicitly providing the browser choices it can make using its full awareness of context.)

So to port my site to my new server, I would need to rewrite my site. I wanted to try to retain the operational simplicity and performance (and security) of a statically-generated site1, so I looked for a nice static site generator that I could employ and came across a couple of options — Jekyll, implemented in Ruby, and Hugo, implemented in Go. I tried Hugo first and ultimately ran with it.2

Whatever system I used, I wanted to:

Hugo

The Hugo getting-started guide is fairly straightforwrd, and Hugo itself is packaged in Debian so getting the code running was trivial. However, I wanted to use my own theme rather than a pre-packaged one, and it turns out that the theme isn’t just reponsible for presentation but also templating, and so does a lot of heavy lifting. There are no default templates (“layouts” in Hugo), and so getting to an actually working site from first-principles took quite a lot of time as Hugo doesn’t have any default fallback templates.

I built my theme by working out how existing themes (such as Photon and Ananke worked and transliterating my existing site structure to it. This caused me to initially use many of the same abstractions they used for factoring out common elements, which i nmy case wasn’t necessary; in hindsight, I should have avoided doing this and just put everything in baseof.html and only refactored when necessary.

Migrating pages

I only had a small number of pages on my previous site, and so manually rewriting the existing static .html files into Markdown was quite tractable, and even allowed me to spot and fix a few old typos — and also marvel slightly at how I used to think! Part of me regrets not writing publicly more, while another part of me is quite glad I said fewer foolish things out loud.

One thing that was annoying to mechanically switch out were links from a manually written <a href="...">link text</a> to the Markdown [link text](url) form. The following vim incantation can be used to replace all instances automatically, assuming anchors only contain a single href attribute and nothing else:

%s/<a href="\([^"]\+\)">\([^<]\+\)<\/a>/[\2](\1)/g

I also updated the URL scheme for many links to other locations from http:// to https://. Many of the destinations on my Elsewhere page had bitrotted; these I fairly aggressively pruned.

Limitations on URL choices

In practice, perfectly reflecting my previous URL scheme with Hugo is not possible, and so adopting Hugo’s opinionated ’not ugly’ URL scheme, and setting up redirects in the webserver, is the path of least resistance.

Hugo doesn’t automatically generate ‘section’ pages for each level of a hierachical section (so will make /sections/index.html but not /sections/{2023,2019,...}/index.html, which is frustrating. It’s possible to make it do this, but it requires generating a _index.html file for each yearly subsection (with an appropriate “Title” set in the front-matter), when what I’d like it to do is to automatically generate one at every level, based on the path name without the need for any manual action.

Ultimately, I added the following redirects to the webserver:

Adjusting presentation

Since the last real content update, HTML5 has been ratified and introduced a number of new semantic elements that I was previously implementing using <div>. I’ve replaced most of these and now make use of <header>, <article>, <footer>, <nav> consistently, and update the CSS style rules to match. (The CSS stylesheet could probably use some pruning and restructuring, but my revised version appears to function at least as well as it did before.

Mobile phone access to websites has also developed substantially, and so doing some work to support these devices well seemed appropriate. In particular, I wanted to support the use of multiple image-sets at different resolutions.

Hugo does not make it straightforward to use responsive images in Markdown (though it’s easier in templates, however I ultimately built something that worked quite well.

The first trick is to know that it’s possible to override the Markdown engine’s expansion of different kind of element, including image expressions (![text](image URL)]), by dropping in a render hook into the site or theme. By piecing together some resource lookup logic and resource variant generation logic that makes use of Hugo’s Image Processing functionality I was able to make Hugo automatically generate and serve responsive images as standard. I was then (after working around some gotchas) able to implement a custom shortcode that, when provided with some image details, would then use this customised pipeline to generate boxout asides like the one on my about page.

Considering users with very narrow screens, I also added some responsive layout logic (a first for me!) to my CSS to switch between displaying the boxout in line with the main content or floating off to one side, so as to avoid crushing the main body of the text in a narrow column when the page is narrow for both to fit side by side.

It took a lot of architectual spelunking to get there, but the result is pretty effective, even if it could surely use some further tweaking.

Things still to improve

The new system isn’t perfect, but it’s good enough to go live. Some issues I have are that it doesn’t seem to be possible to write Hugo templates (sorry, ’layouts’) that include other components in a way that preserves nice structured formatting, and this offended me enough that I now post-process everything with tidy and only then rsync everything into place.

This causes the mtime values of files that haven’t been modified to be refereshed on every page build, which is annoying. Ideally the mtime of files in the live root would match that in revision control.

The new Hugo template generates both a Sitemap and some RSS feeds, but I suspect the auto-generated content of the latter may be suboptimal; I would want to limit it to generating just entries for the various /article pages, and not anything else.

It would be nice to maintain a public link-roll a la Tony and also be able to serve content via ActivityPub. Some people have already put some effort into doing this with Hugo, though the Feature request for this to be implemented upstream has been closed unresolved. (In fairness, ActivityPub support is difficult to support well with just static files.)


  1. There’s an interesting article titled “database-antipattern” that I discovered via the excellent Tom Morris, which argues that using a database for primary long-term storage is a bad idea because it is fragile. For slowly-changing sites, it’s probably faster and more robust to store data in flat files in revision control, and building query-like functionality over the top as needed, rather than using a general purpose database and building revision-control functionality on top. ↩︎

  2. I subsequently discovered that Tony had written a custom tool for dotat.at and would have been tempted to fork that had I known that at the outset. ↩︎