Weeknotes 2026 W15: Continued decluttering

Defreyne, Denis

The decluttering continues, and this week I got a few hours of help from J——. I sure is a lot lighter with somebody else around.

I also finally found my Wasteland playing cards! I put them in my Wasteland 2 collector’s edition ammo box — a place that, in hindsight, made a lot of sense for it.

Quick bits

I wonder how much my imagination — I have lots of it — influences my anxiety. If anxiety is the fear of negative events happening in the future, then my imagination of what the future could be for me — the future that I desire, hope for, and work towards — actively contributes to visualizing all the things that I might fail to get.
I know I’ve talked at length about how much I dislike the traffic in Berlin, but it turns out that my tiny area has the highest number of recorded speeding violations in all of Berlin.
Coming week, I’m taking an exam to prove that I know English. It’s needed for the visa I’m aiming to get. It shouldn’t be too hard to pass, right?

More Zig

I’ve been using a set of home-grown tools that ensure consistency between my beancount data, YNAB, and my bank. I wrote these tools a long time ago and have been maintaining them since.

But it’s been getting slower and slower. It’s no surprise: my local beancount ledger contains seven years of data. I’ve been making tweaks to improve performance,¹ but the true bottleneck slowly revealed itself: the tokenizer is too damn slow.

The tokenizer is written using Ragel and compiled to plain old Ruby. The neat thing about using a generator is that it’s easy to move to another language, and I wanted to move the implementation to a compiled language for speed. Here’s what I did:

Expand the test suite: I wrote an extensive test suite for the existing tokenizer. This would serve as the safety net to ensure I wouldn’t accidentally break anything.² Refactoring can’t be done without a matching test suite!
Generate C: I made Ragel generate C code instead of Ruby code, adjusted my Makefile to generate a C shared library, and wrote Ruby glue code to use FFI to interact with the new library.
Move to re2c: Ragel is not quite actively maintained, and that feels like a liability. re2c, on the other hand, is actively developed. Translating the tokenizer from Ragel to re2c wasn’t too hard: it’s just under a hundred lines of code, and the re2c syntax is fairly similar to Ragel anyway.
Generate Zig: I’m not fond of C; Zig is a much better language overall. I migrated from Ragel to re2c in large part because re2c supports Zig as an output target.

The outcome is a tokenizer that is about a hundred times faster than the original.³ The tokenizer is certainly not a bottleneck anymore. The whole toolset is now fast enough that I probably won’t be bothered by its execution speed anymore for the next few years.

I don’t mind writing tokenizers by hand, but in this case it was useful to have used a generator. It also means I can Unicode categories like L and Lu (letter and uppercase letter, respectively). A handwritten tokenizer could undoubtedly be made to be even faster, but then again: the tokenizer is really not the bottleneck anymore. Amdahl’s law tells me to stop optimizing the tokenizer.

I tried an LLM for this

I think I found, at long last, a use case where AI might be decent: translating code from one language to another. In this case: from Ragel to re2c.

Some observations from my experimentation here:

High test coverage is essential for guiding the LLM in the right direction. Though I suppose high test coverage is useful — probably even necessary — for any sort of refactoring.
LLMs need highly detailed instructions; without them, they’ll often end up doing things in confusing and roundabout ways, or just do plain stupid things: on multiple occasions, it “fixed” failing tests after a refactoring/translation by undoing all changes and then claiming success.
LLMs are not meant to be interacted with as it is running. I often pause the agent to make manual corrections and push the LLM in the right direction before resuming it. But when I do that, the LLM gets really confused because things have changed mysteriously and it seems to be unable to deal with that.
LLMs are able to generate decent test cases in bulk, and these test cases are often surprisingly correct. But LLMs nonetheless miss the spirit of what automated tests are about; they’ll often write redundant tests, overly verbose tests, and tests that — even though technically correct — do not test realistic scenarios.
Good lord, LLMs are wildly bad at simply writing files to disk. Anything over 100 lines of code regularly fails to be written properly — files get mysteriously truncated.

Was this whole experiment of using a LLM to port the tokenizer from Ragel to re2c worth it? As a means of getting familiar with AI-assisted software development… perhaps. I do feel that the end result of this effort (the new tokenizer) is rather sloppy; even though it works (as proven by the tests), it requires a lot more refactoring and cleanup work.

I have over two decades worth of experience as a software developer, which I believe is more than most people in the industry right now. I got into this line of work because of genuine personal interest, and I spent a lot of time and effort honing my craft. LLMs don’t even get close to that level of quality.

I also found that LLMs for code generation is remarkably draining. It requires so much effort to prevent the LLM from going off-road and steering itself off a cliff. There is only so much mental energy that I can dedicate to this. Several people have recommended treating LLMs as junior developers, but I find that to be an insult to all junior developers.

All in all, though, I can cautiously claim that, at long last, I’ve had some success with LLMs for code generation.

Lastly, I wanted to clarity that I am OK with experimenting with LLMs on this project because it is a private (and experimental) project that I don’t intend to ever make public. My AI policy remains in place.

Entertainment

The Store⁴ gives a view of a time and society that I am deeply unfamiliar with, like you’re at a strange party at an odd location (an upmarket department store) where you simply don’t know the rules.
Pompei: Below the Clouds⁵ is a peculiar and dark view of daily live in the shadow of Mount Vesuvius. Worth it just for the atmosphere.

Links

Pillars Of Eternity 1 - Turn Based Mode Is Officially Live (Mortismal Gaming): Now I’m interested in replaying.
Extremely Overdetermined Adventure Game Tier List | Part 1: Re-ranking the Top 20 (Innuendo Studios and Questing Refuge): Now on YouTube! I’ve been following this series on Nebula with great delight. (Also part 2 and part 3.)
1D-Chess (Rowan Monk): Cursed.
Artemis II Is Competency Porn and We Are Starving For It (Liz Plank)

Tech links:

CSS Generator for Blob shapes
What are your programming "hunches" you haven't yet investigated?: Interesting stuff!
Rack::Session::Cookie decrypt failure falls back to accepting unencrypted cookies.: Yikes — that’s a huge one.
Brocards for vulnerability triage (William Woodruff)
Writing design docs (ceejbot)

Matching one set of transactions to another set of transactions is a $O(n^2)$ operation, typically, but if you know that two transactions that match will always be temporally close, you can bucket transactions by date and then only search adjacent buckets for a remarkable performance boost. ↩︎
There are high-level tests (and I’ve got my entire financial database to test against), but they’re slow, and test failures tend to come with unusably generic failure messages. ↩︎
Even before adding --release=safe! ↩︎
The Store, directed by Frederick Wiseman (1983). ↩︎
Sotto le nuvole, written and directed by Gianfranco Rosi (21 Unofilm, Stemal Entertainment, Rai Cinema, 2026). ↩︎

You can reply to this weeknotes entry by email. I’d love to hear your thoughts!

If you like what I write, stick your email address below and subscribe. I send out my weeknotes every Sunday morning. Alternatively, subscribe to the web feed.