The decluttering continues, and this week I got a few hours of help from J——. I sure is a lot lighter with somebody else around.
I also finally found my Wasteland playing cards! I put them in my Wasteland 2 collector’s edition ammo box — a place that, in hindsight, made a lot of sense for it.
Quick bits
-
I wonder how much my imagination — I have lots of it — influences my anxiety. If anxiety is the fear of negative events happening in the future, then my imagination of what the future could be for me — the future that I desire, hope for, and work towards — actively contributes to visualizing all the things that I might fail to get.
-
I know I’ve talked at length about how much I dislike the traffic in Berlin, but it turns out that my tiny area has the highest number of recorded speeding violations in all of Berlin.
-
Coming week, I’m taking an exam to prove that I know English. It’s needed for the visa I’m aiming to get. It shouldn’t be too hard to pass, right?
More Zig
I’ve been using a set of home-grown tools that ensure consistency between my beancount data, YNAB, and my bank. I wrote these tools a long time ago and have been maintaining them since.
But it’s been getting slower and slower. It’s no surprise: my local beancount ledger contains seven years of data. I’ve been making tweaks to improve performance,1 but the true bottleneck slowly revealed itself: the tokenizer is too damn slow.
The tokenizer is written using Ragel and compiled to plain old Ruby. The neat thing about using a generator is that it’s easy to move to another language, and I wanted to move the implementation to a compiled language for speed. Here’s what I did:
-
Expand the test suite: I wrote an extensive test suite for the existing tokenizer. This would serve as the safety net to ensure I wouldn’t accidentally break anything.2 Refactoring can’t be done without a matching test suite!
-
Generate C: I made Ragel generate C code instead of Ruby code, adjusted my Makefile to generate a C shared library, and wrote Ruby glue code to use FFI to interact with the new library.
-
Move to re2c: Ragel is not quite actively maintained, and that feels like a liability. re2c, on the other hand, is actively developed. Translating the tokenizer from Ragel to re2c wasn’t too hard: it’s just under a hundred lines of code, and the re2c syntax is fairly similar to Ragel anyway.
-
Generate Zig: I’m not fond of C; Zig is a much better language overall. I migrated from Ragel to re2c in large part because re2c supports Zig as an output target.
The outcome is a tokenizer that is about a hundred times faster than the original.3 The tokenizer is certainly not a bottleneck anymore. The whole toolset is now fast enough that I probably won’t be bothered by its execution speed anymore for the next few years.
I don’t mind writing tokenizers by hand, but in this case it was useful to have used a generator. It also means I can Unicode categories like L and Lu (letter and uppercase letter, respectively). A handwritten tokenizer could undoubtedly be made to be even faster, but then again: the tokenizer is really not the bottleneck anymore. Amdahl’s law tells me to stop optimizing the tokenizer.
I tried an LLM for this
I think I found, at long last, a use case where AI might be decent: translating code from one language to another. In this case: from Ragel to re2c.
Some observations from my experimentation here:
-
High test coverage is essential for guiding the LLM in the right direction. Though I suppose high test coverage is useful — probably even necessary — for any sort of refactoring.
-
LLMs need highly detailed instructions; without them, they’ll often end up doing things in confusing and roundabout ways, or just do plain stupid things: on multiple occasions, it “fixed” failing tests after a refactoring/translation by undoing all changes and then claiming success.
-
LLMs are not meant to be interacted with as it is running. I often pause the agent to make manual corrections and push the LLM in the right direction before resuming it. But when I do that, the LLM gets really confused because things have changed mysteriously and it seems to be unable to deal with that.
-
LLMs are able to generate decent test cases in bulk, and these test cases are often surprisingly correct. But LLMs nonetheless miss the spirit of what automated tests are about; they’ll often write redundant tests, overly verbose tests, and tests that — even though technically correct — do not test realistic scenarios.
-
Good lord, LLMs are wildly bad at simply writing files to disk. Anything over 100 lines of code regularly fails to be written properly — files get mysteriously truncated.
Was this whole experiment of using a LLM to port the tokenizer from Ragel to re2c worth it? As a means of getting familiar with AI-assisted software development… perhaps. I do feel that the end result of this effort (the new tokenizer) is rather sloppy; even though it works (as proven by the tests), it requires a lot more refactoring and cleanup work.
I have over two decades worth of experience as a software developer, which I believe is more than most people in the industry right now. I got into this line of work because of genuine personal interest, and I spent a lot of time and effort honing my craft. LLMs don’t even get close to that level of quality.
I also found that LLMs for code generation is remarkably draining. It requires so much effort to prevent the LLM from going off-road and steering itself off a cliff. There is only so much mental energy that I can dedicate to this. Several people have recommended treating LLMs as junior developers, but I find that to be an insult to all junior developers.
All in all, though, I can cautiously claim that, at long last, I’ve had some success with LLMs for code generation.
Lastly, I wanted to clarity that I am OK with experimenting with LLMs on this project because it is a private (and experimental) project that I don’t intend to ever make public. My AI policy remains in place.
Entertainment
-
The Store4 gives a view of a time and society that I am deeply unfamiliar with, like you’re at a strange party at an odd location (an upmarket department store) where you simply don’t know the rules.
-
Pompei: Below the Clouds5 is a peculiar and dark view of daily live in the shadow of Mount Vesuvius. Worth it just for the atmosphere.
Links
-
Pillars Of Eternity 1 - Turn Based Mode Is Officially Live (Mortismal Gaming): Now I’m interested in replaying.
-
Extremely Overdetermined Adventure Game Tier List | Part 1: Re-ranking the Top 20 (Innuendo Studios and Questing Refuge): Now on YouTube! I’ve been following this series on Nebula with great delight. (Also part 2 and part 3.)
-
1D-Chess (Rowan Monk): Cursed.
-
Artemis II Is Competency Porn and We Are Starving For It (Liz Plank)
Tech links:
-
What are your programming "hunches" you haven't yet investigated?: Interesting stuff!
-
Rack::Session::Cookiedecrypt failure falls back to accepting unencrypted cookies.: Yikes — that’s a huge one. -
Brocards for vulnerability triage (William Woodruff)
-
Writing design docs (ceejbot)
-
Matching one set of transactions to another set of transactions is a operation, typically, but if you know that two transactions that match will always be temporally close, you can bucket transactions by date and then only search adjacent buckets for a remarkable performance boost. ↩︎
-
There are high-level tests (and I’ve got my entire financial database to test against), but they’re slow, and test failures tend to come with unusably generic failure messages. ↩︎
-
Even before adding
--release=safe! ↩︎ -
The Store, directed by Frederick Wiseman (1983). ↩︎
-
Sotto le nuvole, written and directed by Gianfranco Rosi (21 Unofilm, Stemal Entertainment, Rai Cinema, 2026). ↩︎