Weeknotes 2025 W45: Polymorphic

November 3​–​9, 2025

Quick bits:


Shower thoughts:


I’ve been having trouble sleeping, so I’m trying melatonin. It seems to help with falling asleep, though not with staying asleep. I often wake up at 5–6 AM which doesn’t quite get me my eight hours of sleep. Though perhaps I do not need those eight hours anyway?

It’s a little early to tell how well it works. At least I’ve got absolutely no side effects, which is fantastic.


Acting class is progressing. With just five weeks left until the performance, it’s beginning to get real.

I went second-hand clothes shopping with J—— and we found some nice props to accentuate the characters I’m playing. It was a good amount of fun, even for one of the shopkeepers who noticed us and said “yep, that looks like character work.” We’ve been found out!

As a reminder: the performance will be in the evening of Monday, December 15th at theaterforum kreuzberg. More details to follow. It’s open to the public — drop me a brief message if you’re interested in attending!


Deng, the templating language I’ve been working on, now has the beginnings of an expression language. It now has unary operators, binary operators, functions, and methods. Deng is still not complete, but an expression language is definitely a step in the right direction.

Now I can do stuff like this:

<ol>
  {% for items.find_all("/notes/*.md") as note %}
    <li>{{ note.title }}</li>
  {% endfor %}
</ol>

I haven’t integrated this with my work-in-progress Zig static-site generator yet, but it will happen soon.

My work-in-progress article on expression parsing turned out to be quite useful here, as this is exactly what I needed. Now that I have an expression parser written in Zig, perhaps I can breathe new life into that article with a shift to Zig.

A pending issue is memory management. I’ve skirted around it for now, by assuming that the cleanup of function/method return values lies with the caller. This isn’t ideal, and I will need to revisit that. Some form of garbage collection might be necessary.


Speaking of memory management: in my work-in-progress static-site generator, I’ve struggling to figure out memory management for filters.

A filter2 is a function with this signature:

pub fn call(
    allocator: std.mem.Allocator,
    str: []const u8,
) ![]const u8 {
    // …
}

In other words: take in a string (in other words, a []const u8), and return a string. Think of filters as “transform Markdown to HTML” and the like.

Correct memory management means:

Even though the function signature for a filter is simple, it hides quite some complexity. A filter can do a few different things:

I’d like the static-site generator to deallocate memory when it is no longer in use, but that is surprisingly difficult to do correctly:3

An easy way around all this complexity is to mandate that every filter allocates memory for its return value. That sidesteps the problem entirely, but creates inefficiency — excess memory allocation and copying — that I want to avoid.

There is another slight complication: strings can have a 0 sentinel value. Zig supports C strings (i.e. not []const u8 but rather [:0]const u8), but it’s important that the allocator know whether or not the sentinel value is present to be able to deallocate the memory properly — the allocation length is checked.

Memory management for filters is complex enough that I need some form of garbage collection. I’m partial to reference-counting implementation, which is what I used for a String type:4

const String = struct {
    ref_count: u8 = 1,

    content: union(enum) {
        //allocated memory
        ow ned: struct {
            data: []const u8,
            allocator: std.mem.Allocator,
        },

        // allocated memory for C string
        owned_z: struct {
            data: [:0]const u8,
            allocator: std.mem.Allocator,
        },

        // compile-time constant
        constant: []const u8,

        // slice of another string
        reference: struct {
            data: []const u8,
            retains: *String,
        },
    },
}

This looks intimidating — I know! But it’s the simplest implementation that I can come up with that satisfies my needs.

ref_count is the reference count. Easy enough.5 There will be retain() and release() methods, too:

fn retain(self: *@This()) void {
    // [snip]
}

fn release(self: *@This()) void {
    // [snip]
}

retain() increments the reference count, and release() decrements it. When the reference count hits zero, the release() method deallocates the string.6

The content field can be:

There is one bit of complexity that I’m fortunately able to ignore: reference cycles. Filters execute linearly; content from earlier filters cannot depend on the output of later filters, so reference cycles won’t happen.7

So, I was rather pleased with this implementation, until I realized that there is, in fact, one more piece of complexity that I cannot get around — a piece of complexity that throws a spanner in the works, making the implementation above not practically usable. I had to start over.


If my week­notes had ads, this right here — right after a cliffhanger — would be the perfect place to insert an ad break.

It’s called dramatic storytelling, people.


The implementation of reference counting above only works for strings that retain other strings. But this static-site generator will not be limited to dealing with strings:

To be clear: my SSG does not yet support anything but strings for content, but it’ll need to support different types of content. I want this SSG to be flexible and accommodating.

One approach to supporting multiple types of content would be to allow a String to reference any of these other types:

reference: struct {
    data: []const u8,
    retains: union(enum) {
        string: *String,
        rope: *Rope,
        json_tree: *JsonTree,
        toml_tree: *TomlTree,
        xml_tree: *XmlTree,
        markdown_tree: *MarkdownTree,
        lazy_stream: *LazyStream,
        // ...
    },
},

But ugh! That retains field is a mess, and the list of tagged union variants undoubtedly will grow over time.

For the sake of simplicity and ease of maintenance, I would much prefer to have polymorphic reference counting. An interface, of sorts: let String, Rope, MarkdownTree etc all implement the Rc interface, and allow a String to retain anything that implements that Rc interface.

Zig, however, does not have interfaces. Still, there is a clean way to do it: allow any reference-countable type to return a vtable (or virtual method table) of type Rc that provides reference-counting methods. Like this:

var thing = Thing.create(...)
// reference count is now 1

thing.rc().retain();
// reference count is now 2

thing.rc().release();
// reference count is now back to 1

thing.rc().release();
// reference count has hit 0
// thing is deallocated

In many programming languages, interfaces are a first-class concept, but in Zig, it simply is a struct. I think it is rather neat. The rc() method is implemented on any reference-countable type like this:

pub fn rc(self: *Thing) Rc {
    return .{
        .ptr = @ptrCast(self),
        .ref_count_ptr = &self.ref_count,

        .deinit = rc_deinit,
    };
}

For this to work, two things are needed. First, the reference count needs to live on the instance:8

ref_count: u8 = 1,

It also needs a deinit function for deallocating the contents. I've opted to have two functions, one that takes a *const anyopaque, casts it to the proper type, and calls the usual deinit():9

fn rc_deinit(ptr: *const anyopaque) void {
    const self: *const Thing =
        @ptrCast(@alignCast(ptr));
    self.deinit();
}

pub fn deinit(self: *Thing) void {
    // [snip]
}

That’s what is needed. Now, the String type can retain anything that implements the Rc interface:

const String = struct {
    // [snip]
    content: union(enum) {
        // [snip]
        reference: struct {
            data: []const u8,
            // This used to be `*String`
            retains: Rc,
        },
    },
}

Here’s how that could work, with one string retaining another string (because the former contains a slice of the memory allocated for the latter):

var orig_string = String.fromOwnedZ(
    allocator,
    try allocator.dupeZ(u8, "helloworld"),
);

var string = String.fromReferenced(
    orig_string.getSlice()[3..6],
    orig_string.rc(),
);

orig_string.rc().release();

String.fromReferenced retains the incoming string, which means that orig_string.rc().release() does not deallocate orig_string, as expected.

With this implementation of polymorphic reference counting, I can let a JSON AST retain an incoming string:

var json_string = String.fromOwned(...);
defer json_string.rc().release();

var json_ast: JsonAst = parseJson(json_string);

The parseJson() function has options:

This reference-counting implementation is simple, yet flexible. I am hopeful that this implementation is finally the one that works for me; this is my third or fourth attempt at it, but it looks like this one might be it.

Also: reference counting is neat, isn’t it? I might do a more detailed write-up about polymorphic reference counting in Zig at some point in the future.


That was a lot of writing. I didn’t expect this week­notes entry to turn into a novella. I might be unemployed still, but I am certainly not idle.


Entertainment:


Links:

Tech links:


  1. I woke up at 4 AM with this epiphany front and center in my mind, and at the time it truly felt like my mind had uncovered a deeply fundamental truth about the universe. (I don’t use Microsoft Word.) ↩︎

  2. “Transformer” might be a better name than “filter,” but the name “filter” is what I am used to, and what Nanoc uses, too. ↩︎

  3. This list contains what is top of mind right now, and is undoubtedly incomplete. ↩︎

  4. Reference counting is a form of garbage collection. It’s just not tracing garbage collection. ↩︎

  5. I could also have used an atomic reference count here. That should be a fairly straightforward change. ↩︎

  6. I’ve opted to use retain and release, borrowing the terminology that I am familiar with from my Objective-C days. ↩︎

  7. “Reference cycles won’t happen” feels like exactly the thing that will turn out to be not true simply because I mentioned it. It’s the inherent irony of the universe. ↩︎

  8. How wide should the reference count be? I’ve opted for a single byte, which seems reasonable in my case. In a more generic reference counting implementation, I’d probably use an u32. ↩︎

  9. Keeping the old and usual deinit() around means that this implementation can be used without reference counting, too. That flexibility is advantageous. ↩︎

  10. Avernum 4: Greed and Glory (Spiderweb Software, 2025), published by Spiderweb Software. ↩︎

  11. Warhammer 40,000: Rogue Trader (Owlcat Games, 2023), published by Owlcat Games. ↩︎

  12. Orlando, directed by Sally Potter (Adventure Pictures, Lenfilm Studio, Mikado Film, 1993). ↩︎

You can reply to this weeknotes entry by email. I’d love to hear your thoughts!
If you like what I write, stick your email address below and subscribe. I send out my weeknotes every Sunday morning. Alternatively, subscribe to the web feed.