Nanoc’s lack of parallelism slows down compilation

Nanoc would gain a lot from parallelism, but unfortunately, Ruby does not quite support it a useful way.

Multithreading

Quite a while ago, I worked on a change that would enable compilation in parallel. I couldn’t quite get it to work, and the impact wouldn’t be quite what I had hoped for anyway. Parallelization with threads isn’t particularly powerful in Ruby, because the global VM lock (GVL) prevents more than one thread of doing CPU-bound work, and in my experience, most of the work that Nanoc does is CPU-bound.

This thread-based compilation approach would yield benefits when invoking external commands (because that’s not CPU-bound), but in my opinion, that might be a bit of an edge case, and I really wanted a solution that would enable parallel compilation without having the trouble with the GVL.

Spawning subprocesses

To do: Write this. Probably hard on Windows.

Ractors

Ruby 3.0 came with a new feature for parallel execution called ractors, which is particularly interesting because there no longer is a global GVL, but one per ractor, which enables true parallel execution.

The drawback of ractors is that they put considerable constraints on the code: data can’t easily be shared across ractors, which would require (I believe) large-scale changes to the Nanoc codebase, because the architecture of that codebase does not easily allow it.

Note that I’m not saying that the Nanoc architecture is bad — I’m saying that it’s not immediately compatible with what is needed to make ractors work.

When I started working on making Nanoc compile in parallel, the concept of ractors was so new that it wasn’t worth testing it out. It would make Nanoc depend on the then-latest version of Ruby, which would require people to upgrade, possibly against their wishes. The good news is that ractors have now been around for long enough that this no longer is a concern. So at least that’s good news.

At this point, I think it would make sense for me to give ractors a proper try. The work needed to make it work, and not in the least for me to fundamentally understand how ractors work, would be significant, however. I have not used them yet, and I can’t tell how they work, what their limitations are, or whether it is really usable for production usage.

I also worry that moving to ractors would complicate the codebase too much. It’s already — by necessity — not an easy codebase to navigate, because site compilation in general is not an easy task. Parallelization would certainly add to the complexity.

Lastly, ractors are experimental. Running any code that uses ractors prints this message:

warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.
Note last edited June 2024.
Incoming links: Nanoc.