What would I change about Ruby?

Ruby is close to my favorite programming language. That does not mean I’m entirely happy with it. As an exercise, I’m writing up what I’d change about the language.

This is purely hypothetical — the thought of forking Ruby and making change is preposterous. But it is an interesting exercise!

Ruby was created in the mid-90’s. It’s now 25 years later, and there have been plenty of advances in programming languages, and the best practices around Ruby have emerged and mostly solidified. This creates a good opportunity to look back and reflect.

Remove symbols

Symbols and (frozen) strings are too similar to warrant a distinction between the two.

Interestingly, Clojure has symbols and (immutable) strings, and keywords on top of that. Keywords are similar to immutable strings, but can be namespaced (e.g. ::foo is similar to :my.namespace/foo). I don’t know enough about Clojure to know how namespaced symbols are used and what their uses are.

To do: Ooh, potentially plenty of implications of this.

Use one in-memory string encoding

Always encode strings as UTF-8. Provide a “data array” class to contain arbitrary data, and allow converting between strings and data, e.g.

# string to data
"stuff".to_data(encoding: "UTF-8")

# data to string
some_data.decode_string(encoding: "UTF-8")

Ruby has an ASCII_8BIT (also known as BINARY ) encoding, which essentially makes the string a byte string. However, an ASCII_8BIT string is still a string, and the API feels off — for example, it still has #characters which no longer makes sense.

Character arrays

Related to the previous point: it might be good to have a distinct character array type, where random access for characters would be O(1) rather than O(n) — UTF-8 strings have O(n) character access but O(1) byte access.

This is useful for parsing/lexing. See Accidentally Quadratic — Ruby parser gem as an example.

Use distinct character literal

For strings, Ruby can use single quotes (e.g. 'example'), double quotes (e.g. "example"), and percent-literal syntax (e.g. %[example]). The ?-syntax can be used to construct single-character strings (e.g. ?x which is identical to "x").

I believe character literals are useful, and other languages gravitate towards using single quotes for those.

first_name = "Denis"

# This is a character literal, not a string
first_name[0] # => 'D'

# This is a string, built from two character literals
'D' + 'e' # => "De"

Change keyword for defining functions

A small one, but def always strikes me as odd: you can’t use it for defining anything but functions and methods. Perhaps fn or fun instead?

class Person
  fun initialize(name)
    @name = name

Rename initializer method

Ruby’s constructor is called initialize, e.g.

class Person
  def initialize(name)
    @name = name

I find it difficult to type initialize, and often make typos when typing it, which leads to hard-to-debug problems because the initializer does not exist.

Perhaps renaming it to init could work. Or construct (like in JavaScript).

Or, taking a page from Python’s book, using a clearly reserved name such as __init__. This would allow the interpreter to detect typos: an unknown reserved name could be an error.

class Person
  def __init__(name)
    @name = name

Ruby already uses the double-underscore approach in some cases: __FILE__, __dir__, __method__ and __callee__. Interestingly, these are implemented as methods on Kernel — except __FILE__, which is a constant, and that explains the rather unfortunate inconsistency in capitalization.

__init__ is an identifier, not a keyword. What would that mean for other identifiers that are not keywords? Would we have __self__ instead of self?

Simplify assignment in initializer

Ruby’s initializer is verbose and repetitive when passing in many values:

class Person
  def initialize(first_name)
    @first_name = first_name

I rather like Crystal’s approach, providing a shorthand:

class Person
  def initialize(@first_name)

It looks weird that the #initialize method is now empty, though.

Swift’s approach to structs is also interesting. It automatically generates initializers based on the properties/attributes:

struct Person {
  var firstName: String
  var rank: Integer
var me = Person.new(firstName: "Denis", rank: 9000)

Constructor best practices include not doing any work in the initializer. Could that mean getting rid of traditional constructors entirely? To what extent would that affect DX?

Explicit variable definition

In Ruby, a variable is defined the first time it is used:

amount = 650

Assignment and definition are practically the same. This means that it’s not possible to shadow a variable, e.g. in an if block.

For example, the line amount = 400 modifies the amount variable, rather than define a new one:

amount = 650
if amount > 400 # too much
  amount = 400

After the if block, amount is changed to 400. This isn’t always intended.

It’s easier to understand what is going on with an explicit keyword to declare variables, like var or let:

let amount = 650
if amount > 400 # too much
  let amount = 400

The let amount = 400 line creates a new variable, which goes out of scope after the if. This way, amount outside of the if is back to 650.

Treat modules are files

Ruby is not prescriptive in how to organize code. Over time, though, conventions have appeared: Zeitwerk, for example, defines a file structure that is mostly sensible.

I’d like to have that more tightly integrated into the language. To do: OK, but… how would that work exactly?

To do: Then we can remove most of the indentation caused by nesting, which makes the code less rightward.

Make modules not globally available

To do: Ugh this is a big one

Simplify procs, blocks and lambdas

To do

To do: Blocks are useful (distinct from functions) because they support non-local return.

Swift’s support for closures is neat. A function can take multiple closures, and there’s a special syntax for trailing closures, which allows for a terse Ruby-like syntax:

let names = ["Chris", "Alex", "Ewa", "Barry", "Daniella"]
let reversedNames = names.sorted { $0 > $1 }

Boolean functions without question mark

See Ruby’s question-mark boolean methods are inconsistent

Rename Hash class

Rename Hash to HashMap. There is far too much confusion between what a hash is and a Hash is.

For an object to be used as a Hash key, it must implement #hash, which returns a hash (not a Hash), also known as a hash key or hash value (not Hash keys nor Hash values).

Confused? Yeah. Understandable.

Note last edited November 2023.