Ruby is close to my favorite programming language. That does not mean I’m entirely happy with it. As an exercise, I’m writing up what I’d change about the language.
This is purely hypothetical — the thought of forking Ruby and making change is preposterous. But it is an interesting exercise!
Ruby was created in the mid-90’s. It’s now 25 years later, and there have been plenty of advances in programming languages, and the best practices around Ruby have emerged and mostly solidified. This creates a good opportunity to look back and reflect.
Symbols and (frozen) strings are too similar to warrant a distinction between the two.
Interestingly, Clojure has symbols and (immutable) strings, and keywords on top of that. Keywords are similar to immutable strings, but can be namespaced (e.g. ::foo
is similar to :my.namespace/foo
). I don’t know enough about Clojure to know how namespaced symbols are used and what their uses are.
To do: Ooh, potentially plenty of implications of this.
Always encode strings as UTF-8. Provide a “data array” class to contain arbitrary data, and allow converting between strings and data, e.g.
# string to data
"stuff".to_data(encoding: "UTF-8")
# data to string
some_data.decode_string(encoding: "UTF-8")
Ruby has an ASCII_8BIT
(also known as BINARY
) encoding, which essentially makes the string a byte string. However, an ASCII_8BIT
string is still a string, and the API feels off — for example, it still has #characters
which no longer makes sense.
Related to the previous point: it might be good to have a distinct character array type, where random access for characters would be O(1) rather than O(n) — UTF-8 strings have O(n) character access but O(1) byte access.
This is useful for parsing/lexing. See Accidentally Quadratic — Ruby parser
gem as an example.
For strings, Ruby can use single quotes (e.g. 'example'
), double quotes (e.g. "example"
), and percent-literal syntax (e.g. %[example]
). The ?-syntax can be used to construct single-character strings (e.g. ?x
which is identical to "x"
).
I believe character literals are useful, and other languages gravitate towards using single quotes for those.
first_name = "Denis"
# This is a character literal, not a string
first_name[0] # => 'D'
# This is a string, built from two character literals
'D' + 'e' # => "De"
A small one, but def
always strikes me as odd: you can’t use it for defining anything but functions and methods. Perhaps fn
or fun
instead?
class Person
fun initialize(name)
@name = name
end
end
Ruby’s constructor is called initialize
, e.g.
class Person
def initialize(name)
@name = name
end
end
I find it difficult to type initialize
, and often make typos when typing it, which leads to hard-to-debug problems because the initializer does not exist.
Perhaps renaming it to init
could work. Or construct
(like in JavaScript).
Or, taking a page from Python’s book, using a clearly reserved name such as __init__
. This would allow the interpreter to detect typos: an unknown reserved name could be an error.
class Person
def __init__(name)
@name = name
end
end
Ruby already uses the double-underscore approach in some cases: __FILE__
, __dir__
, __method__
and __callee__
. Interestingly, these are implemented as methods on Kernel
— except __FILE__
, which is a constant, and that explains the rather unfortunate inconsistency in capitalization.
__init__
is an identifier, not a keyword. What would that mean for other identifiers that are not keywords? Would we have __self__
instead of self
?
Ruby’s initializer is verbose and repetitive when passing in many values:
class Person
def initialize(first_name)
@first_name = first_name
end
end
I rather like Crystal’s approach, providing a shorthand:
class Person
def initialize(@first_name)
end
end
It looks weird that the #initialize
method is now empty, though.
Swift’s approach to structs is also interesting. It automatically generates initializers based on the properties/attributes:
struct Person {
var firstName: String
var rank: Integer
}
var me = Person.new(firstName: "Denis", rank: 9000)
Constructor best practices include not doing any work in the initializer. Could that mean getting rid of traditional constructors entirely? To what extent would that affect DX?
In Ruby, a variable is defined the first time it is used:
amount = 650
Assignment and definition are practically the same. This means that it’s not possible to shadow a variable, e.g. in an if block.
For example, the line amount = 400
modifies the amount
variable, rather than define a new one:
amount = 650
if amount > 400 # too much
amount = 400
do_something(amount)
end
After the if
block, amount
is changed to 400. This isn’t always intended.
It’s easier to understand what is going on with an explicit keyword to declare variables, like var
or let
:
let amount = 650
if amount > 400 # too much
let amount = 400
do_something(amount)
end
The let amount = 400
line creates a new variable, which goes out of scope after the if
. This way, amount
outside of the if
is back to 650.
Ruby is not prescriptive in how to organize code. Over time, though, conventions have appeared: Zeitwerk, for example, defines a file structure that is mostly sensible.
I’d like to have that more tightly integrated into the language. To do: OK, but… how would that work exactly?
To do: Then we can remove most of the indentation caused by nesting, which makes the code less rightward.
To do: Ugh this is a big one
To do
To do: Blocks are useful (distinct from functions) because they support non-local return.
Swift’s support for closures is neat. A function can take multiple closures, and there’s a special syntax for trailing closures, which allows for a terse Ruby-like syntax:
let names = ["Chris", "Alex", "Ewa", "Barry", "Daniella"]
let reversedNames = names.sorted { $0 > $1 }
See Ruby’s question-mark boolean methods are inconsistent
Rename Hash
to HashMap
. There is far too much confusion between what a hash is and a Hash
is.
For an object to be used as a Hash
key, it must implement #hash
, which returns a hash (not a Hash
), also known as a hash key or hash value (not Hash
keys nor Hash
values).
Confused? Yeah. Understandable.