The complete guide to implementing equality in Ruby
Ruby is one of the few programming languages that gets equality right. I often play around with other languages, but keep coming back to Ruby. In large part, this is because Ruby’s implementation of equality is so nice.1
Nonetheless, equality in Ruby is not straightforward. There is #==
, #eql?
, #equal?
, #===
, and more. Even if you’re familiar with using them, implementing them can be a whole other story.
Let’s walk through all forms of equality in Ruby and how to implement them.
- Why implementing equality matters
- Basic equality (double equals)
- Basic equality with type coercion
- Strict equality
- Hash equality
- Case equality (triple equals)
- Ordered comparison
- Wrapping up
- Further reading
Why implementing equality matters
We check whether objects are equal all the time. Sometimes we do this explicitly, sometimes implicitly. Here are some examples:
Do these two
Employee
s work in the sameTeam
? Or, in code:denis.team == someone.team
.Is the given
DiscountCode
valid for this particularProduct
? Or, in code:product.discount_codes.include?(given_discount_code)
.Who are the (distinct) managers for this given group of employees? Or, in code:
employees.map(&:manager).uniq
.
A good implementation of equality is predictable; it aligns with our understanding of equality.
An incorrect implementation of equality, on the other hand, conflicts with what we commonly assume to be true. Here is an example of what happens with such an incorrect implementation:
# Find the book “Gödel, Escher, Bach”
repo = BookRepository.new
geb = repo.find(isbn: "978-0-14-028920-6")
geb_also = repo.find(isbn: "978-0-14-028920-6")
geb == geb_also # => false?!
The geb
and geb_also
objects should definitely be equal. The fact that the code says they’re not, is bound to cause bugs down the line. Luckily, we can implement equality ourselves and avoid this class of bugs.
No one-size-fits-all solution exists for an equality implementation. However, there are two kinds of objects where we do have a general pattern for implementing equality: entities and value objects. These two terms come from domain-driven design (DDD for short), but they’re relevant even if you’re not using DDD. Let’s take a closer look.
Entities
Entities are objects that have an explicit identity attribute. Often, entities are stored in some database and have a unique id
attribute, corresponding to a unique id
table column. The following Employee
example class is such an entity:
class Employee
attr_reader :id
attr_reader :name
def initialize(id, name)
@id = id
@name = name
end
end
Two entities are equal when their IDs are equal. All other attributes are ignored. After all, an employee’s name might change, but that does not change their identity. Imagine getting married, changing your name and not getting paid anymore because HR has no clue who you are anymore!
ActiveRecord, the ORM that is part of Ruby on Rails, calls entities models instead, but they’re the same concept. These model objects automatically have an ID. In fact, ActiveRecord models already implement equality correctly out of the box!
Value objects
Value objects are objects without an explicit identity. Instead, their value as a whole constitutes identity. Consider this Point
class:
class Point
attr_reader :x
attr_reader :y
def initialize(x, y)
@x = x
@y = y
end
end
Two Point
s will be equal if their x and y values are equal. The x and y values constitute the identity of the point.
In Ruby, the basic value object types are numbers (both integers and floating-point numbers), characters, booleans, and nil
. For these basic types, equality works out of the box:
17 == 17 # => true
false != 12 # => true
1.2 == 3.4 # => false
Arrays of value objects are in themselves also value objects. Equality for arrays of value objects works out of the box — for example, [17, true] == [17, true]
. This might seem obvious, but this is not true in all programming languages.
Other examples of value objects are timestamps, date ranges, time intervals, colors, 3D coordinates, and money objects. These are built from other value objects: for example, a money object consists of a fixed-decimal number and a currency code string.
Basic equality (double equals)
Ruby has the ==
and !=
operators for checking whether two objects are equal or not:
"orange" == "red" # => false
1 + 3 == 4 # => true
12 - 1 != 10 # => true
Ruby’s built-in types all have a sensible implementation of ==
. Some frameworks and libraries provide custom types, which will have a sensible implementation of ==
, too. Here is an example with ActiveRecord:
geb = Book.find_by(title: "Gödel, Escher, Bach")
also_geb = Book.find_by(isbn: "978-0-14-028920-6")
geb == also_geb # => true
For custom classes, the ==
operator returns true if and only if the two objects are the exact same instance. Ruby does this by checking whether the internal object IDs are equal. These internal object IDs are accessible using #__id__
. Effectively, gizmo == thing
is the same as gizmo.__id__ == thing.__id__
.
This behavior is often not a good default, however. To illustrate this, consider the Point
class from earlier:
class Point
attr_reader :x
attr_reader :y
def initialize(x, y)
@x = x
@y = y
end
end
The ==
operator will return true
only when calling it on itself:
a = Point.new(4, 6)
b = Point.new(3, 7)
a_also = Point.new(4, 6)
a == a # => true
a == b # => false
a == "soup" # => false
a == a_also # => false?!
This default behavior is often undesirable in custom classes. After all, two points are equal if (and only if) their x and y values are equal. This behavior is undesirable for value objects (such as Point
), and entities (such as the Employee
class mentioned earlier).
The desired behavior for value objects and entities is as follows:
Instances of Point
are value objects. With the above in mind, a good implementation of ==
for Point
would look as follows:
class Point
…
def ==(other)
self.class == other.class &&
@x == other.x &&
@y == other.y
end
end
This implementation checks all attributes and the class
of both objects. By checking the class, checking equality of a Point
instance and something of a different class return false
rather than raise an exception.
Checking equality on Point
objects now works as intended:
a = Point.new(4, 6)
b = Point.new(3, 7)
a_also = Point.new(4, 6)
a == a # => true
a == b # => false
a == "soup" # => false
a == a_also # => true
The !=
operator works too:
a != b # => true
a != "rabbit" # => true
A correct implementation of equality has three properties: reflexivity, symmetry, and transitivity.
a == a
.a == b
, then b == a
.a == b
and b == c
, then a == c
.These properties embody a common understanding of what equality means. Ruby won’t check these properties for you, so you’ll have to be vigilant to ensure you don’t break these properties when implementing equality yourself.
Basic equality for value objects
The Point
class is an example of a value object. The identity of a value object, and thereby equality, is based on all its attributes. That is exactly what the earlier example does:
class Point
…
def ==(other)
self.class == other.class &&
@x == other.x &&
@y == other.y
end
end
Basic equality for entities
Entities are objects with an explicit identity attribute, commonly @id
. Unlike value objects, an entity is equal to another entity if and only if their explicit identities are equal.
Entities are uniquely identifiable objects. Typically, any database record with an id
column corresponds to an entity.2 Consider the following Employee
entity class:
class Employee
attr_reader :id
attr_reader :name
def initialize(id, name)
@id = id
@name = name
end
end
For entities, the ==
operator is more involved to implement than for value objects:
class Employee
…
def ==(other)
super || (
self.class == other.class &&
!@id.nil? &&
@id == other.id
)
end
end
This code does the following:
The
super
call invokes the default implementation of equality:Object#==
. OnObject
, the#==
method returnstrue
if and only if the two objects are the same instance. Thissuper
call, therefore, ensures that the reflexivity property always holds.As with
Point
, the implementationEmployee#==
checksclass
. This way, anEmployee
instance can be checked for equality against objects of other classes, and this will always returnfalse
.If
@id
isnil
, the entity is considered not equal to any other entity. This is useful for newly-created entities which have not been persisted yet.Lastly, this implementation checks whether the ID is the same as the ID of the other entity. If so, the two entities are equal.
Checking equality on entities now works as intended:
denis = Employee.new(570, "Denis Defreyne")
also_denis = Employee.new(570, "Denis")
denis_v = Employee.new(992, "Denis Villeneuve")
new_denis_1 = Employee.new(nil, "Denis 1")
new_denis_2 = Employee.new(nil, "Denis 2")
denis == denis # => true
denis == also_denis # => true
denis == "cucumber" # => false
denis == denis_v # => false
denis == new_denis_1 # => false
new_denis_1 == new_denis_1 # => true
new_denis_1 == new_denis_2 # => false
Basic equality with type coercion
Typically, an object is not equal to an object of a different class. However, this is not always the case. Consider integers and floating-point numbers:
float_two = 2.0
integer_two = 2
Here, float_two
is an instance of Float
, and integer_two
is an instance of Integer
. They are equal: float_two == integer_two
is true
, despite different classes. Instances of Integer
and Float
are interchangeable when it comes to equality.
As a second example, consider this Path
class:
class Path
attr_reader :components
def initialize(*components)
@components = components
end
def string
"/" + @components.join("/")
end
end
This Path
class provides an API for creating paths:
path = Path.new("usr", "bin", "ruby")
path.string # => "/usr/bin/ruby"
The Path
class is a value object, and implementing #==
could be done just as with other value objects:
class Path
…
def ==(other)
self.class == other.class &&
@components == other.components
end
end
However, the Path
class is special because it represents a value that could be considered a string. The ==
operator will return false
when checking equality with anything that isn’t a Path
:
path = Path.new("usr", "bin", "ruby")
path.string == "/usr/bin/ruby" # => true
path == "/usr/bin/ruby" # => false :(
It can be beneficial for path == "/usr/bin/ruby"
to be true
rather than false
. To make this happen, the ==
operator needs to be implemented differently:
class Path
…
def ==(other)
other.respond_to?(:to_str) &&
to_str == other.to_str
end
def to_str
string
end
end
This implementation of ==
coerces both objects to String
s, and then checks whether they are equal. Checking equality of a Path
now works:
path = Path.new("usr", "bin", "ruby")
path == path # => true
path == "/usr/bin/ruby" # => true
path == "/usr/bin/python" # => false
path == 12.3 # => false
This class implements #to_str
, rather than #to_s
. These methods both return strings, but by convention, the to_str
method is only implemented on types that are interchangeable with strings.
The Path
class is such a type. By implementing Path#to_str
, the implementation states that this class behaves like a String
. For example, it’s now possible to pass a Path
(rather than a String
) to IO.open
, and it will just work. This is because IO.open
accepts anything that responds to #to_str
.
String#==
also uses the to_str
method. Because of this, the ==
operator is reflexive:
"/usr/bin/ruby" == path # => true
"/usr/bin/python" == path # => false
Strict equality
Ruby provides #equal?
to check whether two objects are the same instance:
"snow" + "plow" == "snowplow"
# => true
("snow" + "plow").equal?("snowplow")
# => false
Here, we end up with two String
instances with the same content. Because they are distinct instances, #equal?
returns false
, and because their content is the same, #==
returns true
.
Do not implement #equal?
in your own classes. It is not meant to be overridden. It’ll all end in tears.
Earlier in this article, I mentioned that #==
has the property of reflexivity: an object is always equal to itself. Here is a related property for #equal?
:
a
and b
. If a.equal?(b)
, then a == b
.Ruby won’t automatically validate this property for your code. It’s up to you to ensure that this property holds when you implement the equality methods.
For example, recall the implementation of Employee#==
from earlier in this article:
class Employee
…
def ==(other)
super || (
self.class == other.class &&
!@id.nil? &&
@id == other.id
)
end
end
The call to super
on the first line makes this implementation of #==
reflexive. This super
invokes the default implementation of #==
, which delegates to #equal?
. Therefore, I could have used #equal?
, rather than super
:
class Employee
…
def ==(other)
self.equal?(other) || (
…
)
end
end
I prefer using super
, though this is likely a matter of taste.
Hash equality
In Ruby, any object can be used as a key in a Hash
. Strings, symbols, and numbers are commonly used as Hash
keys, but instances of your own classes can function as Hash
keys too — provided that you implement both #eql?
and #hash
.
The #eql? method
The #eql?
method behaves similarly to #==
:
"foo" == "foo" # => true
"foo".eql?("foo") # => true
"foo".eql?("bar") # => false
However, #eql?
, unlike #==
, does not perform type coercion:
1 == 1.0 # => true
1.eql?(1.0) # => false
1.0.eql?(1.0) # => true
If #==
doesn’t perform type coercion, the implementations of #eql?
and #==
will be identical. Rather than copy-pasting, however, we’ll put the implementation in #eql?
, and let #==
delegate to #eql?
:
class Point
…
def ==(other)
self.eql?(other)
end
end
class Employee
…
def ==(other)
self.eql?(other)
end
end
I made the deliberate decision to put the implementation in #eql?
and let #==
delegate to it, rather than the other way around. If we were to let #eql?
delegate to #==
, there’s an increased risk that someone will update #==
and inadvertently break the properties of #eql?
(which I’ll mention below) in the process.
For the Path
value object, whose #==
method does perform type coercion, the implementation of #eql?
will differ from the implementation of #==
:
class Path
…
def ==(other)
other.respond_to?(:to_str) &&
to_str == other.to_str
end
def eql?(other)
self.class == other.class &&
@components == other.components
end
end
Here, #==
does not delegate to #eql?
, nor the other way around.
A correct implementation of #eql?
has the following two properties:
a
and b
. If a.eql?(b)
, then a == b
.a
and b
. If a.equal?(b)
, then a.eql?(b)
.These two properties are not explicitly called out in the Ruby documentation. However, to the best of my knowledge, all implementations of #eql?
and #==
respect this property.
As before, Ruby will not automatically validate that these properties hold in your code. It’s up to you to ensure that these properties aren’t violated.
The #hash method
For an object to be usable as a key in a Hash
, it needs to implement not only #eql?
, but also #hash
. This #hash
method will return an integer, the hash code, that respects the following property:
a
and b
. If a.eql?(b)
, then a.hash == b.hash
.Typically, the implementation of #hash
creates an array of all attributes that constitute identity, and returns the hash of that array. For example, here is Point#hash
:
class Point
…
def hash
[self.class, @x, @y].hash
end
end
For Path
, the implementation of #hash
will look similar:
class Path
…
def hash
[self.class, @components].hash
end
end
For the Employee
class, which is an entity rather than a value object, the implementation of #hash
will use the class and the @id
:
class Employee
…
def hash
[self.class, @id].hash
end
end
If two objects are not equal, the hash code should ideally be different, too. This is not mandatory, however: it is okay for two non-equal objects to have the same hash code. Ruby will use #eql?
to tell objects with identical hash codes apart.
Putting it together
With both #eql?
and #hash
in place, the Point
, Path
, and Employee
objects can be used as hash keys:
points = {}
points[Point.new(11, 24)] = true
points[Point.new(11, 24)] # => true
points[Point.new(10, 22)] # => nil
Here, we use a Hash
instance to keep track of a collection of Point
s. We can also use a Set
for this, which uses a Hash
under the hood, but provides a nicer API:
require "set"
points = Set.new
points << Point.new(11, 24)
points.include?(Point.new(11, 24)) # => true
points.include?(Point.new(10, 22)) # => false
Objects that are used in Set
s need to have an implementation of both #eql?
and #hash
, just like objects that are used as hash keys.
Objects that perform type coercion, such as Path
, can also be used as hash keys, and thus also in sets:
require "set"
home = Path.new("home", "denis")
also_home = Path.new("home", "denis")
elsewhere = Path.new("usr", "bin")
paths = Set.new
paths << home
paths.include?(home) # => true
paths.include?(also_home) # => true
paths.include?(elsewhere) # => false
We now have an implementation of equality that works for all kinds of objects.
So far, we’ve covered #==
, #eql?
, and #hash
. These three methods are sufficient for a correct implementation of equality. However, we can go further to improve that sweet Ruby developer experience, and implement #===
.
Case equality (triple equals)
The #===
operator, also called the case equality operator, is not really an equality operator at all. Rather, it’s better to think of it as a membership testing operator. Consider the following:
10..15 === 14 # => true
80..99 === 14 # => false
Here, Range#===
checks whether a range covers a certain element. It is also common to use case
expressions to achieve the same:
case 14
when 10..15
puts "Kinda small!"
when 80..99
puts "Kinda large!"
end
This is also where case equality gets its name. Triple-equals is called case equality, because case
expressions use it.
You never need to use case
. It’s possible to rewrite a case
expression using if
and ===
. In general, case
expressions tend to look cleaner. Compare:
if 10..15 === 14
puts "Kinda small!"
elsif 80..99 === 14
puts "Kinda large!"
end
The examples above all use Range#===
, to check whether the range covers a certain number. Another commonly used implementation is Class#===
, which checks whether an object is an instance of a class:
Integer === 15 # => true
Integer === 15.5 # => false
I’m rather fond of the #grep
method, which uses #===
to select matching elements from an array. It can be shorter and sweeter than using #select
:
[4, 2.0, 7, 6.1].grep(Integer) # => [4, 7]
[4, 2.0, 7, 6.1].grep(2..6) # => [4, 2.0]
# Same, but more verbose:
[4, 2.0, 7, 6.1].select { |num| Integer === num }
[4, 2.0, 7, 6.1].select { |num| 2..6 === num }
Regular expressions also implement #===
. You can use it to check whether a string matches a regular expression:
phone = "+491573abcde"
case phone
when /00000/
puts "Too many zeroes!"
when /[a-z]/
puts "Your phone number has letters in it?!"
end
It helps to think of a regular expression as the (infinite) collection of all strings that can be produced by it. The set of all strings produced by /[a-z]/
includes the example string "+491573abcde"
. Similarly, you can think of a Class
as the (infinite) collection of all its instances, and a Range
as the collection of all elements in that range. This way of thinking clarifies that #===
really is a membership testing operator.
An example of a class that could implement #===
is a PathPattern
class:
class PathPattern
def initialize(string)
@string = string
end
def ===(other)
File.fnmatch(@string, other)
end
end
An example instance is PathPattern.new("/bin/*")
, which matches anything directly under the /bin
The implementation of PathPattern#===
uses Ruby’s built-in File.fnmatch
to check whether the pattern string matches. Here is an example of it in use:
pattern = PathPattern.new("/bin/*")
pattern === "/bin/ruby" # => true
pattern === "/var/log" # => false
Worth noting is that File.fnmatch
calls #to_str
on its arguments. This way, #===
automatically works on other string-like objects as well, such as Path
instances:
bin_ruby = Path.new("bin", "ruby")
var_log = Path.new("var", "log")
pattern = PathPattern.new("/bin/*")
pattern === bin_ruby # => true
pattern === var_log # => false
The PathPattern
class implements #===
, and therefore PathPattern
instances work with case
/when
, too:
case "/home/denis"
when PathPattern.new("/home/*")
puts "Home sweet home"
else
puts "Somewhere, I guess"
end
Ordered comparison
For some objects, it’s useful not only to check whether two objects are the same, but how they are ordered. Are they larger? Smaller? Consider this Score
class, which models the scoring system of my university in Ghent, Belgium.
class Score
attr_reader :value
def initialize(value)
@value = value
end
def grade
if @value < 10
"failing"
elsif @value < 14
"passing"
elsif @value < 16
"distinction"
elsif @value < 18
"high distinction"
else
"highest distinction"
end
end
end
(I was a terrible student. I’m not sure if this was really how the scoring even worked — but as an example, it will do just fine.)
In any case, we benefit from having such a Score
class. We can encode relevant logic on there, such as determining the grade, and checking whether or not a score is passing. For example, it might be useful to get the lowest and highest score out of a list:
scores = [
Score.new(6),
Score.new(17),
Score.new(14),
Score.new(13),
Score.new(11),
]
p scores.min
p scores.max
However, as it stands right now, the expressions scores.min
and scores.max
will result in an error: comparison of Score with Score failed (ArgumentError)
. We haven’t told Ruby how to compare two Score
objects. We can do so by implementing Score#<=>
:
class Score
…
def <=>(other)
return nil unless other.class == self.class
@value <=> other.value
end
end
An implementation of #<=>
returns four possible values:
- It returns
0
when the two objects are equal. - It returns
-1
whenself
is less thanother
. - It returns
1
whenself
is greater thanother
. - It returns
nil
when the two objects cannot be compared.
The #<=>
and #==
operators are connected:
a
and b
. If (a <=> b) == 0
, then a == b
.a
and b
. If (a <=> b) != 0
, then a != b
.As before, it’s up to you to ensure that these properties hold when implementing #==
and #<=>
. Ruby won’t check this for you.
For simplicity, I’ve left out the implementation Score#==
in the Score
example above. It’d certainly be good to have that, though.
In the case of Score#<=>
, we bail out if other
is not a score, and otherwise, we call #<=>
on the two values. We can check that this works: the expression Score.new(6) <=> Score.new(12)
evaluates to -1
, which is correct because a score of 6 is lower than a score of 12.3
With Score#<=>
in place, scores.max
now returns the maximum score. Other methods such as #min
, #minmax
, and #sort
work as well.
However, we can’t yet use operators like <
. The expression scores[0] < scores[1]
, for example, will raise an undefined method error: undefined method `<' for #<Score:0x00112233 @value=6>
. We can solve that by including the Comparable
mixin:
class Score
include Comparable
…
def <=>(other)
…
end
end
By including Comparable
, the Score
class automatically gains the <
, <=
, >
, and >=
operators, which all call <=>
internally. The expression scores[0] < scores[1]
now evaluates to a boolean, as expected.
The Comparable
mixin also provides other useful methods such as #between?
and #clamp
.
Wrapping up
We talked about the following topics:
the
#==
operator, used for basic equality, with optional type coercion.#equal?
, which checks whether two objects are the same instance.#eql?
and#hash
, which are used for testing whether an object is a key in a hash.#===
, which isn’t quite an equality operator, but rather a “is kind of” or “is member of” operator.#<=>
for ordered comparison, along with theComparable
module, which provides operators such as<
and>=
.
You now know all you need to know about implementing equality in Ruby.
Further reading
The Ruby documentation is a good place to find out more about equality:
- Documentation on
#==
(double equals),#eql?
, and#equal?
- Documentation on
#hash
- Documentation on
#===
(triple equals) - Documentation on
#<=>
- Documentation on
Comparable
I also found the following resources useful:
Special thanks to Kevin Newton and Chris Seaton for their invaluable input.
You could say that Ruby’s implementation of equality… has no equal. Ahem. ↩
Other forms of ID are possible too. For example, books have an ISBN, and recordings have an ISRC. But if you have a library with multiple copies of the same book, then ISBN won’t uniquely identify your books anymore. ↩
Did you know that the Belgian high school system used to have a scoring system where 1 was the highest and 10 was the lowest? Imagine the confusion! ↩