Ruby #hash implementations
I tested out a handful of different #hash
implementations to see how they perform.
Setup
Given a Point
class:
class Point
attr_reader :x, :y, :z
def initialize(x, y, z)
@x = x
@y = y
@z = z
end
… and four different #hash
implementations.
XOR:
def hash
self.class.hash ^ @x.hash ^ @y.hash ^ @z.hash
end
Prime XOR:
def hash
self.class.hash ^ 3 * @x.hash ^ 5 * @y.hash ^ 7 * @z.hash
end
High prime XOR:
def hash
self.class.hash ^ 53 * @x.hash ^ 59 * @y.hash ^ 61 * @z.hash
end
Array:
def hash
[self.class, @x, @y, @z].hash
end
For each point type, I created 100 000 points, with coordinates 0 ≤ c < 100:
MAX = 100
TIMES = 100_000
points_with_xor_hash = TIMES.times.map do
PointWithXORHash.new(rand(MAX), rand(MAX), rand(MAX))
end
points_with_prime_xor_hash = TIMES.times.map do
PointWithPrimeXORHash.new(rand(MAX), rand(MAX), rand(MAX))
end
points_with_high_prime_xor_hash = TIMES.times.map do
PointWithHighPrimeXORHash.new(rand(MAX), rand(MAX), rand(MAX))
end
points_with_array_hash = TIMES.times.map do
PointWithArrayHash.new(rand(MAX), rand(MAX), rand(MAX))
end
Test: speed
I then benchmarked how fast #hash
is:
xor 34.200 (± 2.9%) i/s - 342.000 in 10.003618s
prime_xor 17.855 (± 0.0%) i/s - 179.000 in 10.027120s
high_prime_xor 16.992 (± 0.0%) i/s - 170.000 in 10.010219s
array 36.327 (± 0.0%) i/s - 366.000 in 10.076506s
Comparison:
array: 36.3 i/s
xor: 34.2 i/s - 1.06x (± 0.00) slower
prime_xor: 17.9 i/s - 2.03x (± 0.00) slower
high_prime_xor: 17.0 i/s - 2.14x (± 0.00) slower
Winners: array
and xor
.
Test: collisions
For each collection of points, I counted how often each hash code occurs:
name range mean
xor 43 1.37
prime_xor 3 1.05
high_prime_xor 3 1.05
array 3 1.05
Winners: all but xor
.
Test: DX
That sweet developer experience. Arguably, hash
is easiest to implement with Array
: the implementation [self.class, @x, @y, @z].hash
is so nicely compact.
Conclusion
It seems that using Array#hash
for constructing hashes is both very fast (even slightly faster than XOR), and creates well-distributed hash codes. it’s also the most compact.