There's a trick to using Hash.new
In my previous post on hashes I showed that in Ruby you can create a hash with a static default value. Here’s the example I used in that post:
dog_counts = Hash.new(0)
# => {}
[
"doberman",
"dachshund",
"doberman",
"doberman",
"whippet",
"labrador"
].each do |dog_seen|
dog_counts[dog_seen] += 1
end
dog_counts
# => {"doberman"=>3, "dachshund"=>1, "whippet"=>1, "labrador"=>1}
Grouping Dogs By Breed
This pattern works great for counting dogs, but it can get you into trouble if you’re trying to group them. Let’s look at a second example:
dogs_by_type = Hash.new([])
[
{type: "whippet", name: "Roxie"},
{type: "rhodesian", name: "Cessna"},
{type: "alsatian", name: "Daeva"},
].each do |dog|
dog_type = dog[:type]
# The assignment here is because accessing default values doesn't set them in
# the hash, see: https://jarednorman.ca/hash-new-with-a-block
dogs_by_type[dog_type] = dogs_by_type[dog_type] << dog[:name]
end
Do you see the bug? Hint for Python programmers: it’s the same issue default argument values have in Python. Let’s take a look at value of dogs_by_type
:
{"whippet" => ["Roxie", "Cessna", "Daeva"],
"weimaraner" => ["Roxie", "Cessna", "Daeva"],
"rhodesian" => ["Roxie", "Cessna", "Daeva"],
"alsatian" => ["Roxie", "Cessna", "Daeva"]}
We meant to set the default value for this hash to an array, but we actually set the default value to the one, specific array that we passed in. Every time we accessed a key that wasn’t set yet we got the same array and then mutated (changed) it. As a result all the keys in the hash get set to the same array. Here’s an example that demonstrates this more explicitly:
my_array = [3, 2, 1]
my_hash = Hash.new(my_array)
my_array.sort!
my_hash[:some_key]
#=> [1, 2, 3]
Better Grouping
The solution is to use the block syntax from that blog post that I keep linking you to. That might look something like this:
dogs_by_type = Hash.new { [] }
[
{type: "whippet", name: "Roxie"},
{type: "rhodesian", name: "Cessna"},
{type: "alsatian", name: "Daeva"},
].each do |dog|
dog_type = dog[:type]
dogs_by_type[dog_type] = dogs_by_type[dog_type] << dog[:name]
end
dogs_by_type
#=> {"whippet"=>["Roxie"], "rhodesian"=>["Cessna"], "alsatian"=>["Daeva"]}
This gets you the right result, but we can use our learnings from that other post to also assign the default value inside of the block we pass in when creating the hash.
dogs_by_type = Hash.new { |hash, key| hash[key] = [] }
[
{type: "whippet", name: "Roxie"},
{type: "rhodesian", name: "Cessna"},
{type: "alsatian", name: "Daeva"},
].each do |dog|
dogs_by_type[dog[:type]] << dog[:name]
end
That cleans things up significantly by getting rid of that extra assignment. You could even go one step further and write the logic in a more “functional” style:
[
{type: "whippet", name: "Roxie"},
{type: "rhodesian", name: "Cessna"},
{type: "alsatian", name: "Daeva"},
].group_by { |dog|
dog[:type]
}.transform_values { |dogs|
dogs.map { |dog|
dog[:name]
}
}
Now we never even create a hash with a default value!
Conclusion
There are three things you should take away from this article:
- Dogs are good.
- Hashes can have default values in Ruby, but make sure you create them dynamically if you’re going to mutate them.
- Sometimes you can lean on Ruby’s
Enumerable
methods and you won’t even need to make a hash with a default value. That said, the code you end up with might not communicate what you’re trying to do as clearly. It’s a judgement call.