It seems that Hash#merge does not copy any default blocks set on either of the parent Hash objects. This is a bit surprising. Is this the expected behavior?
$ crystal -v
Using compiled compiler at `.build/crystal'
Crystal 0.28.0-dev [5282fdf91] (2019-03-26)
LLVM: 4.0.1
Default target: x86_64-pc-linux-gnu
proc = ->(h : Hash(String, String), k : String) { "oops!" }
hash = Hash(String, String).new(proc)
pp hash["q"]
#=> "oops!"
pp hash.merge({"a" => "b"})["q"]
#=> Unhandled exception: Missing hash key: "q" (KeyError)
That's expected behavior, Hash#merge only merges hash contents.
This seems to mean that one can have either a Hash literal, or a Hash with a default block, but that these two things cannot be combined.
How does it work in Ruby?
I didn't want to initiate that comparison myself (I don't know how far, "but Ruby" goes around here as an argument). However, Ruby does preserve the default values:
Hash.new { "value" }.merge({"a" => "b"})["q"]
#=> "value"
The exact mechanism is not quite clear to me. Ruby also provides Hash#default= and Hash#default_block=, which would allow one to do:
{"a" => "b"}.tap {|h| h.default = "x" }
As a workaround this would come to mind: Hash.new { "value" }.dup.merge!({"a" => "b"})["q"]. But it doesn't work because #dup doesn't duplicate the default block as well. I think that should be fixed.
Regarding #merge, I'm not sure. But probably yes.
@tenebrousedge I misunderstood your issue, in case of already created hash with a fallback, calling #merge on it shouldn't make any difference I guess.
@Sija The semantics of merge is not, merge other into self but merge other and self into a new hash. It would probably still make sense to copy the default value/block.
I understand merge as merge (the content of) other into self but as a new hash, so yeah it should keep the default imo.
What if you're merging 2 hashes that both have their own defaults? I guess it would work the same way if both of them had the same key where the value would come from the hash being passed in to merge, right?
a = Hash(String, String).new { "a" }
b = Hash(String, String).new { "b" }
c = a.merge(b)
c["a"] #=> "b"
@jwoertink I'd expect c["a"] # => "a", i.e. keeping the receivers default proc.
Here's Ruby:
irb(main):002:0> a = Hash.new 0
=> {}
irb(main):003:0> b = Hash.new 1
=> {}
irb(main):004:0> a.merge(b)[10]
=> 0
irb(main):005:0> b.merge(a)[10]
=> 1
We should probably do the same (I think it makes sense). The receiver of merge is the one whose default block is preserved.
Hmm.. That's interesting. I never noticed that with ruby before.
irb(main):001:0> a = Hash.new(0)
=> {}
irb(main):002:0> a["a"] = 1
=> 1
irb(main):003:0> b = Hash.new(1)
=> {}
irb(main):004:0> b["a"] = 2
=> 2
irb(main):005:0> c = a.merge(b)
=> {"a"=>2}
irb(main):006:0> c["b"]
=> 0
irb(main):007:0> c["a"]
=> 2
So it's like c gets the default from a, but the overrides from b when there's a key conflict. TIL 馃槅
So, to my mind, the obvious thing to do here would be to pass @block to the newly created Hash. However, that's probably going to cause merge to blow up if the other hash is not a compatible type. I'm not sure if that's considered to be a problem.
Actually, this isn't trivial to do and we should probably not do anything about it.
The problem in Crystal is that merge will create a Hash with a possibly different type:
h1 = Hash(Int32, String).new("foo")
h2 = Hash(String, Int32).new(10)
h3 = h1.merge(h2)
p typeof(h3) # => Hash(Int32 | String, Int32 | String)
Now, each Hash has a block of type @block : (self, K -> V)? and we can't just copy the block because the types would be incompatible. We could create a new block that, in its implementation, calls the other block but that means the other hash will have to be kept in memory until this new hash is GCed, which is not ideal.
So we should probably do nothing about this. If you want to keep the default block you can use merge!.
Actually, what I said above is nonsense, the Hash can be GCed if the block is kept alive. I'll try to implement this.
Most helpful comment
@jwoertink I'd expect
c["a"] # => "a", i.e. keeping the receivers default proc.