def foo(a : Array(T)) forall T
puts "one"
end
def foo(a)
puts "two"
end
z = [5].as(Array(Array(Int32)) | Array(Int32))
foo(z)
Prints "two" but should print "one". I'm still not exactly sure why this happens.
Without the dependency on Array
:
class Gen(T)
end
def foo(a : Gen(T)) forall T
puts "one"
end
def foo(a)
puts "two"
end
z = Gen(Int32).new.as(Gen(Gen(Int32)) | Gen(Int32))
foo(z)
Not sure if it's relevant, but the bug appears inconsistent.
def foo(a : Array(T)) forall T
a.each_with_index do |a, i|
p "first call for #{a} of class #{a.class}"
foo(a)
end
end
def foo(a : T) forall T
p "second call for #{a} of class #{a.class}"
p a
end
foo([1, [2, [[3]], [4, [[5]]]]])
"first call for 1 of class Int32"
"second call for 1 of class Int32"
1
"first call for [2, [[3]], [4, [[5]]]] of class Array(Array(Array(Array(Int32)) | Int32) | Array(Array(Int32)) | Int32)"
"first call for 2 of class Int32"
"second call for 2 of class Int32"
2
"first call for [[3]] of class Array(Array(Int32))"
"second call for [[3]] of class Array(Array(Int32))"
[[3]]
"first call for [4, [[5]]] of class Array(Array(Array(Int32)) | Int32)"
"first call for 4 of class Int32"
"second call for 4 of class Int32"
4
"first call for [[5]] of class Array(Array(Int32))"
"first call for [5] of class Array(Int32)"
"first call for 5 of class Int32"
"second call for 5 of class Int32"
5
I'm not enough of a mathematician to do this correctly, but bear with me.
So numbers belong to the set N.
[[3]]
and [[5]]
belong to the same set or type A for N ⊂ A.
[2, [[3]]]
and [4, [[5]]]
would also belong to the same set or type B for which (A | N) ⊂ B.
So the array [2, [[3]], [4, [[5]]]
is a collection C which is B(N | B). When the compiler sees this pattern it determines that N | B is the unit type T. When calling the first method on C it will call the second method on any B that is not B(B). Because [[3]]
is A and is not B, the second method is called. Because [4, [[5]]]
is B, the first method is called. For this instance the unit type T is no longer N | B, but rather N | A. Because A is defined as containing N but not itself, N is the unit type T for the first method called on A.
If any of that makes sense and is factual, the compiler's actions are surprising but correct. The expected output would only be possible if the compiler waits to define the unit type T until all possible recursion is done.
It might be a correct guess but there's no recursion going on.
It works wrong because we have a bug in our code.
Recursion might be the wrong term for the algorithm process, but it does seem to have something to do with the recursive union type.
B(A|B) triggers it when C(A|B) does not.
Doing some old-fashioned puts debugging in the compiler and discovered this when I ran it against @stevensonmt's example code:
Crystal::Matches(
@cover=true,
@matches=
[#<Crystal::Match:0x10cf32330
@arg_types=[Array(Array(Int32))],
@context=
#<Crystal::MatchContext:0x10cf52900
@def_free_vars=["T"],
@defining_type=main,
@free_vars={"T" => Array(Int32)},
@instantiated_type=main,
@strict=false>,
@def=
def foo(a : Array(T)) forall T
a.each_with_index do |a, i|
p("first call for #{a} of class #{a.class}")
foo(a)
end
end,
@named_arg_types=nil>,
#<Crystal::Match:0x10cf32300
@arg_types=[(Array(Array(Int32)) | Int32)],
@context=
#<Crystal::MatchContext:0x10cf52800
@def_free_vars=["T"],
@defining_type=main,
@free_vars={"T" => (Array(Array(Int32)) | Int32)},
@instantiated_type=main,
@strict=false>,
@def=
def foo(a : T) forall T
p("second call for #{a} of class #{a.class}")
p(a)
end,
@named_arg_types=nil>],
@owner=main,
@success=true)
Crystal::Type#lookup_matches_without_parents
is correctly matching Array(Array(Int32))
(the [[3]]
) to the first method but also lumping it in with Int32
in a second match. I haven't figured out how that match is used yet, but it seems to be choosing the second match somewhere.
Interesting bug, if I try this: https://carc.in/#/r/8t59
def foo(a : Array(T)) forall T
a.each_with_index do |a, i|
p "first call for #{a} of class #{a.class}"
foo(a)
end
end
def foo(a : Array(Array(Int32)))
a.each_with_index do |a, i|
p "first call for #{a} of class #{a.class}"
foo(a)
end
end
def foo(a : T) forall T
p "second call for #{a} of class #{a.class}"
p a
end
foo([1, [2, [[3]], [4, [[5]]]]])
Then I get this:
"first call for 1 of class Int32"
"second call for 1 of class Int32"
1
"first call for [2, [[3]], [4, [[5]]]] of class Array(Array(Array(Array(Int32)) | Int32) | Array(Array(Int32)) | Int32)"
"first call for 2 of class Int32"
"second call for 2 of class Int32"
2
"first call for [[3]] of class Array(Array(Int32))"
"first call for [3] of class Array(Int32)"
"first call for 3 of class Int32"
"second call for 3 of class Int32"
3
"first call for [4, [[5]]] of class Array(Array(Array(Int32)) | Int32)"
"first call for 4 of class Int32"
"second call for 4 of class Int32"
4
"first call for [[5]] of class Array(Array(Int32))"
"first call for [5] of class Array(Int32)"
"first call for 5 of class Int32"
"second call for 5 of class Int32"
5
Looks like the bug is somewhere in this file. Haven't found it yet. I'll look some more after work today.
The problem is that this is a bug where the problem is not the code. We need to define how we want it to work first.
Here's the code again:
class Gen(T)
end
def foo(a : Gen(T)) forall T
puts "one"
end
def foo(a)
puts "two"
end
z = Gen(Int32).new.as(Gen(Gen(Int32)) | Gen(Int32))
foo(z)
The type of z
is a union of:
Gen(Gen(Int32))
Gen(Int32)
When the compiler tries to match that union type against Gen(T)
it will do this:
Gen(Gen(Int32))
. It matches against Gen(T)
. T
becomes Gen(Int32)
.Gen(Int32)
. Try to match it against Gen(T)
. However, we said T
is Gen(Int32)
. So it's matching against Gen(Gen(Int32))
. No match.The only match is the first one, Gen(Gen(Int32))
, where T
is Gen(Int32)
.
Then the next overload is considered, and the entire union matches.
That's why we are seeing that behavior.
How it should work? Should T
become the union of those types? But if T
is Gen(Int32) | Int32
it would mean we had Gen(Gen(Int32) | Int32)
in the first place, which is not the case.
We basically need to dispatch the union type to two different instantiations of the same overload. But the code never does that in any place of the compiler.
So this is a huge change, if we want to implement it.
Or... other ideas of how this should match?
This makes me question whether I understand forall
. My mental model is that it instantiates a new overload for every single type it was called with, including splitting unions. For example:
def foo(a : T) forall T
end
foo [1, "b"].first
In this example, it seems that there would be 2 overloads for foo
— one for Int32
and one for String
. Is that correct or is there just one for Int32 | String
?
@jgaskins there is just one, for the union type.
It works exactly the same way without forall
. forall
doesn't change which instantiations occur, it just increases the expressiveness of type restrictions.
Exactly. forall
just lets you talk about that T
and use it in code later on. But it doesn't change how things match.
That said, in my mind forall
should also expand union types and provide one overload for each, but that's not the case. And maybe we shouldn't do it. Should we do it for each subclass of a type hierarchy too?
Another thing to consider:
class Gen(T)
end
def foo(a : Gen(T)) forall T
p T
end
foo(Gen(Int32).new || Gen(Char).new)
Currently fails:
Error: no overload matches 'foo' with type (Gen(Char) | Gen(Int32))
Overloads are:
- foo(a : Gen(T))
Couldn't find overloads for these types:
- foo(a : Gen(Int32))
It's because Gen(Char)
matches, T
becomes Char
, then Gen(Int32)
doesn't match and there's no overload to cover that.
The entire situation is very confusing.
My advice is to avoid using forall
if not needed. In the example in the forums the T
wasn't needed at all.
The type of
z
is a union of:
Gen(Gen(Int32))
Gen(Int32)
When the compiler tries to match that union type against
Gen(T)
it will do this:
- Take the first type in the union:
Gen(Gen(Int32))
. It matches againstGen(T)
.T
becomesGen(Int32)
.- The the second type in the union:
Gen(Int32)
. Try to match it againstGen(T)
. However, we saidT
isGen(Int32)
. So it's matching againstGen(Gen(Int32))
. No match.The only match is the first one,
Gen(Gen(Int32))
, whereT
isGen(Int32)
.Then the next overload is considered, and the entire union matches.
That's why we are seeing that behavior.
How it should work? Should
T
become the union of those types? But ifT
isGen(Int32) | Int32
it would mean we hadGen(Gen(Int32) | Int32)
in the first place, which is not the case.We basically need to dispatch the union type to two different instantiations of the same overload. But the code never does that in any place of the compiler.
So this is a huge change, if we want to implement it.
Or... other ideas of how this should match?
Thanks for the clear delineation of the problem. I never expected to uncover such a fundamental bug. How important is this corner case?
How important is this corner case?
There's lucky framework, amber, athena and thousand of shards out there. We have a compiler. There's the standard library with a lot of features.
It's a bug but it's not preventing people from creating wonderful things. I think this is a very low priority bug. It's pretty rare to have to match against Array(T)
and wanting to get that T
. It's a bit leaky. And if you really need to do that, you can do it by doing typeof(array.first)
or something like that too. So there are workarounds.
It works exactly the same way without
forall
.forall
doesn't change which instantiations occur, it just increases the expressiveness of type restrictions.
Exactly.
forall
just lets you talk about thatT
and use it in code later on. But it doesn't change how things match.
Ahh, okay, similar to the idea of &block
vs yield
(purposely ignoring the block inlining for yield
for this analogy), where they both match the exact same way but the &block
version gives you a reference to the block itself but there's only one instantiation of the &block
overload?
Most helpful comment
The problem is that this is a bug where the problem is not the code. We need to define how we want it to work first.
Here's the code again:
The type of
z
is a union of:Gen(Gen(Int32))
Gen(Int32)
When the compiler tries to match that union type against
Gen(T)
it will do this:Gen(Gen(Int32))
. It matches againstGen(T)
.T
becomesGen(Int32)
.Gen(Int32)
. Try to match it againstGen(T)
. However, we saidT
isGen(Int32)
. So it's matching againstGen(Gen(Int32))
. No match.The only match is the first one,
Gen(Gen(Int32))
, whereT
isGen(Int32)
.Then the next overload is considered, and the entire union matches.
That's why we are seeing that behavior.
How it should work? Should
T
become the union of those types? But ifT
isGen(Int32) | Int32
it would mean we hadGen(Gen(Int32) | Int32)
in the first place, which is not the case.We basically need to dispatch the union type to two different instantiations of the same overload. But the code never does that in any place of the compiler.
So this is a huge change, if we want to implement it.
Or... other ideas of how this should match?