Julia: Invariant Tuples

Created on 15 Nov 2017 · 36Comments · Source: JuliaLang/julia

A radical idea that came up in #24166 and on slack: Make Tuple invariant in its parameters.

Pro:

Makes Tuples less special/magical.
(Hopefully) reduces special cases in subtyping code.
Allows abandoning the "diagonal rule" that is useful is dispatch, but nasty in code working with (especially widening) types, ref #24582.
Would allow for e.g. Tuple{Union{Int,Missing}, Union{Int,Missing}} to be concrete. (Added based on comment below.)
...

Con:

Probably huge effort, needs to be well justified.
...

(Hoping to stir up a discussion that adds to these lists.)

Of course the current covariant behavior is useful more often than not, but the new InvariantTuple (for the sake of discussion) could be use to express it: Tuple{Number,Any} would become Tuple{<:Number, <:Any}. The diagonal rule would be trivial: Tuple{T,T} where T would become InvariantTuple{T,T} where T (but T would no longer be confined to concrete types), while Tuple{T,T,Ref{T}} where T (N.B. diagonal rule does not apply!) would become InvariantTuple{<:T,<:T,<:Ref{T}} where T.

Handling Varargs needs more thought: Tuple{Vararg{Any}} would need to be interpreted as InvariantTuple{<:Any,<:Any,...} while Tuple{Vararg{T}} where T would need to be interpreted as InvariantTuple{T,T,...} where T. Hence we would need two different syntaxes for InvariantTuple{Vararg} to distinguish these cases. Maybe InvariantTuple{Vararg{T} where T} for Tuple{Vararg{Any}} vs. InvariantTuple{Vararg{T}} where T for Tuple{Vararg{T}} where T?

I have no idea whether we actually want to go down this road eventually, but I feel like it at least deserves discussion. (CC @andyferris)

speculative types and dispatch

Source

martinholters

👍5 ❤2 🚀1

Most helpful comment

The more I think about this idea, the more I like it. I would make a couple of observations:

It would be more correctly described as "invariant tuple types" – nothing actually changes about tuple values, not their syntax or their type since they are already concrete.
It doesn't change dispatch at all either. What it would change is how you write the signature type of methods with abstract argument types, but methods would continue to have the same type.
It's strictly more expressive: every tuple type that can currently be written can still be written, but many types that cannot currently be written would be expressible.
It makes "invariant position" versus "covariant position" for type vars completely explicit: if T appears as P{<:T} that's covariant position; if it appears as P{T} that's invariant position.
It unifies the way all of Julia's parametric types work.

I also don't think the deprecation process would be too bad: people don't write tuple types all that much and we can deprecate abstract parameters without <: in front of them in Tuple. Then in 1.0 we can re-allow the <:-less syntax with invariant meaning.

StefanKarpinski on 20 Nov 2017

👍6

All 36 comments

Another potential advantage is that Tuple{Union{Int,Missing}, Union{Int,Missing}}would be a concrete type, which is currently thought to make it easier to handle missing data.

JeffBezanson on 15 Nov 2017

Hm, currently

julia> typeintersect(Tuple{T, Ref, T} where T, Tuple{T, Ref{T}, T} where T)
Tuple{T,Ref{T1},T} where T1 where T

which doesn't seem quite right. If I translate this to typeintersect(InvariantTuple{T, <:Ref, T} where T, InvariantTuple{<:T, <:Ref{T}, <:T} where T), I think the answer should be InvariantTuple{T, <:Ref{T}, T} where T *) which I don't think we can represent using our current Tuples. Is that so? That would be another good reason pro this change.

*) Incidentally, if I just define abstract type InvariantTuple{T1,T2,T3} end for this experiment, then

julia> typeintersect(InvariantTuple{T, <:Ref, T} where T, InvariantTuple{<:T, <:Ref{T}, <:T} where T)
InvariantTuple{T,#s11,T} where #s11<:Ref where T

which is also not quite correct...

martinholters on 15 Nov 2017

Tuple{Union{Int,Missing}, Union{Int,Missing}} would be a concrete type

It _could_ be a concrete type. Or we still impose the restriction that only Tuples with all fields concrete types can have instances. No idea whether that has any advantages, but it would be possible.

martinholters on 17 Nov 2017

@JeffBezanson yes, missing was part of my thinking on this one.

andyferris on 17 Nov 2017

I guess part of this thinking would be that the type of a method signature, e.g. f(a::A, b::B) would become Tuple{<:A, <:B} rather than just Tuple{A, B} in order to compensate for the invariance of tuple types? Otherwise this would completely wreak havoc on dispatch in a way that seems totally beyond the pale. But it does seems nice that all parametric types by invariant rather than having this one special case in the system. This would also not be that hard to fix since you would mostly just translate Tuple{A, B} to Tuple{<:A, <:B}.

StefanKarpinski on 17 Nov 2017

I guess part of this thinking would be that the type of a method signature, e.g. f(a::A, b::B) would become Tuple{<:A, <:B}

Yes. Although some way of opting in to it actually becoming Tuple{A, B} might also be useful, if you e.g. want to restrict T to a concrete type in f(x::T, y::T, z::Ref{T}) (i.e. so that x and y are ensured to be of the same type).

martinholters on 17 Nov 2017

I would imagine that the diagonal rule would still apply (although not in that case). Whatever the diagonal rule is expressed in terms of could perhaps be used explicitly for that behavior in that case.

StefanKarpinski on 17 Nov 2017

Yes, the signature for f(x::T1, y::T2) would be Tuple{::A, ::B) where {A <: T1, B <: T2}. Diagonal dispatch f(x::T, y::T) where T would still be a Tuple{T, T} where T <: ....

Usage of tuple(a, b) and f(a, b) would require using the Julia bindings a and b, which each have a concrete type (all bindings have a concrete type). To construct a tuple of abstract types, we could maybe use the Tuple constructor (perhaps like Tuple{Integer}(1)) and to invoke a function using abstract argument types we already have invoke, which dovetails pretty nicely here.

To me, it's interesting to note that the new type system is powerful enough that we no longer need covariant tuples to describe how dispatch behaves! :) The question then becomes: what do we actually gain from covariance at all?

andyferris on 18 Nov 2017

Also - I hope I'm not wrong, but I vaguely remember a comment once discussing that ccall uses a signature which is more like a list of types such as (Integer, Integer) than an actual Julia Tuple. I think this may remove any distinction?

andyferris on 18 Nov 2017

Finally, at JuliaCon it seemed uncertain even if we thought that suitable algorithms things like accurate type intersection for a multiple inheritance type system even exist - I would guess that simplifying the type system might help with forward progress (I'm hardly an expert but it seems logical). I suspect this might be a pretty large simplification of the existing subtyping algorithm, and will be easier for users to grasp.

andyferris on 18 Nov 2017

The more I think about this idea, the more I like it. I would make a couple of observations:

It would be more correctly described as "invariant tuple types" – nothing actually changes about tuple values, not their syntax or their type since they are already concrete.
It doesn't change dispatch at all either. What it would change is how you write the signature type of methods with abstract argument types, but methods would continue to have the same type.
It's strictly more expressive: every tuple type that can currently be written can still be written, but many types that cannot currently be written would be expressible.
It makes "invariant position" versus "covariant position" for type vars completely explicit: if T appears as P{<:T} that's covariant position; if it appears as P{T} that's invariant position.
It unifies the way all of Julia's parametric types work.

StefanKarpinski on 20 Nov 2017

👍6

A wrinkle would be -- if @andyferris's observation is right:

Yes, the signature for f(x::T1, y::T2) would be Tuple{::A, ::B) where {A <: T1, B <: T2}. Diagonal dispatch f(x::T, y::T) where T would still be a Tuple{T, T} where T <: ....

-- that the (mental) translation of a type signature to a tuple type gets complicated. Now the translation is easy:
f(x::T1, y::T2, z::T2) where T2 -> Tuple{T1, T2, T2} where T2 versus with this change:
f(x::T1, y::T2, z::T2) where T2 -> Tuple{A, T2, T2} where {A<:T1, T2}.

mauro3 on 20 Nov 2017

I don't understand enough this topic to be sure that this is related (apologies for the noise if it's not), but I wanted to raise a point about the way tuples work just in case it could be addressed in a tuple overhaul. The problem is related to how types are handled as function arguments: to get type-stable code, we often write a type signature with ::Type{T}. For example you can have f(x) = ...; f(::Type{Int}) = .... But this doesn't work if the arguments of f are wrapped in a tuple: f(x::Tuple{Int}) = ... is fine but f(Tuple{Type{Int}}) fails when called as f((Int,)): ERROR: MethodError: no method matching f(::Tuple{DataType}). Now if the type of Int (resp. (Int,)) could be modified to Type{Int} (resp. Tuple{Type{Int}}), then then call f((Int,)) would work, and together with tuple covariance, the call g((Int,)) would also work where g(::Tuple{DataType}). I can give (if asked) a motivating example for why this behavior can be useful (it would help me a lot while I try to extend the rand API).

rfourquet on 20 Nov 2017

👍1

@mauro3: yes, that's the tradeoff. But the flip side is that right now that complexity is baked into the rules of how Tuple behaves and it's implicit, automatic, unavoidable and kind of arbitrary. With invariant tuple types, you have to do a more complicated method signature to tuple type translation, but after that the complexity is out in plain sight, since the diagonal rule just falls out of a combination of that translation and tuple invariance. The general method translation rule is that f(x1::T1, ..., xn::Tn) has signature type Tuple{X1, ..., Xn} where {X1<:T1, ..., Xn<:Tn}. In the above cases, this looks less uniform than it is because we can simplify that type in non-diagonal cases:

f(x::T1, y::T2) ==> Tuple{A, B} where {A<:T1, B<:T2} ==> Tuple{<:T1, <:T2}
f(x::T,  y::T)  ==> Tuple{A, A} where {A<:T}

The fact that the "diagonal rule" just falls out of the combination of the translation rule and tuple invariance seems considerably easier to explain and justify than having an ad hoc rule.

StefanKarpinski on 20 Nov 2017

👍3

Actually that rule is a bit off: you need deduplicate the Xi<:Ti on the RHS and use the same Xi in the LHS where the same Ti appears in the method signature. That's pretty easy to explain in words but requires annoying double indexing to write out in math syntax, so I won't bother with that.

StefanKarpinski on 20 Nov 2017

I realize also that the change I wish (that typeof(Int) == Type{Int}, which could be questionable in itself, regarding specialization questions) is maybe not crucially dependent on tuple covariance, it could be enough to change a signature g(::Tuple{DataType}) into g(::Tuple{<:DataType}).

rfourquet on 20 Nov 2017

kind of arbitrary

Whatever the other merits of this proposal, I really don't think tuple covariance is arbitrary. It can be formally derived from the other properties of tuples.

f(x::T, y::T) ==> Tuple{A, A} where {A<:T}

Is T a type or a type variable here? The diagonal rule only involves variables, so I'd think they're variables, but then this is just renaming T to A.

@andyferris I don't think this is related to ccall; C doesn't have subtyping so covariance doesn't come up.

JeffBezanson on 20 Nov 2017

Whatever the other merits of this proposal, I really don't think tuple covariance is arbitrary.

I wasn't saying that tuple covariance was arbitrary, I was saying that the diagonal rule is arbitrary: it works the way it does because that way is practical, not as a consequence of something else. But it's very interesting that roughly this same rule also falls naturally out of invariant tuples.

StefanKarpinski on 20 Nov 2017

👍2

For testing the waters here, I have started to add a new built-in type InvariantTuple in a local branch where I plan to experiment with the subtyping rules. If that goes well, I'd try to encode the translation from Tuple to InvariantTuple, perform that during subtyping and get rid of any (other) Tuple handling code in subtyping. That would give an indicator whether it actually helps simplifying the subtyping code by making tuples invariant.

Concerning subtyping, everything without Varargs works out of the box (module any possibly existing bugs, of course). Now for Varargs, I tend to the following interpretation:

InvariantTuple{Vararg{T,2}} where T == InvariantTuple{T,T} where T
InvariantTuple{Vararg{T,2} where T} == InvariantTuple{T1,T2} where {T1,T2}

with the obvious extension when replacing 2 with N ... where N, where (like for Tuples), it doesn't matter whether the where N is inside the InvariantTuple or outside.

Opinions?

martinholters on 21 Nov 2017

Thank you so much for investigating, @martinholters! I think @JeffBezanson’s opinion on that will be most valuable but it all seems sane to me.

StefanKarpinski on 21 Nov 2017

To be clear, I think only the diagonal rule is the source of any complexity, and not covariance by itself. Invariance is possibly more difficult to handle. Vararg is also very complex.

The proposed rule for Vararg makes sense; it matches what we do now:

julia> Tuple{Int, Vararg{T} where T}
Tuple{Int64,Vararg{Any,N} where N}

julia> Tuple{Int, Vararg{T}} where T
Tuple{Int64,Vararg{T,N} where N} where T

One operation we'll need (and already need as long as named tuples stay invariant) is "promoting" types in this way: f(InvTuple{Int}, InvTuple{Void}) => InvTuple{Union{Int,Void}}. This would be used e.g. when collecting results in map. This could be promote_type, except in this context we might not want numeric promotion (and we don't currently do it). Or maybe we do want numeric promotion there?

JeffBezanson on 21 Nov 2017

Do you not agree that the diagonal rule somewhat "falls out of" invariance though? Or is that claiming too much?

StefanKarpinski on 21 Nov 2017

I think it's claiming a bit too much. The whole diagonal thing has two parts:

Identifying which variables are diagonal.

In Tuple{T, Ref{T}} where T, the T is not diagonal. As a result, Tuple{Int, Ref{Any}} <: Tuple{T, Ref{T}} where T holds. If Tuple becomes invariant, that no longer works and something else has to be done. We would probably split variables into invariant and covariant uses, so this type would become Tuple{S, Ref{T}} where S<:T where T. That works, but the right thing didn't fall out of invariance; we had to do extra work. Inserting this transformation in various places adds complexity. We could use it now as a way of implementing tuple covariance, keeping the user-facing Tuple covariant. But if tuple invariance is exposed to the user then everybody has to know about this transformation and sometimes apply it manually.

Constraining diagonal variables to concrete types.

Tuple{T,T} where T currently expresses "a tuple of two things of the same type". With invariant tuples, we wouldn't be able to express that since Tuple{Any,Any} is a subtype of it but might contain two values of different types. It works for dispatch, since dispatch would never internally use a type like Tuple{Any,Any} as the type of actual arguments, but not in general. Of course, we might be ok with giving up this type as long as dispatch does what we want.

JeffBezanson on 22 Nov 2017

@JeffBezanson

It works for dispatch, since dispatch would never internally use a type like Tuple{Any,Any} as the type of actual arguments

Wouldn't that break sorting of methods according to specificity?

ulysses4ever on 24 Nov 2017

I would imagine this would be expressed the same way: InvariantTuple{T,T} where T. Except that instead of being a tuple of two things of the same type because of the diagonal rule, it would have that same meaning because it would only describe objects x such that typeof(x) == InvariantTuple{T,T} for some T, which can only happen when T is a concrete type and therefore when x is a pair of objects with that concrete type. That's what I meant by the diagonal rule "falling out of" invariant tuples.

At this point we're not likely to change this for 0.7, so I have only one observation here:

If we were to introduce an invariant tuple type in the 1.x timeframe (and maybe even give it nice syntax like {T1, T2} for those types, but that doesn't really matter), is there anything that's relatively niche or arbitrary about the way tuples and dispatch work now that would be easier to express in terms of that invariant tuple type as a fundamental construct than in the current design? If so, then this could serve as a differentiating perspective for deciding how such things should work.

Tuple{T,T} where T currently expresses "a tuple of two things of the same type".

StefanKarpinski on 26 Nov 2017

In Tuple{T, Ref{T}} where T, the T is not diagonal.

With invariant tuples, I would think that the corresponding type would be diagonal. I.e. InvariantTuple{T, Ref{T}} where T would contain all x such that typeof(x) == InvariantTuple{T, Ref{T}} for some T so that would "naturally" force T to be concrete since it's the actual type of the first component of x. So yes, if we kept the same behavior for dispatch, the signature f(::T, ::Ref{T}) where T would have to be lowered to InvariantTuple{S, Ref{T}} where {T, S<:T}, but the other option (and I realize this could be disruptive) would be to change the meaning of f(::T, ::Ref{T}) where T so that it corresponds to InvariantTuple{T, Ref{T}} where T. After all the choice that T is not diagonal in such signatures is arbitrary, but the invariant tuple view makes one of the choices more natural. Perhaps that's too inconvenient or too breaking, but that's the sort of thing I was thinking.

StefanKarpinski on 26 Nov 2017

I've started playing with a Tuple to InvTuple transformation algorithm and it gets surprisingly tricky even for simple cases. Consider this: Only looking at tuples of length one, I thought Tuple{X} [where ...] could always be rewritten as InvTuple{T} where T<:X [where ...], no matter what X is (maybe unless it's a Tuple itself). That then gives these two rewrites:

Tuple{Ref{S}} where S<:Number -> InvTuple{T} where T<:Ref{S} where S<:Number
Tuple{Ref{S} where S<:Number} -> InvTuple{T} where T<:(Ref{S} where S<:Number)

But (simply with abstract type InvTuple{T} end):

julia> (Tuple{Ref{S}} where S<:Number) == Tuple{Ref{S} where S<:Number}
true

julia> (InvTuple{T} where T<:Ref{S} where S<:Number) == (InvTuple{T} where T<:(Ref{S} where S<:Number))
false

My current thinking is that it should be valid to unwrap_unionall the Tuple parameters and then to rewrap the resulting InvTuple (with special handling of Varargs). Or am I missing something?

martinholters on 27 Nov 2017

Yes, that's correct (the type system tries to do that internally too, and occasionally gets wrong answers where it fails to do so).

vtjnash on 27 Nov 2017

Mapping the diagonal rule to InvTuples is also surprisingly tricky. First, because Tuple{Union{}} <: Tuple{Int}, we need to map Tuple{Int} to InvTuple{<:Int}, even though Int is a leaf type. It follows that Tuple{Int,Int} maps to InvTuple{<:Int,<:Int}. But of course

julia> InvTuple{Union{},Int} <: InvTuple{<:Int,<:Int}
true # sure

julia> InvTuple{Union{},Int} <: InvTuple{T,T} where T
false # no surprise given invariance

julia> InvTuple{<:Int,<:Int} <: InvTuple{T,T} where T
false # has to follow to maintain transitivity

So either

Tuple{T,T} where T cannot map to InvTuple{T,T} where T
InvTuple would need to have exceptions to its invariance in cases involving Union{}
map Tuple{Int} to InvTuple{Int} even though it leads to a difference when subtyping against Tuple{Union{}}/InvTuple{Union{}} or
something else for which I lack the imagination.

I'm not a fan of 2. because it would defeat the objective of fewer special cases/easier reasoning to some extent. Considering, in theory, mapping Tuple{T,T} where T to Union{InvTuple{T,T}, InvTuple{Union{}, T}, InvTuple{T, Union{}}, InvTuple{Union{}, Union{}}} where T should work, but that approach leads to exponentially large Unions and doesn't work in practice because e.g.

julia> Ref{<:Int} == Union{Ref{Union{}}, Ref{Int}}
false

although we would need that to give true (I think there is an issue, or at least a comment, about this somewhere already).

is simplest implementation-wise, but I wonder what the implications are if Tuple{Union{}} is no longer a subtype of Tuple{Int}...

This leaves me with 4.---so: ideas anyone?

martinholters on 6 Dec 2017

map Tuple{Int} to InvTuple{Int} even though it leads to a difference when subtyping against Tuple{Union{}}/InvTuple{Union{}}

Isn't this one of the areas where covariance of tuples is kind of annoying? E.g. the whole ambiguous_bottom=false business for the Base.isambiguous function.

StefanKarpinski on 6 Dec 2017

Hm, another argument in favor of 3 is that typeof((1.0,)) being mapped to InvTuple{<:Float64} instead of InvTuple{Float64} is a bit weird. After all, we know that element is a Float64, not a non-existing instance of Union{}.

But that might also mean that the idea of a gentle transition by first doing the Tuple to InvTuple translation internally only is misguided.

martinholters on 6 Dec 2017

👍1

Yes, unfortunately I don't think a gentle transition is really possible. The invariant tuple approach seems to lead to some fundamental difference, which is actually precisely what's potentially appealing about it – that it might lead to different and possibly better semantics.

StefanKarpinski on 6 Dec 2017

I think the point of this issue is precisely to change typeof((1.0,)) to InvTuple{Float64}, such that InvTuple{Union{}} ceases to be a subtype of it. So 3 sounds right, but possibly this would be context-sensative? For example, in the expression Tuple{T} (where T isnota TypeVar), it might always mean Tuple{<:T}, but in (1.0,), we always would construct InvTuple{T} and in function (f::T)() end, it would depend on whether T was isconcrete?

vtjnash on 26 Feb 2018

At this point, I'm pretty sure that if we do something like this, it will be breaking, so post-1.x. And if it is breaking anyway, we can go even further. And one thing I have been pondering is to add e.g. x:::T to mean typeof(x) == T (ignoring subtleties if x is a type for simplicity). Then we could e.g. have

foo(x::T, y::AbstractVector{T}) = ...  # same dispatch behavior as now
bar(x:::T, y::AbstractVector{T}) = ... # T has to be the concrete type of x

This would mainly affect dispatch using the diagonal rule which would need to be written with :::T.

martinholters on 26 Feb 2018

❤2

EDIT: Reducing my own noise
What are the drawbacks of introducing x<:T in argument lists as replacement for x::T such that the latter can become exact typing (= types are equal)? In my opinion, while this would be breaking (which the whole thing is anyway), it feels very consistent.
The only problem from the design perspective I currently have no solution for is the following:
When f(x<:T) where T is called then x isa T holds but x<:T doesn't.

EDIT2:
In accordance to parametric types it'd be fancy if f(x>:T) would also work.

rapus95 on 18 Oct 2019

I've just got the following idea:

| meaning | current | new | equivalent expression |
|-|-|-|-|
| instance of type or subtype | a::T | a<::T | typeof(a) <: T |
| subtype | <: | <: | clear as it doesn't change |
| direct instance of type | a::T | a::T | typeof(a) == T |

EDIT It's just that I prefer those operators over the combination :: & ::: due to increased clarity EDITEND

It makes it clear that both subtyping aswell as instance checking is part of the check.

Currently that produces a syntax error in the case with a space

julia> 5 <:: Integer
ERROR: syntax: space not allowed after ":" used for quoting

and a type error due to binding to the "wrong" function in the case without space

julia> 5<::Integer
ERROR: TypeError: in <:, expected Type, got Int64

As symbols and Expr instances behind the subtyping operator <: will never make sense, this even won't introduce ambiguities.

diagonal dispatch: f(x::T, y::T) where T
ordinary dispatch: f(x<::T, y<::T) where T

It also could be extended towards parameters:
struct MyType{N::Int, S<::Integer, T::DataType}

rapus95 on 10 Mar 2020

❤1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Dates.format regression on master: Width of milliseconds field cannot be fixed anymore

helgee · 3Comments

Help regression for Rational in 0.6

omus · 3Comments

Literal NaNs in ASTs

Keno · 3Comments

+(x::T,y::T) where {T} not parsing correctly

musm · 3Comments

The devdocs page for functions is outdated

ararslan · 3Comments