Julia: Feature request: Add syntactic sugar for covariant actual type parameters

Created on 27 May 2014  Â·  50Comments  Â·  Source: JuliaLang/julia

As per discussion in https://groups.google.com/forum/#!topic/julia-users/alavN8tRdyI :
Let e.g.

Array{<:Real}

act like RArray given by

typealias RArray{T<:Real} Array{T}

This convenient shortcut should go a long way in those cases where people would want covariant types.

The type parameter of Array{<:Real} might become anonymous or hidden; it probably doesn't matter so much since you would rarely re-provide it, e.g. Array{<:Real}{Int}.

julep

Most helpful comment

+1 for just supporting f(x::Vector{<:Real}) and similar simple non-nested cases and punting on everything else. All of the -> proposals seem like a cure worse than the disease here.

We don't need a sugar that covers all possible cases, since in more complicated cases you can always fall back onto the current syntax f{T<:Real}(x::Vector{T}) or define a typealias.

All 50 comments

I agree something like this is necessary. This could also be addressed by triangular dispatch (#3766).

The main motivation is to avoid having to explicitly parametrize methods in many cases (and also avoid having to create a lot of type aliases). Like

f(a::Array{<:Real}, b::Array{<:Real}) = a+b

instead of

f{s<:Real,T<:Real}(a::Array{S}, b::Array{T}) = a+b

or

typealias RArray{T<:Real} Array{T}
f(a::RArray, b::RArray) = a+b

How would triangular dispatch help in this case? Or are you thinking about some other kind of case?

Ah, I see your point.

I was thinking of occasions where you have two or more nested types:

immutable Foo{S <: AbstractArray}
   data::S
end

and then I want to dispatch on real arrays. At the moment, I don't think it's possible to dispatch directly. You could define

typealias RArray{T<:Real} Array{T}

bar{S<:RArray}(a::Foo{S}) = ...

Alternatively, if we had triangular dispatch

bar{T<:Real,S<:Array{T}}(a::Foo{S}) = ...

or, with your proposal

bar(a::Foo{<:Array{<:Real}}) = ...

Ok, that should be useful way to use this idea as well.

+1

This makes a lot of function declarations more concise.

+1 I would like to have this.

I believe a bit of theoretical work is needed to figure out what <:Real actually is. Is it a type? Does it make sense to write isa(x, <:Real)? Is there an implied "forall" quantifier somewhere, and if so, where does it go in T{S{R{<:Real}}}? How does it relate to partially-specialized types like Array and Array{Int}? And so on.

Given the current semantics of covariant parametric types, I agree that this is really useful – but, as @JeffBezanson points out, not entirely clear yet. Basically, it allows you to write the covariant parametric type as Array{<:Real} while Array{Real} is an invariant parametric type. Note also, that Array{<:Real} is abstract, while Array{Real} is concrete. But since Array{<:Real} is the more common case, it makes me wonder if it wouldn't be better for Array{Real} to be how you write the abstract, covariant parametric type and have a less convenient syntax for the concrete, invariant parametric type. Something like Array{=Real} or even Array[Real]. This wouldn't be backwards compatible, but I suspect that switching to covariant parametric types would be surprisingly non-disruptive.

function foo(x::Vector{Real}) end works, it just requires an argument of type Vector{Real} (which is a concrete type, you can create one with e.g. Real[1, 2.5]). I think that the question is more about what you commonly want (in this case, most likely the covariant type).

@sunetos Int is a subtype of Real, so your first foo works as expected. However, both Vector{Int} and Vector{Real} are concrete types, and neither is a subtype of the other, so the second foo doesn't work as you expect. I don't see any inconsistency here. I agree that this behavior may surprise some people though.

When parametric types are involved, things are more complicated. Making parametric types covariant by default may introduce other subtle problems.

@sunetos, I find it frustrating when you speak on behalf of all "newcomers". Could you please stick to claims about your own beliefs rather than making assertions about the beliefs of all people? You don't represent me and I find it quite upsetting when you assert that you do.

Please don't bow out. Your opinion is really valuable. I spoke up because I think it's important to keep in mind that none of us actually know how the modal newcomer will behave since we've never run any experiments. Right now, we're all just speaking from strong personal beliefs.

@sunetos I am sympathetic to the proposal of having parametric types covariant by default, and believe that it may substantially improve the experience of many users.

However, changing the default behavior from invariant to covariant is a major change to the language, which may potentially complicate type inference and method resolution. This needs extensive discussion.

FWIW, I remember bumping into the "Array{Real} is a concrete type" issue at some point and having to take a while to realize what was going on (I may have even posted in julia-users about it). So, like @sunetos, this was something I didn't understand at first. I think making Array{Real} covariant by default would be sensible and probably not too disruptive.

-Jacob

I also found quite unintuitive the current Array{T} behavior at first, though now I understand why it works that way. I think defaulting to covariant behavior may well reduce entry costs and the questions from newcomers on the mailing list (even if as @johnmyleswhite said it's always pretty hard to know this population in advance), without creating real problems in practice. Cases where you do not want this behavior are probably quite rare and would only affect people with advanced understanding of Julia.

I see the arguments in favor of covariance, but you do trade one surprise for another:

function f(x::Vector{Real})
    x[1] = 1.5
    ...

That code does not work for all arguments.

Another "new surprise" is that Array(Any, n) has to give an Array{=Any,1}, and you'd have to learn to insert the = when reasoning about the result type of the Array constructor.

@JeffBezanson's argument is spot on.

When you want to input an array, you probably feel that covariance makes things convenient. However, when you want to supply a collection to receive output results, covariance is probably not what you want (you may like contravariance here)

I also find I have a psychological bias regarding numeric arrays --- it is very tempting to think of a Vector{Int} as a kind of "vector of real numbers". However in other domains you might commonly have a Vector{AbstractGraph}, where subtypes like Vector{EmptyGraph} are mostly useless.

I didn't understand that example. In languages with concrete type inheritance, polymorphism, etc., you would specify the type that has something akin to the "least common denominator" of required behaviors. So if your function tried to do something that would only work on floats (like the example of assigning 1.5), then you would specify the argument as ::Vector{FloatingPoint}, not ::Vector{Real}, because clearly the function requires floats. If the function is generic enough to never require floats, but would work with any numeric type, then you would use ::Vector{Real} or ::Vector{Number}.

Let's consider an example outside numerical computation:

# an action can operate on any instance of type T
type Action{T}  ... end
apply_action{T}(act::Action{T}, x::T) = ...

function apply_action(A::Vector{Float64}, act::Action{Float64})
    for x in A
        apply_action(act, x)
    end
end

In this case, we do wish to have _contravariant_ behavior, which allows that we supply instances of Action{Real} or Action{Any}, which can also operate on float64 numbers.

It is true when we consider arrays, covariance might feel convenient. However, there are plenty of cases where the opposite behavior, namely contravariance, is preferred.

_C#_ is a language that puts a lot of thoughts in this issues, which allows to specify whether each type parameter is covariant, invariant, or contravariant.

(see http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science))

The point of the example is that writing to arrays asks for contravariance, in which case you need to explain to people why you have to write Vector{FloatingPoint} instead of Vector{Real}. No one choice is "obviously intuitive to all beginners".

In spite of the debate about default behaviors, I think it is useful to provide syntactic sugar for people to express covariance when they need it.

My feeling is that Array{<:Real} is a good way to go, where <: indicates covariance, and the part <:Real should not be considered as a standalone type. The type Array{<:Real} should be considered as abstract, and we have Array{T} <: Array{<:Real} whenever T <: Real.

Ok, then I see what you mean.

Yes, I think what @lindahua described would be a good way forward.

Also extends to Array{>:Float64} for contravariance of course.

Ah, yes, the ever-angry supertype of operator.

IMHO <: and >: sound really good and easy to understand.

A contravariant version would defintely be cool and useful too, but that
would be a new type system feature and not just syntactic sugar, right?

I agree that sanest way forward would be to keep invariance as the default.
It will most probably cause more confusion initially, but much less so in
the long run.

It is pretty nifty to have the option open in the future for implementing <: contravariance.

I would also favour to have invariant behaviour as default and allow for this nice syntax for easy covariant behaviour. One thing that made me pause and think for a second, and that should perhaps be included in the manual, is that if you just use Tuples as containers, then a function definition with an argument of type NTuple{N,Real} would actually match an argument (1,2,3), whereas Vector{Real} would not match with [1,2,3]. I actually started using tuples in several cases where I wanted a simple immutable linear container with covariant behaviour (e.g. as field in a new type), and in many cases the additional length information was actually nice and could also be included as parameter in my newly defined type, often with reversed order, e.g.:

I just replaced the original example, which was kind of too simple, with something more complicated: suppose you want a type that stores a list of matrices, where the precise type of matrix is not really relevant but the element type is. You could then define

type MyType{T,N}
   container::NTuple{N,AbstractMatrix{T}}
   other
end

This way, container would match with both a homogeneous tuple of Matrices, with the same element type, or with a heterogeneous list of different types of matrices, all with the same element type. Unfortunately, you would not be able to replace the original matrices. Doing this with container::Vector{AbstractMatrix{T}} requires more code, because you need an additional constructor like

function MyType{TT<:AbstractMatrix}(container::Vector{TT},other)
# check that all matrices have same element type, then convert container to a Vector{AbstractMatrix}

I propose the syntax

{T<:Real}->Vector{T}
etc.

for (in pseudo-julia):

Union({ Vector{T} forall T <: Real }...)

This syntax is currently an error, so it's available.

This provides a canonical syntax for all parameterized types, including diagonal and triangular cases.

This syntax is needed to clarify the scope of quantifiers. For example, Array{Array{<:Real}} is ambiguous, because it might mean {T<:Real}->Array{Array{T}} (all arrays of arrays of some Real) or Array{{T<:Real}->Array{T}} (an array of various types of Real arrays).
EDIT: The syntax Array{<:Real} could still be supported, with the understanding that it means {_<:Real}->Array{_}, with the quantifier moved outside of just one type.

Triangular constraints are naturally supported by nesting; each forall type will introduce only one variable. For example

{T<:Real, A<:AbstractVector{T}}->MyType{T,A}

will desugar to

{T<:Real}->{A<:AbstractVector{T}}->MyType{T,A}

typealias would be a synonym for a const assignment to one of these. That keyword should perhaps be deprecated. All top-level types with parameters would be bound to these kinds of types as well:

Array === {T}->{N}->(array type with parameters T,N)

In method signatures, outermost forall types yield parameters that will be passed in; inner forall types do not. This lets you express new things, such as

function foo(t::{T}->(T,T))

which accepts a tuple of two values of the same type, without any method parameters.

Since it has been requested that someone start bikeshedding this issue, I'm going to wonder if the -> part of this proposal should be spelled --> under the consideration that we used to and might again use that for function types. Then again, this isn't really a function type – it's a type constructor.

I think some more examples would help. If I am reading correctly you are proposing, for example,

function f(x::{T<:Real}->Vector{T})

instead of

function f{T<:Real}(x::Vector{T})

whereas above it was proposed

function f(x::Vector{<:Real})

If I have this part correct, I am in favor of the last version over the first. I realize that your proposal covers more ground then just the covariance issue (e.g., triangular dispatch) and it may be worth it for that but it would be a shame to lose the simpler case.

I was intending to keep the current method parameter syntax, but

function f{T<:Real}(x::Vector{T})

would imply a method type signature of {T<:Real}->(Vector{T},). Currently there is _no way_ to write this type in isolation, which is a pretty major problem.

The other syntax, f(x::{T<:Real}->Vector{T}), means something totally different which cannot currently be expressed at all, namely the type ({T<:Real}->Vector{T},). However since tuples are covariant this particular case is not such a big deal.

Agreed, +1 to keeping the simpler, immediate Vector{<:Real} syntax. As Jeff notes above, this cannot be the complete syntax as it is ambiguous with nested parameters, but it could be supported as a sugar for making the innermost type covariant (see his EDIT). I think that it would probably cover almost all use-cases, so the more verbose syntax would only be used when it is really needed.

At first I balked at the -> punning against anonymous functions. But I think it actually makes sense. You are defining a temporary binding (often with some restrictions) on the LHS that is then used in the RHS. In that regard, I don't see a pressing need to deprecate typealias: it is a convenient syntax for named types, whereas {}-> is the 'anonymous' syntax.

While not immediately applicable, it might be interesting to think about how this would interact with function types. For example a add(t::{T<:Real}->(T,T)) probably returns something of type T. I could see its type being written as: {T<:Real}->((T,T) --> T). That's probably how you might also write the type for the parametric add{T<:Real}(t::(T,T)). Is that ok? Were the covariant type-constructor and function type symbols the same, it would probably be even more confusing.

I like Jeff's proposal, even though it may be a bit more verbose. It unifies the various ways of expressing single, union, and parametric types (which are another form of union types) in the system to one convention which I feel is a good thing. @mbauman brings up a good point about function return types. Last I read the proposal was:

function add{T<:Real})(a::T, b::T)::T
end

for a type stable add of two objects. How would this new syntax interact with function return types?

function add(a::{T<:Real}->T, b::{T<:Real}->T)::{T<:Real}->T
end

seems a bit verbose, although with typealiases this might be less of a concern.

That last example strikes me as equivalent to

function add(a::Real, b::Real)::Real

since each argument (and the return value) has its own quantifier.

@mbauman is right that function types with a-->b could be combined with this, and are totally orthogonal. The syntax function f(a)::Real is also orthogonal, and not a function type, just a declarative way to ask for the return value to be converted.

Again, this proposal does not require any changes to method definition syntax. It just provides a way to write the implied types by themselves.

Ahh I see, thanks for the clarification. Being able to naturally express function types is a nice benefit.

Continuing the discussion about things that don't exist yet, how do you see this proposal coexisting with abstract multiple inheritance?

I wonder if https://github.com/JuliaLang/julia/issues/4859 could be addressed in the new syntax proposal (providing default type parameter values). (if indeed it's a good idea....)

The bad news is that this is closer to being _incompatible_ with the idea of default type parameters. But the good news is that #1470, adding a construct function for constructors, will allow defining constructors for partial types, so you could define a constructor for Dict{K,V} that inserts a third type parameter. I think this will resolve the issue satisfactorily.

I got to this topic, as I need things like this nearly everywhere in my code.

It might not be very constructive what I've to say, but this is one of the most important features in Julia, so I think it's extremely important that there are several iterations for the syntax to make all people feel comfortable with it.

As much as I like JeffBezanson's proposal, I didn't grasp the syntax right away.
It reminded me of my first horrors with Scala, as they make it incredibly easy to disguise the type of an argument and make things nearly unreadable. (Horrors I never had with Julia)
I must admit, that I don't have a killer proposal, as JeffBezanson's proposal is already pretty damn fine.
But I'll try to verbalize my unconscious rejection of the proposed syntax.

  • First problem:

    • I think one parses a type top down, meaning, one would like to start with MyType, and then get to know the parameters, and parameters of parameters.

  • Second:

    • To many Curly Braces and other symbols in one line.

  • Third:

    • It's not really perceived as one item, but rather as a chain of items.

      This is confusing, as you're defining only one item, namely the type.

Some proposals I quickly came up with:

# As a reference, the original proposal
::{T<:Real}->{A<:AbstractVector{T}} -> MyType{T,A}

# Leaving away curly braces and wrapping everything in brackets, to make it feel more like a type
::(T<:Real -> A<:AbstractVector{T} -> MyType{T,A})

# Introducing new parameters on the fly. (Maybe not easy to parse, or ambiguous?)
# Also you could define ambiguous parameters, with more than one type.
::MyType{T<:Real, A <: AbstractVector{T}}, ::SecondType{T, F <: FloatingPoint}

# That's already it (Maybe I can come up with more) 
# To compare them better visually, here are all of them side by side:
::{T<:Real} -> {A <: AbstractVector{T}} -> MyType{T,A}
::(T<:Real -> A <: AbstractVector{T} -> MyType{T,A})
::MyType{T <: Real, A <: AbstractVector{T}}

The first thing that I notice is, that putting MyType in the first place is a lot easier on the eye, as you immediately know the type everything is about, which helps to put things into context....

I hear you on the excessive amount of punctuation. However one problem we
really need to solve is clarifying what parts of a type parameters are
bound in. I agree starting with the type name is easier to read, but as far
as I can tell this makes it impossible to solve this problem. We could have
some rule about how far "out" type variables apply, but in more complex
cases this would be even harder to understand.

~2 years later, what are the thoughts on this now?
(Given as @IainNZ points out, I just made a duplicate proposal: #16086)

Still wanted. Part of the type revamp issue: https://github.com/JuliaLang/julia/issues/8974

+1 for just supporting f(x::Vector{<:Real}) and similar simple non-nested cases and punting on everything else. All of the -> proposals seem like a cure worse than the disease here.

We don't need a sugar that covers all possible cases, since in more complicated cases you can always fall back onto the current syntax f{T<:Real}(x::Vector{T}) or define a typealias.

Should this be added to the 1.0 milestone?

With the new where syntax, Array{<:Number, 3} can just be sugar for Array{T,3} where T<:Number.

Someone might have already mentioned this but I think an issue for this syntax is that it is unclear where the where should be. Array{Array{T where T<:Number,3},3}, Array{Array{T,3} where T<:Number,3}, Array{Array{T,3},3} where T<:Number all have different meanings.

It seems pretty clear to me that the where should go with the immediately enclosing type. e.g. Array{Array{<:Number,3}} should be Array{Array{T,3} where T<:Number}, surely? Anything else would be a messy context-dependent meaning, no?

You can always use where explicitly if you want something else.

Just pushed a PR. This is pretty simple to implement given the new where machinery.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

omus picture omus  Â·  3Comments

felixrehren picture felixrehren  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

m-j-w picture m-j-w  Â·  3Comments

ararslan picture ararslan  Â·  3Comments