Language: Feature: Sound declaration-site variance

Created on 14 Aug 2019 · 20Comments · Source: dart-lang/language

This is the tracking issue for introducing a sound mechanism for declaration-site variance in Dart.

Background: The original request motivating this kind of feature is #213; the initial proposal for declaration-site invariance is #213. The initial proposal for the related feature known as use-site invariance is #229, and the corresponding tracking issue is #753.

The text below describes properties of this feature which are good candidates for being adopted. Many things can still change, and a full feature specification will be written and used to manage the discussions about the final design.

Variance in Dart Today

As of Dart 2.4 or earlier, every type variable declared for a generic class is considered covariant. The core meaning of this is that a parameterized type C<T2> is a subtype of C<T1> whenever T2 is a subtype of T1. Other subtype rules can then be used to show subtype relationships like List<int> <: Iterable<dynamic> and Map<String, String> <: Map<Object, Object> <: dynamic.

This type rule is not sound; that is, in order to maintain heap soundness it is necessary to check certain types dynamically. This means that a program with no compile-time errors can fail with a type error at run time.

For instance, with the declaration List<num> xs and some expression e with static type num, it is necessary to check during evaluation of xs.add(e) that the value of e actually has the type which is required by xs: It is possible that it is a List<int> or even a List<Never>, and it would then be a dynamic type error if the value of e is a double, even though the expression had no type errors at compile-time.

Unsound covariance enables many software designs that would be rejected by a traditional sound approach to variance (e.g., as in Java or C#). This allows developers to make a trade-off between more flexible types (e.g., a variable of type List<num> is allowed to refer to a List<int>) in return for accepting the potential dynamic type errors (a List<int> _will_ work safely under the type List<num> in a lot of ways, just not all).

We want to enable a sound typing discipline for variance as well (rejecting more programs, but providing a compile-time guarantee against the run-time type errors described above). This feature is concerned with the provision of support for that.

Declaration-site Variance

Declaration-site variance can be used to declare a strict and sound treatment of variance for each type variable of a generic class.

Syntactically, declaration-site variance consists in allowing each type parameter declaration of a generic class declaration to include one of the following modifiers: out, in, or inout. We say that such a type parameter _has explicit variance_.

The use of type parameters with explicit variance in the body of the enclosing class is restricted. It is a compile-time error for a type variable marked out to occur in a non-covariant position in the signature of a member declaration; and for a type variable marked in to occur in a non-contravariant position. For example:

abstract class Good<out X, in Y, inout Z> {
  X get m1;
  void set m2(Y value);
  Z m3<U extends Z>(List<Z> zs);
}

class Bad<out X, in Y> {
  Y get m1; // Error.
  void set m2(X value); // Error.
  Y m3<U1 extends X, U2 extends Y>(List<X> xs); // Error.
}

Here are some core properties of declaration-site variance:

We obtain the following subtype relationships: Let C be a generic class with one type parameter X. Assume that S is a subtype of T. If X is marked out then C<S> <: C<T>; if X is marked in then C<T> <: C<S>. _Note that there is no subtype relationship between C<S> and C<T> if X is marked inout, unless S == T._

A type parameter with explicit variance can be used in the specification of a superinterface. For example:

class A<out X, in Y, Z> {
  X get m;
}

class B<out X, inout Y, in Z> implements A<X, Y, Z> {}

Soundness is ensured via a number of rules like the following: It is a compile-time error if a type parameter X marked out occurs in a non-covariant position in an actual type argument for a superinterface D when the corresponding type parameter of D is marked out; and if X occurs in a non-contravariant position in an actual type argument for D when the corresponding type parameter is marked in; and if X occurs at all in an actual type argument for D when the corresponding type parameter is marked inout.

The interaction with unsoundly covariant type parameters is similarly guarded: It is a compile-time error if a type parameter X marked out occurs in a non-covariant position in an actual type argument for a superinterface D when the corresponding type parameter of D has no explicit variance.

In return for all these restrictions, we get soundness: For a class where all type parameters have explicit variance, every (non-dynamic) member access which is statically type correct is also dynamically safe (no type checks on parameter types etc. are needed at run time).

In general, declaration-site variance can be used for classes which are intended to be strictly type checked with respect to variance everywhere.

feature variance

Source

eernstg

👍21 ❤9 🚀6 🎉6

Most helpful comment

@leafpetersen : I find the word inout confusing. It's difficult to see how this word may carry the meaning of "exact".
To illustrate, let's compare 2 notations.
In scala,
+ means "covariant"
- means "contravariant"
= means "invariant"
These names (+ and -) are easy to explain: "plus" stands for "MORE specific" (hence the sign)
Minus stands for "LESS specific" (hence the sign). so in one case S>=T, in another S<=T, and clearly
if we have simultaneous relations S>=T and S<=T, then the only solution is S==T, hence the = sign in the third case.

But as soon as we start using names like "in" and "out", we immediately lose the intuition of the "direction of specificity", and cannot see the otherwise obvious fact that "in" and "out" simultaneously mean "invariant" - it may as well mean nothing at all (no restrictions). So, as @mockturtl correctly pointed out, my problem is with the wording. This is not my last problem, unfortunately.

Let's consider your example with addToList. What will our new program look like to prevent this bad outcome? Where "in", "out" and "inout" should be added? And what line in the code will be flagged statically after we do these changes?

(By the way, what combination of "in" and "out" characterizes the current default behavior in dart?)

tatumizer on 23 Dec 2019

👍3

All 20 comments

First draft of feature specification at PR #557.

eernstg on 2 Sep 2019

The "exact" type already exists in Dart in certain places - the type of a literal or object creation expression may have an exact type, which is why List<int> l = <num>[1]; is a compile-time error instead of a run-time downcast failure.
I assume those cases can be subsumed by exact types and still behave the same.

For use-site invariance, we will have both List<Foo> and List<exact Foo> as instantiations, with the latter being a subtype of the former. We don't actually introduce a nww type called exact Foo, just new syntactic forms for type arguments, and a suitably expanded type relation between instantiated generic types. We can say that we introduce exact Foo, but it's not a type, just a type pattern.

As stated, the type arguments can only be used in type pattern expressions (type annotations), not class expressions or literals.
That is, you can write List<exact Foo> x; as a type annotation, but not class C extends Iterable<exact int> or new List<exact int>(). Those make no sense, they are always "exact" in that they are run-time invocations with a single value.

The following should be allowed:

List<exact String> foo(List<exact int> list) => (list..add(42)).map((x) => "$x").toList();

It is meaningful and useful.

What about:

class C<T> {
  List<exact T> mkList() => <T>[];
}

Is this meaningful? Maybe. If you have a C<num> c, then you only know that the result of c.mkList is List<num>. If you have C<exact num> d; then the static type of d.mkList is (or should be) List<exact num>. If the code had not written exact T in the return type, then it would not have been true. The method:

  List<T> mkList() => <T>[];

will never have a static return type of List<exact anything>, not even for D<exact num>.mkList().

So, we need some substitution rules for type arguments to make this work.
exact T[exact S/T] -> exact S
exact T[S/T] -> S
T[exact S/T] -> S (?)

It also suggests that a lot of platform libraries should be changed to have exact in their return type, because they are really intended to be exact. Take: Set<T>'s Set<T> union(Set<T> other). Here we should make it Set<exact T> union(Set<T> other), so that if I have a Set<exact num> then the union returns another Set<exact num>.
(I just realized that I have written exact instead of exactly everywhere. I'm not particularly fond of all these keywords, but shorter seems better :)

That suggests a migration issue. If we change some platform libraries to return, say, List<exact String>, then any existing implementation of the same interface will no longer be valid when it returns merely List<String>, a super-type of the required return type. We may want to have an NNBD-like migration process where legacy libraries are accepted for a while, until everybody has migrated.

If we need migration anyway, then we don't need to make the syntax backwards compatible, and we could do something else, like:

class C<+T , -S, =U> {  // out T, in S, inout U
   List<+T> mkList() => <T>[]; // not exact T
   List<U> mkList() => <U>[]; // exact U
}

We do get the inherent complexity of a two-tier type system where it matters whether you write exact int or int on every type argument in your API. You have to get it right, otherwise you make lock yourself out of possible later changes.
If you say foo(List<exact num> arg) then you can't add elements to arg later, and you prevent anyone with merely a List<num> from calling you. They'll have to do list.cast<exactly num>() to convert their list.

I expect exact types in arguments to be rare (you really do need to do modification), and for them to be common in return types.

For declarations site variance, we already have something similar for function types. For those, the variance is automatically computable from the co-/contra-variance of the occurrences of the type arguments.
We could do the same for class parameters, but that would break all our existing unsafely covariant classes. We could do it anyway, and require migration which would effectively mean that all existing type arguments become covariant, and we write an explicit covariant on all existing contravariant occurrences (well, the ones in parameters, we can't help the ones in function return types).

lrhn on 3 Sep 2019

The "exact" type already exists

Right. The notion of an _exact type_ is not specified, but it is used in some cases by our tools (even to raise an error for certain "impossible" casts).

The notion of a _type argument which is known exactly_ is a different concept (an object can have dynamic type List<exactly T> even though it is an instance of some proper subtype of List). I just checked the text above to make sure that it doesn't ignore that distinction, and adjusted a couple of sentences.

For use-site invariance, we will have both List<Foo> and List<exact Foo> as instantiations

Both of those are instantiations of the generic type List (where instantiation of a generic type means providing actual type arguments), but we will not have an object whose dynamic type is List<Foo>. If the dynamic type T of a given object o is such that List<Foo> is one of the superinterfaces of T then o has type List<exactly Foo>. In general, every type argument in the dynamic type of an object is exactly something, and the superinterfaces will carry this property with them.

class A<X, Y> {}
class B<X> implements A<X, int> {}

With that, B<String>() has dynamic type B<exactly String>, and it is also of type A<exactly String, exactly int>.

We may or may not want to let Type.toString() reveal this bit, but it _must_ be present in the dynamic representation of types in order to maintain soundness.

we introduce exact Foo, but it's not a _type_, just a type pattern

I'd prefer to say that a type accepts type arguments, each of which may have the modifier exactly, which also implies that exactly Foo is not a type.

the type arguments can only be used in type pattern expressions (type annotations),
not class expressions or literals.

exactly can be used on type arguments, e.g., <List<exactly num>>[], but not on types, and we need to make the distinction that "type arguments" given prescriptively are types. So List<exactly num> is OK as a type annotation, but List<exactly num>() as an instance creation is not, and <exactly num>[] is not; but when exactly is nested one level deeper then it is again a type argument, which is the reason why <List<exactly num>>[] is OK.

class C<T> {
  List<exactly T> mkList() => <T>[];
}

That can be allowed, and would be safe, but the computation of the member access type must take into account that the statically known value of T is just an upper bound.

main() {
  C<num> c = C<int>();
  List<num> xs = c.mkList(); // Static type of `c.mkList()` is `List<num>`.
  List<exactly num> ys = c.mkList(); // Error (downcast).
}

Of course, with class C<inout T> we would have an uninterrupted chain of exactness, and c.mkList() would have static type List<exactly num>.

we need some substitution rules

I believe the most straightforward way to get this right is to consider the types of member accesses, with a case for each variance of the type parameters of the class. Type parameters with no variance modifier are the most complex ones, of course, because they are allowed to occur in any position in a member signature.

We may want to have an NNBD-like migration process where legacy libraries
are accepted for a while

Indeed.

we don't need to make the syntax backwards compatible

Right, but C<=X, -Y> x; may not be optimally readable (and in, out, inout isn't all that verbose). In any case, that's probably not more breaking than the keywords.

do get the inherent complexity of a two-tier type system where it matters
whether you write exact int or int on every type argument in your API

I think that wouldn't ideally be such a common situation: The declaration site variance modifiers are required to match the usage (so if you have a type parameter out E then it just can't be the type annotation of a method parameter), and a subtype receiver will have a subtype member (including: less specific parameter types and more specific return types).

The use of exactly in a member signature would be specifically concerned enhancing the type safety in the management of legacy types (with type parameters that have no variance modifiers).

I would expect the reasonable trade-off to be somewhere _between_ just using the legacy types as they are (with the current level of type safety, which has been used in practice for years without complete panic) and teasing out the very last drop of static type safety, by adding exactly as many places as possible. In any case, there are rules for eliminating exactly from interface type members such that the resulting typing is sound.

eernstg on 3 Sep 2019

I more like the +T and -T syntax in Scala

hepin1989 on 20 Dec 2019

I more like the +T and -T syntax in Scala

+1. Looks more "mathy". I think it's better aligned with the overall dart style. Not sure how they denote "in out" though. Is it +- ? Or = ?

tatumizer on 20 Dec 2019

The current in out looks like a primary student and verbose.
Kotlin and C# is using in and out too，but reads badly.

hepin1989 on 23 Dec 2019

Variance is a fascinating issue. One of those things that take 3 years to learn, and then someone asks you to do the dishes, and 20 minutes later, you are back to square 1 in your studies.
Reading stackoverflow discussions on variance is an entertaining pastime. There are 2 sorts of posters: simpletons (those who are wondering what the hell variance is) and experts (those kindly explaining it to them). One thing they have in common is that no one has the slightest clue. Rare comment from an expert comes without a whole sub-thread beneath it that openly calls out said expert as an ignorant imposter.

I suspect in this thread we have a couple of people who know what they are talking about :-)
@eernst: lets's look at your example:

class Bad<out X, in Y> {
  Y get m1; // Error.
  void set m2(X value); // Error.
  Y m3<U1 extends X, U2 extends Y>(List<X> xs); // Error.
}

Question 1:: designations "out" and "in" - who will benefit from them? Is it the AUTHOR of the Bad class, or the USER of said class? Sure, user will benefit, too, if the author makes fewer mistakes, but other than that?

Question 2: you seem to be leaning towards the use of "inout" as a synonym of "exact", but I have difficulty understanding the reasoning leading from "inout" to "exact". After all, according to your definitions, "in" means that the argument can occur in "in" position (whatever that means), and "out" means that it can occur in "out" positions (whatever that means), but "inout"... I would guess it can occur in both positions, but it's a very long way from saying "exact" b/c "inout" looks LESS restrictive than either of "in" and "out" individually, whereas "exact" is certainly something more restrictive.

tatumizer on 23 Dec 2019

Is it the AUTHOR of the Bad class, or the USER of said class? Sure, user will benefit, too, if the author makes fewer mistakes, but other than that?

The primary beneficiary is the USER. The List class is one of the primary examples of something that should be invariant, but is instead unsoundly covariant. We have numerous users who are frustrated that this results in unexpected runtime errors that could have been caught at compile time. Enclosed below is an innocuous looking test program that fails at runtime instead of at compile time. Sound variance allows users who strongly prefer to rule these errors out at compile time to do so.

void addToList<T>(List<T> target, List<T> src) {
  for(var x in src) {
    target.add(x);
  }
}

void test(List<num> nums) {
  addToList(nums, [3, 4, 5]);
}
void main() {
  List<double> doubles = [3.5];
  test(doubles);
}

leafpetersen on 23 Dec 2019

👍1

Question 2: you seem to be leaning towards the use of "inout" as a synonym of "exact", but I have difficulty understanding the reasoning leading from "inout" to "exact".

Reading between the lines of this question, I think you are missing the primary effect of this feature, which is that it changes the subtyping relation between types. Describing inout as exact is sensible because given class Invariant<inout X> ... we know that any variable of type Invariant<num> contains only objects that were allocated as Invariant<num> or a direct subclass thereof. Specifically, it will never contain an Invariant<int>, nor an Invariant<Object>. In this sense then, it contains objects whose generic parameters are exactly as described by the type. Hence exact. Does that help?

leafpetersen on 23 Dec 2019

Describing inout as exact is sensible

I think tatumizer is suggesting exact will make a better (easier to understand) syntax for invariance. (Compare: "let xy mean neither x, nor y.")

mockturtl on 23 Dec 2019

(By the way, what combination of "in" and "out" characterizes the current default behavior in dart?)

tatumizer on 23 Dec 2019

👍3

I find the _word_ inout confusing. It's difficult to see how this word may carry the meaning of "exact".

Ah, sorry, I misunderstood your question. Here's the intution:

a type variable labelled in may only be used to pass things in to methods
- void add (T x)
a type variable labelled out may only be used to pass things out of methods
- T get first
a type variable labelled inout may be used to pass things in to and out of methods
- T swap(T x)

Does that help?

(By the way, what combination of "in" and "out" characterizes the current default behavior in dart?)

Neither. Classes are allowed to use type variables everywhere in a method, which corresponds most closely to inout, but subtyping is covariant, which corresponds to out.

Let's consider your example with addToList. What will our new program look like to prevent this bad outcome? Where "in", "out" and "inout" should be added? And what line in the code will be flagged statically after we do these changes?

For completeness, here is the example rewritten to use an example of an invariant implementation of List. The call to test on the second line of main is statically rejected.

import 'dart:collection';

class Array<inout T> extends ListBase {
  List<T> _store;
  int get length => _store.length;
  void set length(int i) => _store.length = i;
  T operator[](int index) => _store[index];
  void operator[]=(int index, T value) => _store[index] = value;
  Array.fromList(List<T> this._store);
}

void addToList<T>(Array<T> target, Array<T> src) {
  for(var x in src) {
    target.add(x);
  }
}

void test(Array<num> nums) {
  Array<num> ints = Array.fromList([3, 4, 5]);
  addToList(nums, ints);
}
void main() {
  Array<double> doubles = Array.fromList([3.5]);
  test(doubles);
}

leafpetersen on 23 Dec 2019

@leafpetersen : I understand your example. At least I think I do.
Here's my interpretation of what's going on here (please correct me if I'm wrong).
The line class Array<inout T> says that the object returned by the constructor of Array has to be taken with exactly the same parameter type as this constructor got from its input, that is: <double>. In the first line of "main", we see that it's taken correctly. Then, when we try to pass it as Array<num>, the compiler looks into the definition of Array<inout T> and says: wait... we can't use this cast because the definition prohibits that! We can pass around this object only as Array<double>, not Array<num>!

So far so good. But not quite, and certainly not for everybody. Because I'm just lucky to personally know from this thread that 1) "inout" means "exact" (I would never guess so) 2) Due to your example I made a plausible conjecture that the declaration class Array<inout T> means: the object returned by the constructor of the class has to be taken with exactly the same type parameter as I passed on input. This conjecture is not trivial at all. Because in your 3-point explanation, as well as in every explanation I saw in stackoverflow, they were talking about "in" and "out" positions. Positions??? Where??? We don't have any "positions" here. We are talking about the constructor. My prior (wrong, and I knew it was wrong)) understanding was: no matter in which method of the class we see "T", it should be taken as is. This is certainly not true! Your definition has a method T operator[](int index) => _store[index];, but according to the class definition, we have to take T as is, so we cannot assign num x= array[0]. But I don't believe that. This simply can't be true!
Please confirm and explain how "positions" are relevant here at all.

Now - back to the question of why "inout" means "exact".

type variable labelled inout may be used to pass things in to and out of methods
T swap(T x)

Here's the problem. I can't see anything in this statement that says "exact". Probably there's way to hand-wave it into "exact", but why? Why not call "exact" exact by simply saying exact or, better yet, =. True, as soon as we say =, we will have to replace "in" with "+", and "out" with "-", but it's the change for the better anyway, no?

tatumizer on 24 Dec 2019

@tatumizer I think this is getting a bit far afield - variance is definitely a confusing subject, but I don't think we're going to fix that here. Is it fair to summarize your general take here as being that you find + - = more intuitive than out in inout? Or are you expressing a preference for something different than that?

leafpetersen on 24 Dec 2019

I don't feel to be in a position to express a definitive preference b/c I'm not sure in anything, but FWIW that would be a correct summary. This notation might be a better choice to help users understand the concept IMO.
Thanks for your patience, I appreciate it! :-)

tatumizer on 24 Dec 2019

repeat the in 、out and inout here and there make me looks like a fool; why not make use of inputParameter T outputParameterR?

hepin1989 on 24 Dec 2019

@leafpetersen I think the - and + style are more concise and the in and out style are more easy to understand.

hepin1989 on 24 Dec 2019

👍1

Having experienced this feature in Kotlin I _strongly_ recommend we never go there. It is completely impenetrable to most developers. If you need to know type algebra to be able to use a feature then that feature doesn't, IMHO, belong in Dart.

Hixie on 29 Jul 2020

@Hixie Just to be sure we're on the same page: Kotlin has two different kinds of variance - this kind (declaration-site) which is what C# uses, and which I don't usually think of as requiring a lot of type algebra and use-site variance which is what Java uses, and which does require introduce some serious type algebra. And the combining the two, yeah, that gets pretty wild pretty quick. Are you specifically commenting on declaration site variance (marking type parameters classes as usable as in, out, or inout), or on the Kotlin style combination?

leafpetersen on 29 Jul 2020

👍1

The declaration-site version.

Hixie on 4 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings