Language: NNBD support for generic functions where the result nullability depends on if an optional parameter is passed or not

Created on 12 Feb 2020  Â·  20Comments  Â·  Source: dart-lang/language

A common pattern in Dart is to have an "orElse" callback on functions that needs to have a fallback behavior.

A concrete example is the Iterable.firstWhere:

final value = [1, 2, 3].firstWhere((i) => <condition>, orElse: () => <fallback>)

Now, this is not specific to Iterable.firstWhere and is very common to Dart in general.\
NNBD doesn't cause any issue with Iterable.firstWhere specifically, as by default, it will throw if it needs a fallback but no orElse was provided.

The problem comes with a variant of that pattern, where instead of throwing, the default behavior of their orElse would be: () => null.

A typical implementation would be:

T doSomething<T>({T orElse()}) {
  // some logic

  if (orElse != null) {
    return orElse();
  } else {
    return null;
  }
}

The issue with that variant is that it is impossible to migrate to NNBD without some inconvenient change.

This pattern gets stuck between two un-ideal solutions:

  • either make orElse required:
    dart T doSomething<T>({required T orElse()});

This is not ideal because it creates a lot of duplicate code.\
We suddenly have to add tons of orElse: () => null.

  • or make the return type always nullable:

    T? doSomething<T>({T orElse()});
    

    This is not ideal either because if an orElse is specified, then the result may effectively never be null but will still be considered as nullable because of a language limitation.

So in the end, with NNBD enabled we either have pointless boilerplate or an incorrectly inferred type.

request

All 20 comments

One way to think about this is that we'd want two function types of the form X Function<X>({required X orElse()}) and X? Function<X>(), where the first one ensures that there is a way to get an X, and the second one does not provide this feature and consequently may return null.

In the first type we probably don't want to specify that the type argument is non-nullable, that is, X extends Object. If the type argument is nullable then the second function type could be used, but if we really want to pass an orElse argument and use a nullable type then it seems a bit too strict to make it an error.

But with two distinct types that we can't immediately merge to something that has all the desired properties, maybe they are just two different functions/methods?

One way to bridge the gap would be to use a case function (of course, we don't have case functions, yet):

X? doSomething<X>({X orElse()}) {
  X? case() => ...
  X case({required X orElse()}) => ...
}

This would allow call sites with sufficient type information to choose either case and get the desired return type (no nulls when orElse is provided and X is non-nullable), and it would preserve the slightly less precise typing `X? Function({X orElse()}) in cases where the call site does not provide sufficient type information to justify a static choice.

The suggested syntax looks confusing. It is weird to have a return type defined on doSomething if it is ultimately dictated by the case inside.

But other than that, that's an interesting idea!

The point is that we want to preserve the notion of having a single function (e.g., for dynamic invocations, or any usage that doesn't justify a choice among multiple entities with the same name), so there is an "overall" function, and it needs to have a return type and a run-time representation.

But then we may know more at particular usage points. When that is true, it is allowed for the compiler to resolve the invocation to the first case that matches and generate a call directly to a helper-function whose body is that case (so each case must also have a representation as an actual function at run time). This also allows us to rely on the specified return type for that case, and it is certainly possible that specific cases can have more specific return types than the overall function.

Of course, this implies that the cases must be part of the type of the function. We may or may not wish to enable this feature in function types of Dart in general. One approach which is less powerful, but maybe more comprehensible (and maybe more performant), is to say that case functions can only dispatch to specific cases when the function itself is statically resolved. One obvious step up would be to also support cases in instance methods, and allow for invocations to call a specific case, and then require all overriding declarations to preserve the set of cases and their ordering (but they could add new cases after the existing ones).

How would that apply to an IDE's type preview or the is operator?

That depends on the degree to which we include cases in function types. The most comprehensive level of support would allow all functions to be case functions, and all function types to contain a full list of cases. A "normal" function type would then simply be the ones we know today, and that would be a supertype of similar ones where we have added some cases.

Another way of migrating it to NNBD is to do the right thing: make it a breaking change, and instead of throwing by default, return null by default. It won't be terribly breaking, and at least will be consistent with map subscript, which returns null if the key is not there. The compiler will force the call site to handle null, so the problem won't go unnoticed.
We have tons of places where map[key] may return null, and a large proportion of these places have to be fixed (b/c they currently rely on the "knowledge" that the key is there). We (probably) have a much smaller number of places where a similar fix should be done for findFirst and friends.

make it a breaking change, and instead of throwing by default, return null by default.

That issue is about the functions that already return null by default.

It's about those functions that default to returning null would have to make their return type always nullable (even when it definitely cannot be null) or make the fallback callback _required_.

It's about those functions that default to returning null would have to make their return type always nullable (even when it definitely cannot be null)

My point is that the definition of firstWhere should be changed so that in the absence of orElse parameter, it would return null if the value satisfying the predicate is not found. Currently, it throws instead.

Let's forget about orElse for a second, consider only the cases when orElse is omitted.

Then it's difficult to explain why [1, 2, 3].firstWhere((x)=>x<0) throws. This is certainly not the case for much more common map subscript {"a":0}["b"], which returns null in the same situation. In both cases, the value satisfying the predicate is not found. How one "not found" is different from another "not found"? Why does it throw in one case, and returns null in another?

If you think along these lines, it's clear that for the sake of consistency, firstWhere should return null if the value is not found. Then its return type would naturally become T?. Note that orElse has nothing to do with it.

I agree.\
It's probably off-topic though. This suggestion should probably be made on https://github.com/dart-lang/sdk.

I don't think it's necessarily off topic. If firstWhere and friends are fixed, I'm not sure we have a whole lot of examples that fit into a definition of "generic functions where the result nullability depends on if an optional parameter is passed or not", which is exactly the topic of the current thread :-)

One thought I had during the design process was to allow a default value that wasn't valid for all possible type arguments, and then you would have to provide a better value if the default value wasn't valid. So:

static Null kNull() => null;
T first<T>(Iterable<T> values, [T orElse() = kNull]) { ... }

Then a first<int>([1, 2, 3]) would be invalid because there is no valid argument for orElse, and the default is not valid, but first<int?>([1, 2, 3]) would be valid.

That idea doesn't work. The problem is that default values are not part of function types, so if I do:

T Function<T>(Iterable<T>) f = first;

then it's seemingly valid, but it doesn't preserve the safety of the orElse parameter.

Er definitely do not want the default value to be part of the function type, because that would make a lot of otherwise compatible function types be different.

(Doesn't even work if we use the default value's type in inference when not passing it, for the same reasons).

So, any proposal here, for a very real problem we have had in the platform libraries too, has to solve the problem of the default value not being visible in the function type.

Idea

Could we let the type of the default value be part of the function type? Since we only care about the type, the exact value might not be necessary.

Then T Function<T>(Iterable<T>, [T Function() = Null Function()] would be the function type of first.

Let's define this function type as having a potentially unsafe default value type because it has a default value type which is not a subtype of the parmeter type.

If neither the parameter type nor the default value type referred to a type variable, then there'd be nothing potential about it. The it's just a compile-time error. The default value would never be a valid argument.

If the parameter type or the default value type contains a type argument, either from the function or from a surrounding function or class, and the default value type is not a subtype of the parameter type, then we include the default value's type in the function parameter's signature.

So type1 Function([type2 = type3]) can be a function type if the potentially unsafe default value type type3 is not definitely a subtype of type2 (then it would definitely be safe), and when it's not definitely not (then it would definitely be unsafe).
Such function types can only be invoked without the argument for that parameter if the default value type is a subtype of the parameter type at the invocation (when all the available type arguments have been supplied).

You can specify a potentially unsafe default value type that is completely unrelated to the parameter type, as long as at least one of them contains a type variable. We will not try to solve for whether it's possible to bind type variables such that the default value becomes valid.

We'd have to define subtype relations on such types. As usual, we'll want "soundly substitutable" as the underlying principle, so a subtype of the above type would be one which allows the same arguments.

Subtypes of T Function<T>(Iterable<T>, [T Function() = Null Function()] would include:

  • T Function<T>(Iterable<T>, [T Function()]) aka
  • T Function<T>(Iterable<T>, [T Function() = T Function()]. (A function where the second argument can always be omitted)
  • T Function<T>(Iterable<T>, [T Function() = Never Function()].

Potentially unsafe default value types in function types are covariant, and they are supertypes of corresponding safe function types.

Since all current functions are safe (you cannot declare a potentially unsafe default value in the current type system), introducing these extra function types should be non-breaking.

That is, until we start using them in the platform libraries. Even then, the constraints we'd introduce with NNBD won't break any legacy code because all legacy types are nullable.

I have not considered whether this introduces something bad into the type system (like, say, undecidability).

Summary:

You can declare function types with potentially unsafe default value types.

Type1 Function([Type2 x = Type3])

If Type3 is a subtype of Type2, this is just Type1 Function([Type 2 x]), and you'll get a warning or error if you write it anyway.

A function can be declared with a default value which is potentially not a subtype of the parameter type. Maybe that needs extra syntax

Null _kNull();
T first<T>(Iterable<T> values, [T orElse() <= _kNull]) {
  var it = values.iterator;
  if (it.moveNext) return it.current;
  return orElse();
}

The type of first is T Function<T>(Iterable<T>, [T Function() = Null Function()]).
Since Null Function() is not a subtype of T Function() in NNBD-world, the former allowed as a potentially unsafe default value type for the latter.

Calling first, or anything with the same type, without a second argument is only allowed if the default value type is a subtype of the actual parameter type of the invocation.

  • first<int>([1, 2]); – disallowed`
  • first<int?>([1, 2]); – allowed
  • first<int>([1, 2], () => 0); – allowed

Omitting the second argument means that the default value type can be used in inference:

  • var x = first([1, 2]); – allowed, and x has type int?.

Cool idea! ;-)

One thing to think about: When we rely on 'the actual parameter type of the invocation' and that could be determined by the choice of actual type arguments passed to the callee when that is a generic function, inference would generally be able to choose a super-type for some of the type arguments, and thus make the difference between a valid and an unusable default value.

Null _kNull();

T first<T>(Iterable<T> values, [T orElse() <= _kNull]) => ...

void main() {
  var x = first([1, 2]); // Succeeds, passing `int?` to `first` and to `[1, 2]`.
}

The developer who writes this might be happy because "it works", but the ? on the type argument passed to the list literal may come back and bite us later (that won't happen in the concrete example, but it could happen if that list were stored somewhere and used later).

This could serve as a warning about tractability and comprehensibility issues with inference. So we might prefer to ignore the potential default value during inference, and then simply make it an error if the default value is an error with that typing, and no actual argument is provided.

The underlying mechanism is (1) in some context there is an option to specify a default value, (2) a default value is specified, but for some configurations (e.g., for some values of some type variables) that default value is an error, so (3) we just consider the default value to be provided when it's not an error, and omitted when it is an error.

We could use this kind of mechanism in several different contexts, preferably with some syntactic marker (such that we don't just silently accept wrong default values all over the place). For example:

abstract class A<X extends num> {
  X foo() <=
  int foo() => 42; // Default implementation.
}

class B extends A<int> {}

The default implementation of foo would be ignored when it is not a correct override of the abstract one declared by X foo(), and it would be inherited by all classes (such as B) where it is correct. So the class B is OK, even though it's concrete and doesn't say anything about foo, because it will inherit the implementation returning int, because that works for B.

The next step could be to have a list of candidates and taking the first one that works. And so on. And along the way we need to consider when the complexity cost outweighs the benefits, of course. ;-)

The "actual parameter type" would still have to be the static type of the actual parameter of the function type being called, just after any known type variables have been instantiated.

For the nullable type coming back to bite us, I think null safety will actually make that issue go away. When the type of x is int?, the user will very quickly see that in the following code. It's not an accident, it's really the desired behavior that it becomes nullable when you do not provide a default.

It's a clever idea to use the same approach with interface methods or (please) interface default methods. If you are compatible, you get the default implementation, and if not, you don't.

This s also something we have always wanted for List.sort where the default value only works for comparable type parameters. If we had non-constant default values (still compatible with this idea because we only use the type), we could sort as void sort([int compare(E a, E b) = (Comparable<E> a, E b) => a.compareTo(b)) { ... }. Then you could only omit the compare parameter when the element type is comparable to itself. Nice!

var x = first([1, 2]); – allowed, and x has type int?.

What about this one:

var x = first([1, 2], orElse: ()=>null);

It shouldn't be allowed because the type of return value of orElse is Null, not int required by the signature of the method. But then, it will be difficult to explain why it works with the omitted parameter which by default is doing the same thing - returning null.

@tatumizer It will actually work because it will infer that R is int?. Then covariant generics allows you to pass a List<int> where a List<int?> is expected.

The inference for R first<R>(Iterable<R>, {R orElse()}) will have on context type, so it infers each parameter independently. That's a List<int> and a Null Function(), and the solution for R is then int?, which becomes the return type too.

Indeed, first([1, 2], orElse: ()=>"not an int") would succeed and return an Object.

@lrhn:

  1. how does the compiler know that orElse parameter possesses this magic property that its return type gets factored in in the return type of the overall function? What symbol serves as a marker?
  1. Why not make this marker explicit by introducing a new keyword? This would make it unnecessary to incorporate the default value into the type: T first<T>(Iterable<T> values, [magic orElse()])

  2. How can you explain to the user why map subscript returns null when the key is not found whereas firstWhere (in Iterable) and friends in the same situation require magic to return null? In other words: with your new definition, firstWhere will return null on an empty list. But the same could be achieved by simply returning null (for an empty list) when orElse is missing (currently, StateError is thrown instead, not clear why). The return type of firstWhere can be always T? - in the same way as the return type of map subscript is T?. The rest can be easily handled by the user's program, e.g.

var x = list.firstWhere(predicate) ?? throw "blahblah";
var x = list.firstWhere(predicate) ?? orElseExpression;
  1. Your idea boils down to introducing a notion of sum(T,R) for types T,R such that sum(String, Null) == String?, sum(num, int) == num etc - that is, the sum of 2 types evaluates to their closest common ancestor (roughly speaking). Won't it be better to make the notation more explicit:
    T+X first<T,X>(Iterable<T> values, [X orElse()]) - thus eliminating all magic?

About the "default implementation that uses a different type", this would only partially support my use-case.

On freezed, I generate a when/map method for pattern matching.

Such that for:

@freezed
abstract class Example with _$Example {
  const factory Example.person(String name, int age) = Person;
  const factory Example.city(String name, int population) = City;
  const factory Example.country(String name, int population) = Country;

}

We have:

Example example;

String name =  example.map<String>(
  person: (Person person) => person.name,
  city: (City city) => city.name,
  country: (Country country) => country.name,
  orElse: () => '',
);

Now, in the ideal world, with NNBD map should allow three cases:

  • All scenarios are passed and no orElse is passed:

    // name is not nullable
    String name = example.map<String>(
    person: (Person person) => person.name,
    city: (City city) => city.name,
    country: (Country country) => country.name,
    );
    

    This prevents from passing an orElse callback, as it would be never reached (although optional).

  • some scenarios are not handled, but an orElse is provided:

    // name is still not nullable
    String name = example.map<String>(
    person: (Person person) => person.name,
    orElse: () => '',
    );
    
  • some scenarios are not handled and no orElse are provided:

    // name can be null
    String? name = example.map<String>(
    person: (Person person) => person.name,
    );
    

Right now, if I want to achieve the same thing with NNBD I need to define three different functions when they are effectively three times the same thing.

This leads to confusing naming (map vs mapOrElse vs mapOrNull)

@tatumizer wrote:

how does the compiler know that orElse parameter possesses
this magic property that its return type gets factored in in the return
type of the overall function?

During inference of the actual type arguments for an invocation of a generic function, the context type may provide some constraints on the type variables based on the return type, and the static types of the actual arguments may provide some constraints based on the declared types of the formal parameters. So there is no special magic about taking the types of value arguments into account when inferring type arguments, which may in turn influence the return type of the invocation. However, the context type is given a high priority during inference, which may give the impression that the types of the value arguments are ignored during inference:

main() async {
  Map<X, Y> f<X, Y>(X x, Y y) => {};
  var map = f(1, true);
  print(map.runtimeType); // 'JsLinkedHashMap<int, bool>'.
  Map<num, double> map2 = f(2, 3);
  print(map2.runtimeType); // 'JsLinkedHashMap<num, double>'.
}

This shows that the types of the actual arguments fully determine the return type in the first call of f, but the context type actually makes 3 evaluate to a double in the second call. So information may go up or down, and the context type may "shadow" other things, but it's not a new thing per se to take the types of actual arguments into account during inference of type arguments to a generic function invocation.

You could say that it is an anomaly to infer a union type (like int?), and we actually don't infer "the other union type" (FutureOr), but we are able to get T? from inference today, with null safety:

void main() {
  f(bool b) {
    if (b) return null;
    return 42;
  }
  String s = f; // Error message reveals static type of `f` is `int? Function(bool)`.
}

So I don't think there's a need for new magic in order to handle the inference in the example.

@eernstg : how the user, while contemplating the declaration in dartdoc

T first<T>(Iterable<T> values, [T orElse() = kNull])

is supposed to guess that the return type of orElse gets mixed into the return type of the method, without looking hard into the implementation of said method?

There is a certain amount of guesstimation involved in this discussion because the declaration of first that you show is a compile-time error today, assuming null safety.

If we allow it and give it the meaning that the default value is considered to exist unless the inferred type makes it an error then you are right that we would need to have some extra support during inference in order to take kNull into account during inference. If we don't do that then we will indeed just choose the type argument int in some cases and determine that there is no default value, so it is an error to omit the orElse argument. In other cases we would have int? from the context type, and it would "just work".

So the question is not so much whether

the return type of orElse gets mixed into the return type of the method

because that's a standard property of type inference: When we have chosen a value for T/R/X then we may also have chosen a value for part of the return type because that type variable occurs in the return type. The question is really whether we'd require type inference to take this situation into account: (1) regular inference caused an error because there is no default value, (2) use the "potential" default value to obtain further constraints on type variables, run inference again (and in that case, in the example, we could potentially choose int? and allow the call).

It's certainly possible that we should rather avoid putting this kind of extra smarts into type inference, because that might eliminate exactly those cases that were described as questionable in this discussion.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

moneer-muntazah picture moneer-muntazah  Â·  3Comments

listepo picture listepo  Â·  3Comments

wytesk133 picture wytesk133  Â·  4Comments

lrhn picture lrhn  Â·  4Comments

mit-mit picture mit-mit  Â·  3Comments