Language: Users want to define union or union-like APIs

Created on 18 Dec 2018 · 14Comments · Source: dart-lang/language

_Based on conversations with @yjbanov, @leonsenft, @mdebbar._

Currently, the Dart language lacks a way to provide static union or union-like semantics or APIs. Multiple other platforms take different approaches - anything from user-definable union types, algebraic/tagged unions, method overloading, and I'm sure other approaches we missed.

Let's look at two examples:

APIs that take nominal types A or B

void writeLogs(Object stringOrListOfString) {
  if (stringOrListOfString is String) {
    _writeLog(stringOrListOfString);
  } else if (stringOrListOfString is List<String>) {
    stringOrListOfString.forEach(_writeLog);
  } else {
    throw ArgumentError.value(stringOrListOfString, 'Not a String or List<String>');
  }
}

Problems:

No static type safety. The user can pass an Octopus, and only receive an error at runtime:

void main() {
  // No static error.
  // Runtime error: "Instance of 'Octopus': Not a String or List<String>".
  writeLogs(Octopus());
}

Relies on complex TFA for optimizations, which fall apart with dynamic access:

void main() async {
  // Inferred as "dynamic" for one reason or another.
  var x = something.foo().bar();

  // No static error. Even if it succeeds, all code paths are now retained (disables tree-shaking).
  writeLogs(x);
}

Solutions

A clever use can simply just write two functions:

void writeLog(String log) {
   _writeLog(log);
}

void writeLogList(List<String> logs) {
  logs.forEach(_writeLog);
}

... unfortunately, this now means you often need to think of convoluted API names like writeLogList.

Something like user-definable union types:

void writeLog(String | List<String> logOrListOfLogs) {
    if (stringOrListOfString is String) {
    _writeLog(stringOrListOfString);
  } else if (stringOrListOfString is List<String>) {
    stringOrListOfString.forEach(_writeLog);
  } else {
    // Bonus: Can remove this once we have non-nullable types.
    throw ArgumentError.null(logOrListOfLogs);
  }
}

... unfortunately this (a) Can't have different return types, and (b) might have complex side-effects with reified types (i.e. expensive performance reifying and storing writeLog<T>(T | List<T> | Map<T, List<T> | ....), and (c) just looks ugly compared to the rest of the language.

@yjbanov did mention a first-class match or when could help with (c), but not (a) or (b):

void writeLog(String | List<String> logOrListOfLogs) {
  when (logOrListOfLogs) {
    String: {
      _writeLog(logOrListOfLogs);
    }
    List<String>: {
      logOrListOfLogs.forEach(_writeLog);
    }
    Null: {
    // Bonus: Can remove this once we have non-nullable types.
    throw ArgumentError.null(logOrListOfLogs);
    }
  }
}

Something like user-definable method overloads (my preference in this scenario):

void writeLog(String log) {
  _writeLog(log);
}

void writeLog(List<String> logs) {
  logs.forEach(_writeLog);
}

... this solves all of the above concerns. It does not allow dynamic calls, but neither will static extension methods and neither do, say, named constructors or separate methods (used today), so I don't see this as a net negative.

APIs that structural types A or B

@dantup ran into this while defining Microsoft Language Service protocols. Imagine the following JSON:

// success.json
{
  "status": "SUCCESS"
}

// failure.json
{
  "status": "ERROR",
  "reason": "AUTHENTICATION_REQUIRED"
}

Modeling this in Dart is especially difficult:

void main() async {
  Map<String, Object> response = await doThing();
  final status = response['status'] as String;
  if (status == 'SUCCESS') {
    print('Success!');
  } else if (status == 'ERROR') {
    print('Failed: ${response['reason']}');
  }
}

You can write this by hand, of course, but imagine large auto-generated APIs for popular services. At some point you'll drop down to using code generation, and it's difficult to generate a good, static, model for this.

Problems

Let's imagine we get value types or data classes of some form, and let's even assume NNBD to boot.:

data class Response {
  String status;
  String? reason;
}

This _works_, but like the problems in the nominal types above, you need runtime checks to use the API correctly. This can get very very nasty on giant, popular APIs (like Microsoft's Language Service, but many many others including Google's own):

void main() async {
  var response = await getResponse();
  // Oops; this will never trigger, because we did not capitalize 'ERROR'.
  if (response.status == 'error') {
    print('ERROR!');
    return;
  }
  // Oops; this will print 'Yay: null' because success messages do not have a reason field.
  if (response.status == 'SUCCESS') {
    print('Yay: ${response.reason}');
    return;
  }
}

Solutions

One way this could be solved is having user-definable tagged unions.

TypeScript would model this as:

type Response = IResponseSuccess | IResponseFailure;

interface IResponseSuccess {
  status: "SUCCESS";
}

interface IResponseFailure {
  status: "ERROR";
  reason: string;
}

async function example_1() {
  const response = await getResponse();
  // Static error: "status" must be "SUCCESS" or "ERROR", got "error".
  if (response.status == 'error') {
    console.log('ERROR!');
    return;
  }
}

async function example_2() {
  const response = await getResponse();
  if (response.status == 'ERROR') {
    console.log('ERROR!');
    return;
  }
  // Automatically promotes "response" to "IResponseSuccess"!
  // Static error: "reason" does not exist on "IResponseSuccess".
  console.log('Yay: ', response.reason);
}

request

Source

matanlurey

👍34 ❤14 🎉11 👀1

Most helpful comment

Another random data point: I've been using C# again recently (for a silly hobby project), and wow, having static overloads is so nice. I'd forgotten how nice they are. Really helps with API design, too.

jmesserly on 21 Dec 2018

👍3

All 14 comments

It does not allow dynamic calls

Is that true? Can't dynamic dispatch to writeLog be implemented as a wrapper on top of the two functions? There's will be dispatch cost, of course, but we're talking about dynamic anyway. I don't think you're worried about method dispatch performance at that point. Without overloads you'd have to do type checks anyway, as your void writeLogs(Object stringOrListOfString) demonstrates.

yjbanov on 18 Dec 2018

Can't dynamic dispatch to writeLog be implemented as a wrapper on top of the two functions?

It could. It does mean though, for overloads at least, you do not know the return type. In practice I'm not sure this is worth it. If it was a feature specifically for trying to help migrate existing (non-overloaded) APIs to overload-based ones, I could see value in that.

matanlurey on 18 Dec 2018

In cases where the compiler can't statically determine which overload to call, it could use a union type for the return:

C foo(A a) {}
D foo(B b) {}

void bar(Object obj) {
  var result = foo(obj); // The compiler would infer the type of `result` as `C | D`.
}

mdebbar on 18 Dec 2018

@matanlurey

It does mean though, for overloads at least, you do not know the return type.

What does it mean to know the return type in dynamic dispatch?

yjbanov on 18 Dec 2018

@matanlurey

In practice I'm not sure this is worth it. If it was a feature specifically for trying to help migrate existing (non-overloaded) APIs to overload-based ones, I could see value in that.

I agree. I also do not see a lot of value in dynamic dispatch as of Dart 2. But that's different from saying that overloads do not support dynamic dispatch. They do. The question is whether we want it.

yjbanov on 18 Dec 2018

@mdebbar Right, that's a second type of dispatch. Unless @matanlurey and I misunderstood each other, we were talking about dispatching d.foo(a) where d is dynamic. What you are talking about is when a is dynamic. Both kinds of dispatches need to be decided upon.

yjbanov on 18 Dec 2018

👍1

I'm curious if the web platform APIs could provide use cases and example problems which we could add to this request?

sethladd on 18 Dec 2018

https://github.com/Microsoft/TypeScript/blob/master/lib/lib.dom.d.ts is a good source of web platform examples (look for | in that file).

yjbanov on 18 Dec 2018

There are a couple of separate pieces here that I want to try to tease out to understand better. That way we can be more precise about what the actual user need is.

Overloading

This is the ability to have two methods with the same name but different parameter lists. In your example, it's:

void writeLog(String log) {
   _writeLog(log);
}

void writeLog(List<String> logs) {
  logs.forEach(_writeLog);
}

One key question for this is whether overloads should be chosen dynamically or statically. Given:

Object log;
if (isMonday) {
  log = "A string";
} else {
  log = ["Some", "strings"];
}
writeLog(log);

Would you expect this to do the right thing on all days of the week? Or is this a static error because it doesn't know which overload to call at compile-time?

Which answer you choose has profound impact on the design...

Dynamic overloading

If the dispatch does happen at runtime, then you're talking about something like multimethods—runtime dispatch of methods based on the types of their parameters. This is a really cool, powerful feature. It's also very rare in object-oriented languages.

Doing this would let us do things in Dart that few other languages can do, but it could also be fiendishly complex. Consider:

int weird(String s) => 3;
bool weird(List l) => true;

main() {
  var fn = weird;
  Object unknown;
  var o = fn(unknown);
}

What is the static type of fn? What is the static type of o?

Static overloading

This is what C++, Java, C#, etc. do. It's definitely well-explored territory. It solves several real, concrete problems. For example, in Dart, adding a method to a base class may always be a breaking change because some subclass could have a method with the same name but a different signature. In the listed languages, that's much safer. If the signature is compatible, there's no problem. If it isn't, it just becomes a separate overload. The only risk if there's a compatible signature but an incompatible return type.

Static overloading also has a deserved reputation for adding a ton of complexity to the language. It complicates generics and implicit conversions, sometimes leads to exponential performance cliffs during type-checking, and confuses users.

Union types

This is the ability to define a structural type that permits all values of any two given types. That's:

void writeLog(String | List<String> logOrListOfLogs) {
    if (stringOrListOfString is String) {
    _writeLog(stringOrListOfString);
  } else if (stringOrListOfString is List<String>) {
    stringOrListOfString.forEach(_writeLog);
  } else {
    // Bonus: Can remove this once we have non-nullable types.
    throw ArgumentError.null(logOrListOfLogs);
  }
}

Dart has already taken steps in this direction with FutureOr<T> and will take more steps with non-nullable types. The plan is that a nullable type is effectively sugar for the union of the underlying type and Null. So int? means int | Null. The semantics fall out of that.

I wouldn't be surprised if we eventually get union types, though we don't have plans for it currently. (Non-nullable types will keep us more than busy enough for the immediate future.) Union types are nice, but don't solve as many problems as users think.

Consider +. You'd expect its declaration in the int class to look something like:

class int {
  int | double operator +(int | double rhs) => ...
}

But the union types aren't precise enough. This declaration loses the fact that 1 + 3 should have type int, not int | double. You really want to say "if the parameter type is int, then the return type is int. If the parameter type is double, then the return type is double."

Overloading can express that, but union types can't.

Literal types

The TypeScript example introduces an entirely new feature, singleton types that only contain a single value. That lets you use an == on a property value to determine the type of some surrounding object. It looks to me like a hint of dependent typing.

That's a lot of type system machinery to add, and I'm not sure how useful it is. It quickly falls down if you don't compare to actual literal values. It might be worth looking at, but I'd be surprised if it fit well within a more nominal language like Dart.

munificent on 18 Dec 2018

Thanks @munificent. I agree this is probably a few issues and needs more investigation.

Without a longer reply, my 2 cents:

Dynamic overloading is _cool_, but not necessary. With potentially implicit downcasts being disabled by default (or going away entirely), you'd have to cast with as in order to even invoke the multi-methods, which in turn means that you might as well just have static overloading only.
Literal types (i.e. tagged unions, @yjbanov will want to say more, I'm sure) are cool. I agree maybe they are a "lot" to add now to (mostly nominal) Dart, but they could potentially add a lot of value in our serialization story (JSON, ProtoBufs, etc).

matanlurey on 18 Dec 2018

148 introduces 'case functions', which is one way to handle the issues described here. Considering some points raised above:

@matanlurey wrote:

It does not allow dynamic calls

Case functions do allow that.

@munificent wrote:

you're talking about something like multimethods

Right, case functions rely on a simple, user-specified approach to disambiguation (so you won't ever get "ambiguous invocation" errors, which is otherwise a source of a long list of fine papers ;-).

It is guaranteed in some (but not all) cases that the semantics of a case function invocation is exactly the same for a statically resolved case and for a dynamically resolved case, and I expect that this could be subject to 'strict' warnings. For instance, sealed classes would give some useful guarantees.

So you could say that case functions are a pragmatic take on multimethods.

But the union types aren't precise enough. This declaration loses the fact
that 1 + 3 should have type int, not int | double.

When giving an argument of type int to a case function whose corresponding case has return type int, we would get the type int for the returned result. If the case is chosen dynamically then we may know less.

Literal Types.

It is probably not too hard to introduce constants as patterns for case functions. But we might want to design a general pattern declaration and matching feature first, such that we can use the same approach everywhere.

eernstg on 18 Dec 2018

I'm curious if the web platform APIs could provide use cases and example problems which we could add to this request?

Web APIs are littered with these.

Search for or on this page: https://firebase.google.com/docs/reference/js/firebase.firestore.Query
Search for DartName= in https://github.com/dart-lang/sdk/blob/master/tools/dom/idl/dart/dart.idl

WebIDL explicitly supports Union types - https://www.w3.org/TR/WebIDL-1/#idl-union - so this is always an issue when interfacing with web/JS apis

CC @sethladd

kevmoo on 19 Dec 2018

👍2

jmesserly on 21 Dec 2018

👍3

Would syntax sugar over callable classes works?

Right now we can achieve the equivalent of named constructors for functions using callable classes:

const someFunction = _SomeFunction();

class _SomeFunction {
  const _SomeFunction();

  void call() {}
  int customName() => 42;
}

which allows

someFunction();
int result = someFunction.customName();

It works but is not very convenient to write.

allowing . in the name of functions may be a good idea:

void someFunction() {}
int someFunction.customName() => 42;

The bonus point here is that since it's not actual method overload, dynamic invocation still works just fine.

rrousselGit on 14 Jan 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Syntax catching multiple exceptions in a single on block

jonasfj · 3Comments

question: Method returning a generic type

moneer-muntazah · 3Comments

Why are late instance fields not lazily initialized in constructors?

creativecreatorormaybenot · 3Comments

Infix function

marcelgarus · 3Comments

Specialize interfaces for specific generic types on a class

natebosch · 4Comments