Typescript: Suggestion: Opt-in distributive control flow analysis

Created on 18 Jun 2018 · 15Comments · Source: microsoft/TypeScript

Search Terms

distributive control flow anaylsis; correlated types; union call signatures; conditional types

Suggestion: Opt-in distributive control flow analysis

Provide a way to tell the type checker to evaluate a block of code (or expression) multiple times: once for each possible narrowing of a specified value of a union type to one of its constituents. The block (or expression) should type-check successfully if (and only if) each of these evaluations does. Any values or expressions whose types are computed or affected by any of these evaluations should become the union of that computed or affected by the evaluations... that is, it should be distributive.

This allows some JavaScript patterns that currently don't work in TypeScript.

This suggestion would introduce new syntax at the type system level... preferably something which would currently cause a compiler error so it's guaranteed not to break existing code. I propose the syntax type switch(x) and as if switch(x) below, just to have something concrete to use... better suggestions are welcome.

Use Cases

This would ease the pain of:

using "correlated" or "entangled" records in the absence of existential types #14466
calling overloaded functions on union types #14107 or #7294 without requiring the difficult general solution

Currently these can only be addressed either by using unsafe type assertions, or by duplicating code to walk the compiler through the different possible cases.

Examples

A "correlated" or "entangled" record type

Consider the following discriminated union UnionRecord:

type NumberRecord = { kind: "n", v: number, f: (v: number) => void };
type StringRecord = { kind: "s", v: string, f: (v: string) => void };
type BooleanRecord = { kind: "b", v: boolean, f: (v: boolean) => void };
type UnionRecord = NumberRecord | StringRecord | BooleanRecord;

Note that each constituent of UnionRecord is assignable to the following type for some type T:

type GenericRecord<T> = { kind: string, v: T, f: (v: T) => void };

This means that, for any possible value record of type UnionRecord, one should be able to safely call record.f(record.v). But without direct support for existential types in TypeScript, it isn't possible to convert a UnionRecord value to an exists T. GenericRecord<T>. And thus it is impossible to safely express this in the type system:

function processRecord(record: UnionRecord) {
  record.f(record.v); // error, record.f looks like a union of functions; can't call those
}

Right now, the ways to convince TypeScript that this is okay are either to use an unsafe assertion:

function processRecord(record: UnionRecord) {
    type T = string | number | boolean; // or whatever
    const genericRecord = record as GenericRecord<T>; // unsafe
    genericRecord.f(genericRecord.v);
    genericRecord.f("hmm"); // boom?
}

or to explicitly and unnecessarily narrow record to each possible constituent of UnionRecord ourselves:

const guard = <K extends UnionRecord['kind']>(
  k: K, x: UnionRecord): x is Extract<UnionRecord, {kind: K}> => x.kind === k;
const assertNever = (x: never): never => { throw new Error(); }
function processRecord(record: UnionRecord) {
    if (guard("n", record)) {
        record.f(record.v); // okay
    } else if (guard("s", record)) {
        record.f(record.v); // okay
    } else if (guard("b", record)) {
        record.f(record.v); // okay
    } else assertNever(record);
}

^ This is not just redundant, but brittle: add another GenericRecord<T>-compatible constituent to the UnionRecord, and it breaks.

While it would be nice if the type checker automatically detected and addressed this situation, it would probably be unreasonable to expect it to aggressively distribute its control flow analysis over every union-typed value. I can only imagine the combinatoric nightmare of performance problems that would cause. But what if you could just tell the type checker to do the distributive narrowing for a particular, specified value? As in:

function processRecord(record: UnionRecord) {
  type switch (record) {  // new syntax here
    record.f(record.v); // okay
  }
}

The idea is that type switch(val) tells the type checker to iterate over the union constituents of typeof val, similarly to the way distributive conditional types iterate over the naked type parameter's union constituents. Since the code in the block has no errors in each such narrowing, it has no errors overall. (Note that any performance problem caused by this should be comparable to that of compiling the "brittle" code above)

That is: if a value x is of type A | B, then type switch(x) should iterate over A and B, narrowing x to each in turn. (Also, I think that if a value t is of a generic type T extends U | V, then type switch(t) should iterate over U and V.)

Additionally, any value whose type would normally be narrowed by control flow analysis in the block should end up as the union of all such narrowed types, which is very much like distributive conditional types. That is: if narrowing x to A results in y being narrowed to C, and if narrowing x to B results in y being narrowed to D, then type switch(x) {...y...} should result in y being narrowed to C | D.

Another motivating example which shows the distributive union expression result:

Calling overloaded functions on union types

Imagine an overloaded function and the following oft-reported issue:

declare function overloaded(x: string): number;
declare function overloaded(x: number): boolean;
declare const x: string | number;
const y = overloaded(x); // expected number | boolean, got error

Before TypeScript 2.8 the solutions were either to add a third redundant-information overload which accepted string | number and returned number | boolean:

declare function overloaded(x: string | number): number | boolean;  // add this

or to narrow x manually in a redundant way:

const y = (typeof x === 'string') ? overloaded(x) : overloaded(x); 
// y is number | boolean;

Since TypeScript 2.8 one could replace the three overloaded signatures with a single overloaded signature using conditional types:

// replace with this
declare function overloaded<T extends string | number>(x: T): T extends string ? number : boolean;

which, when the argument is a union, distributes over its types to produce a result of number | boolean.

The suggestion here would allow us to gain this behavior of distributive conditional types acting at the control flow level. Using the type switch(x) code block syntax for a single expression is a bit messy:

let y: string | number | boolean;  // something wide
type switch(x) {
    y = overloaded(x); // okay
}
y; // narrowed to number | boolean;

If we could introduce a way to represent distributive control flow for an expression instead of as a statement, it would be the simpler

const y = (overloaded(x) as if switch(x));  // y is number | boolean

where the as if switch(x) is saying "evaluate the type of the preceding expression by distributing over the union constituents of typeof x".

Whew, that's it. Your thoughts? Thanks for reading!

Checklist

My suggestion meets these guidelines:

[x] This wouldn't be a breaking change in existing TypeScript / JavaScript code (new syntax is currently an error)
[x] This wouldn't change the runtime behavior of existing JavaScript code (has no effect on emitted js)
[x] This could be implemented without emitting different JS based on the types of the expressions (has no effect on emitted js)
[x] This isn't a runtime feature (e.g. new expression-level syntax) (has no effect on emitted js)

Duplicate

Source

jcalz

👍4

Most helpful comment

So, this was closed as a duplicate of #7294 which was recently closed via #29011. I don't think that fix (even after its caveats are addressed) will deal with correlated types. Could I reopen this? Or open a new issue to track narrowing of correlated types? I keep seeing questions about this but I don't know where to direct them for an official answer. Right now the answer is "use type assertions", but I'd love to see an official place for this issue, even if the suggestion here isn't the solution for it.

jcalz on 7 Feb 2019

👍4

All 15 comments

A few notes on this proposal.

we have always overlaid type system concepts on top of JS patterns. e.g. control flow analysis and type guards. And that has served us well. That first guaranteed a natural and ergonomic user experience, and second allows our analysis engine to understand plain JS. having a new syntax to call a function with a union type is not seamless nor ergonomic.
"without requiring the difficult general solution" i do not think that is true. i think the implementation issues with this proposal are more or less the same issues with https://github.com/Microsoft/TypeScript/issues/7294, namely that context sensitive declarations get a type based on the first context they are checked in.

I believe https://github.com/Microsoft/TypeScript/issues/7294 is still the right approach there. we just need to figure out the machinery to make it work.

mhegazy on 22 Jun 2018

Hmm, fair enough.. union-type function calls have other, more attractive potential solutions.

I still keep running into the correlated/entangled record issue and I'd love to have some idiomatic TS way of dealing with it, so if you have any idea there I'd be thrilled to hear it.

Thanks for your feedback!

jcalz on 22 Jun 2018

Another use case from this question not involving function calls:

interface Fish  {
  type: 'FISH',
}

interface Bird  {
  type: 'BIRD',
  flyingSpeed: number,
}

interface Ant  {
  type: 'ANT',
}

type Beast = Fish | Bird | Ant

function buildBeast(animal: 'FISH' | 'BIRD' | 'ANT') {
    const myBeast: Beast = animal === 'BIRD' ? {
        type: animal,
        flyingSpeed: 10
    } : {type: animal} // error
}

In this proposal, something like the following would leave the emitted JavaScript (essentially) unchanged and allow the code to compile in a non-redundant type-safe way:

function buildBeast(animal: 'FISH' | 'BIRD' | 'ANT') {
    const myBeast: Beast = animal === 'BIRD' ? {
        type: animal,
        flyingSpeed: 10
    } : ({type: animal} as if switch(animal)) // okay
}

jcalz on 27 Jun 2018

this is a different issue though. this works in the simple case of one property, i.e. { p: "A" } | { p: "B" } is equivalent to { p : "A" | "B" } but once you add another property it does not, i.e { p1: "A", p2: "A" } | { p1: "B", p2: "B" } is not equivalent to { p1: "A" | "B", p2: "A" | "B" }.

We could add special support for one property, thought it is not clear if the value added would warrant the cost needed to detect that there is only one union property in the type that can be merged.

mhegazy on 28 Jun 2018

I'm sure every particular case that could be addressed by this suggestion could instead be fixed by a cleverer type checker, and that each instance of cleverness needed could be seen as a separate issue (related records ≠ union function parameters ≠ single-property union distribution). But in the absence of such cleverness, it would be nice for developers to have a handle they can turn to lead the type checker through a more exhaustive analysis which could verify (or falsify) the type safety of these patterns.

jcalz on 28 Jun 2018

But in the absence of such cleverness,

I hear you. but our budget to add new features is tight. once something is added, it can not be removed when the compiler gets cleverer, so we tread slowly on these issues.

mhegazy on 28 Jun 2018

👍2

i think the implementation issues with this proposal are more or less the same issues with #7294, namely that context sensitive declarations get a type based on the first context they are checked in. I believe #7294 is still the right approach there. we just need to figure out the machinery to make it work.

Even so, I'm wondering if either #7294 (or the related #14107) would actually handle the use case presented here, namely the correlated nature of the two arguments to the function. For example,

function processRecord(record: UnionRecord) {
  record.f(record.v); // error, record.f looks like a union of functions; can't call those
}

wouldn't be fixed if I interpret record.f as a single function whose argument is string & number & boolean. I'm still forced to walk through each possible narrowing of record manually. Or am I missing something?

jcalz on 15 Jul 2018

wouldn't be fixed if I interpret record.f as a single function whose argument is string & number & boolean. I'm still forced to walk through each possible narrowing of record manually. Or am I missing something?

that works for a single argument function, but not if you have multiple arguments. gets more complicated once you through contextual types for context sensitive constructs (lambdas and object literals) into the mix. Overload resolution is a complicated piece of machinery.

mhegazy on 16 Jul 2018

Even in the single-argument function case, the perfect solution to unions/intersections of functions wouldn't address correlated values, right? It's an unrelated issue as far as I can tell.

In case I haven't explained it properly (and if anyone else is watching), the issue is that you have a value record whose property types exhibit something like a "joint distribution":

| ACTUAL POSSIBILITIES | typeof record.f =
(v: number) => void | typeof record.f =
(v: string) => void | typeof record.f =
(v: boolean) => void |
| ---: | :---: | :---: | :---: |
| typeof record.v = number | ✓ | | |
| typeof record.v = string | | ✓ | |
| typeof record.v = boolean | | | ✓ |

It is obvious from that chart that record.f(record.v) is type safe. But the compiler ends up treating the type of record.f to be independent of that of record.v. That is, it naturally takes a "marginal distribution" for each property, which throws away the relationship between the variables, and instead views record as if it were:

| LOSSY INTERPRETATION | typeof record.f =
(v: number) => void | typeof record.f =
(v: string) => void | typeof record.f =
(v: boolean) => void |
| ---: | :---: | :---: | :---: |
| typeof record.v = number | ✓ | ✓ | ✓ |
| typeof record.v = string | ✓ | ✓ | ✓ |
| typeof record.v = boolean | ✓ | ✓ | ✓ |

in which record.f might not accept an argument of type record.v.

Is there an existing issue filed somewhere which mentions this situation? Mostly I just want to have someplace canonical to point people when they run into it themselves.

jcalz on 16 Jul 2018

I do not think we are disagreeing on the desired behavior. Given call on a union of signatures, the compiler needs to verify that for every signature the call is valid, and the result is a union of the return types of the resolved signatures. When looking at a signature from one constituents of the union, say record.f in your example, we can narrow the type of record to match, this way the type of record.v would have the correct type and the overload resolution would succeed.
but again the challenge here is how to implement this in the current system.

mhegazy on 16 Jul 2018

👍2

That makes sense, thanks.

jcalz on 16 Jul 2018

Automatically closing this issue for housekeeping purposes. The issue labels indicate that it is unactionable at the moment or has already been addressed.