Typescript: Annotate immediately-invoked functions for inlining flow control analysis

Created on 10 Oct 2016  ·  48Comments  ·  Source: microsoft/TypeScript

Per #9998 and its many offshoots, we have problems analyzing code like this

let x: string | number = "OK";
mystery(() => {
  x = 10;
});
if (x === 10) { // Error, we believe this to be impossible  
}

The problem is that we don't know if mystery invokes its argument "now" or "later" (or possibly even never!).

The converse problem appears during reads:

let x: string | number = ...;
if (typeof x === 'string') {
  // Error, toLowerCase doesn't exist on string | number
  let arr = [1, 2, 3].map(n => x.toLowerCase());
}

Proposal is to allow some form of indicating that the function is invoked immediately, e.g.

declare function mystery(arg: immediate () => void): void;

This would "inline" the function body for the purposes of flow control analysis

Needs Proposal Suggestion

Most helpful comment

@oswaldofreitas

not sure if it's related to this issue, anyway this is what happens with my code:

let resolver;
new Promise(resolve => (resolver = resolve));
console.log(resolver); // error: Variable 'resolver' is used before being assigned.ts(2454)

You can write:

// The `!` after the variable identifier tells TypeScript
// to treat this variable as always being assigned.
let resolver!: (value?: unknown) => void;
new Promise(resolve => (resolver = resolve));
console.log(resolver);

All 48 comments

I suggested something similar in https://github.com/Microsoft/TypeScript/issues/7770#issuecomment-216024556.

Though after thinking about it, I thought it was better to annotate callbacks as non-immediate https://github.com/Microsoft/TypeScript/issues/10180#issuecomment-238063683:

Immediate case:

function doSomething(callback: () => any) {
     callback();
}

function fn(x: number|string) {
  if (typeof x === 'number') {
    doSomething(() => x.toFixed()); // No error.
  }
}

Non-immediate case:

function doSomething(callback: async () => any) {
     setTimeout(callback, 0);
}

function fn(x: number|string) {
  if (typeof x === 'number') {
    doSomething(() => x.toFixed()); // Error x is  'number|string'
  }
}

Sync callbacks is only assignable to sync callbacks and vice versa for Async callbacks:

function doSomething(callback: () => any) {
     setTimeout(callback, 0); // Error a sync callback is not assignable to an async one.
}
function doSomething1(callback: () => any) {
     callback();
}
function doSomething2(callback: () => any) {
     doSomething1(callback); // No error
}

Though re-using async keyword as a type annotation might conflict with the meaning of async declarations in async/await which makes a function returning a promise. An alternative is to use nonimmediate, though I think that keyword is too long.
.

For reference, there was also some discussion of deferred and immediate modifiers in #9757.

@RyanCavanaugh you commented here that implementing this kind of type modifier would require a massive architectural change. Is that still the case?

For completeness #11463 proposed an alternative solution for this problem. Since that was just an "escape hatch" and this is an instruction to CFA to better understand the code, I would prefer this.

The use case where we have run into this is converting a Promise into a deferral and having to generate a noop function to get around strict null checks:

const noop = () => { };

function createDeferral() {
    let complete = noop;
    let cancel = noop;
    const promise = new Promise<void>((resolve, reject) => {
        complete = resolve;
        cancel = reject;
    });

    return { complete, cancel, promise };
}

const deferral = createDeferral();

Though after thinking about it, I thought it was better to annotate callbacks as non-immediate

While it is more "burden" on the developer, in the sense of "strictness" I still think it is safer to assume the worst (that it might not be immediate). Like with strictNullChecks overall, so I would be only in favour of an annotation that indicates that it is immediate.

Also 🚲 🏠 while it might be a tad bit confusing, why not sync to be a counter to async:

interface PromiseConstructor {
    new <T>(executor: sync (resolve: (value?: T | PromiseLike<T>) => void, reject: (reason?: any) => void) => void): Promise<T>;
}

I forgot to mention my points on why I think annotating it as async rather sync.

  1. Annotating it as immediate or sync means nearly all callbacks needs to be annotated, assuming most callbacks is sync. Async is the exception, sync is the rule.
  2. We already annotate async functions(as in async/await) as async. Sync functions has no annotations. So to keep consistent with the current design we should not annotate with immediate.

Maybe to minimize confusion that async only accept a Promise returning callback, a deferred keyword might be more desirable.

I think to cover all cases of _"used before assigned"_ checks you would need both a deferred and an immediate modifier.

The deferred modifier would be needed in cases like @kitsonk reported here where a non-immediately invoked callback references a variable that is initialized later in the calling scope (i.e. lexically later but temporally earlier). immediate wouldn't help with that (unless the compiler assumes that an un-annotated callback _must_ be deferred).

What about the interplay between immediate vs deferred invocation and synchronous vs asynchronous functions, which can occur in any combination?

function invokeImmediately<R>(cb: () => R) {
    return cb();
}

let p = invokeImmediately(async () => {
    console.log(1);
    await null;
    console.log(2);
});

console.log(3);

p.then(() => {
    console.log(4);
});

// Output:
// 1
// 3
// 2
// 4

Here we have immediate invocation of an asynchronous function (fairly common in real code). As shown by the output, the async function executes _synchronously_ until it either awaits or returns. Then the _rest_ of the function is deferred to another tick.

It would be pretty hard to do accurate CFA in this case with types and assignments involved, since some of the callback happens 'in line' with the surrounding code and some execution is deferred until later.

@tinganho,

Annotating it as immediate or sync means nearly all callbacks needs to be annotated, assuming most callbacks is sync. Async is the exception, sync is the rule.

In the new Promise() example, the executor function implements the Revealing Constructor Pattern. I wouldn't classify it as a callback — the constructor guarantees it's invoked immediately. Callbacks are often invoked in a future turn.

@yortus,

Here we have immediate invocation of an asynchronous function (fairly common in real code). As shown by the output, the async function executes synchronously until it either awaits or returns. Then the rest of the function is deferred to another tick.

It would be pretty hard to do accurate CFA in this case with types and assignments involved, since some of the callback happens 'in line' with the surrounding code and some execution is deferred until later.

Not knowing anything about how CFA is actually implemented, presumably it could still make a prediction up until the first await statement?

For both your points it would seem that sync doesn't quite cover the nuance of immediate _invocation_, so immediate may be a better keyword.

@novemberborn I'm assuming though no annotation is immediate?

@tinganho,

I'm assuming though no annotation is immediate?

How do you mean?

Still think it would be dangerous to have implicit immediate. There are many callbacks where it doesn't matter and if it does matter, the developer should be explicit in order for CFA not to make assumptions.

@novemberborn

In the new Promise() example, the executor function implements the Revealing Constructor Pattern. I wouldn't classify it as a callback — the constructor guarantees it's invoked immediately. Callbacks are often invoked in a future turn

In below, my arguments so far have been that no annotation means sync:

interface PromiseConstructor {
    new <T>(executor: (resolve: (value?: T | PromiseLike<T>) => void /* is sync*/, reject: (reason?: any) => void) => void): Promise<T>;
}

Still think it would be dangerous to have implicit immediate. There are many callbacks where it doesn't matter and if it does matter, the developer should be explicit in order for CFA not to make assumptions.

@kitsonk the CFA need to assume either implicit deferred or implicit immediate in case of no annotation.

Or are you arguing for that either immediate or deferred must be annotated in all callbacks? Or just implicit deferred is more preferable? In case of the latter, CFA still makes assumption.

@tinganho I don't think this discussion is about how the callback behaves, it's about whether it's called before or after the function it's passed to has returned. PromiseConstructor guarantees it always invokes the `executor callback before returning.

the CFA need to assume either implicit deferred or implicit immediate in case of no annotation

It could be conservative and make _neither_ assumption, consistent with present compiler behaviour. Otherwise all the existing code out there with no annotation would suddenly compile differently (i.e. assuming immediate invocation). Being explicit with immediate (and/or deferred) _opts in_ to extra CFA assumptions that the compiler would not otherwise make.

the CFA need to assume either implicit deferred or implicit immediate in case of no annotation.

No, it needs to assume what it assumes today, which is it could be either, therefore values could have changed, so reset the types.

And picking an example of a Promise resolve is exactly the sort of situation where you can run into problems assuming too much, since promises are always resolved at the end of the turn:

let foo = { foo: 'bar' };

new Promise((resolve) => {
        console.log('executor -->', foo.foo);
        resolve(foo);
    })
    .then((result) => {
        console.log('result -->', result.foo);
    });

foo.foo = 'qat';
console.log('post -->', foo.foo);

/* logs
executor --> bar
post --> qat
result --> qat
*/

What's the difference between neither, both and async? Isn't neither or both assuming worst case which is the async case?

@tinganho currently this example has a "used before assigned" error under conservative (i.e. 'neither') assumptions. But CFA _could_ compile this successfully if it _knew_ that the callback would be invoked later (e.g. with a deferred annotation). So there is a difference between _no_ annotation and a deferred annotation (and an immediate annotation).

@yortus that was filed as a bug and fixed, the neither assumption is still in my opinion an async assumption no?

If you thought the bug should be there, then you are assuming an implicit immediate callback and not neither?

I mean let say a callback can be both. Wouldn't it in all cases assume the async case since it is the worst case?

Just for clarity I think when you say neither, you actually mean either? You must choose between async or sync or both. I don't think a callback can be classified as none of them.

@tinganho yes you are right, I suppose the 'fix' there means the compiler now does assume deferred execution on the basis of pragmatism. But to be clear, it assumes deferred _invocation_, not whether or not the callback itself is asynchronous.

I'm sure you have considered this at length but given the original example

declare function mystery(f: () => void);
let x: string | number = "OK";
mystery(() => {
    x = 10;
});
if (x === 10) { // Error, we believe this to be impossible  
}

I think the most intuitive behavior would be for a closure which rebinds x to cause the type of x to revert to string | number.

As @kitsonk remarked

While it is more "burden" on the developer, in the sense of "strictness" I still think it is safer to assume the worst (that it might not be immediate). Like with strictNullChecks overall, so I would be only in favour of an annotation that indicates that it is immediate.

While I am in favor of what CFA brings especially in catching logic errors and redundant (superstitious) checks, I think the example above should not be an error because the caller _is_ justified in checking x against 10 simply because mystery may have invoked the closure but it may not have invoked the closure.
Consider a case where mystery invokes the closure immediately, but conditionally based on some other arbitrary state.
consider:

function conditionalBuilder() {
  return {
    when: (predicate: () => boolean) => ({
      then: (action: () => void) => {
        if (predicate()) {
          action();
        }
      }
    });
  };
}

conditionalBuilder()
  .when(() => Math.random() * 100) % 2 > 1)
  .then(() => {
    x = 10;
  });

EDIT fixed typo, formatting

The useful distinction isn't between sync vs async.
There are many synchronous higher-order functions (let's not call them callbacks, because that sounds async) which are not always immediately invoked.

// Invoke N times when N may be 0
function f() {
    let x: number
    ([] as number[]).map(n => {
        x = n
    })
    console.log(x)
}

function provideDefault<T>(x: T | undefined, getDefault: () => T) {
    return x === undefined ? getDefault() : x
}

// Invoke 0 or 1 times
function g(n: number | undefined) {
    let x: number
    const y = provideDefault(n, () => {
        x = 0
        return 0
    })
    console.log(x)
}

function createIterator<T>(next: () => IteratorResult<T>) {
    return { next }
}

// Invoke later, but *not* asynchronously
function h() {
    let x: number
    const iter = createIterator(() => {
        x = 0
        return { value: undefined, done: true }
    })
    console.log(x)
}

We correctly flag all of these currently.

We (I) need to categorize all the different use cases here - things invoked immediately, things sometimes invoked immediately, etc, and come up with a taxonomy of what callbacks do so we can think about what a holistic solution might look like.

Dealing with a use case where we _frigged_ some types and were caught by the new weak type detection has highlighted that narrowed types work great, until that is referenced in a immediately invoked callback. It is great that CFA tracks perfectly, until the callback...

type FooObject = { values: string[] };
const map: { [key: string]: FooObject | string[] } = { a: { values: ['b'] }, b: ['c'] };

function bar(callback: () => void) { }

Object.keys(map).forEach((key) => {
    let value = map[key];
    if (Array.isArray(value)) {
        value = {
            values: value
        };
    }
    value; // FooObject - awesome!!!!
    bar(() => {
        value; // string[] | FooObject - sad face emoji
    });
});

There is a way to work around this problem -- and still keep types -- for the case where the compiler mistakenly assumes the value must be undefined or null.

The following reports, "Property 'toLowerCase' does not exist on type 'never'":

function firstOverToLower(above: string, among: string[]): string {
    let found: string|null = null;
    among.forEach(s => {
        if (s > above)
            found = s;
    });
    if (found === null)
        return "NONE";
    return found.toLowerCase();
}

A postfix bang on the last line eliminates the problem:

function firstOverToLower(above: string, among: string[]): string {
    let found: string|null = null;
    among.forEach(s => {
        if (s > above)
            found = s;
    });
    if (found === null)
        return "NONE";
    return found!.toLowerCase();
}

I realize this doesn't address the general case. Is the only solution to the general case to declare the variable in question as any?

There's another case to include here, that none of the examples above seem to cover. This triggers in React Components that initialize a variable to null in the component rather than in a constructor for the component.

A minimal contrived example (you wouldn't normally use null as an option here, but it serves to demonstrate what I am trying to describe):

interface MyComponentState {
  name: string | null;
}
export class MyComponent extends React.Component<{},MyComponentState> {
  state = { name: null };

  setName = (event: React.FormEvent<HTMLInputElement>) => {
    const name = event.currentTarget.value;
    this.setState({ name });
  }

  getNameLength = (): number => {
    const { name } = this.state;
    return name ? name.length : 0;
  }

  render{
    return (
      <div>
        { this.state.name === null && <p>No name set</p> }
        <input type="text" onChange={this.setName} value={this.state.name || ''} />
        <p>Length of name: {this.getNameLength()}</p>
      <div>
    );
  }
}

In the example above, this.state.name is considered to be permanently null - the getNameLength function will complain that it cannot get length from type never. The work-around is to change the initialization for state to be done inside a constructor function:

export class MyComponent extends React.Component<{},MyComponentState> {
  constructor(props: {}) {
    super(props);
    this.state = { name: null };
  }

In reality, that shouldn't be needed when there were no props to be passed in and the initial state did not depend on the input.

That is an unrelated (but also annoying) problem.

You are declaring that your class has a state prop that is of type { name: null }, which happens to be assignable to MyComponentState. But the type of this.state is not MyComponentState.

MyComponentState happens to be the type of super.state (even though it doesn't exist in reality), which is the type that this.state inherits if you don't declare it in the subclass body, like in your second example.

To get the correct type of state while initializing it in the class body, you need to declare it as readonly state: Readonly<MyComponentState> = { ... }.

@Kovensky Thank you for your explanation. I would never have expected to need to declare state as readonly - but it does make some sense given that all further access to change it is via a function instead of directly.

@dawnmist see also #10570

Is there a way to at least get this added for core functions?

Because code like this currently breaks, which it probably shouldn't. And adding unnecessary null checks anywhere is a) tedious and b) bundle bloat.

class A
{
    private test: string | null = null;

    public fun()
    {
        if (this.test === null)
        {
            return;
        }

        [1].forEach(
            el => {
                console.log(this.test.length); // <- this will not be null, but currently it is marked as error.
            }
        )
    } 
}

PS: I know that you can overwrite the prototype, but I am sure that TypeScript can detect that and remove the "is called sync" flag then. 😊

Another funny example I ran into:

let x: number | undefined = undefined;
function c() {
  x = 1;
}
c();
if (x !== undefined) {
  console.log(x.toFixed(1)); // ERROR: Property 'toFixed' does not exist on type 'never'
}

vs

let x: number | undefined;
function c() {
  x = 1;
}
c();
if (x !== undefined) {
  console.log(x.toFixed(1)); // OK
}

My original example had null in place of undefined but with undefined it's even more confusing as the assigning undefined doesn't really do anything.

I'll chime in here and say I ran into this today with code similar to this:

function interesting(arr: number[]) {
    let foundInterestingNumber: boolean = false;
    arr.forEach((item) => {
        if (item === 5) {
            foundInterestingNumber = true;
        }
    });

    if (foundInterestingNumber === true) { // error TS2367: This condition will always return 'false' since the types 'false' and 'true' have no overlap
        return "interesting!";
    } else {
        return "boring";
    }
}

console.log(interesting([5])); // prints "interesting!". So obviously that TS2367 error is wrong.
console.log(interesting([6])); // prints "boring"

Got this same TS2367 error in TypeScript versions 3.1.6, 3.3.1, 3.3.3, 3.4.0-dev.20190228.

What's odd to me, though, is that this does not give the error, even though it is pretty much equivalent:

if (foundInterestingNumber) {

Even without the addition of the immediate keyword for general use, it seems like TypeScript should be able to know that array builtins like forEach are invoked immediately?

I can see 3 features of the annotations:

  1. Do variables need to be initalised?

    let x: number | string;
    mystery(() => {
     return x; // Error or not?
    });
    

    YES if it might be called immediately

  2. Do variables have their discriminated type?

    let x: number | string = ...;
    if (typeof x === 'number') {
     mystery(() => {
       x.toFixed(); // Error or not?
     });
    }
    

    YES if it is only called immediately

  3. Do reassignments affect the calculated type?

    let x: number | string = '';
    mystery(() => {
     x = 10;
    });
    x // string OR number OR string | number
    

    YES, union of new and old type if it might be called immediately, replace with new type if it will be called immediately

Thoughts on deferred (called later 0+ times), immediate (called immediately 0+ times), immediate! (called immediately 1+ times)? Also these unions and intersections (all would require initialisation and not benefit from discriminated types):

  • immediate! & deferred (reassignment replaces type)
  • immediate & deferred (reassignment union type)
  • immediate | deferred (default - ignores reassignment)

I just ran into this as well. Here's my example:

var language: string|null = null;
var lineWithoutTag = line.replace(/\[([a-zA-Z\- ]+)\]/g, function(match, p1, offset, string){
    language = p1;
    return "";
});

if(language !== null) {
    language = language.toLowerCase();
}

I get Type error: Property 'toLowerCase' does not exist on type 'never'. TS2339

Is there any workaround to tell the compiler that it shouldn't be so clever, and consider the fact that language really could be a string or null?

@DouglasHeriot

var language = null as string|null;

Haven't read through the whole thread so not sure if this has been suggested already and I'm also not very familiar with the design/inner workings of the language so not sure if this is a horrible idea/difficult to implement either, but - what if, instead of decorating the parameter, you decorated the return type? That way you could, using already existing patterns like union types and promises and conditional types (which I think is a _massive_ benefit), do...

// Read: this function will call f immediately
function callFunction(f: () => void): calls f {
  f()
}
// Read: this function returns T and calls f
function callAndReturnFunction<T>(f: () => T): T & calls f {
  return f()
}
// Read: this function returns a Promise which, when done, will return T and have called f
function promiseFunctionCall<T>(f: () => void): Promise<T & calls f> {
  return new Promise((resolve) => 
    setTimeout(() => resolve(f()), 1000)
  )
}
// Read: this function either calls f or returns (and calls) nothing 
function mightCallFunction(f: () => void): calls f | undefined {
  if(Math.random() > 0.5) {
    f()
  }
}
// Read: if B is true, this function calls f; otherwise, it returns (and calls) nothing
declare function callFunctionIf<B extends boolean>(f: () => void, bool: B): B extends true ? calls f : undefined

... And this syntax works better than you'd think since TypeScript forces all function types to annotate argument names, anyway, so even in anonymous function you'd be able to refer to the parameter by name in the return type. Not sure how you'd write the type for functions that might call f at any point in the future, unbound by Promises and such, though. defer calls f? Not sure I like that.

(Response is about situation in lieu of immediate annotation -- although such an annotation would be awesome to have!)

@aluanhaddad
RE: https://github.com/microsoft/TypeScript/issues/11498#issuecomment-252886165

I'm sure you have considered this at length but given the original example

declare function mystery(f: () => void);
let x: string | number = "OK";
mystery(() => {
    x = 10;
});
if (x === 10) { // Error, we believe this to be impossible  
}

I think the most intuitive behavior would be for a closure which rebinds x to cause the type of x to revert to string | number.

It blows my mind that this is not what TypeScript does -- any (possible/potential) rebinding of the symbol whatsoever should revert to the explicitly defined type from the variable declaration.

It blows my mind that this is not what TypeScript does -- any (possible/potential) rebinding of the symbol whatsoever should revert to the explicitly defined type from the variable declaration

You're saying that this code should be an error:

function fn(x: string | undefined) {
  mystery(() => x = undefined);

  if (x !== undefined) {
    console.log("It's not undefined");
    console.log(x.toLowerCase());
                ~ Error, value is possibly 'undefined'
  }
}

You're saying that this code should be an error:

Why wouldn't the (nearer) CFA of the if take precedence at this point?

Why wouldn't the (nearer) CFA of the if take precedence at this point?

It can't, because the callback could have just executed:

let _cb: any;
const console = {
  log(s: string) {
    _cb();
  }
}
function mystery(fn: any) {
  _cb = fn;
}
function fn(x: string | undefined) {
  mystery(() => x = undefined);

  if (x !== undefined) {
    console.log("It's not undefined");
    // Crashes
    console.log(x.toLowerCase());
  }
}

🤦‍♂

So if a binding is reassigned by any path of any "passed" function expression, I guess I'm saying I would prefer the string | undefined (from the declaration's explicit type) held true (until support for the immediate annotation comes along that is 😸). This _is_ assuming that all three of the following expressions would work in such a world (well... at least the first 2):

x && x.toLowerCase()

x?.toLowerCase()

if (x) {
  // no "interesting" stuff (i.e., no invocations)
  x.toLowerCase()

_It's a Crappy JavaScript World™_, but the way it is today (CFA assuming that the callback _couldn't_ have executed):

// example1
let result: {} | undefined = undefined;
[""].forEach(() => {
    result = {};
});
return result; // undefined

Seems like too strong an assumption.

That example2 is different from example1 just hurts:

// example2
let result: {} | undefined;
[""].forEach(() => {
    result = {};
});
return result; // {} | undefined

We tried both when writing CFA and felt the current behavior was much, much better despite the obvious limitations. The current behavior also gives you a straightforward workaround:

let x = "OK" as string | number;
mystery(() => {
    x = 10;
});
if (x === 10) { // OK
}

whereas the pessimistic behavior would have no such analog.

It's easy to say "give me the set of problems I don't currently have", because you don't know how many of them there are.

It's easy to say "give me the set of problems I don't _currently_ have"

Oh, for sure! (see my _I guess I'm saying_ rambling in the above response 😄)

Copy-pasting my comment from https://github.com/microsoft/TypeScript/issues/37107#issuecomment-592693563,

The issue I have with the word immediate and sync is that they don't describe this case well,

async function bar (callback : /*beforeReturnedPromiseResolves*/ () => void) {
  await new Promise(resolve => setTimeout(resolve, 1000));
  //Not immediate, not synchronous, but definitely before promise resolves
  callback();
}
async function foo (arg : string|undefined) {
  if (typeof arg == "string") {
    await bar(() => {
        //Should be OK, even though bar() is async
        //And callback is called asynchronously,
        //it is still invoked before bar()'s promise resolves
        const str : string = arg;
    });
    arg = undefined;
  }
}

BTW, this could be done for a very broad subset without any sort of keyword annotation - if the closure is created after the narrowing, you can safely extend it without issue. I find it surprising that only the thunk in this demo is having an error - it suggests to me you all are just processing it almost as if the suspensions aren't even there.

You could capture another broad subset by adding something like deferred (...args: Args[]) => Result as a supertype of the non-deferred functions, and allow those functions to be used in cases like setTimeout and promise .then callbacks. The main constraint is they can't be called during the lifetime of that function.

not sure if it's related to this issue, anyway this is what happens with my code:

let resolver;
new Promise(resolve => (resolver = resolve));
console.log(resolver); // error: Variable 'resolver' is used before being assigned.ts(2454)

@oswaldofreitas

not sure if it's related to this issue, anyway this is what happens with my code:

let resolver;
new Promise(resolve => (resolver = resolve));
console.log(resolver); // error: Variable 'resolver' is used before being assigned.ts(2454)

You can write:

// The `!` after the variable identifier tells TypeScript
// to treat this variable as always being assigned.
let resolver!: (value?: unknown) => void;
new Promise(resolve => (resolver = resolve));
console.log(resolver);
Was this page helpful?
0 / 5 - 0 ratings