Flow: The inferred types of array literals are unsound

Created on 1 Dec 2018  Â·  4Comments  Â·  Source: facebook/flow

As seen here, this program involving a mutated array typechecks, which is unsound:

/* @flow */
const arr = ['1', 0];
arr.sort();
const x: string = arr[0]; // unsound!

As far as I can tell, Flow treats the inferred type of arr sort of like a tuple, so arr[0] has type string and arr[1] has type number. However, it also doesn’t treat it like a tuple, since it allows .sort() and similar mutating methods to be applied to the value.

This only gets worse when I try to deduce what kind of mysterious type Flow is inferring for arr. If I use flow type-at-pos on the binding site (or any of the use sites), it claims it has inferred the type Array<string | number>, but if I add that type as a type annotation, the program (correctly) fails to typecheck, as seen here:

/* @flow */
const arr: Array<string | number> = ['1', 0];
arr.sort();
const x: string = arr[0];
4: const x: string = arr[0];
                     ^ Cannot assign `arr[0]` to `x` because number [1] is incompatible with string [2].
References:
2: const arr: Array<string | number> = ['1', 0];
                             ^ [1]
4: const x: string = arr[0];
            ^ [2]



md5-fe66bb4886f0fcbfe2f1f26b44144818



3: arr.sort();
^ Cannot call arr.sort because property sort is missing in $ReadOnlyArray [1].
References:
2: const arr: [string, number] = ['1', 0];
^ [1]
```

Maybe the typechecker is tracking some refinements on individual array elements in a way I do not understand, which would not be reflected in the type reported by type-at-pos, but even if this is true, then it is still a bug, since .sort() should invalidate those refinements.

soundness tuple bug

Most helpful comment

I went digging through the Flow source code for some answers, just out of curiosity, and I think I have a little more understanding about what’s going on. However, disclaimer: I know essentially nothing about how Flow actually works, and this is my first time looking at any of Flow’s code at all, so this could be totally off the mark.

Anyway, I found the place where the type of array literals appears to be inferred, which comes down to these lines here:

https://github.com/facebook/flow/blob/147eb3fe43b407fb70506b750d73734e49241b05/src/typing/flow_js.ml#L10631-L10632

Notably, literals are handled specially, since they get both an element type and a tuple type, whereas non-literal arrays only get an element type. In a sense, it seems like this is a little bit like an ad-hoc intersection, so that the type can be used both ways, absent a type annotation to disambiguate user intent. Of course, it isn’t actually an intersection type, and in particular, the way it’s handled is unsound!

(I imagine much of what I said above is obvious to some or all of the Flow maintainers, but it certainly wasn’t obvious to me! So consider the above a little bit of thinking out loud.)

I don’t know enough about how Flow actually works to have any idea what the right fix is, but it seems to me like the status quo is unacceptably unsound (much more so than the unsoundness that arises from out-of-bounds array accesses). As a random guess, it seems like perhaps the extra tuple information ought to be dropped if any non-read-only methods are invoked on the array, but I don’t know if that actually means anything in the context of how Flow handles type refinements.

As a final note, I found it extremely confusing that Flow infers a type that can’t be printed using type-as-pos because there is no surface syntax to actually write the type down! I guess maybe that’s just a fact of life for Flow, since the internal type system tracks a lot more information than is made visible in the source syntax, but I wish Flow at least had some way to communicate to me that the type it has printed is imprecise.

All 4 comments

I went digging through the Flow source code for some answers, just out of curiosity, and I think I have a little more understanding about what’s going on. However, disclaimer: I know essentially nothing about how Flow actually works, and this is my first time looking at any of Flow’s code at all, so this could be totally off the mark.

Anyway, I found the place where the type of array literals appears to be inferred, which comes down to these lines here:

https://github.com/facebook/flow/blob/147eb3fe43b407fb70506b750d73734e49241b05/src/typing/flow_js.ml#L10631-L10632

Notably, literals are handled specially, since they get both an element type and a tuple type, whereas non-literal arrays only get an element type. In a sense, it seems like this is a little bit like an ad-hoc intersection, so that the type can be used both ways, absent a type annotation to disambiguate user intent. Of course, it isn’t actually an intersection type, and in particular, the way it’s handled is unsound!

(I imagine much of what I said above is obvious to some or all of the Flow maintainers, but it certainly wasn’t obvious to me! So consider the above a little bit of thinking out loud.)

I don’t know enough about how Flow actually works to have any idea what the right fix is, but it seems to me like the status quo is unacceptably unsound (much more so than the unsoundness that arises from out-of-bounds array accesses). As a random guess, it seems like perhaps the extra tuple information ought to be dropped if any non-read-only methods are invoked on the array, but I don’t know if that actually means anything in the context of how Flow handles type refinements.

As a final note, I found it extremely confusing that Flow infers a type that can’t be printed using type-as-pos because there is no surface syntax to actually write the type down! I guess maybe that’s just a fact of life for Flow, since the internal type system tracks a lot more information than is made visible in the source syntax, but I wish Flow at least had some way to communicate to me that the type it has printed is imprecise.

One more thing: it looks like this issue has been previously reported in #4702, but I think I’ll still leave this issue open for now, since I think it adds some extra context that the linked issue does not include (and includes the scary word “unsound”, so maybe it will capture just a little more attention 😜).

Thank you for your analysis and debugging here. I'm not sure when we'll get around to fixing this, but the level of detail you went into here will make it much easier for whoever picks this up later.

Another example:

const a: [number, string] & Array<number | string> = [0, ''] // it' ok

function f(b: [number, string]) {
  b[0].toFixed(2)
}

function g(b: Array<number | string>) {
  b.pop()
  b.pop()
}

g(a)
f(a)

const a: [number, string] & Array<number | string> = [0, '']

is unsafe, only

const a: [number, string] & $ReadOnlyArray<number | string> = [0, '']

should be allowed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

l2silver picture l2silver  Â·  3Comments

mjj2000 picture mjj2000  Â·  3Comments

pelotom picture pelotom  Â·  3Comments

bennoleslie picture bennoleslie  Â·  3Comments

Beingbook picture Beingbook  Â·  3Comments