Flow: Request: Safe (statically checked) downcast operation

Created on 15 Jun 2018  Â·  22Comments  Â·  Source: facebook/flow

This is a request for a new expression form $Downcast<S>(t), with the
following semantics:

  • The type of t is required to be some T such that S <: T.
  • The type of the whole expression is S.
  • The runtime elision is simply t: i.e., this has no runtime effect.

The alternate form $Downcast(t) should infer the appropriate S.

This is morally equivalent to a first-class form of the function

function downcast<T, S: T>(t: T): S {
  return (t: any);
}

except that something like (downcast("wat"): number) typechecks
because T can be inferred to mixed or S can be inferred to
empty, so any use case must explicitly bound both type parameters,
which makes this formulation sufficiently inconvenient as to not be
actually useful. A first-class form need not suffer from this failure,
would not need to be defined or imported, and would not incur a runtime
penalty.


As motivation: it is frequently the case that I have a value nominally
of some type T that I know, from runtime invariants that cannot
reasonably encoded in the type system, is actually an instance of S.
The temptation is always to simply cast it through any, but this can
easily lead to errors due to a miscommunication between my mental model
and the actual code.

For instance, consider some code like the following:

declare class FruitStand {
  catalog(): Iterator<{fruit: Fruit, priceInCents: number}>;
}
class FruitMarket {
  appleStand: FruitStand;
  orangeStand: FruitStand;
  *apples(): Iterator<Apple> {
    for (const apple of this.appleStand.catalog()) {
      // We know that the `appleStand` actually only contains apples, so
      // this cast is surely safe!
      yield (apple: any);
    }
  }
  // …
}

The invariant that the apple store only holds Apples may well be true,
but the code is incorrect; it is casting from a {fruit, priceInCents}
record, when it really meant to just cast from a Fruit. (That is, the
code should properly yield (apple.fruit: any) instead.)

This kind of mistake may seem like trivial programmer error, but it is
an easy mistake to make. When the declaration is far from the call site
(say, in the above example, if catalog is defined in another module),
one might simply forget the correct type on the other end. Or, the
mistake can arise during refactoring: if catalog originally returned
an Iterator<Fruit>, a developer might decide to upgrade it to its
current signature by simply changing the type and fixing all the
resulting type errors—but this would fail to catch the any-cast in the
above client.

I’ve made mistakes identical in structure to this example on multiple
occasions.

Downcasts can be a natural and unavoidable part of writing real software
in a system like Flow, and it is unfortunate that in the current state
of affairs one must lose significantly more safety than is theoretically
required.

discussion

All 22 comments

Downcasting is inherently unsafe and unlike languages like java, flow does not and can not provide runtime checks for it. If you want to catch these errors use strict mode, which doesn't allow the any type. In your motivating example you never express your assumption that the apple store only holds apples to the typesystem. If you can not do this, the typesystem can't help you.

Downcasting in flow can safely be done through type refinements.

Try

In your example it would be even better to parameterize FruitStand with the concrete Fruit type

Try

Downcasts are avoidable and should be avoided.

In your motivating example you never express your assumption that the
apple store only holds apples to the typesystem. If you can not do
this, the typesystem can't help you.

I completely agree: the type system can help me only insofar as I
properly express my invariants. And I try my hardest to do so—from
simply adding type parameters and using type refinements to cooking up
complex combinations of type-level dictionaries and $ObjMaps. But some
invariants are simply too complex to express to Flow’s type system. When
Flow simply doesn’t have the vocabulary to express the invariants that I
need, I am forced to cast through any.

You’re right, of course, that in the particular case of my previous
example it suffices to parameterize the type of FruitStand, but such
easy fixes are not always possible. Here; let me provide an example that
of code that we can _prove_ is correct under all circumstances, but for
reasons that we simply cannot express to Flow.

Suppose that we have one module, called tshc.js, with the following
contents:

// @flow
// tshc.js

export opaque type TypesafeHeterogeneousContainer: _TSHC = _TSHC;
interface _TSHC {
  put(key: Symbol, value: mixed): void;
  getAll(key: Symbol): mixed[];
}

export function make(): TypesafeHeterogeneousContainer {
  const data: {[Symbol]: mixed[]} = {};
  function ensureExists(key: Symbol) {
    if (!data[key]) {
      data[key] = [];
    }
  }
  function put(key: Symbol, value: mixed): void {
    ensureExists(key);
    data[key].push(value);
  }
  function getAll(key: Symbol): mixed[] {
    ensureExists(key);
    return data[key].slice();
  }
  return {put, getAll};
}

Because of the way that the opaque types are set up, the only possible
way for a TypesafeHeterogeneousContainer to be constructed is for it
to be the result of a call to make(). Therefore, a client that
receives a TypesafeHeterogeneousContainer knows that it behaves
according to this implementation.

Now, suppose that we have another module, called widgets.js, with the
following contents:

// @flow
// widgets.js

import {type TypesafeHeterogeneousContainer} from "./tshc";

const TSHC_KEY: Symbol = Symbol("widget");
export type Widget = {|+gizmo: string, +doodad: number|};

function addWidget(collection: TypesafeHeterogeneousContainer, widget: Widget) {
  collection.put(TSHC_KEY, widget);
}

function getWidgets(collection: TypesafeHeterogeneousContainer): Widget[] {
  const definitelyWidgets = collection.getAll(TSHC_KEY);  // (*)
  return (((definitelyWidgets: mixed[]): any): Widget[]);
}

Because the TSHC_KEY is a Symbol that is not exported from the
widgets.js module, no other module can access its value, and so no
other client can call collection.put with TSHC_KEY as the first
argument. Therefore, on the line marked (*), we can be absolutely
certain that every element of getAll is actually a Widget.

Other modules like widget.js can of course adopt the same pattern, and
there can be arbitrarily many of these modules, and they might be known
only at runtime, so the typesafe heterogeneous container cannot be
parameterized over all of them.

Let me summarize. On the line marked (*), we have a value that Flow
knows is of type T, and which we know to be of type S, a more
specific type. There is no way to express the appropriate constraint to
Flow. Therefore, we are forced to cast through any.

This is a very real example, by the way. It is the backbone of a _plugin
interface_: the TSHC is the plugin manager, and modules like widget.js
are plugins interacting with a datastore. In fact, such a plugin
architecture is exactly what motivated me to open this feature request
in the first place.

Downcasts are avoidable and should be avoided.

It looks like you’re a Haskell person, so let me take the risk of
providing an analogy. Foreign function calls in Haskell are unsafe and
should be avoided—but sometimes they’re exactly what you need, and so
Haskell provides a facility for them. This facility doesn’t just expose
a blanket operation like

ffi :: FunctionName -> a -> IO b

; instead, it lets you declare the intended types:

foreign import ccall "hypot" hypot :: Double -> Double -> Double
foreign import ccall "getuid" getuid :: IO Int

main = do
  getuid >>= print       -- typesafe!
  print (hypot 3.0 4.0)  -- typesafe!

These declarations are still dangerous: nothing stops me from writing

foreign import ccall "hypot" hypot :: Int -> Int
foreign import ccall "getuid" getuid :: IO Double

and then getting unexpected behavior at runtime. Haskell does not and
cannot provide runtime checks that these declarations are correct. But
Haskell has at least given me the tools to limit the extent of the
danger. As long as my assumption that the actual type of "hypot"
matches the declaration—which I can’t prove to the Haskell
compiler—holds, the rest of the program is still safe.

In the same way, when I downcast from T to S, as long as my
assumption holds that—due to invariants that I can’t express to the Flow
typechecker—the provided value is actually an instance of S, the rest
of the program is still safe.

So, I would amend your claim to: downcasts are _often_ avoidable and
should be avoided _when possible_. In the cases where they are
necessary, Flow should help the user as much as possible. This is all
I’m asking for.

Your problem boils down to the fact that you have badly modeled your _TSHC type. It's a container of some data type, but rather than providing an explicit type you just use mixed. The fact that it is an opaque type doesn't change anything. If you add some trivial generics to your example it works.

Try


Haskell FFI has nothing to do with the problem at hand.

Yes, the flow typesystem is not perfect and is filtering out useful programs. That is the nature of typesystems. But all your examples can be solved inside of flow if you model your data correctly. Flow provides a way to circumvent the typechecker if you know that a program that does not typecheck is in fact correct. I don't think that there should be additional facilities to incentivize circumventing the typesystem.

@chisui, you misunderstand the example. The TSHC is a container not
just of widgets but of many types. (Hence: typesafe _heterogeneous_
container.) We have widget.js and mcguffin.js and zombo.js, all
storing data into _the same instance_ of the data structure. With your
parameterized example, this is a _TSHC<Widget | McGuffin | Zombo>, but
then of course when you getAll(TSHC_KEY) in the widget module you
have to again cast from an array of union type to a more specific type.

In any case, the specifics of the example are not important. The salient
point is that there exist invariants—even relatively simple ones!—that
cannot be modeled inside Flow. In those cases, I want to be able to
easily sacrifice type safety _locally_, not _globally_.

Flow provides a way to circumvent the typechecker if you _know_ that a
program that does not typecheck is in fact correct.

…but it only provides “entirely hands-off, completely unchecked” way to
circumvent this, not a precise “take this one additional axiom” approach.

I don't think that there should be additional facilities to
incentivize circumventing the typesystem.

In cases like these, I am already circumventing the type system. I will
continue to use any, overkill and distasteful though it is, because it
is the only option available to me.

Your typesafe heterogenous lists are not typesafe then.

Type refinements are a safe way of downcasting.

I'll copy the (very simple) example I posted in #6516 here as well. Try this.

In the example @wchargin posted over on the other thread, the $Downcast option would provide a workaround to cases in which Flow's type refinement system simply doesn't work. Ideally, though, the type refinement system would work automatically out of the box. What would be the suggested way to correctly model data to ensure that examples like this work within the confines of type refinement?

As I described there, there are ways to solve this problem with existing flow features. No special downcast utility type needed.

@chisui suggests working around the fact that filter does not refine
the result type by using the (as-of-yet unimplemented) flatMap.
There’s nothing wrong with this approach by itself, but replacing a
simple filter with a flatMap whose callback always returns a
subsingleton list violates Flow’s design principle of supporting
“JavaScript, your way”:

Flow is designed to understand idiomatic JavaScript. It understands
common JavaScript patterns and many of the weird things we JavaScript
developers love to do.

When we have a list of items and want to filter them, we naturally turn
to the filter function. I don’t want to make my code more convoluted
through this layer of indirection, and I certainly don’t want to _read_
code that looks like that.

What this filter example and the typesafe heterogeneous containers
that I mentioned earlier have in common is that they really _are_
typesafe, in the formal sense that guarantees of the preservation
theorem hold. That is, these programs are typesafe because if a variable
has static type Ď„, then at runtime its value always also has type Ď„.
In other words, “nothing goes wrong”, even though we had to cast through
any.

It is Flow’s goal to understand these kinds of natural JavaScript code.
Flow is not a Hindley–Milner typesystem, and not all code that
JavaScript developers write fits neatly into a Hindley–Milner framework.
A safe downcast operation helps Flow help developers more.

By the way, I note that this feature is built in to TypeScript:

const x: number = 3;

const a: 3 = x;  // fails, as expected
const b: 3 = x as 3;  // works!
const c: string = x as string;  // fails! 'number' not subtype of 'string'

Edit to add: this is documented in the “Basic types” section of the
TypeScript documentation
.

I think your typescript example illustrates pretty well why having downcasts inside the language is pretty dangerous. Just change the code of x to any other number and the assignment to b still works. You can even have expressions like 1 as 2 which is simply wrong.

If you expose such a tool to the user it will be abused. It shouldn't be easy to subvert the typesystem. having to write an ugly cast through any makes you aware of what you are about to do. Exposing such a powerful and having it look the same as other utility types normalizes its use, in strict mode it should definitly not be possible to use.

That being said. I'm still not convinced that such a tool is actually needed. All examples I have seen can be reworked to work with the exsting features of flow.

Here is another attempt at your heterogenous collection without the need for the consumer of your data structure having to use ugly casts:

Try

Exposing such a powerful and having it look the same as other utility
types normalizes its use, in strict mode it should definitly not be
possible to use.

This is eminently reasonable, in my opinion. If strict-mode disables
any casts, it makes sense for it to also disable downcasts. (After
all, you can downcast to empty.) I’m totally open to a different
syntax. Calling the form $UncheckedDowncast or $UnsafeDowncast or
whatever would be fine with me, as would changing the grammar: e.g.,
making it an infix operator à la TypeScript’s as.

Just change the code of x to any other number and the assignment to
b still works.

Of course. This is the point.

Here is another attempt at your heterogenous collection without the
need for the consumer of your data structure having to use ugly casts:

Your code still has not managed to avoid an any-cast or take advantage
of type refinements. Furthermore, it is unsound: unlike the code that I
provided, clients can use your code _without any additional casts_ to
violate type safety and produce runtime errors. Neither of these
indicates to me that the code has been “reworked to work with the
exsting features of flow”.

Can you explain to me how the heterogenous container example is unsound? All I did was adding a phantom type to the key type and make put and getAll to use this phantom type to check that only values of that type are put in and taken out. Any unsoundness present in this version has also to be present in the original example.

edit:

The only way I could come up with to break the implementation is if you allow the same Symbol to refer to different type:

const s = Symbol("sym")
const A: ID<A> = s
const B: ID<B> = s

This can easily be made impossible by making ID opaque and providing a constructor function, which I omitted since opaque doesn't do anything in Try Flow.

The only way I could come up with to break the implementation is if
you allow the same Symbol to refer to different type:

Yes, that is precisely the problem:

const container = make();
const key = Symbol();
container.put(key, 17);
const myStrings: string[] = container.getAll(key);
myStrings[0].toLowerCase();  // runtime error, not caught statically

Indeed, making the ID type opaque and exposing a constructor fixes it.

Now, suppose that we make this modification. Where are we? We have
written some code that is fully typesafe, in the sense that any client
code without additional casts will be safe from runtime exceptions. And
yet the implementation of the TSHC still requires a cast through any:
we can prove that the cast is sound, but we can’t express that proof to
Flow.

Tomorrow I might come along and refactor the implementation of the TSHC
to be backed by immutable.js lists instead of raw arrays.
As this is an implementation-only change, the API to the user is not
affected, so my plan is to change the type definition of data and fix
all resulting type errors. Doing so, I find and fix errors in
ensureExists and put, while getAll is already type-correct and
needs no changes. And yet the function is now wrong: Immutable.Lists
do have a slice method, but it returns another Immutable.List, not
an array!

If the (data[key].slice(): any) had instead been a checked downcast,
then Flow would have been able to tell that Array<T> is not a subtype
of Immutable.List<T> and raised an appropriate error.

  1. If you invariants that you don't express in the typesystem you need some other form of proof. The most realistic one beeing tests. If you have done so your tests should fail after this kind of refactoring. You would still need to do this if the proposed downcast existed.
  2. $WhateverDowncast wouldn't help you in this case either since A <: B does not imply T<A> <: T<B>. For that to work we would need some way of stating variance of type parameters.

your tests should fail after this kind of refactoring

Of course, but you could say this about _any_ feature that grants
additional typesafety. Types complement tests; tests complement types.

we would need some way of stating variance of type parameters.

immutable.js lists are covariant.

But this is all beside the larger point. The code with a checked
downcast is objectively _strictly safer_ and has _strictly stronger
invariants_ than the code with the cast through any.

And I demonstrated that most of these invariants can be expressed without subverting the typesystem. If we encounter some that we can not express we should not try to add more facilities to subvert the typesystem but rather make flow expressive enough to do so.

Your request seems self-contradictory (unsafe cast, yet safe and statically analyzed?), but -- if I'm understanding -- what you really want is an unsafe cast that still does some amount of type checking.

Perhaps $Subtype or $Supertype is what you're looking for? Example

I'm not sure how useful they really are, but I figured some documentation (including unsafety warnings) is better than none. Hence #6887

Hi @aij—thanks for your input.

what you really want is an unsafe cast that still does some amount of
type checking.

Yes: an unsafe cast that Flow verifies can _possibly_ succeed.

Perhaps $Subtype or $Supertype is what you're looking for

No, I don’t think so. In your example baz = 42 indeed fails to
typecheck, but (baz: 42) _does_ typecheck. This is because the subtype
of T can be selected to be empty, which can then be _safely_ upcast
to number. For instance:

// @flow
type T = { foo: number };
type U = { foo: number, bar: string };
const baz: $Subtype<T> = { foo: 0 };
const quux: number = baz; // I want this to fail: _no_ `T` is a number.

I figured some documentation (including unsafety warnings) is better than none.

I appreciate the thought, and the documentation helps, but I still don’t
understand what $Subtype and $Supertype are actually doing and how
they interact with inference, so I don’t feel comfortable using them in
actual code. (It’s a shame—I don’t know any other way to specify lower
bounds on type parameters.)

You can combine unsafe types to have differently unsafe types: This fails to typecheck. & will give you a subtype of two types, while | will give you a supertype. I have a feeling that's not quite what you're looking for though.

Of course, no need to go all the way to empty, T & number is also a type, though given the right assumptions/axioms, it could be provably uninhabited. With a sufficiently expressive type system, proving all uninhabited types are uninhabited is not going to be generally possible though.

I'm curious: What does Typescript do? (Not saying flow should necessarily do the same thing...)

You can combine unsafe types to have differently unsafe types

Sure, but this is very ad hoc. Change the type of quux from number
to {| foo: string |} in your example and the trick doesn’t work. I
also think that it’s clear that this is not a clear or readable
workaround.

& will give you a subtype of two types

I _wish_ that this were true, but it’s not. Flow’s intersection types
are badly broken. Contrary to what the docs suggest, these types are
_not_ infima. For instance, they’re neither commutative nor associative
(see #5895, #6641).

Union types do seem to act as suprema, for what it’s worth.

Of course, no need to go all the way to empty, T & number is also
a type, though given the right assumptions/axioms, it could be
provably uninhabited.

Right, yes, of course. This is why existing workarounds do not suffice.
In my proposed $Downcast<S>(t), S would be required to be a subtype
of the declared type of t. (Not an arbitrary inferred type of t,
which could always be mixed.)

I'm curious: What does Typescript do?

In case you missed it, TypeScript has been discussed previously in this
thread. The summary is that its as-operator appears to behave exactly
as I want. This continues to be the case with your example:

// @flow
type T = { foo: number };
type U = { foo: number, bar: string };
const baz: T = { foo: 0 };
const quux: U = baz;  // fails (good)
const xuuq: U = baz as U;  // OK (good)

I wish that this were true, but it’s not. Flow’s intersection types
are badly broken.

I was assuming that was due to bugs rather than intentional... Note that I filed one of the issues you referenced. :smile:

In my proposed $Downcast<S>(t), S would be required to be a subtype of the declared type of t.

Does that imply t would have to be a variable rather than an expression? I'm not exactly sure where Flow uses declared types rather than annotations...

In case you missed it, TypeScript has been discussed previously in this thread.

That's why I asked. It sounded like it behaves like you wanted, so I was wondering what it does in order to produce that behavior.

The documentation you liked earlier says

Type assertions are a way to tell the compiler “trust me, I know what I’m doing.” A type assertion is like a type cast in other languages, but performs no special checking or restructuring of data. It has no runtime impact, and is used purely by the compiler. TypeScript assumes that you, the programmer, have performed any special checks that you need.

From what you said earlier:

Yes: an unsafe cast that Flow verifies can possibly succeed.

Re: incompleteness: Playing with the Typescript playground, whatever TS is doing, it seems to err on the side of rejecting even casts that can succeed:

type T = { foo: number };
type U = { foo: number, bar: string };
type V = { bar: string };
const baz = { foo: 0, bar: "" };
const qux: T = baz;
const quux: V = qux as V;  // fails
const xuuq: V = qux as U;  // OK

Which seems contrary to the documentation, since I know that qux really is a V.

Cool, sounds like we’re mostly on the same page. :-)

Does that imply t would have to be a variable rather than an
expression?

Perhaps. Or, if feasible, inference could find the most specific
inferred type of T assuming that the downcast weren’t there, and use
that type. Or, the T could be explicitly specified in the downcast,
making it $Downcast<T, S>. I’m not sure what’s best.

Re: incompleteness: Playing with the Typescript playground, whatever
TS is doing, it seems to err on the side of rejecting even casts that
can succeed:

Yep, this is definitely strange. A simple modification of this test case
shows that the “allowed to cast” relation is not transitive:

type T = { foo: number };
type U = { foo: number, bar: string };
type V = { bar: string };
declare var t: T;
(t as U);         // OK
((t as U) as V);  // OK
(t as V):         // not OK...?

I guess that I should retract the claim that TS-as does _exactly_ what
I want! It still seems like a reasonably close approximation, though,
and again I haven’t looked into it very much.

Was this page helpful?
0 / 5 - 0 ratings