Typescript: recursive type definitions

Created on 12 Jun 2015  Â·  35Comments  Â·  Source: microsoft/TypeScript

I have a recursive type definition for JSON Serializable entities (Exhibit A, edited after discussion with @JsonFreeman ):

export type IJSONSerializable = IJSONSerializableObject | number | string | IMaybeRecursiveArray<IJSONSerializableObject | number | string>;

interface IMaybeRecursiveArray<T> {
   [i: number]: T | IMaybeRecursiveArray<T>;
}

interface IJSONSerializableObject {
    [paramKey: string]: IJSONSerializable
}

I would like to be able to write the following, but get a circular reference error (Exhibit B):

export type IJSONSerializable = {[paramKey: string]: IJSONSerializable} | number | string | Array<IJSONSerializable>;

How difficult would it be to address this limitation of the type system?

Fix Available Suggestion

Most helpful comment

Picking up on the suggestion in #3988, here is how you would currently write a correct recursive JSON data type:

type JSONValue = string | number | boolean | JSONObject | JSONArray;

interface JSONObject {
    [x: string]: JSONValue;
}

interface JSONArray extends Array<JSONValue> { }

The trick is to make the recursive back references within interface types. This works because resolution of interface base types and interface members is deferred, whereas resolution of type aliases is performed eagerly. Ideally you'd be able to replace JSONArray with JSONValue[] in the definition of JSONValue, but certainly this is a reasonable workaround for now.

All 35 comments

That isn't just recursion, it is infinite recursion. How do you suggest the compiler could resolve that, since it needs to understand IJSONSerializable to be able to type guard IJSONSerializable?

I haven't looked into the typescript compiler code. Lazily evaluate the type definition? Translate something like Exhibit B back into Exhibit A? If you point me to the relevant area in the code base, I can attempt to address it.

Edited previous comments for clarity.

@opensrcken Compiler code is not necessary here to define a general algorithm that will solve the problem.

export type IJSONSerializable = {[paramKey: string]: IJSONSerializable} | number | string | Array<IJSONSerializable>;
var aThing: IJSONSerializable = { foo: 'a' };

Lazily evaluating the original type definition isn't the issue. The problem comes as soon as you try to use it, like here, where we need to check whether something is assignable to that type. So the compiler asks is { foo: string } assignable to IJSONSerializable? Well to answer that question it needs to know whether { foo: string } is assignable to { [paramKey:string]: IJSONSerializable }. So then it has to check whether the type of foo is assignable to IJSONSerializable and then we're back to trying to answer the first check again and infinitely recursing. One way we break out of this situation today is by falling back to any after a certain number of loops, but that's not what you want here since that would make this type definition essentially useless (everything would be assignable to it).

@danquirk, I think the compiler implementation is relevant.

Let's simplify this a bit -- we are trying to define the type of a data structure that is legitimately recursive, theoretically infinitely so. For simplicity's sake, let's think about a vanilla Binary Tree, instead of JSON.

It would seem to me that the compiler should understand the possibility of an infinitely recursive type, and only match a data structure that claims to match that type if it also detects that the data structure itself is theoretically recursive in the same way. The missing piece in your example is the ability to detect infinite recursion in a type definition, without actually having to recurse infinitely, and treat that as a type in and of itself, distinct from any.

Remember, the very first example in the OP _is_ infinitely recursive, it's just not _directly_ self-referential:

export type IJSONSerializable = IJSONSerializableObject | number | string | Array<IJSONSerializableObject | number | string>;

interface IJSONSerializableObject {
    [paramKey: string]: IJSONSerializable
}

So are you saying that the compiler is actually treating this as any under the hood? If not, it seems the compiler is already able to understand this notion to some degree.

@opensrcken The types you've defined in exhibit A and exhibit B are not even the same. In exhibit B, your type would allow Array<Array<number>>, whereas the type in exhibit A would not. Am I correct?

The reason this is an error is the Array<IJSONSerializable> reference. In order to know what that type represents, we cannot create a type for Array<IJSONSerializable> because we don't even know if it's an object type. Furthermore, if it is, we don't know if it's a type we created already, or a new type. Our only way of looking up this information is by knowing exactly what IJSONSerializable is, and we don't.

In order to support this, we'd need to change the architecture of type aliases so that all operations in the type checker know how to process the types created by them. It would involve actually creating a container every time we encounter a type alias. This would possibly create memory overhead, and would add another case to handle in every type operation in the checker. It is nontrivial, but not fundamentally undoable.

Forgot to clarify, it is not a result of failure to detect relations between types that are infinitely recursive.

Also, it's simple recursion in this case (classic mu type), not infinite/generative recursion.

Yes, you are correct about the difference between exhibit A and B. That is an oversight on my part. I have edited exhibit A to correctly reflect the potential recursive nature of JSON arrays.

I think the important thing here is that JSON is not some sort of edge case. Recursive types are fairly commonplace in programming, and it seems worth supporting.

Another example: Promises/A+ promises.

type ThenableLike<T> = T | ThenableLike<Thenable<T>>;

interface Thenable<T> {
  then(callback: (value: T) => ThenableLike<T>): Thenable<T>;
}

class Promise<T> implements Thenable<T> {
  static all<T>(thenables: ThenableLike<T>[]): Promise<T>;
  static race<T>(thenables: ThenableLike<T>[]): Promise<T>;
  static resolve<T>(thenable: ThenableLike<T>): Promise<T>;
  static reject<T>(thenable: ThenableLike<T> | Error): Promise<T>;

  then<U>(
    callback: (value: T) => ThenableLike<U>,
    error?: (err: Error) => ThenableLike<U>
  ): Promise<U>;

  catch<U>(error: (err: Error) => ThenableLike<U>): Promise<U>;
}

Or, the classic array flatten function:

type Nested<T> = T[] | Nested<T[]>;
function flatten<T>(list: Nested<T>): T[] {
  return (<T> []).concat(...list.map<T | T[]>((i: T | Nested<T>) =>
    Array.isArray(i) ? flatten(i) : i));
}

@opensrcken

I think the important thing here is that JSON is not some sort of edge case. Recursive types are fairly commonplace in programming, and it seems worth supporting.

:+1: Definitely not an edge case. That flatten function is in nearly every utility library out there.

Actually @opensrcken I don't think ThenableLike or Nested make sense the way they are defined. Those are infinitely expanding type aliases with no structure other than a union type, and those would degenerate. This problem does _not_ occur with your ideal definition of IJSONSerializable above. To see what I mean, let's take Nested<number> as an example. Let's now try to show that string is not assignable to Nested<number>. I'll go in steps:

  1. Is string assignable to number[]? No. Let's try Nested<number[]>
  2. Is string assignable to number[][]? No. Let's try Nested<number[][]>
  3. Is string assignable to number[][][]? No. Let's try Nested<number[][][]>
  4. Is string assignable to number[][][][]? No. Let's try Nested<number[][][][]>
  5. ...

Eventually, the type system decides that it's never going to get an answer. So it has no basis to give an error, and as a result, string is assignable to Nested<number>. In fact, everything is, and it's a useless type.

Now I'm not saying that recursive types are bad. Just this kind of recursive type is bad. It is much better if Nested is written like this:

type Nested<T> = T[] | Nested<T>[];

Now the type has structure because all constituents are arrays.

The same thing goes for ThenableLike. You'd need to define it like this to make it not degenerate:

type ThenableLike<T> = T | Thenable<ThenableLike<T>>;

Although in this case, perhaps what you want is not recursive at all. Thenable itself already seems to encapsulate the recursion. Why is it not

type ThenableLike<T> = T | Thenable<T>;

@JsonFreeman

  1. You mean me, @impinball? ;)
  2. I came up with those from more of a pure functional mindset (Haskell/OCaml/etc.), which has a type system in which types sometimes can become data. My mistake on that part.
  3. The above definition for Thenable itself doesn't fully encapsulate the possibilities for things such as this, an arbitrary amount of nesting:
Thenable<Thenable<Thenable<...<Thenable<T>>...>>>

Although, I just realized that this could potentially also be used to properly, and completely type Function.prototype.call and Function.prototype.bind. It may take a little bit of ugly hacking to pull it off, but it may be possible. I wouldn't hold my breath for it, though.

Yes, sorry @impinball. I got confused when you addressed @opensrcken.

I realize different type systems have different ways of understanding types and data, so they don't always translate perfectly. Specifically in TypeScript, there is a very clean separation of values and types. So an infinitely recursive type without structure just wouldn't work, even if we were to maximally support recursive types.

For the arbitrary recursion on Thenable, ideally you should be able to do that with type ThenableLike<T> = T | Thenable<ThenableLike<T>>; if we supported it. The type system could handle it just fine, it's just that we have to adjust the compiler architecture.

Picking up on the suggestion in #3988, here is how you would currently write a correct recursive JSON data type:

type JSONValue = string | number | boolean | JSONObject | JSONArray;

interface JSONObject {
    [x: string]: JSONValue;
}

interface JSONArray extends Array<JSONValue> { }

The trick is to make the recursive back references within interface types. This works because resolution of interface base types and interface members is deferred, whereas resolution of type aliases is performed eagerly. Ideally you'd be able to replace JSONArray with JSONValue[] in the definition of JSONValue, but certainly this is a reasonable workaround for now.

The only way I can think of to make this work is to introduce a new kind of type for type aliases, but that would involve lots of extra plumbing. Maybe @ahejlsberg has a better idea.

What this really is starting to sound like now is inductive type inference
as opposed to deductive. Most strongly typed languages with any sort of
type inference deduce types. That includes TypeScript. Haskell and OCaml
are the only two remotely mainstream languages I know of that do their
typing inductively. And as for Haskell, the reason why it can have infinite
types is because its nonstrict nature extends to its type system as well to
a large degree. Not to mention it's nearly Turing complete without the GHC
extensions.

The tricks being described here are taking advantage of the deductive
inference, using it to power the recursive types. Deducting types is hard
enough. Inductively finding them is much harder.

I've started to backtrack a little from my initial stance with this bug for
those reasons.

On Thu, Aug 6, 2015, 21:12 Jason Freeman [email protected] wrote:

The only way I can think of to make this work is to introduce a new kind
of type for type aliases, but that would involve lots of extra plumbing.
Maybe @ahejlsberg https://github.com/ahejlsberg has a better idea.

—
Reply to this email directly or view it on GitHub
https://github.com/Microsoft/TypeScript/issues/3496#issuecomment-128554030
.

I don't know what _inductive_ and _deductive_ mean. But I will say that the type system / compiler in TypeScript is generally lazy. That's why recursive type aliases are supported to some degree. The reason they are not supported in _all_ cases is that type aliases themselves are resolved eagerly. To make type aliases lazy, it would likely be necessary to have an actual type for type aliases, but every single type operation in the compiler would now have to be aware of a type alias type.

What I mean about inductive vs deductive is in the sense of inductive
reasoning vs deductive reasoning in the mathematical sense. Feel free to
correct me if I'm not using the correct terminology here (I don't have a
PhD in type systems).

Inductive:

-- In Haskell
-- Note that this would never be checked
-- eagerly
type Thenable a = a | Thenable a
type Promise a = (Thenable a) => a | Promise a

Deductive:

// In Typescript
type Infinite<T> = Array<T> | Array<Infinite<T>>;

Deductive typing is lazy at type definition (it doesn't explode over simple
recursion), but eager in applying the types. Inductive typing is lazy at
both type definition and type application. TypeScript eagerly applies
types. Haskell lazily applies types. That's the real difference I'm talking
about.

On Fri, Aug 7, 2015, 17:28 Jason Freeman [email protected] wrote:

I don't know what _inductive_ and _deductive_ mean. But I will say that
the type system / compiler in TypeScript is generally lazy. That's why
recursive type aliases are supported to some degree. The reason they are
not supported in _all_ cases is that type aliases themselves are resolved
eagerly. To make type aliases lazy, it would likely be necessary to have an
actual type for type aliases, but every single type operation in the
compiler would now have to be aware of a type alias type.

—
Reply to this email directly or view it on GitHub
https://github.com/Microsoft/TypeScript/issues/3496#issuecomment-128837473
.

I see. I think I also lack sufficient type theory knowledge to comment on induction versus deduction, though it is clear that there is a distinction between lazy and eager processing of types.

There is a section in the spec here:
https://github.com/Microsoft/TypeScript/blob/master/doc/spec.md#310-type-aliases
that defines the "directly depends on" relation. This makes explicit all the places where processing a type is eager. Everything not in this list is lazy.

For type arguments, it's a caching issue in the implementation. In other words, the reason the compiler recurses on Infinite (in your example) is so that it can check the instantiation cache to see if Array has already been instantiated that way. The most obvious way is to create a special type object for type aliases and resolve them lazily like we do for interfaces and classes. The catch with type aliases is that there would need to be some rules about what sorts of self references are _not_ allowed (union/intersection constituents, target type of a generic type reference, etc).

The only way to realistically fix this issue is to completely change how types are even resolved. As I said, it's inductive (in the sense of mathematical induction) vs deductive (in the traditional sense) type checking. Haskell's relies on mathematical induction. That's why this is possible:

-- Part of a Haskell JSON library. Makes use of a GHC extension.
-- Documentation: https://hackage.haskell.org/package/json-0.9.1/docs/Text-JSON.html
-- Source: https://hackage.haskell.org/package/json-0.9.1/docs/src/Text-JSON-Types.html

data JSValue
    = JSNull
    | JSBool     !Bool
    | JSRational Bool{-as Float?-} !Rational
    | JSString   JSString
    | JSArray    [JSValue]
    | JSObject   (JSObject JSValue)
    deriving (Show, Read, Eq, Ord, Typeable)

newtype JSString   = JSONString { fromJSString :: String }
    deriving (Eq, Ord, Show, Read, Typeable)

newtype JSObject e = JSONObject { fromJSObject :: [(String, e)] }
    deriving (Eq, Ord, Show, Read, Typeable )

I couldn't find a single JSON implementation that both enforced correctness fully at compile time and didn't have a type system that uses mathematical induction.

@jbondc All I can say is good luck. Another thing to beware of is that this wouldn't work completely when you're interacting with JS. :frowning:

(I kinda wish I could have dependent types myself...)

Although it might help in typing Function.prototype.bind/Function.prototype.call. And if it can include array tuples, it can also type Function.prototype.apply. :smiley:

The original issue is handled by https://github.com/Microsoft/TypeScript/issues/3496#issuecomment-128553540

it sure would be nice to be able to write type JSONValue = ... | JSONValue[] instead of using the extra interface in type JSONValue = ... | JSONArray, but since this scenario is unblocked i am inclined to close this issue. please let me know if you have other thoughts.

Don't mind submitting a PR for this. Will spend some time exploring incremental parsing before I revisit.

closing.

@isiahmeadows The TypeScript translation of the Haskell json data type that you've posted works just fine:

type JSValue = { kind: 'JSNull' }
             | { kind: 'JSBool',     value: boolean }
             | { kind: 'JSString',   value: string  }
             | { kind: 'JSRational', asFloat: boolean, value: number }
             | { kind: 'JSArray',    value: JSValue[] }
             | { kind: 'JSObject',   value: { [key: string]: JSValue } }

data constructors actually introduce indirection which saves Haskell. The direct Haskell translation of the OP's snippet would require using either type or newtype, but that's not possible.

@isiahmeadows That's called "equirecursive" and "isorecursive" approaches to infinite types. In first case we get "real" infinite types, and it's barely possible to do type inference there, even without any other type system extensions. In a big type system like TS there might not be even a way to typecheck it.

Most programming languages (including Haskell and O'Caml) prefer to use isorecursive approach. Operations on named types (constructing its instance or pattern-matching it) include operations of explicit type folding and unfolding, while types are represented in finite notation and don't equal each other even if they're isomorphic to each other.

Recursion on type aliases is essentially equirecursive feature, and it's most likely impossible to implement in TS at all. Someone could make a proof of this claim, but type system of TS is unsound anyway, so there's not much sense in making it.

but type system of TS is unsound anyway
@polkovnikov-ph is this a personal opinion? Can yo explain a bit?

@metasansana #9825 - It's a theoretical thing.

@metasansana There is even a gist for it: https://gist.github.com/t0yv0/4449351

The following code shouldn't compile, but it does

class A {}

class B extends A {
  foo(x: string) : string {
    return x + "!";
  };
}

function f1(k: (a: A) => void) : void {
  k(new A());
}

function f2(k: (a: B) => void) : void {
  f1(k);
}

f2(function (x: B) { console.log(x.foo("ABC")); });

This is one of the many bugs in type system of TS, and, unfortunately

100% soundness is not a design goal.

did you try compiling your code with --strictFunctions flag? i asm not at the computer, but it should break it (as you expect)

@aleksey-bykov I tried it in TS playground. It doesn't have such a flag. In no way this should be a "feature" disabled by default, let alone the fact it shouldn't even exist. "False positives" are intolerable in type systems.

the problem you are talking about doesnt exist anymore, typescript takes it slow progressing from loose to strict giving us a chance to tighten our code (originally written in js) one step at a time at a comfortable pace, this is the reason for the flag

playground might be lagging behind the latest version in master, but it doesnt stop anyone from using it in production

what else is wrong?

honestly there are very few impurities left in TS that make your code unsound, and TS design team doesnt hesitate rolling out breaking changes for the sake of brighter future, i am personally very happy with that, wish you the same

@aleksey-bykov You meant the --strictFunctionTypes flag right?

Fixed in #33050.

Was this page helpful?
0 / 5 - 0 ratings