Typescript: Leading |/& is not included in the intersection/union node

Created on 17 Apr 2019  ·  12Comments  ·  Source: microsoft/TypeScript


TypeScript Version: 3.4.1


Search Terms: Union Intersection Leading Character Parse AST

Code

type Foo =
    | A
    | B;

https://astexplorer.net/#/gist/52725a31dc27bf4bdf88843ebd3d07f8/eccbc97edfc8d520f8c08aff0f2024053eeab1b7

(note this behaviour is exactly the same for IntersectionTypes)

Expected behavior:
The UnionType node should include the leading |.

Actual behavior:
The leading | is not included in the node.
image
(The underlined node is currently hovered, highlighting it in the editor)

This creates some weirdness because it means the leading character ofc isn't part of the Union/Intersection node, and is instead technically a part of whatever node is its parent.

Related Issues:
https://github.com/typescript-eslint/typescript-eslint/issues/435

Bug good first issue help wanted

Most helpful comment

I wonder if we should add something like prefixToken to UnionTypeNode and IntersectionTypeNode as a place to put this token, and possibly issue a syntax error on a leading | or & if we don't end up parsing a union or intersection:

// ok
type A = 
  | X
  | Y;
// not ok 
type B =
  | X;

Alternatively, we could always create a UnionTypeNode or IntersectionTypeNode with a single constituent when we encounter a leading & or | token, so that we have a place to put the token.

All 12 comments

That definitely seems wrong.

Does this mean the following code is considered as an intersection/union type?

type A = | 1
type B = & 2

@mysticatea No. With a single type it'll be a type alias. See here: https://astexplorer.net/#/gist/8c2a38fd42b7326872dfcde0ec74cb5f/fb65f9cee71509112894c26567d5f74ae377ba3e

In that case, I'm not sure if this is a bug.

I don't think that LiteralType node range should contain the beginning operator. So it's consistent that the intersection/union type node range doesn't contain the beginning operator.

cc: @ahejlsberg

IMO it should parse as union/intersection type with one constituent. That's what happens with a trailing | or &

@ajafff A trailing | or & is syntax error: https://www.typescriptlang.org/play/index.html#src=type%20A%20%3D%20%7C%201%0D%0Atype%20B%20%3D%20%26%202%0D%0Atype%20C%20%3D%201%20%7C%3B%0D%0Atype%20D%20%3D%201%20%26%0D%0A

The problem with ignoring it is that you now have this free token in the code which doesn't belong to any node at all, which creates an invalid state in the produced AST.

Not that I want to prescribe a solution (I'm not a language designer).
But from the point of someone looking for consistency, I'd probably want type x = | T; to be a semantic error. i.e. prefixing a type with the | or & operator is not semantically valid unless there is more than one type (i.e. it is a union/intersection).

The problem that I see with the current semantics is that this is 100% valid typescript:

type A = | ( | ( | ( | ( | 1) ) ) );

repl

Which of course means that each of the ParenthesizedType nodes "owns" one |, and the TypeAliasDeclaration owns one | as well.

type A = | ( | ( | ( | ( | 1) ) ) );

🤣

I think we have three paths to solve this problem.

  1. The beginning operator is floating token as is because it doesn't have any effects for semantic. This is similar to the fact AST nodes don't contain extra parentheses. For example, var a = ((((1))));, this 1 Literal node range doesn't contain the parentheses. It's not surprising.
  2. Intersection/union node contains the beginning operator and the beginning operator makes intersection/union node even if the followed type was not an intersection/union type. My first question came from this way. For example, type A = | 1 makes a UnionType node which has a LiteralType node. But as considering @bradzacher 's example, this way can make deeply nested intersection/union types.
  3. Intersection/union node contains the beginning operator and the beginning operator is a syntax error if the followed type was not an intersection/union type. This looks like the most reasonable way, but it's a breaking change.

Totally new to Typescript, but: What is the semantic value of nesting types in a union or intersection in the first place? Are dynamic expressions allowed here, such that nesting expressions might be a valid use case?

@platinumazure - if I understand your question, you're talking about the use of parenthesis?

They let you clarify types when things become ambiguous. For example:

type UnionIntersection = A | B & C;
type UnionIntersectionClear = (A | B) & C;
type UnionIntersectionClear2 = A | (B & C);

type ArrowFn = () => string | number;
type ArrowFnClear = (() => string) | number;
type ArrowFnClear = () => (string | number);

You can get pretty funky with your nesting if you'd like, the syntax is expressive enough that you can add any combination of types in. types can reference other types, you can inspect types by property names if they have them (type T = { prop: string }; type TProp = T['prop'];).

Thanks @bradzacher, I appreciate the primer. That answers my question.

I wonder if we should add something like prefixToken to UnionTypeNode and IntersectionTypeNode as a place to put this token, and possibly issue a syntax error on a leading | or & if we don't end up parsing a union or intersection:

// ok
type A = 
  | X
  | Y;
// not ok 
type B =
  | X;

Alternatively, we could always create a UnionTypeNode or IntersectionTypeNode with a single constituent when we encounter a leading & or | token, so that we have a place to put the token.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kyasbal-1994 picture kyasbal-1994  ·  3Comments

fwanicka picture fwanicka  ·  3Comments

manekinekko picture manekinekko  ·  3Comments

CyrusNajmabadi picture CyrusNajmabadi  ·  3Comments

bgrieder picture bgrieder  ·  3Comments