Typescript: Tolerate invalid identifier syntax errors in list parsing

Created on 29 Aug 2020  路  6Comments  路  Source: microsoft/TypeScript

Search Terms

parser variable name list identifiers

Suggestion

Instead of aborting parsing a list when an invalid identifier is reached, have the parser emit a syntax error but otherwise parsing. In other words: it should successfully parse lists such as variable declarations as if the identifiers were valid, with the one difference of emitting a syntax error that doesn't affect parsing.

Use Cases

Issues such as #11648 and #19352 pop up from situations where the parser attempts to read in a list, such as one of variable declarations, and cannot because an identifier is invalid.

const case = 123;
//    ~~~~ 'case' is not allowed as a variable declaration name.(1389)
//         ~ Variable declaration expected.(1134)
//           ~~~ Variable declaration expected.(1134)

Although #40105 improved the first error message (previously it was also Variable declaration expected.), the underlying issue remains: the parser aborts reading in the list, resulting in an odd error message and surprising emitted JavaScript.

const ;
123;

Examples

Collecting a few examples from the linked issues...

const data = { in: 1 };
const { in } = data;

for (const case of [1, 2, 3]) console.log(case);

const case = 123;

let delete = true;

Checklist

My suggestion meets these guidelines:

  • [x] This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • [x] This wouldn't change the runtime behavior of existing JavaScript code
  • [x] This could be implemented without emitting different JS based on the types of the expressions
  • [x] This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, etc.)
  • [x] This feature would agree with the rest of TypeScript's Design Goals.
Experience Enhancement Suggestion

Most helpful comment

it depends really on context. If, for example, you saw function foo() there, it would likely not be a good idea to reinterpret function as an identifier. Keywords are actually super useful for compilers because the majority case is that they are actually keywords, and thus can be used to resync without needing any special parser logic about if that's the right thing to do.

Of course, as you've identified that's not always the best thing. However, rather than unilaterally reinterpretting as an identifier (Which will likely degrade a bunch of error correction), i might conditionally reinterpret based on what's around the keyword. in other words, reinterpret if it seems highly likely that it's an identifier, and that you would have even worse issues treating it as a keyword.

All 6 comments

Maybe it's not obvious to me - what's a case that you have in mind outside of the scope of #11648?

@DanielRosenwasser in my mind, that issue is purely about error messaging, as was #19352, while this one is about changing the parser to emit friendlier code in the error situations -- i.e. fixing the root cause instead of applying another bandaid.

Is that how you're seeing these? As long as there's an issue somewhere about the parser changes, I'm happy!

I would consider this to be kind of meta issue and that the 2 issues that you linked to roughly correspond to 2 specific instances that would need to be fixed to consider the effort a success. I don't think there's any sort of higher-level architectural thing that we need to do in these specific scenarios.

However, one could imagine something unrelated to "list parsing" which is that the parser could reinterpret all unexpected keywords as Identifier nodes after issuing an error message on that Node.

For example, in the following example, you'll get two errors:

let x = export + 100;
//      ~~~~~~ Expression expected.
//             ~ Variable declaration expected.

Whereas we could instead have provided an error like

let x = export + 100;
//      ~~~~~~ Unexpected 'export' keyword.

Now, reinterpreting every unexpected keyword as an Identifier actually does sound appealing, but I don't know what sort of impact that would have on the rest of the compiler. I can imagine that in the language service, we rely on specific keywords to guide us on what sorts of constructs we might encounter, and we do some other error recovery on specific keywords being present, so there'd be quite a bit of churn there.

Maybe @CyrusNajmabadi has some specific thoughts on this one.

it depends really on context. If, for example, you saw function foo() there, it would likely not be a good idea to reinterpret function as an identifier. Keywords are actually super useful for compilers because the majority case is that they are actually keywords, and thus can be used to resync without needing any special parser logic about if that's the right thing to do.

Of course, as you've identified that's not always the best thing. However, rather than unilaterally reinterpretting as an identifier (Which will likely degrade a bunch of error correction), i might conditionally reinterpret based on what's around the keyword. in other words, reinterpret if it seems highly likely that it's an identifier, and that you would have even worse issues treating it as a keyword.

Yup, I think that's what I had in mind as well. I think we'd entertain a PR that did that, but I couldn't guarantee the scope of the work that it entails, or whether we'd be comfortable with the change if it seems way too invasive.

Was this page helpful?
0 / 5 - 0 ratings