Roslyn: Error Recovery Docs?

Created on 30 Dec 2016 · 4Comments · Source: dotnet/roslyn

Is there any documentation on how the error recovery system works?
If so, could you signpost me to it.

Area-Compilers Question Resolution-Answered

Source

AdamSpeight2008

👍2

Most helpful comment

Principally I'm interested in how the parser recovers after it find something it didn't expect.

First, realize that parsing is conceptually just looping nested contextual recognizers. i.e. there is a top level loop that is trying to recognize top level constructs (i.e. usings/namespaces/type-declarations/etc.). Then, when you parse some construct, it itself may have some nested loop inside of it (and so on and so forth). For example, a class-declaration parsing function will have a loop that recognizes and parses out the class members. A parameter list parsing function will have a loop that recognizes and parses out parameters. etc. etc. So, at any point in time you're in some (possibly deeply) nested set of parsing loops.

Second: we break "something it didn't expect" into two concepts:

Something unexpected, which might be the start of something valid in some parsing context we're currently in.
Something unexpected which is not.

For example, if we have:

```c#
class C
{
void M()
{
if (a)
class D


While attempting to parse the embedded statement of the 'if-statement' the parser will see a token it does not expect (```class```).  When this happens, it checks what 'loop contexts' its currently parsing and asks if any of them could handle processing ```class```.  In the above example, we're actually have three nested parsing function loops:

1. The outermost loop that's parsing the top-most file constructs (usings/namespaces/types/etc.)
2. The class-loop that is parsing class members.
3. The block-statement loop (inside 'M') that is parsing out statements.

In this case, the block-statement loop will say "i cannot handle 'class'".  We then look upwards and ask the class-loop if it can handle 'class'.  It will say 'yes i can'.  Because we know an outer loop will be able to handle this, we simply stop parsing our inner loop and allow the parsing to bubble back up (i.e. popping our literal function call stack).  

Now, let's say we have something like:

```c#
class C
{
    void M()
    {
        if (a)
           , return;

In this case we will perform the same sort of check as above. Asking each of our containing loop contexts "can you handle ,?". However, in this case, none of the loop constructs will be able to handle it. So we say "Ok. We don't know what to do with this token. So just skip it and continue performing whatever loop we're in". In the case above, we're in a block, in a statement-parsing loop. We'll see 'return' and know how to handle it.

So, to sum up: When we encounter something we don't expect, we make a decision if we should stop what we're currently doing and allow someone above us in the stack to take over. Or we decide if we should just skip over the token. If we skip, we then go back to our normal parsing routines, repeating hte same sort of error-check as necessary if we can't handle that token, and so on and so forth.

CyrusNajmabadi on 30 Dec 2016

❤5 👍5

All 4 comments

Which error recovery system are you referring to?

CyrusNajmabadi on 30 Dec 2016

@CyrusNajmabadi I didn't know there was multiple error recovery systems.
Hence the need for documentation,
Principally I'm interested in how the parser recovers after it find something it didn't expect.
Since the IDE doesn't just report only the first error. thus it must be able to resync further on in the grammar.

AdamSpeight2008 on 30 Dec 2016

Principally I'm interested in how the parser recovers after it find something it didn't expect.

Second: we break "something it didn't expect" into two concepts:

Something unexpected, which might be the start of something valid in some parsing context we're currently in.
Something unexpected which is not.

For example, if we have:

```c#
class C
{
void M()
{
if (a)
class D


While attempting to parse the embedded statement of the 'if-statement' the parser will see a token it does not expect (```class```).  When this happens, it checks what 'loop contexts' its currently parsing and asks if any of them could handle processing ```class```.  In the above example, we're actually have three nested parsing function loops:

1. The outermost loop that's parsing the top-most file constructs (usings/namespaces/types/etc.)
2. The class-loop that is parsing class members.
3. The block-statement loop (inside 'M') that is parsing out statements.

In this case, the block-statement loop will say "i cannot handle 'class'".  We then look upwards and ask the class-loop if it can handle 'class'.  It will say 'yes i can'.  Because we know an outer loop will be able to handle this, we simply stop parsing our inner loop and allow the parsing to bubble back up (i.e. popping our literal function call stack).  

Now, let's say we have something like:

```c#
class C
{
    void M()
    {
        if (a)
           , return;

CyrusNajmabadi on 30 Dec 2016

❤5 👍5

Sounds like a great wiki article.