Rubberduck: RD does not correctly recognize unorthodox Rem comments, sometimes leading to parser errors

Created on 17 Nov 2016  路  12Comments  路  Source: rubberduck-vba/Rubberduck

You can start a Rem comment with Rem?, which does not seem to be recognized correctly.
For example, in Rem?Debug.Print "bla" code inspections recognize Rem?Debug as a variable and Rem?Debug.Print "bla leads to a parser error. Both are recognized as comments by the VBE.

I am not sure which other terminators for the Rem token are affected.

antlr bug status-declined up-for-grabs

Most helpful comment

I did some experiments and I think I know now what is causing this issue: although it is not documented, none of the characters above is allowed in a VBA identifier, which also applies to '[' and ']'. (At least the VBE reacts with a syntax error.) The problematic characters found further above, which are possible as the start of a Rem comment, are currently allowed in the lexer rule for identifiers.
So, adding them to the exclusion rules in the identifier rule and rebuilding the parser should solve the problem.

All 12 comments

Linking #1057

This could be a tricky one, and I'd deem it a low-priority issue - but it's an issue nonetheless, thanks for reporting it!

All of these are problematic:

Rem?
Rem>
Rem<
Rem{
Rem}

Part of me still thinks this is a bug in VBA's own parser.

Just for completeness, the HasComment string extension in Rubberduck.Parsing.VBA is even less reliable in recognizing Rem comments: it only recognizes Rem followed by a whitespace.

@MDoerner good catch. In v1.4.3 we used it to parse comments, but since 2.x (not sure which pre-release exactly) the comments (and annotations) are parsed by the ANTLR-generated parser; that extension method is used in two tests and 4 places where this bug can potentially bite us in the rear end:

  • IgnoreOnceQuickFix.Fix()
  • ObsoleteCallStatement.GetInspectionResults()
  • RemoveCommentQuickFix.Fix()
  • ReplaceObsoleteCommentMarkerQuickFix.Fix()

Tweaking the grammar to accept those will be hard if at all possible. In the meantime HasComment now accounts for all these:

public static class StringExtensions
{
    // see issues #1057 and #2364.
    private static readonly IList<string> ValidRemCommentMarkers =
        new List<string>
        {
            Tokens.Rem + ' ',
            Tokens.Rem + '?',
            Tokens.Rem + '<',
            Tokens.Rem + '>',
            Tokens.Rem + '{',
            Tokens.Rem + '}',
            Tokens.Rem + '~',
            Tokens.Rem + '`',
            Tokens.Rem + '!',
            Tokens.Rem + '/',
            Tokens.Rem + '*',
            Tokens.Rem + '(',
            Tokens.Rem + ')',
            Tokens.Rem + '-',
            Tokens.Rem + '=',
            Tokens.Rem + '+',
            Tokens.Rem + '\\',
            Tokens.Rem + '|',
            Tokens.Rem + ';',
            Tokens.Rem + ':',
            Tokens.Rem + '\'',
            Tokens.Rem + '"',
            Tokens.Rem + ',',
            Tokens.Rem + '.',
        };

I did some experiments and I think I know now what is causing this issue: although it is not documented, none of the characters above is allowed in a VBA identifier, which also applies to '[' and ']'. (At least the VBE reacts with a syntax error.) The problematic characters found further above, which are possible as the start of a Rem comment, are currently allowed in the lexer rule for identifiers.
So, adding them to the exclusion rules in the identifier rule and rebuilding the parser should solve the problem.

@MDoerner that's definitely worth giving it a shot! Thanks!

Is that still an issue? I think I might have fixed it and forgot to reference the issue...

This issue still exists: Rem?bug is recognized as one variable and Rem?bug " causes a parser error.

That's a VBE bug, and an edge case. tagging [declined], but keeping [up-for-grabs] if anyone wants to take a stab at it.

Was this page helpful?
0 / 5 - 0 ratings