Linguist: Outstanding Grammar Issues

Created on 30 Nov 2017  路  17Comments  路  Source: github/linguist

The following is a detailed list of all the outstanding issues in the grammars that GitHub.com uses for syntax highlighting the code in our website.

These issues are detected by our grammars compiler (https://github.com/github/linguist/pull/3915) and are probably causing minor rendering bugs in the website.

Help is very much welcome! If you're seeing bugs or rendering issues in your source code in GitHub, please start by taking a look at this list to make sure we're not detecting any issues in your language's grammar.

Feel free to ask any questions about any given issue and what would be the appropriate way to fix it. I'll keep the issue up-to-date as I work through grammar fixes myself.

cc @github/linguist @pchaigno @Alhadis


  • [ ] repository https://bitbucket.org/Clams/sublimesystemverilog/get/default.tar.gz (1 errors)

    • [ ] Unknown keys in grammar: source.systemverilog (in Clams-sublimesystemverilog-8c37a53ac265/SystemVerilog.tmLanguage) contains invalid keys (hidden)
  • [ ] repository vendor/grammars/Lean.tmbundle (from https://github.com/leanprover/Lean.tmbundle) (1 errors)

    • [ ] Unknown keys in grammar: source.lean (in Syntaxes/Lean.tmLanguage) contains invalid keys (Patterns[3].pattern)
  • [x] repository vendor/grammars/MagicPython (from https://github.com/MagicStack/MagicPython) (1 errors)

    • [x] Unknown keys in grammar: source.python (in grammars/MagicPython.cson) contains invalid keys (first_line_match)
  • [ ] repository vendor/grammars/NimLime (from https://github.com/Varriount/NimLime) (2 errors)

    • [ ] Missing include in grammar: source.nim (in Syntaxes/Nim.YAML-tmLanguage) attempts to include source.asm but the scope cannot be found
    • [ ] Missing include in grammar: source.nim (in Syntaxes/Nim.YAML-tmLanguage) attempts to include text.html.markdown.multimarkdown but the scope cannot be found
  • [ ] repository vendor/grammars/Scalate.tmbundle (from https://github.com/scalate/Scalate.tmbundle) (1 errors)

    • [ ] Missing include in grammar: source.scaml (in Syntaxes/Scaml.tmLanguage) attempts to include source.js.jquery but the scope cannot be found
  • [ ] repository vendor/grammars/atom-language-perl6 (from https://github.com/perl6/atom-language-perl6) (6 errors)

    • [ ] Invalid regex in grammar: source.perl6fe (in grammars/perl6fe.cson) contains a malformed regex (regex "(?x) ( [\p{Digit}\p{Alpha}'\-_]+...": unknown property name after \P or \p (at offset 16))
    • [ ] Invalid regex in grammar: source.perl6fe (in grammars/perl6fe.cson) contains a malformed regex (regex "[\p{Digit}\p{Alpha}'\-_]+": unknown property name after \P or \p (at offset 9))
    • [ ] Invalid regex in grammar: source.perl6fe (in grammars/perl6fe.cson) contains a malformed regex (regex "(?x)(?<!\\)(\$|@|%|&)(?!\$)(...": unknown property name after \P or \p (at offset 75))
    • [ ] Invalid regex in grammar: source.perl6fe (in grammars/perl6fe.cson) contains a malformed regex (regex "(": missing ) (at offset 1))
    • [ ] Invalid regex in grammar: source.perl6fe (in grammars/perl6fe.cson) contains a malformed regex (regex ")": unmatched parentheses (at offset 0))
    • [ ] Invalid regex in grammar: source.perl6fe (in grammars/perl6fe.cson) contains a malformed regex (regex "(?x)(\$|@|%|&)(\.|\*|:|!|\^|~|...": unknown property name after \P or \p (at offset 57))
  • [ ] repository vendor/grammars/dartlang (from https://github.com/dart-atom/dartlang) (1 errors)

    • [ ] Missing include in grammar: source.yaml-ext (in grammars/yaml-ext.cson) attempts to include source.ruby.rails but the scope cannot be found
  • [ ] repository vendor/grammars/language-crystal (from https://github.com/atom-crystal/language-crystal) (1 errors)

    • [ ] Missing include in grammar: source.crystal (in grammars/crystal.cson) attempts to include source.js.jquery but the scope cannot be found
  • [ ] repository vendor/grammars/language-haml (from https://github.com/ezekg/language-haml) (1 errors)

    • [ ] Missing include in grammar: text.haml (in grammars/ruby haml.cson) attempts to include source.ruby.rails but the scope cannot be found
  • [ ] repository vendor/grammars/language-ruby (from https://github.com/atom/language-ruby) (1 errors)

    • [ ] Missing include in grammar: source.ruby (in grammars/ruby.cson) attempts to include source.js.jquery but the scope cannot be found
  • [ ] repository vendor/grammars/language-shellscript (from https://github.com/atom/language-shellscript) (1 errors)

    • [ ] Missing include in grammar: source.shell (in grammars/shell-unix-bash.cson) attempts to include text.html.textile but the scope cannot be found
  • [ ] repository vendor/grammars/language-yaml (from https://github.com/atom/language-yaml) (1 errors)

    • [ ] Missing include in grammar: source.yaml (in grammars/yaml.cson) attempts to include source.ruby.rails but the scope cannot be found
  • [ ] repository vendor/grammars/liquid.tmbundle (from https://github.com/bastilian/validcode-textmate-bundles) (1 errors)

    • [ ] Missing include in grammar: text.html.liquid (in Liquid.tmbundle/Syntaxes/HTML Liquid.plist) attempts to include source.ruby.rails but the scope cannot be found
  • [ ] repository vendor/grammars/mako-tmbundle (from https://github.com/marconi/mako-tmbundle) (1 errors)

    • [ ] Missing include in grammar: text.html.mako (in Syntaxes/HTML (Mako).tmLanguage) attempts to include comment.block but the scope cannot be found
  • [ ] repository vendor/grammars/mathematica-tmbundle (from https://github.com/shadanan/mathematica-tmbundle) (2 errors)

    • [x] Missing include in grammar: source.mathematica (in Syntaxes/Mathematica.tmLanguage) attempts to include meta.scope.any.mathematica but the scope cannot be found
    • [ ] Invalid regex in grammar: source.mathematica (in Syntaxes/Mathematica.tmLanguage) contains a malformed regex (regex "(\b|(?<=_))(Abort|AbortKernels|A...": definition too long (54020 bytes))
  • [ ] repository vendor/grammars/ruby-slim.tmbundle (from https://github.com/slim-template/ruby-slim.tmbundle) (1 errors)

    • [ ] Missing include in grammar: text.slim (in Syntaxes/Ruby Slim.YAML-tmLanguage) attempts to include source.ruby.rails but the scope cannot be found
  • [ ] repository vendor/grammars/turtle.tmbundle (from https://github.com/peta/turtle.tmbundle) (5 errors)

    • [ ] Invalid regex in grammar: source.turtle (in Syntaxes/Turtle.tmLanguage) contains a malformed regex (regex "(?x) (?<PNAME_NS> (?: (?: [\...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 68))
    • [ ] Invalid regex in grammar: source.turtle (in Syntaxes/Turtle.tmLanguage) contains a malformed regex (regex "(?x)( (?: [\p{L}\p{M}] | [:0...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 121))
    • [ ] Invalid regex in grammar: source.turtle (in Syntaxes/Turtle.tmLanguage) contains a malformed regex (regex "(?x) (?<PN_CHARS_U>[\p{L}\p{M...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 73))
    • [ ] Invalid regex in grammar: source.turtle (in Syntaxes/Turtle.tmLanguage) contains a malformed regex (regex "\[[\u20\u9\uD\uA]*\]": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 4))
    • [ ] Invalid regex in grammar: source.turtle (in Syntaxes/Turtle.tmLanguage) contains a malformed regex (regex "(?x)((?<=\s|^|_)(?:[\p{L}\p{M}]...": PCRE does not support \L, \l, \N{name}, \U, or \u (at offset 57))

  • [x] repository vendor/grammars/Sublime-Lasso (from https://github.com/bfad/Sublime-Lasso) (1 errors)

    • [x] Missing include in grammar: file.lasso (in Syntaxes/Lasso.tmLanguage) attempts to include source.smarty but the scope cannot be found
  • [x] repository vendor/grammars/TypeScript-TmLanguage (from https://github.com/Microsoft/TypeScript-TmLanguage) (1 errors)

    • [x] Unknown keys in grammar: source.ts (in TypeScript.YAML-tmLanguage) contains invalid keys (variables)
  • [x] repository vendor/grammars/c.tmbundle (from https://github.com/textmate/c.tmbundle) (1 errors)

    • [x] Unknown keys in grammar: source.c.platform (in Syntaxes/Platform.tmLanguage) contains invalid keys (hideFromUser)
  • [x] repository vendor/grammars/gap-tmbundle (from https://github.com/dhowden/gap-tmbundle) (1 errors)

    • [x] Invalid regex in grammar: source.gap (in Syntaxes/GAP.tmLanguage) contains a malformed regex (regex "\b(16Bits_AssocWord|16Bits_Depth...": definition too long (173512 bytes))
  • [x] repository vendor/grammars/html.tmbundle (from https://github.com/textmate/html.tmbundle) (1 errors) [Removed in #4274]

    • [x] Missing include in grammar: text.html.basic (in Syntaxes/HTML.plist) attempts to include source.smarty but the scope cannot be found
  • [x] repository vendor/grammars/idl.tmbundle (from https://github.com/mgalloy/idl.tmbundle) (1 errors)

    • [x] Unknown keys in grammar: source.idl (in Syntaxes/IDL.tmLanguage) contains invalid keys (isDisabled)
  • [x] repository vendor/grammars/language-babel (from https://github.com/github-linguist/language-babel) (14 errors)

    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?:^|;)\s*+(\bstatic\b)?\s*+(\ba...": subpattern name expected (at offset 422))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?<!:)\s*+(\bstatic\b)?\s*+(\bas...": subpattern name expected (at offset 78))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "\s*+(\basync\b)?\s*+(?=(<(?:(?>[...": subpattern name expected (at offset 240))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "\s*+(\b[_$a-zA-Z][$\w]*)\s*+(=)\...": subpattern name expected (at offset 271))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "\s*+(\b[A-Z][$\w]*)?(\.)(prototy...": subpattern name expected (at offset 304))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "\s*+(\b_?[A-Z][$\w]*)?(\.)([_$a-...": subpattern name expected (at offset 291))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(^\s*+(?=([$\w]*\s*+\??\s*+(:|=(...": nothing to repeat (at offset 73))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?<=^|{|,)\s*+(('|\")([^"']*)(\k...": subpattern name expected (at offset 33))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?<=^|{|,)\s*+(\b[_$a-zA-Z][$\w]...": subpattern name expected (at offset 281))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?<=^|{|,)\s*+(\b[_$a-zA-Z][$\w]...": subpattern name expected (at offset 271))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?<=^|{|,)\s*+(('|\")([^"']*)(\k...": subpattern name expected (at offset 33))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(?<=^|{|,)\s*+(('|\")([^"']*)(\k...": subpattern name expected (at offset 33))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "(^|:|;|=|(?<=:|;|=))\s*+(\((?=((...": subpattern name expected (at offset 52))
    • [x] Invalid regex in grammar: source.js.jsx (in grammars/Babel Language.json) contains a malformed regex (regex "\s*+((("|').*?(?<=[^\\])\k<-1>)|...": subpattern name expected (at offset 27))
  • [x] repository vendor/grammars/language-emacs-lisp (from https://github.com/Alhadis/language-emacs-lisp) (2 errors)

    • [x] Invalid regex in grammar: source.emacs.lisp (in grammars/emacs-lisp.cson) contains a malformed regex (regex "(?x)(?<=[()]|^) (abbrev-all-cap...": definition too long (46034 bytes))
    • [x] Invalid regex in grammar: source.emacs.lisp (in grammars/emacs-lisp.cson) contains a malformed regex (regex "(?x)(?<=[()]|^)(?: \*table--ce...": definition too long (690564 bytes))
  • [x] repository vendor/grammars/language-jison (from https://github.com/cdibbs/language-jison) (1 errors)

    • [x] Unknown keys in grammar: source.jisonlex-injection (in grammars/jisonlex-injection.cson) contains invalid keys (injectionSelector)
  • [x] repository vendor/grammars/language-maxscript (from https://github.com/Alhadis/language-maxscript) (1 errors)

    • [x] Invalid regex in grammar: source.maxscript (in grammars/maxscript.cson) contains a malformed regex (regex "(?i)\b(3D_Studio|3D_Studio_Shape...": definition too long (55719 bytes))
  • [x] repository vendor/grammars/language-pcb (from https://github.com/Alhadis/language-pcb) (1 errors)

    • [x] Invalid regex in grammar: source.pcb.board (in grammars/pcb.board.cson) contains a malformed regex (regex "^\s*(\Po\s+.+\s+)([~F][~P])\s*$": unknown property name after \P or \p (at offset 7))
  • [x] repository vendor/grammars/language-roff (from https://github.com/Alhadis/language-roff) (1 errors)

    • [x] Missing include in grammar: text.roff (in grammars/roff.cson) attempts to include source.AtLilyPond but the scope cannot be found
  • [x] repository vendor/grammars/language-typelanguage (from https://github.com/goodmind/language-typelanguage) (1 errors)

    • [x] Unknown keys in grammar: source.tl (in grammars/typelanguage.tmLanguage.json) contains invalid keys ($schema)
  • [x] repository vendor/grammars/objective-c.tmbundle (from https://github.com/textmate/objective-c.tmbundle) (1 errors)

    • [x] Unknown keys in grammar: source.objc.platform (in Syntaxes/Platform.tmLanguage) contains invalid keys (hideFromUser)
  • [x] repository vendor/grammars/oz-tmbundle (from https://github.com/eregon/oz-tmbundle) (1 errors)

    • [x] Grammar conversion failed. File Originals/Oz.tmLanguage failed to parse: XML syntax error on line 23: expected element name after <
  • [x] repository vendor/grammars/pike-textmate (from https://github.com/hww3/pike-textmate) (1 errors)

    • [x] Unknown keys in grammar: source.pike (in Pike.tmbundle/Syntaxes/Pike.plist) contains invalid keys (Patterns[5].swallow, Patterns[6].swallow, foregroundColor, backgroundColor, increaseIndentPattern)
  • [x] repository vendor/grammars/rascal-syntax-highlighting (from https://github.com/usethesource/rascal-syntax-highlighting) (1 errors)

    • [x] Invalid regex in grammar: source.rascal (in atom/language-rascal/grammars/rascal.cson) contains a malformed regex (regex "/(?!/|*)": nothing to repeat (at offset 6))
  • [x] repository vendor/grammars/sublime-MuPAD (from https://github.com/ccreutzig/sublime-MuPAD) (1 errors)

    • [x] Unknown keys in grammar: source.mupad (in MuPAD.tmLanguage) contains invalid keys (smartTypingPairs, highlightPairs)
  • [x] repository vendor/grammars/sublime-autoit (from https://github.com/AutoIt/SublimeAutoItScript) (1 errors)

    • [x] Invalid regex in grammar: source.autoit (in AutoIt.tmLanguage) contains a malformed regex (regex "\b(?i:_array1dtohistogram|_array...": definition too long (79183 bytes))
  • [x] repository vendor/grammars/sublime-glsl (from https://github.com/euler0/sublime-glsl) (1 errors)

    • [x] Unknown keys in grammar: source.glsl (in GLSL.tmLanguage) contains invalid keys (Patterns[3].beginCapture)
  • [x] repository vendor/grammars/sublime-mask (from https://github.com/tenbits/sublime-mask) (1 errors)

    • [x] Missing include in grammar: source.mask (in Syntaxes/mask.tmLanguage) attempts to include js-expression but the scope cannot be found
  • [x] repository vendor/grammars/sublime-netlinx (from https://github.com/amclain/sublime-netlinx) (1 errors)

    • [x] Missing include in grammar: source.netlinx.erb (in NetLinx.erb.tmLanguage) attempts to include meta.define.event but the scope cannot be found
  • [x] repository vendor/grammars/sublime-spintools (from https://github.com/bitbased/sublime-spintools) (1 errors)

    • [x] Unknown keys in grammar: source.spin (in Spin.tmLanguage) contains invalid keys (bundleUUID)
  • [x] repository vendor/grammars/sublimetext-cuda-cpp (from https://github.com/harrism/sublimetext-cuda-cpp) (1 errors)

    • [x] Invalid regex in grammar: source.cuda-c++ (in cuda-c++.tmLanguage) contains a malformed regex (regex "\printf\b": unknown property name after \P or \p (at offset 2))
  • [x] repository vendor/grammars/vue-syntax-highlight (from https://github.com/vuejs/vue-syntax-highlight) (2 errors)

    • [x] Missing include in grammar: text.html.vue (in vue.YAML-tmLanguage) attempts to include text.slm but the scope cannot be found
    • [x] Missing include in grammar: text.html.vue (in vue.YAML-tmLanguage) attempts to include text.pug but the scope cannot be found
Help Wanted

All 17 comments

Thanks for the useful and actionable list @vmg. One question...

repository vendor/grammars/language-babel (from https://github.com/gandm/language-babel) (14 errors)

Is this the actual upstream grammar as it stands now or the old pinned copy we're shipping with Linguist at the moment?

I'm assuming the latter, but thought I'd check to be sure.

I'm assuming the latter, but thought I'd check to be sure.

Correct! I fucked up my submodules. Sorry about that, I'll update the list with the proper URL.

Hey, sorry about the (really) slow response. My MacBook died last week, which means I've been painfully limited in what I'm able to do on GitHub (I'm using my work's computer for the time being, when time permits).

The issues reported by my grammars are easily fixed; but the LilyPond grammar should be fixed with a PR to the upstream repository. Basically, the scope.AtLilyPond should be replaced with just scope.lilypond so it's consistent with other LilyPond grammars (and I can therefore replace the offending rule in text.roff with a single inclusion: {include: "source.lilypond"}.

What's the maximum token length permitted by the new compiler? I was about to start fixing the issues with Emacs Lisp's grammar, but realised I don't have the actual limit to go by.

Admittedly, I'm not really fond about the size limit, because the "fix" here is to simply break the pattern down into multiple rules that're bunched together under the same name. It feels terribly hacky, and the fact that the rules in question were compiled from an external source means that updating the list in future might be made more complicated...

@Alhadis: Sorry, you caught me on Holidays. The maximum size is enforced by PCRE, not by our parser, and it's 64kb for a single regexp. I'm aware it's a bummer, but it's the way PCRE was designed.

That's understandable. How is whitespace treated inside expressions which use "expanded" notation...?

~perl
m/
abc
(?:
xyz
)
(?=\w+)
/x;
~

Because there are two different ways to represent that in CSON. One is with an ordinary quoted-string, which includes embedded newlines as part of the pattern...

~cson
pattern: "(?x)
abc
(?:
xyz
)
(?=\w+)
"
~

... and the other is to use triple-quoted strings ("heredocs"):

~cson
pattern: """(?x)
abc
(?:
xyz
)
(?=\w+)
"""
~

The latter will strip as much indentation as it can, leaving some (but not all) horizontal whitespace after the CSON-to-JSON conversion:

~yaml
(?x)
abc
(?:
xyz
)
(?=\w+)
~

Now this won't make any difference to the regex engine, but it will to my subdivision efforts... :grinning:

@Alhadis I'm honestly not sure of how exactly does PCRE implement this -- you should be able to test it out by simply downloading libpcre and trying to compile in the regexps. Our parser has no custom behavior here.

Okay, that's the last of my grammars fixed. 馃槈

@Alhadis You closed this by mistake, right? :smile_cat:

Yeah, sorry. I didn't even notice I'd pressed the wrong button to comment. My mistake. 馃槗

@Alhadis :bow::bow::bow:

I've updated the sublime-mask output in the OP as the latest grammar compiler now prefer the "compiled" .tmLanguage file over the YAML file and sublime-mask has only updated the YAML file. I have pinged the author in https://github.com/tenbits/sublime-mask/pull/1 asking them to update the mask.tmLanguage file too.

repository vendor/grammars/oz-tmbundle (from https://github.com/eregon/oz-tmbundle) (1 errors)
Grammar conversion failed. File Originals/Oz.tmLanguage failed to parse: XML syntax error on line 23: expected element name after <

@vmg Do you have more information on that error? I don't see anything weird in the XML format...

I don't see anything weird in the XML format...

It's mixing tabs and spaces for indentation, but other than that, I can't see anything wrong either... :confused:

The issue is that it's loading Syntaxes/Oz.tmLanguage _and_ Originals/Oz.tmLanguage. The grammar in Originals is not even valid XML, but has the same extension, so we attempt to load both.

This is fixable in the compiler (most likely fix: ensure that the .tmLanguage file is in a Syntaxes folder -- although that could break other grammars). I can't get to that this week, but if anybody wants to pick it up it should be a pretty straightforward fix. The Go codebase is pretty simple! 馃槃

Ok. Thanks for the explanation @vmg!
Won't have time either this week, but I'll try to look into it later. I need to better understand the Go code anyway.

IMHO, the Missing include in grammar: errors should be downgraded to warnings; or, at the very least, passed through a whitelist of manual approvals (similar to how we're whitelisting certain license hashes).

It's currently impossible for a grammar package to support both GitHub and their respective editors if two grammars disagree on their scopeName . For example, I can't publish this or even view its Lex highlighting locally correctly because Atom's C++ grammar uses source.cpp as its scopeName, whereas Linguist's C++ grammar uses source.c++ instead.

The logical solution would be to include both:

~cson
patterns: [
{include: "source.c++"}
{include: "source.cpp"}
]
~

... which the compiler will flag as an error.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

d4nyll picture d4nyll  路  3Comments

etc0de picture etc0de  路  5Comments

Alhadis picture Alhadis  路  5Comments

FranklinYu picture FranklinYu  路  4Comments

philiparvidsson picture philiparvidsson  路  4Comments