JavaScript is a syntactically complex language, and its complexity continues to increase (#1269). As a result, the syntax test file is over a thousand lines. Similar tests are often grouped together, but there's no overall organizational scheme. As a result, it can be hard to find tests pertaining to a certain feature, and it's even harder to verify that the tests are comprehensive. (I suspect that there are a lot of holes, and of course adding more tests could exacerbate these issues.)
A natural solution might be to split the tests across several files. Each file could cover a major aspect of the language in a single, manageable chunk. The downside is that there would then be multiple files to test.
Currently, the Python and C bundles each have two syntax test files per language. As far as I am aware, no core bundles have more than that. How many ways is it reasonable to split a syntax test? For JavaScript, I might imagine separate files for core syntax, strings & regular expressions, and ES6 classes. Each of these files could be individually longer than most languages' syntax tests.
Thoughts?
I guess you've not looked at the C# package :) https://github.com/sublimehq/Packages/tree/9e83b95ae096816005c69af1d675f065ed079a17/C%23/tests
I saw the syntax_test_c#.cs at the bottom of the package and somehow missed the entire tests directory. With that in mind, I suppose there's no reason not to split the JavaScript tests accordingly.
The YAML package also has several test files inside a subfolder. I personally prefer using multiple files for organization of tests covering various areas of a language, but @wbond was more fond of single test files, which would add to the overall test file complexity and make it more likely to notice regressions.
Imo, closing each file with a test assertion against any sort of meta scope (see also the python test) is sufficient to check against potential context leaking.
There have been multiple times where changes to the C syntax have been caught by the interaction of different contexts, which would not have been caught if the tests were split out into different files. A test at the end of the file for a meta scope would catch certain errors, however the syntax may be in an incorrect context that doesn't have a meta scope. The error may not show up until you try and parse further syntax.
I am not against organizing files into sections, but I think there is a benefit to having the tests in a single file.
I've definitely run into similar situations with the JavaScript syntax. I wonder if there's a good, systematic way to test for such errors explicitly rather than implicitly. In JavaScript, a short sanity check at the end of each file might do it. For instance:
[foo];
^ meta.sequence punctuation.section
If the parse was in the wrong context at this point, the opening bracket might instead match meta.brackets punctuation.section, punctuation.definition.symbol, string, or something else.
Most helpful comment
I saw the
syntax_test_c#.csat the bottom of the package and somehow missed the entiretestsdirectory. With that in mind, I suppose there's no reason not to split the JavaScript tests accordingly.