Followup to #389 & #1131. I'm taking a look at this for work because we have a few thousand files to lint. Running them in serial takes a _long_ time. Our options are:
Within option 2 I see there being two possibilities:
Runner and Linter to be asynchronousLinter in particular would be a breaking API changeProcessSplitter class between tslint-cli.ts and Runner--parallel is provided and > 1--parallel1/<--parallel> of those filesThoughts?
Some random thoughts:
async processing:
--type-check or do we get race conditions from that?Runner asyncglob-stream for globbing to start linting as fast as possible (although that's not the bottleneck)ts.Program (created when --project is passed) when not type checking. We could then avoid the globbing and reuse the SourceFiles instead of reading the files again and creating new SourceFilesConfiguration and RuleLoaderLinter async does not seem to be necessary?ts.updateSourceFile instead of this.getSourceFilemultiple threads:
result cache:
Should probably do #2282 first.
Also, #2235 would address the problem of reading files multiple times.
Glad the reaction has been favorable so far. :) I'll start work on some of the low hanging fruit.
Re the downsides of multiple threads: agreed, and this should definitely not be used for "normal" (most) projects. The scenario I'm approaching this for is overpowered 16-core monstrosities (dev machines, build lab machines, obscene Azure VMs, and the like) made to deal with huge code bases. They're fantastic at MSBuild/gcc/etc. but amusingly running TSLint tasks slows them down something terrible. The argument could (and should) be made that projects big enough to slow down TSLint should be split up into sub-projects, and those sub-projects' TSLint tasks should be parallelized... but that's not always possible.
maybe merge-able with https://github.com/palantir/tslint/issues/943
parallel linting is not supported so far, my workaround is
find src -name '*.ts' | parallel node_modules/.bin/tslint -p . --fix
sometimes also use git status to skip some files.
Hi there,
I have one suggestion for this: what if we'll move all rules, which doesn't require a type checking into separate processes and split all files between them, but all rules, which require a type checking we'll leave in the main thread (to avoid compilation of the project multiple times)? 🤔
I just played with that and have some results.
I've created 3 separate processes (os.cpus().length - 1) to linting by non-typed rules and leave linting by typed rules in the main thread. So, each separate process lints ~630 files by 123 rules, the main thread lints all 1894 files by 9 rules.
In my runs each process works ~70-85 seconds, the main thread lints all files ~70 seconds. Time to create ts.Program is 15-16 seconds. So, the total time of linting the whole project is ~100-105 seconds. tslint from npm does it in ~170 seconds (parallel version is ~40% faster).
To summarize the results:
So, the parallel version is ~40% faster in my case.
What do you think about that?
For me (I'm not familiar with tslint's code a lot) there is still some open questions:
LintResult object to the main thread. In my case I just use errorCount and output from that object - it seems that it's enough to print the error(s)./cc @JoshuaKGoldberg @ajafff
This looks like a great start @timocov! Super exciting that you got this to work 👏
- We can't send/receive the code/nodejs state between processes
Why not? Can you elaborate? https://nodejs.org/api/child_process.html#child_process_options_stdio implies there are some built-in options there.
Applying fixes is done by a formatter AFAICS. For both 2/fixes and 3/sorting, from my perspective it should be possible to run something like:
Thoughts?
Why not? Can you elaborate? https://nodejs.org/api/child_process.html#child_process_options_stdio implies there are some built-in options there.
I meant that we can't just send the whole instance of some class from one process to another process without implementing (de)serialization code. Or we can? If so, I can't realize how nodejs does it 🙂
Of course, we can send serialized state/JSON between processes to share result (and this is what I did in my the first implementation).
If you wish I can prepare the code (actually you already can see rough solution here, but I need to make it more readable and remove duplications) of the first implementation and we can discuss on it.
we can send serialized state/JSON between processes to share result (and this is what I did in my the first implementation)
Awesome, that's exactly what we'd need. +1 to preparing the code!
@JoshuaKGoldberg @adidahiya I just created #4483 with a little bit prepared code of parallel running the linter.
As I said before - I'm not familiar with tslint's code base a lot and can made an epic fails in the PR - keep this in mind 🙂.
☠️ TSLint's time has come! ☠️
TSLint is no longer accepting most feature requests per #4534. See typescript-eslint.io for the new, shiny way to lint your TypeScript code with ESLint. ✨
It was a pleasure open sourcing with you all!
Most helpful comment
Hi there,
I have one suggestion for this: what if we'll move all rules, which doesn't require a type checking into separate processes and split all files between them, but all rules, which require a type checking we'll leave in the main thread (to avoid compilation of the project multiple times)? 🤔
I just played with that and have some results.
I've created 3 separate processes (
os.cpus().length - 1) to linting by non-typed rules and leave linting by typed rules in the main thread. So, each separate process lints ~630 files by 123 rules, the main thread lints all 1894 files by 9 rules.In my runs each process works ~70-85 seconds, the main thread lints all files ~70 seconds. Time to create
ts.Programis 15-16 seconds. So, the total time of linting the whole project is ~100-105 seconds.tslintfrom npm does it in ~170 seconds (parallel version is ~40% faster).To summarize the results:
So, the parallel version is ~40% faster in my case.
What do you think about that?
For me (I'm not familiar with tslint's code a lot) there is still some open questions:
LintResultobject to the main thread. In my case I just useerrorCountandoutputfrom that object - it seems that it's enough to print the error(s)./cc @JoshuaKGoldberg @ajafff