This is derived from the conversations had here: https://github.com/bcoe/c8/issues/116
Is your feature request related to a problem? Please describe.
The problem is with intergrating V8 coverage into downstream systems such as test runners or processes which run nodejs programs.
Describe the solution you'd like
The most ideal solution would be something like
// where it comes from isn't too important
const coverage = require('some-nodejs-module-maybe?');
const collection = coverage.collectCoverage({reporters: ['text-summary', 'lcov']});
jasmine.execute();
const reports = await collection.completeCoverage();
const textSummaryReport = reports.get(''text-summary');
console.log(textSummaryReport);
cons lcovReport = reports.get('lcov');
fs.writeFileSync('to/some/location', lcovReport);
// or maybe merge/process it in some other way
However it's my understanding that this approach wouldn't work very well since internally the collectCoverage would have to activate covergae via the inspector api which isn't the best.
https://github.com/bcoe/c8/issues/116#issuecomment-502826818
With that in mind another approach is using the NODE_V8_COVERAGE=/some/dir env var.
However an issue came accross with this one where we need to somehow signal Nodejs to write out the coverage file since we need to process it in in the same process.
https://github.com/bcoe/c8/issues/116#issuecomment-503039423
Describe alternatives you've considered
The two points above
IIUC the feature request that actually applies here is the ability to turn on coverage at run time and write out the reports at some time other than right before exiting? That should be possible with two bindings. However off the top my head, we need to notify (at least) the (subsequently spawned) child processes about the change of coverage directory after the API is invoked. One option might be to set process.env. NODE_V8_COVERAGE but that could also be too surprising.
the ability to turn on coverage at run time and write out the reports at some time other than right before exiting
Yeah I think that would solve our problem here, thanks for summing that up.
I'm not very familar with whats going on here inside Nodejs so leaning on your knowledge to come up with the best solution.
I'm not sure what you mean by the spawned child process in this situation though.
What part of this is spawning a sub process?
I'm not sure what you mean by the spawned child process in this situation though. What part of this is spawning a sub process?
I think @joyeecheung is talking about child processes being spawned by the script being profiled.
In the linked issue @bcoe says that working with the inspector API is problematic but it's not clear to me what the difficulties are. I'm inclined to say no change to Node is necessary because all the building blocks already exist:
$ grep 'command .*Coverage' deps/v8/src/inspector/js_protocol.pdl
command getBestEffortCoverage
command startPreciseCoverage
command stopPreciseCoverage
command takePreciseCoverage
@bcoe Can you expand? Maybe @SimenB can chime in too?
@bnoordhuis old issues on the c8 repo are a pretty good history of why interacting with the Inspector protocol doesn't work great for test coverage:
@Toxicable is the problem that you want to collect coverage for an application before it exits? I'm not fully understanding why running Node.js with an environment variable is a blocker.
@SimenB brings up the need for "running tests in watch mode", which I think means you'd want coverage before the process exits?
In this case, I wonder could you run the script with NODE_V8_COVERAGE set, so that source is instrumented for block coverage immediately, and then use an inspector session like @bnoordhuis suggests?
@bnoordhuis see my post here: https://github.com/bcoe/c8/issues/116#issuecomment-502585658
Using the inspector API is not as accurate as the env var. Running with NODE_V8_COVERAGE makes no difference. Is it because some new session is created instead of connecting to the current one? I have to admit I don't know why they differ, but point 2 brought up above here _sounds_ like a likely suspect to me.
If I can run with NODE_V8_COVERAGE to trigger some instrumentation and access that through the inspector API, that'd work for us. The API is a bit clunky with i's callbacks and whatnot, but I'll happily go for that! More than good enough 馃檪
@SimenB brings up the need for "running tests in watch mode", which I think means you'd want coverage before the process exits?
Yes - I'd like to print the coverage after each run. The process might be running for hours (I've had it going for the 8 hours I'm at work multiple times without ever exiting it). I'd also like to clear collected coverage before kicking off a new run
@Toxicable is the problem that you want to collect coverage for an application before it exits? I'm not fully understanding why running Node.js with an environment variable is a blocker.
@bcoe Being able to gather and process the report in memory, without touching the FS is the high level thing I want here
So possibly some API such as
// the process should be started with `NODE_V8_COVERAGE`
require('assert').notNull(process.env['NODE_V8_COVERAGE']);
//run some code that needs to be instrumented
jasmine.execute();
// stop the current run of coverage
const v8Report = require('inspector').send('stopPreciseCoverage');
// process the report in memory
@bcoe Apropos (1), calling inspector.close() from process.exit() should work. If it doesn't, please file a new issue with a test case and I'll take a look.
W.r.t. (2) and cc @SimenB: start the inspector like this to get block coverage:
const options = {callCount: true, detailed: true};
session.post('Profiler.startPreciseCoverage', options, () => { /* ... */ });
As far as I'm aware that works fine (at least with Node.js master) with an already running isolate. If it doesn't, I'd like to know what exactly doesn't work.
Is that sufficient? Is anything else needed?
I agree with @bnoordhuis that this can be done with using the inspector protocol directly. NODE_V8_COVERAGE is also implemented with inspector but it's in core because we also want to use it to collect coverage of internals which requires the connection to be created earlier before the bootstrap, and other implications mentioned by @bcoe in https://github.com/nodejs/node/issues/28283#issuecomment-504240397.
If the request is to get user land coverage on demand, then none of those issues apply here. I don't really see why it can't be done with the inspector module directly?
Didn't know about those options! I'll test it out and report back.
It would be nice with a cleaner api, but that should be very much possible to build in userland
@joyeecheung @bnoordhuis if we could pull it off without too much trouble, I think it would be a somewhat cleaner API if someone could trigger the existing inspector session we have open and tell it to dump its coverage.
My thinking is, then @SimenB or @Toxicable could start their process with NODE_V8_COVERAGE set... my gut is that this does have a few benefits from a simplicity point of view: you don't need a second inspector session, you piggyback off the one already running; you ensure coverage was enabled before any of your applications dependencies load; the upstream library doesn't need to speak the inspector API.
I'm not quite sure what the mechanism would be for triggering a dump ... maybe if you have NODE_V8_COVERAGE enabled, SIGUSR1 or SIGUSR2 would trigger a process to dump, without interfering with any other hooks on these signals?
If we're not open to adding this to Node.js, I think that it's a great candidate for a userland module that @SimenB, @Toxicable, and myself could hopefully share for our use-cases.
For Jest's use case, it's very explicit when we load user code, so we don't need any coverage guarantee about "library code". As long as vm and hopefully worker_threads are supported, we're good. Hopefully we won't need the env variable(?)
More than happy to figure out some abstraction for collecting the coverage in user land that can be shared, sounds like a great goal.
you ensure coverage was enabled before any of your applications dependencies load; the upstream library doesn't need to speak the inspector API.
This also isn't a concern for us.
There is a clear separation between our "framework" code and the users code that we actually want to instrument and we control when it's called.
@bcoe With that in mind, do you think c8 would be the place for such an api?
All it would do is call
const options = {callCount: true, detailed: true};
session.post('Profiler.startPreciseCoverage', options, () => { /* ... */ });
But just to ensure anyone integrating it downstream dosen't have to have this conversation again
Adding the options worked an absolute treat - I get the same numbers as if starting the node process with NODE_V8_COVERAGE now. Thanks! I guess we can move the discussion about abstractions on top of the inspector API back to the c8 repo?
EDIT: btw, the docs state that worker_threads are not supported (https://nodejs.org/api/cli.html#cli_node_v8_coverage_dir) - is that a limitation of the env var or worker_threads? And is there an issue I can subscribe to for it? While Jest uses the child_process module for now to do parallelization, we also have an implementation using worker_threads, it's just not enabled by default yet
Oh actually, there is one API I'm missing - getting instrumentation data for an uncovered file. istanbul-lib-instrument has a readInitialCoverage function which does this, but it relies on Babel: https://github.com/istanbuljs/istanbuljs/blob/371adb154f0189b793e636b5bbcc63f183d36c2b/packages/istanbul-lib-instrument/src/read-coverage.js. Is that possible with v8 coverage? Or are we bound to _executed_ code?
@SimenB, @Toxicable let's move this conversation to c8 and/or v8-to-istanbul, as from conversations we've been having it sounds like the inspector API is working like a charm 馃憤
Most helpful comment
@joyeecheung @bnoordhuis if we could pull it off without too much trouble, I think it would be a somewhat cleaner API if someone could trigger the existing inspector session we have open and tell it to dump its coverage.
My thinking is, then @SimenB or @Toxicable could start their process with NODE_V8_COVERAGE set... my gut is that this does have a few benefits from a simplicity point of view: you don't need a second inspector session, you piggyback off the one already running; you ensure coverage was enabled before any of your applications dependencies load; the upstream library doesn't need to speak the inspector API.
I'm not quite sure what the mechanism would be for triggering a dump ... maybe if you have
NODE_V8_COVERAGEenabled,SIGUSR1orSIGUSR2would trigger a process to dump, without interfering with any other hooks on these signals?If we're not open to adding this to Node.js, I think that it's a great candidate for a userland module that @SimenB, @Toxicable, and myself could hopefully share for our use-cases.