Node: RFC: add a flag similar to --prof, but based on v8 CPU profiler

Created on 23 Mar 2019  路  3Comments  路  Source: nodejs/node

Hi all,

I am currently working on adding a flag similar to --prof but use the better-maintained v8 CPU profiler instead. Working prototype is here, which reuses the recent refactored NODE_V8_COVERAGE implementation (also depends on another refactor in https://github.com/nodejs/node/pull/26874).

When running

node --cpu-prof entry.js

It starts the CPU profiler at start up, when the Node.js instance (main thread or worker threads) exits, it writes the CPU profile under the current working directory with the file name returned by the unified DiagnosticFileName('CPU', 'cpuprofile') (example: CPU.20190323.191111.95080.0.cpuprofile).

It also provides --cpu-prof-name for the user to specify the path where the profile will be written to (the directory of the path would be created if it does not exist yet). (we can't make that a value of --cpu-prof because then node --cpu-prof entry.js would be ambiguous, or at least our current option parser does not support something smart enough to work around it).

The generated CPU profile can be visualized using Chrome's dedicated DevTools for Node.js (available in chrome://inspect), this is much easier to use than --prof-process for --prof, although V8's CPU profiler is currently only capable of profiling JavaScript. However there are also discussions about including native stacks there as well (according to @psmarshall ).

We may also be able to hook this into our benchmark runner to get CPU profiles for the core benchmarks.

Some questions regarding this feature:

  • Should the CPU profiler be started before bootstrap, or before pre-execution, or right before we start to execute the first user script?

    • Bootstrap includes the setup of the Node.js environment that does not depend on any run time states e.g. CLI flags, environment variables. This part of execution will go away once embedded v8 snapshot is implemented and used in core (because then we'll be deserializing the context from binary data instead).

    • Pre-execution includes the setup that depends on run time states (e.g. process.argv, process.env), as well as preparations done before executing the entry point (e.g. running preloaded modules specified with --require) - currently these two are mixed together but we should be able to refactor and separate them.

  • The prototype is currently doing something similar to coverage collection when it serialize the results: it does a JSON::Parse on the message returned from the inspector protocol, and then does a JSON::Stringify on the specific subtree (in this case, message.result.profile). It should be fine as an initial implementation, however:

    • This is not exactly cheap, but it saves us the trouble of maintaining a CPU profile serializer by ourselves (this is not currently available in v8-profile.h)

    • For reference v8-inspector maintains their own serializer, but the format of the CPU profile is unspecified and subject to changes so that requires manual maintainance unless the support is added in the upstream

    • Is it possible to customize the inspector protocol or improve the inspector API to get the results directly and avoid this detour? cc @nodejs/v8-inspector

Any feedback regarding this idea is welcomed!

cli discuss inspector performance

Most helpful comment

I think it's a good idea to keep a --cpu-prof-process-like option to avoid requiring a browser to view results.

It would be good as a follow-up for the flag that writes the CPU profile. I can think of two ways to implement this:

  1. We implement this by ourselves. However since the CPU profile format is subject to changes, we need to maintain it by ourselves and it could become in the same situation as --prof-process which is sometimes broken without being noticed (even though the scripts mostly comes from deps/v8/tools) - or it does work, but does not produce enough meaningful output.
  2. V8 implements this in their tools (similar to --prof-process) and we vendor it in over here along with v8 (or it could be placed somewhere else maintained by the upstream). This may get better test coverage in the upstream, since unlike --prof the CPU profiler is considered public and is well-supported.

Even if we go with option 1, I think this should ideally be developed in a separate repo, published to npm, before it matures and be moved here (similar to node-report). That is not even blocked by the feature proposed here though, since it is just a regular CLI that takes a CPU profile and prints things.

This also helps moving the v8 CPU profiler from tier 3 in our diagnostics tooling support to tier 1.

cc @nodejs/v8 @nodejs/diagnostics

All 3 comments

I think it's a good idea to keep a --cpu-prof-process-like option to avoid requiring a browser to view results.

I think it's a good idea to keep a --cpu-prof-process-like option to avoid requiring a browser to view results.

It would be good as a follow-up for the flag that writes the CPU profile. I can think of two ways to implement this:

  1. We implement this by ourselves. However since the CPU profile format is subject to changes, we need to maintain it by ourselves and it could become in the same situation as --prof-process which is sometimes broken without being noticed (even though the scripts mostly comes from deps/v8/tools) - or it does work, but does not produce enough meaningful output.
  2. V8 implements this in their tools (similar to --prof-process) and we vendor it in over here along with v8 (or it could be placed somewhere else maintained by the upstream). This may get better test coverage in the upstream, since unlike --prof the CPU profiler is considered public and is well-supported.

Even if we go with option 1, I think this should ideally be developed in a separate repo, published to npm, before it matures and be moved here (similar to node-report). That is not even blocked by the feature proposed here though, since it is just a regular CLI that takes a CPU profile and prints things.

This also helps moving the v8 CPU profiler from tier 3 in our diagnostics tooling support to tier 1.

cc @nodejs/v8 @nodejs/diagnostics

--cpu-prof was implemented. Can we close?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jmichae3 picture jmichae3  路  3Comments

srl295 picture srl295  路  3Comments

mcollina picture mcollina  路  3Comments

filipesilvaa picture filipesilvaa  路  3Comments

loretoparisi picture loretoparisi  路  3Comments