Node: 100% CPU caused by thread spike (in Fibers, Meteor)

Created on 16 Apr 2018  路  19Comments  路  Source: nodejs/node

Version: 8.11.1
Platform: any

We have been experiencing 100% CPU in production, presumably caused by the use of Fibers in Node.js by Meteor and the way the data lookup for threads is implemented in V8.

While this issue probably should not be fixed in this repository it seemed worthwhile to track it here as well, in order to find out whether more users of Node.js are suffering from this issue.

As @kentonv describes the issue in the Meteor repository:
V8 uses a linked list to map thread IDs to some thread-local data. Fibers aren't threads, but the fibers package fakes out V8 into thinking that fibers are threads, so this applies to fibers too. If a large number of fibers are created, the list becomes long, and so every lookup into the table becomes slow, because a lookup in a linked list is O(n). These lookups happen frequently, slowing down everything. Using a hash table makes the lookups O(1).

Issue on Meteor repository:
https://github.com/meteor/meteor/issues/9796

Issue on Fibers repository:
https://github.com/laverdet/node-fibers/issues/371

Commit which fixes the issue in Node.js:
https://github.com/nodejs/node/commit/0b882566078cb14537f9f91220f1a58530fa7687

V8 issue:
https://bugs.chromium.org/p/v8/issues/detail?id=5338

Proposed contribution which fixes the issue in V8:
Not available yet.

V8 Engine

All 19 comments

node already has the patch, and v8 is oss, why not just submit the patch to them? this also seems like the wrong place to be asking.

@devsnek

"While this issue probably should not be fixed in this repository it seemed worthwhile to track it here as well, in order to find out whether more users of Node.js are suffering from this issue."

I've created a PR in the V8 repo: https://github.com/v8/v8/pull/24/

@KoenLav please note v8 doesn't accept prs to that repo, they use gerrit. you can follow this guide to learn how to submit a patch: http://dev.chromium.org/developers/contributing-code

i'm going to close this for now, feel free to ping me if you have any other questions.

ping @nodejs/v8 just in case. Although seems like this is on the right track as is.

@devsnek no problem to leave this here closed (people should still be able to find it more easily).

@devsnek the V8 patch has landed (https://chromium-review.googlesource.com/c/v8/v8/+/1014407). Any indication of when it will be available in Node.js?

@KoenLav I have requested an upstream merge for V8 6.7 and 6.6 so that that can end up in Node 10. Let's see if upstream is willing to back merge. Otherwise I'm +1 on floating it as a patch here otherwise.

@KoenLav you can run git node v8 backport b49206ded97c4eaac7c273ce004d840a0185d40e (using https://github.com/nodejs/node-core-utils/) in a checkout of node and then open a pr

@KoenLav you can run git node v8 backport b49206ded97c4eaac7c273ce004d840a0185d40e (using https://github.com/nodejs/node-core-utils/) in a checkout of node and then open a pr

Yes, but wait for a disposition upstream first.

@devsnek @ofrobots awaiting merge from upstream first.

Any chance of getting this applied in Nodejs 8.* as well?

Yes, that is possible. The LTS policy is that a fix has to be released on the current branch for a bit before we back-port to LTS branches. This helps shake any potential instability. The process as I see it, from here:

  1. If upstream approves a merge, we will pick up the fix on the next V8 update.
  2. If upstream doesn't want to merge back, we can float the fix it on master. The release team will back-port and release on 10.

If all looks good, this will subsequently be considered for a merge back to older release branches.

@ofrobots as the merge has been approved for V8 6.7, what would be the proper course of action from here?

Our fuzzers found a crash so I would like to understand what's going on there before moving forward with this. @ofrobots has been CC'ed on the issue.

@hashseed link?

@KoenLav unfortunately I can't share the link as it is security sensitive by default. Kenton has access to it though.

@hasheed would be happy to try and shed some light if you CC ;)

It appears to have been a problem with the fuzzer setup, not an actual bug in the code.

@kentonv good to hear!

Have they communicated any ETA on merging this commit to V8 6.7? Or is it already done?

I've opened the 6.7 merge change upstream. Once there is LGTM, this can land upstream, and will be picked up by Node.js when we move up to V8 6.7 (~ end of the month).

For V8 6.6, we need to float the patch here. I've opened PR here: #20727

Was this page helpful?
0 / 5 - 0 ratings