Node: Significant map/reduce performance regression in node 12

Created on 28 May 2019  路  31Comments  路  Source: nodejs/node

  • Version: v12.1.0
  • Platform: Windows 10 64-bit

Some background: I recently read an old article that, among other things, looked at performance in Node (6 at the time) for map/reduce vs using a for loop, and after seeing a comment by someone else about the recent performance changes in later version of V8, decided to see if there was any difference in the results in later version of Node. So I set out to try and reproduce their test and results, and started looking at newer versions of Node to compare, and ended up stumbling upon what appears to be a rather significant performance regression.

The test is fairly simple, take a list of 32 million random numbers, and calculate the sum of the squares. The full code for the test I ran can be found here: https://gist.github.com/kbjr/1a551ddb3b59e9685b04bb051daef44c

But the two basic samples are this:

const result = data
  .map((x) => x ** 2)
  .reduce((memo, x) => memo + x, 0);
let result = 0;

for (let i = 0; i < data.length; i++) {
  result += data[i] ** 2;
}

The for loop version is faster, but that has always been the case and is not the point of this issue. Here are the results I got from running this test on 10.15.0:

runMapReduce        12602.4ms   13285.9ms   12729.4ms
runForEach          2407.41ms   2358.85ms   2432.26ms
runForLoop          69.4525ms   47.4004ms   56.8418ms
runForLoopNoLength  66.4152ms   51.5009ms   55.3084ms

Which looks about the same as the original results in the article I had read, so I wasn't really surprised at the outcome. What did surprise me was when I ran the test again on 12.1.0:

runMapReduce        126262ms    126607ms    129501ms
runForEach          1451.66ms   1457.29ms   1490.63ms
runForLoop          38.9766ms   45.9452ms   34.7133ms
runForLoopNoLength  38.9530ms   44.0065ms   35.1938ms

The other tests actually all got some small speed ups, but the map/reduce slowed down by basically a whole order of magnitude.

V8 Engine performance

Most helpful comment

Short status update: the regression likely comes from a change in V8's GC where we reduced the page size. That moved an odd performance cliff that existed before, in a way that this test case exposes. We are working on this.

All 31 comments

iirc fast paths for handling holey arrays (the one you create is holey) were removed while rewriting parts of v8 internals.

@nodejs/v8

iirc fast paths for handling holey arrays (the one you create is holey) were removed while rewriting parts of v8 internals.

By "holey", you mean "has gaps"? I don't think data has gaps in the code provided:

let data = new Array(size);

for (let i = 0; i < size; i++) {
    data[i] = rand();
}

Or am I misunderstanding?

@trott it's an internal array kind in V8 describing its memory layout. holey arrays operate more like dictionaries since they could be missing indices. afaik even though the array is eventually filled it remains in that initial type.

@devsnek If we assume that is accurate, does that mean that populating the array like

const data = [ ];

for (...) {
  data.push(rand());
}

should remove this "holey" status and make the process fast again?

If that's the case, I would be happy to rerun the test like this to verify.

@kbjr that would indeed not be a holey array

So, I ran the test again like this, but the results are basically the same:

runMapReduce        126969ms    126892ms    126506ms
runForEach          1642.75ms   1642.28ms   1688.50ms
runForLoop          37.5993ms   42.6008ms   34.8042ms
runForLoopNoLength  36.9135ms   45.7981ms   34.7570ms

The code change:

// let data = new Array(size);

// for (let i = 0; i < size; i++) {
//  data[i] = rand();
// }

let data = [ ];

for (let i = 0; i < size; i++) {
    data.push(rand());
}

i guess that's not the issue then :)

Something contributing to this might be my rewrite of v8's exp operator which did make it a bit slower (and more accurate), but i doubt that would result in the enormous difference here.

What interests me the most is that there doesn't seem to be an impact on the forEach test (in fact, its actually a bit faster).

If it was something to do with the way V8 reads the array (or how the exp operator works), I would expect to see a similar result in that test as well (or all of them, if it were the exp operator).

My gut instinct would be something to do with the map function and the way it creates the new array, but that's just a hunch. I'm going to try breaking the test apart into a map and a reduce and see if one of those steps separately is indeed causing the majority of the slow down.

Okay, I ran the map and reduce separately with the following new tests:

let squares;

const runMap = () => runWithTimer(() => {
    squares = data.map((x) => x ** 2);
});

const runReduce = () => runWithTimer(() => {
    const result = squares.reduce((memo, x) => memo + x, 0);
});

And here are the results (only run once because I'm lazy and this takes a while):

runMap              126779ms
runReduce           862.096ms

So it is most definitely the map call that is slow.

I'm going to try map without the exponent, just to rule that out as well...

As expected, removing the exponent had basically no effect:

runMap              124786ms
runReduce           883.284ms
// squares = data.map((x) => x ** 2);
squares = data.map((x) => x);

So, it seems the real regression is as simple as, the map function is very slow, even if you do literally nothing in the callback.

So, here's the most basic, reproducible case:

console.log('Creating test data');

const data = [ ];

for (let i = 0; i < 32000000; i++) {
    data.push(Math.random() * 100);
}

console.log('Running test');

const start = Date.now();

data.map((x) => x);

console.log(`Done. Duration: ${Date.now() - start}ms`);

And the output:

```
位 nvm use 10.15
10.15.0
Now using node v10.15.0 (64-bit)

位 node .map-regression.js
Creating test data
Running test
Done. Duration: 6060ms

位 nvm use 12.3.1
Now using node v12.3.1 (64-bit)

位 node .map-regression.js
Creating test data
Running test
Done. Duration: 125899ms

weeee ok i read through the actual source and found that the fast path now only considers PACKED_SMI_ELEMENTS (arrays of 32 bit integers). Indeed if I use data.push(Math.floor(Math.random() * 100)); the time drops to around 500ms.

/cc @bmeurer

Can you file a bug at https://crbug.com/v8/new please?

Thanks for this.

@devsnek you're indeed absolutely correct when you say that PACKED arrays can become HOLEY but not the other way 'round.

Basically, once a HOLEY array, always a HOLEY array. 馃槆

@bmeurer if it's fine by you, I'd love to help out with this.

Short status update: the regression likely comes from a change in V8's GC where we reduced the page size. That moved an odd performance cliff that existed before, in a way that this test case exposes. We are working on this.

Is there an update on this issue? We started adopting Node 12 now that it became LTS and one of our services was hit hard by this performance regression.

Would it make sense to start removing map from our code bases or is there a chance that this is going to be fixed in a future 12 minor/patch release?

As this issue is open for a while, I fear its more the first option.

I would love to offer my help, but my C++ knowledge is rusty at best.

Has there been any update on this? Maybe I'm misunderstanding, but this seems like a massive regression, especially if it hits all map calls.

Has there been any update on this? Maybe I'm misunderstanding, but this seems like a massive regression, especially if it hits all map calls.

Testing locally, it seems the bug still exists in Node.js 12.14.1 (which was released within the last 24 hours), but was fixed in Node.js 13.2.0.

13.2.0 updated V8 to 7.9. I believe 12.x is still running V8 7.7.

It's possible that the fix to this in 12.x would be to find the right patch to float on V8 7.7, but it may not apply easily/cleanly. /ping @nodejs/v8-update

Tested with v12.16.0 which ships with V8 7.8, but performance is still not on a node v10 level.

Using the script from https://github.com/nodejs/node/issues/27930#issuecomment-496353854

$ nvm use 10
Now using node v10.18.1 (npm v6.13.4)
$ node bench.js
Creating test data
Running test
Done. Duration: 4805ms

$ nvm use v12.15.0
Now using node v12.15.0 (npm v6.13.4)
$ node bench.js
Creating test data
Running test
Done. Duration: 73486ms

$ nvm use v12.16.0
Now using node v12.16.0 (npm v6.13.4)
$ node bench.js
Creating test data
Running test
Done. Duration: 73772ms

@targos sorry for the ping, but I couldn't find any meeting notes or other kind of documentation whether v8 7.9 is planned to be back-ported to v12? https://github.com/nodejs/node/pull/30020 has a backport-requested-v12.x label, but no comments whether this is still planned.

Could you provide some insights?

There is no plan to backport V8 7.9 to v12.

There is no plan to backport V8 7.9 to v12.

Is cherry-picking the V8 fix onto the 12.x branch possible/likely?

There is no plan to backport V8 7.9 to v12.

Is cherry-picking the V8 fix onto the 12.x branch possible/likely?

Sure, but I'm not aware of which commit fixed it.

I鈥檓 not familiar with the v8 code base, but the issue you opened references https://chromium-review.googlesource.com/c/1529006 as the culprit and I guess we could go from the history after this or by reverting the change or is this thinking too simple?

Just an FYI, v8 has closed this issue as "won't fix"

https://bugs.chromium.org/p/v8/issues/detail?id=9302#c4

Just an FYI, v8 has closed this issue as "won't fix"

https://bugs.chromium.org/p/v8/issues/detail?id=9302#c4

If V8 is going to live with this, I think we will have to as well. Should this be closed?

I'm fine either way, but keeping it open might be better for discoverability.

Let's not keep it open, that wouldn't seem to serve much purpose at this point

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mikeal picture mikeal  路  90Comments

feross picture feross  路  208Comments

speakeasypuncture picture speakeasypuncture  路  152Comments

ctavan picture ctavan  路  87Comments

thecodingdude picture thecodingdude  路  158Comments