Node: Performance degradation when doing JSON.stringify(obj).replace(/regexp/, func) on Node 6.2.2 comparing to 4.4.5

Created on 20 Jun 2016  路  10Comments  路  Source: nodejs/node

Here is the code example:

console.time('Test 1, working additions and multiplications (1000ms)');
var r = 1;
for (var i = 0; i < 140000000; i++) r = r + 0.5*Math.random();
console.timeEnd('Test 1, working additions and multiplications (1000ms)');

console.time('Test 2, working with arrays and hashsets (1000ms)');
var arr = [], obj = {};
for (i = 0; i < 4500000; i++) {
    var a = [Math.random(), Math.random(), Math.random()];
    var b = {a: a[0], b: a[1], c:a[2]};
    arr.push(b);
    obj[i] = b;
}
console.timeEnd('Test 2, working with arrays and hashsets (1000ms)');

console.time('Test 3, working with JSON parse/stringify (1000ms)');
for (i = 0; i < 270000; i++) {
    arr[i] = JSON.stringify(obj[i]);
    obj[i] = JSON.parse(arr[i]);
}
console.timeEnd('Test 3, working with JSON parse/stringify (1000ms)');

obj = null;

arr.splice(800000);
console.time('Test 4. JSON.stringify for big object (530ms)');
var str = JSON.stringify(arr);
console.timeEnd('Test 4. JSON.stringify for big object (530ms)');

var JSONcompact = require('./JSONcompact');
console.log('Test 5. JSON.stringify + replace egexp for big object, repeating 50 times...');
var result = [];
for (i = 0; i < 50; i++) {
    var start = +new Date();
    str = JSONcompact.stringify(arr);
    result.push((+new Date() - start) + 'ms');
}
console.log('Test 5 results:', result.join(', '));

JSONcompact.js:

'use strict';
function replaceAllQuotes(x) { return x.replace(/"/ig, ''); }
var JSONcompact = {
    stringify: function(obj) {
        return JSON.stringify(obj).replace(/"[A-Za-z0-9_-]{1,20}":/g, replaceAllQuotes);
    }
};
if (typeof(module) !== 'undefined') module.exports = JSONcompact;

Results on Node 6.2.x:
Test 5 results: 1672ms, 1676ms, 1721ms, 1530ms, 1589ms, 1656ms, 1465ms, 1492ms, 1688ms, 1622ms, 1568ms, 1454ms, 1529ms, 1577ms, 1750ms, 1600ms, 1498ms, 1709ms, 1580ms, 1504ms, 1484ms, 1541ms, 1574ms, 1521ms, 1507ms, 1660ms, 1592ms, 1640ms, 1676ms, 1742ms, 1668ms, 1653ms, 1628ms, 1712ms, 1715ms, 2076ms, 1980ms, 2624ms, 2588ms, 7734ms, 9265ms, 9092ms, 9489ms, 9982ms, 10044ms, 10148ms, 10164ms, 10413ms, 10061ms, 9990ms

Results on Node 4.4.5:
Test 5 results: 1617ms, 1879ms, 1796ms, 1691ms, 1758ms, 1633ms, 1760ms, 1734ms, 1651ms, 1703ms, 1612ms, 1751ms, 1754ms, 1648ms, 1715ms, 1734ms, 1881ms, 1939ms, 1709ms, 1799ms, 1634ms, 1764ms, 1741ms, 1689ms, 1826ms, 1692ms, 1975ms, 1839ms, 1814ms, 1982ms, 1762ms, 1986ms, 1973ms, 1853ms, 1800ms, 1800ms, 2058ms, 1742ms, 1689ms, 1838ms, 1764ms, 1780ms, 1727ms, 1608ms, 1702ms, 1610ms, 1752ms, 1752ms, 1618ms, 1708ms

Hardware: MacBook Pro 2013, Intel Core i7, 3 GHz, 256KB L2 cache, 4MB L3 cache, SSD, OS X 10.11.5

V8 Engine performance

Most helpful comment

--trace_gc clearly shows that v4 scavenges much more aggressively. Throughput in v6 seems to be higher in general but from time to time it gets overwhelmed by GC interrupt checks in ParseJsonValue() that result in many expensive mark-sweep cycles. The numbers more or less match v4 when I crudely disable it:

diff --git a/deps/v8/src/heap/heap.cc b/deps/v8/src/heap/heap.cc
index f5110f8..fa8091a 100644
--- a/deps/v8/src/heap/heap.cc
+++ b/deps/v8/src/heap/heap.cc
@@ -796,8 +796,6 @@ class GCCallbacksScope {
 void Heap::HandleGCRequest() {
   if (incremental_marking()->request_type() ==
       IncrementalMarking::COMPLETE_MARKING) {
-    CollectAllGarbage(current_gc_flags_, "GC interrupt",
-                      current_gc_callback_flags_);
   } else if (incremental_marking()->IsMarking() &&
              !incremental_marking()->finalize_marking_completed()) {
     FinalizeIncrementalMarking("GC interrupt: finalize incremental marking");

I suspect an issue with V8's GC heuristics (when to do what kind of collection.) It would be best to report it over at https://bugs.chromium.org/p/v8/issues/list.

All 10 comments

/cc @nodejs/v8

--trace_gc clearly shows that v4 scavenges much more aggressively. Throughput in v6 seems to be higher in general but from time to time it gets overwhelmed by GC interrupt checks in ParseJsonValue() that result in many expensive mark-sweep cycles. The numbers more or less match v4 when I crudely disable it:

diff --git a/deps/v8/src/heap/heap.cc b/deps/v8/src/heap/heap.cc
index f5110f8..fa8091a 100644
--- a/deps/v8/src/heap/heap.cc
+++ b/deps/v8/src/heap/heap.cc
@@ -796,8 +796,6 @@ class GCCallbacksScope {
 void Heap::HandleGCRequest() {
   if (incremental_marking()->request_type() ==
       IncrementalMarking::COMPLETE_MARKING) {
-    CollectAllGarbage(current_gc_flags_, "GC interrupt",
-                      current_gc_callback_flags_);
   } else if (incremental_marking()->IsMarking() &&
              !incremental_marking()->finalize_marking_completed()) {
     FinalizeIncrementalMarking("GC interrupt: finalize incremental marking");

I suspect an issue with V8's GC heuristics (when to do what kind of collection.) It would be best to report it over at https://bugs.chromium.org/p/v8/issues/list.

The JSON stringifier used to not contain any interrupt checks. That caused uninterruptible freezes for large objects though. The consequence of course is that GC also gets a chance to run.
@jeisinger is this actionable?

thanks for filing the bug. I cc'd some team members on it for their feedback

This is not due to JSON.stringify, but perhaps due to replace/regexp/function.

When I change this line:
return JSON.stringify(obj).replace(/"[A-Za-z0-9_-]{1,20}":/g, replaceAllQuotes);

to this:
return JSON.stringify(obj).replace(/"(\w+)"\s*:/g, '$1:');;

The performance degradation disappears.

@oebspm Interestingly I was checking right when you posted your comment if the upgrade to V8 5.1 resolves this issue and it does, results are all within 1750-2250 ms on my machine.

The plan is to land 5.1 in the v6 branch before it goes LTS so this should be resolved within the next two or three months.

This has nothing to do with JSON.stringify.

Testcase:

var strs = Array(1600000).fill('"a"0').join('');
var str, start;

function identity(x) { return x; }

for (var i = 0; i < 55; i++) {
  start = +new Date();
  str = strs.replace(/"[ab]"/g, identity);
  console.log(i, (+new Date() - start) + 'ms');
}

My application works with the big data volumes in multiple threads, even after I have changed the code and using no replace regexp/function, the performance degradation appears on Node 6.2.2, while on 4.4.7 it works fine.

I confirm this is fixed with V8 5.1 in v6.5.0.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

stevenvachon picture stevenvachon  路  3Comments

akdor1154 picture akdor1154  路  3Comments

srl295 picture srl295  路  3Comments

fanjunzhi picture fanjunzhi  路  3Comments

danielstaleiny picture danielstaleiny  路  3Comments