Over at Angular CLI we recently instructed our users to update to Node 12. In https://github.com/angular/angular-cli/issues/13734 we started getting reports that projects that used to build correctly were now hitting the memory limit.
I looked a bit further into these reports in https://github.com/filipesilva/angular-cli-perf-benchmark and found that projects were using more memory in Node 12 than they were using in Node 10. In one case the memory usage went up by 70%.
Reproduction steps:
git clone https://github.com/filipesilva/clarity && cd clarity && git checkout d19039902 && npm i
npm run populate-build-cache
to populate disk cache.npm run benchmark
to benchmark the build over 5 builds, save these resultsnpm run benchmark
to benchmark the build over 5 builds, save these resultsResults on my Windows dev machine were:
[benchmark] Benchmarking process over 5 iterations, with up to 5 retries.
[benchmark] node --max_old_space_size=2400 ./node_modules/@angular/cli/bin/ng build website --prod (at D:\sandbox\memory-debug\clarity)
[benchmark] Process Stats
[benchmark] Elapsed Time: 252701.40 ms (259868.00, 260823.00, 244032.00, 255378.00, 243406.00)
[benchmark] Average Process usage: 1.00 process(es) (1.00, 1.00, 1.00, 1.00, 1.00)
[benchmark] Peak Process usage: 1.00 process(es) (1.00, 1.00, 1.00, 1.00, 1.00)
[benchmark] Average CPU usage: 13.03 % (13.16, 13.18, 12.87, 13.55, 12.40)
[benchmark] Peak CPU usage: 189.12 % (187.60, 187.60, 203.10, 193.80, 173.50)
[benchmark] Average Memory usage: 1906.98 MB (1923.10, 1953.03, 1886.74, 1900.24, 1871.77)
[benchmark] Peak Memory usage: 2674.04 MB (2659.36, 2691.45, 2613.00, 2706.36, 2700.01)
[benchmark] Benchmarking process over 5 iterations, with up to 5 retries.
[benchmark] node --max_old_space_size=2400 ./node_modules/@angular/cli/bin/ng build website --prod (at D:\sandbox\memory-debug\clarity)
[benchmark] Process Stats
[benchmark] Elapsed Time: 206732.00 ms (192481.00, 215809.00, 205091.00, 209814.00, 210465.00)
[benchmark] Average Process usage: 1.00 process(es) (1.00, 1.00, 1.00, 1.00, 1.00)
[benchmark] Peak Process usage: 1.00 process(es) (1.00, 1.00, 1.00, 1.00, 1.00)
[benchmark] Average CPU usage: 10.12 % (8.58, 10.55, 10.12, 10.67, 10.71)
[benchmark] Peak CPU usage: 168.48 % (125.10, 167.20, 176.60, 204.70, 168.80)
[benchmark] Average Memory usage: 1451.49 MB (1474.81, 1445.85, 1449.72, 1447.81, 1439.26)
[benchmark] Peak Memory usage: 1979.37 MB (2165.37, 1945.54, 1926.42, 1947.23, 1912.31)
Results on a CircleCI Linux machine were:
[benchmark] Benchmarking process over 5 iterations, with up to 5 retries.
[benchmark] node --max_old_space_size=2400 ./node_modules/@angular/cli/bin/ng build website --prod (at /home/circleci/project/project)
[benchmark] Process Stats
[benchmark] Elapsed Time: 147158.00 ms (201190.00, 139490.00, 135690.00, 133010.00, 126410.00)
[benchmark] Average Process usage: 1.18 process(es) (1.90, 1.01, 1.00, 1.00, 1.00)
[benchmark] Peak Process usage: 8.40 process(es) (36.00, 3.00, 1.00, 1.00, 1.00)
[benchmark] Average CPU usage: 156.38 % (170.09, 151.25, 151.50, 156.67, 152.41)
[benchmark] Peak CPU usage: 954.00 % (2500.00, 740.00, 520.00, 510.00, 500.00)
[benchmark] Average Memory usage: 1981.30 MB (2041.02, 1945.87, 2002.34, 1947.21, 1970.06)
[benchmark] Peak Memory usage: 3450.42 MB (6018.51, 2851.30, 2806.87, 2773.70, 2801.71)
[benchmark] Benchmarking process over 5 iterations, with up to 5 retries.
[benchmark] ng build website --prod (at /home/circleci/project/project)
[benchmark] Process Stats
[benchmark] Elapsed Time: 122160.00 ms (120890.00, 113270.00, 161590.00, 112800.00, 102250.00)
[benchmark] Average Process usage: 1.53 process(es) (3.67, 1.01, 1.00, 1.00, 1.00)
[benchmark] Peak Process usage: 8.40 process(es) (36.00, 3.00, 1.00, 1.00, 1.00)
[benchmark] Average CPU usage: 156.52 % (221.59, 144.47, 132.13, 140.85, 143.58)
[benchmark] Peak CPU usage: 1218.22 % (3700.00, 780.00, 544.44, 533.33, 533.33)
[benchmark] Average Memory usage: 1198.62 MB (1405.18, 1134.82, 1208.82, 1088.15, 1156.14)
[benchmark] Peak Memory usage: 2335.99 MB (4962.54, 1781.51, 1672.10, 1604.10, 1659.72)
Please note that in the Linux results I did two things different:
--max_old_space_size=2400
on the Node 12.4.0 benchmark because Node 10.16.0 didn't need it. Without increasing the max heap size, the Node 12.4.0 build fails hitting the memory limit (which is how we found https://github.com/nodejs/node/issues/28202).The numbers that most matter here are Average Memory usage
and Peak Memory usage
.
In Windows these are around 1900/2670 for Node 12.4.0 and 1450/1980 for Node 10. In Linux these are around 1950/2800 for Node 12.4.0 and 1150/1650 for Node 10.
This indicates Node 12.4.0 is using a lot more memory in some cases.
The Windows and Linux results for Node 10.16.0 are a bit different. I think the difference between is because on the Linux test I didn't increase the default memory limit, which forced more aggressive GC. But it might also be that the machine specs influenced it. I don't feel that changes the conclusion though.
https://github.com/filipesilva/angular-cli-perf-benchmark contains more examples. In all of them Node 12.4.0 used more memory. It's also true that Node 12 has a dynamic memory limit though. But in this reproduction, even when setting the memory limit manually we can see a difference.
/cc @ulan
I can reproduce the regression. At first glance it seems to be caused by slow progress in marking phase of GC. Investigating.
Actually, this looks like a memory leak in V8.
I bisected to [deps: update V8 to 7.4]: https://github.com/nodejs/node/pull/26685
Bisect within V8 regression range points to [Reland "Ship constant field tracking"]
https://chromium-review.googlesource.com/c/v8/v8/+/1458238
The flag that controls the feature in read-only and can be set only at build time.
I checked that setting track_constant_fields to false in v8/src/flag-definitions.h on the v12.x branch fixes the issue locally. @ofrobots, @hashseed do we want to disable the feature in Node 12?
Unfortunately, the flag no longer exists in upstream V8. So it is not possible to disable the feature on the master branch. I will file an issue for upstream V8 and start investigation with compiler folks.
The plot thickens! Thanks for looking at this @ulan 馃憤
Upstream issue: https://bugs.chromium.org/p/v8/issues/detail?id=9507
@filipesilva you're welcome and thank you for filing a bug with detailed repro steps.
I investigated, and it's triggered by a bug in Angular that used to got very lucky with an optimization we removed:
In https://github.com/angular/angular/blob/e8ae3c5f2efe01acee894f3ea094544689310066/packages/compiler/src/output/output_ast.ts#L1467
there is the following code:
private _clone(obj: any): any {
const clone = Object.create(obj.constructor.prototype);
for (let prop in obj) {
clone[prop] = obj[prop];
}
The for-in loop also iterates over the prototype chain, copying many functions from the prototypes onto the cloned object as own properties. This doesn't make sense, since the cloned object also has the right prototype, so in a way it has all these functions twice now: on the prototype chain and as own properties.
Putting a check in this loop and only copying own properties will improve performance and should fully recover this regression.
The reason why this was better before is that in older versions of V8, we had an optimization that detected function properties that are always the same and shares them in the hidden class instead of actually putting them on the object. This is kind of V8 doing something like prototypes without the user using prototypes. When we enabled constant field tracking, which serves a similar purpose in other cases, we removed this old optimization. This results in us always putting own properties into the object, so they always consume memory, as one might expect actually.
@tebbi thank you so much for pinpointing what was happening here!
Can you tell me how you figured out it was exactly in that clone that things were going awry? I'm trying to identify other places where we use too much memory and would appreciate better techniques for doing so.
I used --trace-gc-object-stats to find out that the memory regression came from JavaScript objects getting bigger and going to dictionary mode. Then I instrumented the V8 code to print a stack trace whenever an object goes to dictionary mode because of too many properties (That's in src/objects/map.cc Map::TransitionToDataProperty()). These stack traces were pointing to the _clone function.
This is fixed in Angular upstream now. Does this resolve the issue for Node.js?
@tebbi I'll re-run benchmarks for the original test cases when Angular releases the fix, which should be today I think.
I tried on windows again, same steps as my original comment, but using checkout out commit 357f05053
instead.
Node 12.4.0
[benchmark] Benchmarking process over 5 iterations, with up to 5 retries.
[benchmark] node --max_old_space_size=2400 ./node_modules/@angular/cli/bin/ng build website --prod (at D:\sandbox\clarity)
[benchmark] Process Stats
[benchmark] Elapsed Time: 236350.00 ms (249539.00, 238937.00, 218797.00, 240053.00, 234424.00)
[benchmark] Average Process usage: 1.00 process(es) (1.00, 1.00, 1.00, 1.00, 1.00)
[benchmark] Peak Process usage: 1.80 process(es) (3.00, 1.00, 1.00, 3.00, 1.00)
[benchmark] Average CPU usage: 16.58 % (17.64, 15.61, 15.06, 17.64, 16.94)
[benchmark] Peak CPU usage: 206.26 % (236.00, 187.50, 203.10, 204.70, 200.00)
[benchmark] Average Memory usage: 1375.30 MB (1363.32, 1418.53, 1369.96, 1356.29, 1368.40)
[benchmark] Peak Memory usage: 2038.24 MB (2145.49, 2219.04, 2113.53, 1869.96, 1843.18)
Node 10.16.0
$ npm run benchmark
> @clr/[email protected] benchmark D:\sandbox\clarity
> benchmark -- node --max_old_space_size=2400 ./node_modules/@angular/cli/bin/ng build website --prod
[benchmark] Benchmarking process over 5 iterations, with up to 5 retries.
[benchmark] node --max_old_space_size=2400 ./node_modules/@angular/cli/bin/ng build website --prod (at D:\sandbox\clarity)
[benchmark] Process Stats
[benchmark] Elapsed Time: 219195.60 ms (184729.00, 199793.00, 222094.00, 261570.00, 227792.00)
[benchmark] Average Process usage: 1.00 process(es) (1.00, 1.00, 1.00, 1.00, 1.00)
[benchmark] Peak Process usage: 1.40 process(es) (1.00, 1.00, 1.00, 3.00, 1.00)
[benchmark] Average CPU usage: 14.49 % (11.25, 13.54, 15.35, 16.62, 15.68)
[benchmark] Peak CPU usage: 174.08 % (137.50, 201.60, 165.60, 135.90, 229.80)
[benchmark] Average Memory usage: 1441.90 MB (1460.18, 1439.92, 1436.15, 1443.31, 1429.94)
[benchmark] Peak Memory usage: 1921.71 MB (1901.18, 1892.90, 1914.77, 1977.76, 1921.93)
So if we compare average/peak with my original results, we see
Node 12.4.0 shows the roughly same numbers as Node 10.16.0 after https://github.com/angular/angular/commit/24ca582bc5f133bf43004eb87f446fc558fc8c64 (the change that @tebbi suggested).
Indeed I think this issue is resolved.
30% to 70% increased memory usage was indeed caused by that loop. I wonder if there's a good way to identify these things in the future.
Most helpful comment
I used --trace-gc-object-stats to find out that the memory regression came from JavaScript objects getting bigger and going to dictionary mode. Then I instrumented the V8 code to print a stack trace whenever an object goes to dictionary mode because of too many properties (That's in src/objects/map.cc Map::TransitionToDataProperty()). These stack traces were pointing to the _clone function.