Sdk: dart2js uses too much memory

Created on 23 Nov 2016  ·  108Comments  ·  Source: dart-lang/sdk

I am no longer able to build the SDK on my DragonBoard (a 1GB ARM64 board) because dart2js runs out of memory while compiling the Observatory.

area-web type-bug web-dart2js

All 108 comments

We are doing some long term changes that can help with this, but you wont see the benefits for a while.

Is observatory built using pub build or calling dart2js directly? If you use pub-build, do you actually need to use transformers? I recall you used to depend on polymer transformers, but if you don't anymore consider moving out of using pub-build. If you must use pub-build, you might benefit from splitting the build in 2 steps:

  • disable dart2js in your pub build by adding a $dart2js transformer with no inputs:
transformers:
...
- $dart2js:
    $include: []
  • run pub build --mode=debug to generate dart files in the output build folder
  • run dart2js on the output entry point file, e.g. build/web/main.dart

The reason I suggest this is that we have noticed on some customer apps that pub-build increased the overall memory pressure by a lot (sometimes 50% more on some large complex apps). This was more so on apps that had lots of transformers though, so YMMV.

Observatory is built directly with dart2js now, not with pub build which should help some

I think there were also changes that improved memory usage in dart2js. I can't quite recall if it relied on flags like --omit-implicit-checks to get the memory improvements.

Is this still a problem? Can you give that flag a try? It was also recommended here: https://github.com/dart-lang/sdk/issues/32503#issuecomment-405731621

I see the dart2js invocation has a peak RSS of 900MB. --omit-implicit-checks does not improve this. In any case, compiling with --omit-implicit-checks means we are not running Dart. dart2js should be able to compile a modest app like Observatory on a 1GB system without cheating on semantics.

(We can also no longer build on ARM/ARM64 hosts because of build tool issues that probably won't addressed until Fuchsia starts self-hosting.)

I do not expect the various optimization flags to make much difference to the memory requirements of dart2js.

The memory footprint of dart2js has recently bit us. We've started running unit tests for some of our repositories on dart2js. When we made this change we had to increase the instance size of our runners in CI because of the increased memory footprint of dart2js vs. ddc.

Here is a graph of the memory consumption when running unit tests on one of our packages using DDC. The point at which the memory drops off is when the build ends and execution of the unit tests begins.
image

And now, here is a similar chart when running those same unit tests on dart2js.
image

Because of this increased memory footprint we had to change our runners from 16G of RAM to 32G. To be clear, the examples above are from one of our internal projects, not from Observatory.

CC @vsmenon

We're continuing to have increased build memory issues. For a couple of our applications we're in the neighborhood of 25G:
Screen Shot 2019-10-03 at 10 57 34 AM

At this point we have pretty limited recourse:

1) Some changes to dart2js somehow reduce the memory load. We don't have any expertise in this area to offer, so we're 100% reliant on the Google Dart team to offer help here if there are any ideas.

2) We keep throwing more hardware at the builds. This is reasonably easy to do in the short term, but is unpalatable long term because it increases costs and limits our CI system options (we'd like to consider swapping our in house CI system for an OTS solution, but many OTS options have 16G memory limitations).

3) We make architectural changes in our large applications that enable us to break up the monolithic dart2js app build. I think we need to figure out how to do this for other dev efficiency and app extensibility reasons, but it's not really a model that Dart handles well at this point (https://github.com/dart-lang/sdk/issues/10530).

I'd love to hear any advice you've gleaned from other LARGE Dart apps.

@todbachman-wf @trentgrover-wf - thanks for following up and sharing your details. I have a few questions for you:

  • can you provide the output of how time is spent in dart2js (passing --verbose to dart2js should print a table with how time was spent) - it currently doesn't show how memory is used, but it may hint is into what phases are taking a long time and may give clues if something is really off.
  • Is this the usage by a single invocation of dart2js? Or are the 25G caused by running multiple processes at once? Our very large internal apps don't use that much RAM, so I'm curious if the configuration is different or the size of the apps is different.

  • If it's OK for you to share, how large is your Dart app at this time? (an useful data point for us is to share what number does dart2js print when it says how many bytes of sources it read)

    • what flags are currently being used with dart2js? Most flags have no effect, but I'm curious if you are using anything that could be contributing to more memory use. For instance, by any chance are you using --dump-info? It would also be worth looking at the cost of features you don't need for your unit tests. For example, do you need source-maps on those CI bots? If not, would --no-source-maps help?

We have recently added a mode in dart2js to split the build in 3 steps: front-end, global-analysis, and backend, but I'm not sure how much it will help. The steps run serially and do the same amount of work, the only difference is that data is not kept in memory, it is dumped and re-read from disk. Technically we save all the data we use and we would have GC'd everything else, but if we were leaking anything, this split could potentially get rid of that.

The backend step can be sharded into many subprocesses, these shards are then merged by a single step that may again use as much memory as before.

The flags for splitting the build are not well documented because they are mostly used behind the scenes in build systems like bazel. Unfortunately they are not used in webdev at this time, so there is not an easy way to enable them without some work on our side. If you believe this could help on your end, we can try to figure out how to get it out to you.

@jakemac53 - what are your thoughts on how hard it would be to integrate the split dart2js steps in webdev?

For an idea of scale: Roughly 4 million lines of Dart code, about 240 packages.

It should be feasible to do the split steps for dart2js in webdev (build_web_compilers really), but I don't know that it really buys anything externally? Probably not more than two days of work.

I think the main reasoning for that feature though is dealing with our timeout restrictions internally which don't really apply externally? We also won't get the parallelism benefits that we get with bazel, so it seems like it would not help much.

  • Is this the usage by a single invocation of dart2js? Or are the 25G caused by running multiple processes at once?

Actually, this could easily be part of the problem. By default we use this formula to determine the parallelism: min((Platform.numberOfProcessors / 2).ceil(), 4).

This is overridable using the BUILD_MAX_WORKERS_PER_TASK environment variable. You could try setting that to 1 and see if it helps.

For an idea of scale: Roughly 4 million lines of Dart code, about 240 packages.

Great to know - we have some internal apps of ~5 million LOC and we build them under 12Gb of RAM. I'm more and more thinking that parallel builds might be what's causing the biggest issue here. Let us know if it helps limiting the number of workers as Jake suggested!

@sigmundch Below you'll find the compilation log for our main application.

pub global run webdev build --output web:build/web -- --delete-conflicting-outputs
[INFO] Building new asset graph completed, took 13.0s
[INFO] Checking for unexpected pre-existing outputs. completed, took 2ms
[INFO]build_web_compilers:entrypoint on web/unified.dart: Running dart2js with -O3 --csp --verbose --packages=.package-c8d7b36e9d30a695d64c7220770b4729 -oweb/unified.dart.js web/unified.dart
[INFO]build_web_compilers:entrypoint on web/modal_home.dart: Running dart2js with -O3 --csp --verbose --packages=.package-f781111427abdec6cb53f112cf302ab5 -oweb/modal_home.dart.js web/modal_home.dart
[INFO]build_web_compilers:entrypoint on web/main.dart: Running dart2js with -O3 --csp --verbose --packages=.package-2db771eee24daf207a15a3c4f223a656 -oweb/main.dart.js web/main.dart
[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
- build_web_compilers:entrypoint on web/modal_home.dart
- build_web_compilers:entrypoint on web/unified.dart
[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
- build_web_compilers:entrypoint on web/modal_home.dart
- build_web_compilers:entrypoint on web/unified.dart
[INFO]build_web_compilers:entrypoint on web/unified.dart: Dart2Js finished with:
Info: Compiling file:///tmp/scratch_spaceQINVQV/web/unified.dart (2.4.1)
Info: Kernel load complete
Info: Resolved 21557 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(getProperty))
static uses:
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=4, types=1))
const-literals:
"Object|Null"
"#[#]"
native-data:
NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1)).
Info: Resolved 812 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 21034 elements in inferencing graph.
Info: Inferred 398193 types.
Info: Compiling methods
Info: Compiled 14323 methods.
Info: Compiled 812 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
kernel loader:                                   17.050s (39.8%)
Front end:                                        1.698s  (4.0%)
Front end > closures:                             0.609s  (1.4%)
Front end > worldImpact:                          0.867s  (2.0%)
Type inference:                                   4.465s (10.4%)
Deferred Loading:                                 0.893s  (2.1%)
Deferred Loading > prepare:                       0.022s  (0.1%)
Deferred Loading > find-imports:                  0.153s  (0.4%)
Enqueue:                                          4.709s (11.0%)
Enqueue > resolution.staticUse:                   0.091s  (0.2%)
Enqueue > resolution.typeUse:                     0.857s  (2.0%)
Enqueue > resolution.dynamicUse:                  3.339s  (7.8%)
Enqueue > codegen.typeUse:                        0.177s  (0.4%)
Enqueue > codegen.staticUse:                      0.118s  (0.3%)
Enqueue > codegen.dynamicUse:                     0.065s  (0.2%)
Enqueue > codegen.constantUse:                    0.059s  (0.1%)
self:                                             3.187s  (7.4%)
*self > impl.run:                                  0.004s  (0.0%)
self > run:                                       0.753s  (1.8%)
self > compileFromKernel:                         0.007s  (0.0%)
self > computeClosedWorld:                        1.360s  (3.2%)
self > processQueue:                              0.042s  (0.1%)
self > emptyQueue:                                0.076s  (0.2%)
self > applyImpact:                               0.297s  (0.7%)
self > work.run:                                  0.644s  (1.5%)
SSA builder:                                      2.496s  (5.8%)
SSA optimizer:                                    4.412s (10.3%)
SSA optimizer > SsaInstructionSimplifier:         0.629s  (1.5%)
SSA optimizer > SsaTypeconversionInserter:        0.072s  (0.2%)
SSA optimizer > SsaRedundantPhiEliminator:        0.021s  (0.0%)
SSA optimizer > SsaDeadPhiEliminator:             0.029s  (0.1%)
SSA optimizer > SsaTypePropagator:                0.441s  (1.0%)
SSA optimizer > SsaCheckInserter:                 0.065s  (0.2%)
SSA optimizer > SsaDeadCodeEliminator:            0.138s  (0.3%)
SSA optimizer > SsaGlobalValueNumberer:           0.103s  (0.2%)
SSA optimizer > SsaCodeMotion:                    0.044s  (0.1%)
SSA optimizer > SsaLoadElimination:               0.205s  (0.5%)
SSA optimizer > SSA value range builder:          0.135s  (0.3%)
SSA optimizer > SsaSimplifyInterceptors:          2.245s  (5.2%)
SSA code generator:                               1.579s  (3.7%)
SSA code generator > SsaInstructionSelection:     0.059s  (0.1%)
SSA code generator > SsaTypeKnownRemover:         0.027s  (0.1%)
SSA code generator > SsaTrustedCheckRemover:      0.006s  (0.0%)
SSA code generator > SsaAssignmentChaining:       0.029s  (0.1%)
SSA code generator > SsaInstructionMerger:        0.095s  (0.2%)
SSA code generator > SsaConditionMerger:          0.020s  (0.0%)
SSA code generator > SsaShareRegionConstants:     0.025s  (0.1%)
SSA code generator > SsaLiveIntervalBuilder:      0.204s  (0.5%)
SSA code generator > SsaVariableAllocator:        0.333s  (0.8%)
Code emitter:                                     2.183s  (5.1%)
Code emitter > finalize rti:                      0.064s  (0.1%)
Code emitter > build program:                     0.345s  (0.8%)
Code emitter > emit program:                      0.208s  (0.5%)
Code emitter > write fragments:                   1.156s  (2.7%)
Code emitter > source-maps:                       0.339s  (0.8%)
Code emitter > emit buffers:                      0.069s  (0.2%)
Diagnostic handler:                               0.000s  (0.0%)
Total compile-time 42.889s; setup 0.000s; async 0.018s; unaccounted 0.193s (0.45%)
Compiled 48,074,370 characters Dart to 1,895,627 characters JavaScript in 42.9 seconds
Info: 883,913 characters JavaScript in web/unified.dart.js
Dart file web/unified.dart compiled to JavaScript: web/unified.dart.js
Emitted file 4 JavaScript files.
[WARNING] No actions completed for 15.0s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
- build_web_compilers:entrypoint on web/modal_home.dart
[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
- build_web_compilers:entrypoint on web/modal_home.dart
[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
- build_web_compilers:entrypoint on web/modal_home.dart
[INFO]build_web_compilers:entrypoint on web/modal_home.dart: Dart2Js finished with:
Info: Compiling file:///tmp/scratch_spaceQINVQV/web/modal_home.dart (2.4.1)
Info: Kernel load complete
Info: Resolved 62949 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(getProperty))
static uses:
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=4, types=1))
const-literals:
"Object|Null"
"#[#]"
native-data:
NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1)).
Info: Resolved 852 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 61991 elements in inferencing graph.
Info: Inferred 1013912 types.
Info: Compiling methods
Info: Compiled 29475 methods.
Info: Compiled 852 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
kernel loader:                                   26.314s (29.2%)
Front end:                                        3.512s  (3.9%)
Front end > closures:                             0.987s  (1.1%)
Front end > worldImpact:                          1.929s  (2.1%)
Type inference:                                  20.554s (22.8%)
Deferred Loading:                                 1.616s  (1.8%)
Deferred Loading > prepare:                       0.017s  (0.0%)
Deferred Loading > find-imports:                  0.315s  (0.3%)
Enqueue:                                          7.064s  (7.8%)
Enqueue > resolution.staticUse:                   0.192s  (0.2%)
Enqueue > resolution.typeUse:                     2.100s  (2.3%)
Enqueue > resolution.dynamicUse:                  3.966s  (4.4%)
Enqueue > codegen.typeUse:                        0.309s  (0.3%)
Enqueue > codegen.staticUse:                      0.245s  (0.3%)
Enqueue > codegen.constantUse:                    0.125s  (0.1%)
Enqueue > codegen.dynamicUse:                     0.123s  (0.1%)
self:                                             6.055s  (6.7%)
*self > impl.run:                                  0.006s  (0.0%)
self > run:                                       1.096s  (1.2%)
self > compileFromKernel:                         0.011s  (0.0%)
self > computeClosedWorld:                        2.612s  (2.9%)
self > processQueue:                              0.058s  (0.1%)
self > emptyQueue:                                0.172s  (0.2%)
self > applyImpact:                               0.795s  (0.9%)
self > work.run:                                  1.301s  (1.4%)
SSA builder:                                      5.937s  (6.6%)
SSA optimizer:                                   10.762s (11.9%)
SSA optimizer > SsaInstructionSimplifier:         1.377s  (1.5%)
SSA optimizer > SsaTypeconversionInserter:        0.130s  (0.1%)
SSA optimizer > SsaRedundantPhiEliminator:        0.039s  (0.0%)
SSA optimizer > SsaDeadPhiEliminator:             0.056s  (0.1%)
SSA optimizer > SsaTypePropagator:                1.070s  (1.2%)
SSA optimizer > SsaCheckInserter:                 0.118s  (0.1%)
SSA optimizer > SsaDeadCodeEliminator:            0.251s  (0.3%)
SSA optimizer > SsaGlobalValueNumberer:           0.213s  (0.2%)
SSA optimizer > SsaCodeMotion:                    0.067s  (0.1%)
SSA optimizer > SsaLoadElimination:               0.481s  (0.5%)
SSA optimizer > SSA value range builder:          0.231s  (0.3%)
SSA optimizer > SsaSimplifyInterceptors:          6.219s  (6.9%)
SSA code generator:                               2.986s  (3.3%)
SSA code generator > SsaInstructionSelection:     0.119s  (0.1%)
SSA code generator > SsaTypeKnownRemover:         0.053s  (0.1%)
SSA code generator > SsaTrustedCheckRemover:      0.011s  (0.0%)
SSA code generator > SsaAssignmentChaining:       0.055s  (0.1%)
SSA code generator > SsaInstructionMerger:        0.181s  (0.2%)
SSA code generator > SsaConditionMerger:          0.038s  (0.0%)
SSA code generator > SsaShareRegionConstants:     0.053s  (0.1%)
SSA code generator > SsaLiveIntervalBuilder:      0.356s  (0.4%)
SSA code generator > SsaVariableAllocator:        0.657s  (0.7%)
Code emitter:                                     4.847s  (5.4%)
Code emitter > finalize rti:                      0.171s  (0.2%)
Code emitter > build program:                     0.704s  (0.8%)
Code emitter > emit program:                      0.467s  (0.5%)
Code emitter > write fragments:                   2.656s  (2.9%)
Code emitter > source-maps:                       0.700s  (0.8%)
Code emitter > emit buffers:                      0.145s  (0.2%)
Diagnostic handler:                               0.000s  (0.0%)
Total compile-time 90.112s; setup 0.000s; async 0.029s; unaccounted 0.429s (0.48%)
Compiled 75,567,544 characters Dart to 3,898,517 characters JavaScript in 90.2 seconds
Info: 3,898,517 characters JavaScript in web/modal_home.dart.js
Dart file web/modal_home.dart compiled to JavaScript: web/modal_home.dart.js
Emitted file 1 JavaScript files.
[WARNING] No actions completed for 15.0s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart

... snipped ...

[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
[INFO]build_web_compilers:entrypoint on web/main.dart: Dart2Js finished with:
Info: Compiling file:///tmp/scratch_spaceQINVQV/web/main.dart (2.4.1)
Info: Kernel load complete
Info: Resolved 353764 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(callMethod))
static uses:
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=6, types=1))
type uses:
TypeUse(String,TypeUseKind.PARAMETER_CHECK)
TypeUse(List<dynamic>,TypeUseKind.PARAMETER_CHECK)
const-literals:
"Object|Null"
"#[#].apply(#, #)"
native-data:
NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes anything), throws: may).
Info: Resolved 959 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 348316 elements in inferencing graph.
Info: Inferred 7457920 types.
Info: Compiling methods
Info: Compiled 261047 methods.
Info: Compiled 959 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
kernel loader:                                   48.227s  (3.5%)
Front end:                                       16.635s  (1.2%)
Front end > closures:                             3.546s  (0.3%)
Front end > worldImpact:                         10.483s  (0.8%)
Type inference:                                 809.963s (59.0%)
Deferred Loading:                                14.607s  (1.1%)
Deferred Loading > prepare:                       0.032s  (0.0%)
Deferred Loading > find-imports:                  3.196s  (0.2%)
Enqueue:                                         50.264s  (3.7%)
Enqueue > resolution.staticUse:                   1.042s  (0.1%)
Enqueue > resolution.typeUse:                    23.682s  (1.7%)
Enqueue > resolution.dynamicUse:                 15.228s  (1.1%)
Enqueue > codegen.typeUse:                        3.204s  (0.2%)
Enqueue > codegen.staticUse:                      4.571s  (0.3%)
Enqueue > codegen.dynamicUse:                     1.397s  (0.1%)
Enqueue > codegen.constantUse:                    1.136s  (0.1%)
self:                                            38.161s  (2.8%)
*self > impl.run:                                  0.006s  (0.0%)
self > run:                                       1.919s  (0.1%)
self > compileFromKernel:                         0.160s  (0.0%)
self > computeClosedWorld:                       17.912s  (1.3%)
self > processQueue:                              0.122s  (0.0%)
self > emptyQueue:                                1.181s  (0.1%)
self > applyImpact:                               5.702s  (0.4%)
self > work.run:                                 11.156s  (0.8%)
SSA builder:                                    105.507s  (7.7%)
SSA optimizer:                                  195.665s (14.3%)
SSA optimizer > SsaInstructionSimplifier:        21.150s  (1.5%)
SSA optimizer > SsaTypeconversionInserter:        1.208s  (0.1%)
SSA optimizer > SsaRedundantPhiEliminator:        0.355s  (0.0%)
SSA optimizer > SsaDeadPhiEliminator:             0.646s  (0.0%)
SSA optimizer > SsaTypePropagator:               22.848s  (1.7%)
SSA optimizer > SsaCheckInserter:                 1.080s  (0.1%)
SSA optimizer > SsaDeadCodeEliminator:            2.491s  (0.2%)
SSA optimizer > SsaGlobalValueNumberer:           2.209s  (0.2%)
SSA optimizer > SsaCodeMotion:                    0.646s  (0.0%)
SSA optimizer > SsaLoadElimination:               5.562s  (0.4%)
SSA optimizer > SSA value range builder:          1.976s  (0.1%)
SSA optimizer > SsaSimplifyInterceptors:        129.364s  (9.4%)
SSA code generator:                              26.896s  (2.0%)
SSA code generator > SsaInstructionSelection:     0.973s  (0.1%)
SSA code generator > SsaTypeKnownRemover:         0.465s  (0.0%)
SSA code generator > SsaTrustedCheckRemover:      0.109s  (0.0%)
SSA code generator > SsaAssignmentChaining:       0.472s  (0.0%)
SSA code generator > SsaInstructionMerger:        1.775s  (0.1%)
SSA code generator > SsaConditionMerger:          0.279s  (0.0%)
SSA code generator > SsaShareRegionConstants:     0.469s  (0.0%)
SSA code generator > SsaLiveIntervalBuilder:      3.666s  (0.3%)
SSA code generator > SsaVariableAllocator:        6.119s  (0.4%)
Code emitter:                                    63.477s  (4.6%)
Code emitter > finalize rti:                      7.248s  (0.5%)
Code emitter > build program:                    22.425s  (1.6%)
Code emitter > emit program:                      2.978s  (0.2%)
Code emitter > write fragments:                  22.764s  (1.7%)
Code emitter > source-maps:                       6.407s  (0.5%)
Code emitter > emit buffers:                      1.649s  (0.1%)
Diagnostic handler:                               0.005s  (0.0%)
Total compile-time 1372.998s; setup 0.000s; async 0.051s; unaccounted 3.532s (0.26%)
Compiled 135,107,008 characters Dart to 34,420,510 characters JavaScript in 1373 seconds
Info: 11,630,196 characters JavaScript in web/main.dart.js
Dart file web/main.dart compiled to JavaScript: web/main.dart.js
Emitted file 399 JavaScript files.
[WARNING] No actions completed for 15.0s, waiting on:
- Instance of 'Dart2JsArchiveExtractor' on wdesk|web/main.dart.js.tar.gz
[WARNING] No actions completed for 15.1s, waiting on:
- Instance of 'Dart2JsArchiveExtractor' on wdesk|web/main.dart.js.tar.gz
[INFO] Running build completed, took 24m 30s
[INFO] Caching finalized dependency graph completed, took 2.5s
[INFO] Creating merged output dir `build/web` completed, took 28.1s
[INFO] Writing asset manifest completed, took 41ms
[INFO] Succeeded after 25m 1s with 19701 outputs (66017 actions)

The following charts are for this actual compilation in CI. I modified the CI run so that all it does is a pub get followed by webdev build, so the majority of this is just the compilation.

image

image

The only dart2js flags we've been using are -O3 and --csp. Of course, for the data above I added the --verbose flag too.

From the output it looks like webdev might be spawning 3 dart2js processes, one for each entrypoint. Am I interpreting that correctly?

Thank you this is very valuable information.

it looks like webdev might be spawning 3 dart2js processes, one for each entrypoint. Am I interpreting that correctly?

Correct. Is this the regular CI job that you are concerned about? or are you also running tests with pub run build_runner test --release which can also spawn dart2js processes?

From the data above, the main.dart entrypoint shows a couple important areas we need to work on on our end:

  • our global inference is huge and is likely causing the memory issues. We've known for a long time that this phase was expensive and have done a bunch of improvements this last year (mostly by taking advantage of Dart 2.0 semantics and trying to reduce the work we do). There is clearly more to be done and unfortunately it is not something we can solve quickly. Unfortunately, the data here is telling me that we will unlikely gain from splitting the compile in pieces like I suggested in an earlier post, since the most expensive part is on the earlier phase.
  • the time spent in SsaSimplifyInterceptors seems extremely high too. There is a piece of logic there that could potentially be quadratic or possibly cubic depending on the structure of the code. We'll take a look to fix that. It is possible that you have some functions that trigger our pathological running time. If so, breaking apart those functions could be a way to cut that piece of compile-time. If you are interested and it is easy for you to compile this code with a custom dart2js, we could create a patch that helps narrow down where that time is being spent.

We do have some larger packages on which we run tests in CI using pub run build_runner test --release. My original comment on this issue from August shows a memory usage chart for one of those packages.

We'd be more than happy to test out a custom dart2js to help get to the root of the problem!

^ As some added context, in packages that we do run tests in CI, they are not run in parallel to the build step. We also tried another run where we only ran pub get and the build, and we split the build up into 3 parts (one for each entry point in web/) so that only one dart2js process was running at any given time. That did decrease the amount of swap used during the build, but peak memory remained pegged at ~16GB and cumulative compilation time increased (unsurprisingly) due to the lack of parallelization.

We do have some larger packages on which we run tests in CI using pub run build_runner test --release. My original comment on this issue from August shows a memory usage chart for one of those packages.

OK - that would explain why you are seeing ~20G earlier. Let us know if the tip from @jakemac53 to use export BUILD_MAX_WORKERS_PER_TASK=1 helps in that context when running tests. It will simply remove the parallelization and hopefully reduce the amount of swap.

We'd be more than happy to test out a custom dart2js to help get to the root of the problem!

Awesome, thanks!

I can confirm that using a reduced number of workers does help reduce the memory footprint. But it's not a silver bullet as it seems to increase the compilation time pretty dramatically on the package I tested it on.

Here's what I'm seeing in CI as I vary BUILD_MAX_WORKERS_PER_TASK for the package that we're running unit tests using dart2js on.

Worker Count | Peak Memory (GB) | Compile & Test Execution Time (min)
-- | -- | --
4 | 20 | 21
3 | 15 | 24
2 | 11 | 33
1 | 7 | 58

It sounds like separating out the building from the testing would be beneficial, that is what we do internally.

You could run 4 separate jobs that each build one of the apps, and then a single job that runs the test using the built outputs from the other jobs. That would give you the fastest turnaround in terms of your CI and will ultimately scale the best as it will limit the amount of ram required for any individual job.

Yeah we have already started moving tests to separate jobs, and we can split up the build steps in the case where we have multiple entry points, but it looks like we would still be interested in experimenting with a custom dart2js for the largest dart2js build step since it sustains a peak memory usage of 16-20GB even when run completely by itself.

We have similar problems. If @evanweible-wf @todbachman-wf want to call, we can talk about some of our solutions to speed up.
Maybe @sigmundch will help us with dart2js(Total compile-time 461s)?

[INFO] GrinderTasksBuilder: dart2js --verbose --terse --enable-diagnostic-colors --minify --trust-primitives --omit-implicit-checks --show-package-warnings --new-deferred-split -O3 --out=___.dart.js --packages=___.dart
Info: Compiling file:///tmp/OVLTHP/___.dart (2.4.0)
Info: Kernel load complete
Info: Resolved 246037 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(callConstructor))
dynamic uses:
Selector(operator, ==, arity=1),StrongModeConstraint(k:class(List),ClassRelation.subtype),[]
Selector(call, addAll, arity=1),StrongModeConstraint(k:class(List),ClassRelation.subtype),[]
static uses:
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=3, types=1))
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[bool],CallStructure(arity=3, types=1))
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[int],CallStructure(arity=3, types=1))
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=4, types=1))
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=5, types=1))
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=6, types=1))
StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=7, types=1))
type uses:
TypeUse(Function,TypeUseKind.PARAMETER_CHECK)
TypeUse(List,TypeUseKind.PARAMETER_CHECK)
list-literals:
ListLiteralUse(List,isConstant:false,isEmpty:false)
const-literals:
null
"Object"
"new #()"
"bool"
"# instanceof Array"
"int"
"#.length"
0
1
""
"#[0]"
"new #(#)"
2
"#[1]"
"new #(#, #)"
3
"#[2]"
"new #(#, #, #)"
4
"#[3]"
"new #(#, #, #, #)"
"#.bind.apply(#, #)"
"String"
"String(#)"
native-data:
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [bool], creates: [bool], sideEffects: SideEffects(reads static; writes nothing), throws: may)
NativeBehavior(returns: [int], creates: [int], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1))
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [Object, Null], creates: [], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [String], creates: [String], sideEffects: SideEffects(reads anything; writes anything), throws: may)
NativeBehavior(returns: [Object], creates: [Object], sideEffects: SideEffects(reads anything; writes anything), throws: may).
Info: Resolved 1062 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 240586 elements in inferencing graph.
Info: Inferred 4930897 types.
Info: Compiling methods
Info: Compiled 227811 methods.
Info: Compiled 1062 native classes, 0 native classes omitted.
Info: Emitting JavaScript
packages/*/src/common/model/folder.dart:288:11:
Hint: Overriding 'noSuchMethod' causes the compiler to generate more code and prevents the compiler from doing some optimizations.
dynamic noSuchMethod(Invocation invocation) => null;
^
Info: Timings:
kernel loader: 47.534s (10.3%)
Front end: 17.534s (3.8%)
Front end > closures: 3.343s (0.7%)
Front end > worldImpact: 11.668s (2.5%)
Type inference: 148.534s (32.2%)
Deferred Loading: 5.609s (1.2%)
Deferred Loading > prepare: 0.186s (0.0%)
Enqueue: 25.177s (5.5%)
Enqueue > resolution.staticUse: 1.045s (0.2%)
Enqueue > resolution.typeUse: 13.548s (2.9%)
Enqueue > resolution.dynamicUse: 3.600s (0.8%)
Enqueue > codegen.typeUse: 2.775s (0.6%)
Enqueue > codegen.staticUse: 2.608s (0.6%)
Enqueue > codegen.dynamicUse: 0.635s (0.1%)
Enqueue > codegen.constantUse: 0.964s (0.2%)
self: 48.778s (10.6%)
*self > impl.run: 0.005s (0.0%)
self > run: 2.056s (0.4%)
self > compileFromKernel: 0.114s (0.0%)
self > computeClosedWorld: 29.614s (6.4%)
self > processQueue: 0.142s (0.0%)
self > emptyQueue: 1.311s (0.3%)
self > applyImpact: 5.461s (1.2%)
self > work.run: 10.071s (2.2%)
SSA builder: 45.692s (9.9%)
SSA optimizer: 36.937s (8.0%)
SSA optimizer > SsaInstructionSimplifier: 9.841s (2.1%)
SSA optimizer > SsaTypeconversionInserter: 1.002s (0.2%)
SSA optimizer > SsaRedundantPhiEliminator: 0.265s (0.1%)
SSA optimizer > SsaDeadPhiEliminator: 0.434s (0.1%)
SSA optimizer > SsaTypePropagator: 6.998s (1.5%)
SSA optimizer > SsaCheckInserter: 0.248s (0.1%)
SSA optimizer > SsaDeadCodeEliminator: 1.962s (0.4%)
SSA optimizer > SsaGlobalValueNumberer: 2.180s (0.5%)
SSA optimizer > SsaCodeMotion: 0.503s (0.1%)
SSA optimizer > SsaLoadElimination: 4.462s (1.0%)
SSA optimizer > SSA value range builder: 1.529s (0.3%)
SSA optimizer > SsaSimplifyInterceptors: 3.464s (0.7%)
SSA code generator: 22.719s (4.9%)
SSA code generator > SsaInstructionSelection: 1.038s (0.2%)
SSA code generator > SsaTypeKnownRemover: 0.400s (0.1%)
SSA code generator > SsaTrustedCheckRemover: 0.307s (0.1%)
SSA code generator > SsaAssignmentChaining: 0.490s (0.1%)
SSA code generator > SsaInstructionMerger: 1.468s (0.3%)
SSA code generator > SsaConditionMerger: 0.230s (0.0%)
SSA code generator > SsaShareRegionConstants: 0.426s (0.1%)
SSA code generator > SsaLiveIntervalBuilder: 2.992s (0.6%)
SSA code generator > SsaVariableAllocator: 4.777s (1.0%)
Code emitter: 61.255s (13.3%)
Code emitter > finalize rti: 7.132s (1.5%)
Code emitter > build program: 21.180s (4.6%)
Code emitter > emit program: 2.611s (0.6%)
Code emitter > write fragments: 22.409s (4.9%)
Code emitter > source-maps: 6.304s (1.4%)
Code emitter > emit buffers: 1.615s (0.3%)
Diagnostic handler: 0.003s (0.0%)
Total compile-time *
461.902s
; setup 0.000s; async 0.085s; unaccounted 2.039s (0.44%)
Compiled 96,689,428 characters Dart to 22,393,135 characters JavaScript in 462 seconds
Info: 4,651,368 characters JavaScript in ___.dart.js
Dart file ___.dart compiled to JavaScript: ___.dart.js
Emitted file 504 JavaScript files.
[INFO] GrinderTasks: dart2js took 490780 ms

There is a piece of logic there that could potentially be quadratic or possibly cubic depending on the structure of the code. We'll take a look to fix that. It is possible that you have some functions that trigger our pathological running time. If so, breaking apart those functions could be a way to cut that piece of compile-time.

@sigmundch How would we identify the offending code?

@todbachman-wf - I have 2 changes I'd like you to try, even better if done separately. The changes are currently only in code reviews:
https://dart-review.googlesource.com/c/sdk/+/121645 - to print the methods that are taking a while on that optimization phase
https://dart-review.googlesource.com/c/sdk/+/121241 - replacing a part of the algorithm to be linear and not cubic (which may be the issue at fault for this phase).

We'd be more than happy to test out a custom dart2js to help get to the root of the problem!

What's the best way for us to provide these changes to you? Do you need prebuilt binaries? Or by any chance do you have a setup to build the dart SDK, so a patch like the CL above would be enough for you to test things out?

(Tod is on PTO for a couple days.)

We don't currently have a CI setup that builds the dart SDK, so prebuilt binaries would be helpful, but we can also look into building from a CL. Do we need to be compatible with the latest dev channel of the SDK in order to use these dart2js refs?

Thanks @evanweible-wf

I'll look into what options we have available on our end as well (maybe we can give you a small snapshot of the compiler that could be copied into your sdk for testing, or maybe we have a channel that is frequent enough that we can ship this under a flag and let you test it out). If the former idea is an option, we should be able to rebase the code changes on top of the version of the sdk you are currently using.

Can you remind me what version are you currently using?

We're on 2.4.1. There was an analyzer regression that prevented us from upgrading to 2.5.0 and although there was a related fix merged, we are unable to compile on 2.5.2. We might have better luck with 2.6.0, but currently our app fails to build due to what looks like a typing regression.

@sigmundch I built the master branch of the SDK yesterday to see if https://dart-review.googlesource.com/c/sdk/+/121241 would help with memory but the build fails due to this issue https://github.com/dart-lang/sdk/issues/38880
So I wasn't able to determine if it helps for our situation.

@evanweible-wf - see https://github.com/dart-lang/sdk/issues/38880 for a fix in your code (and other context).

Hi @robbecker-wf - great to hear you have a build of the SDK! and thanks for giving it a shot!

With that setup it should be a lot easier to try a few things on the SDK that is working for you. We tag each release, so you can simply do git checkout 2.4.1; gclient sync and you should be able to build the SDK at that version. With that, I'm curious if you could try both patches:

  • with https://dart-review.googlesource.com/c/sdk/+/121645 we are especially curious if it all the time comes from one very pathological case, or if the time is shared among many methods. If it's from a single place, we could learn about what patterns make the algorithm worse and whether changes in your code could help reduce compilation time.

  • with https://dart-review.googlesource.com/c/sdk/+/121241 we
    are curious if this was the main issue or if there is more to do. We have tried creating artificial inputs to recreate the scenarios where this code is slow under the SsaSimplifyInterceptors step, but we haven't successfully found examples that don't make other parts of the compiler slower too. The fact that this phase is so much slower than the rest is pretty unusual.

@vsmenon I hacked in the workaround from the linked issue and built master of the dart sdk from today, Oct 16. It got much further, but eventually OOM'd after 20m instead of about 5m.

I tried it with vanilla 2.6-dev8 which incorporates https://dart-review.googlesource.com/c/sdk/+/121241 and it still crashes with OOM. Next I'll apply https://dart-review.googlesource.com/c/sdk/+/121645 and report back here.

6m in and all it's turned up so far is:
interceptor simplifier on j:constructor(IdTokenClaims.fromJwt) took 1164 ms

I added a print at the beginning so I could tell if this was working. It printed all the _ in the first ~6m and then finally crashed with OOM just under 19m.
My function looked like this:

  @override
  void visitGraph(HGraph graph) {
    print('_');
    Stopwatch watch = new Stopwatch()..start();
    this._graph = graph;
    visitDominatorTree(graph);
    if (watch.elapsedMilliseconds > 1000) {
      print('interceptor simplifier on ${graph.element} '
          'took ${watch.elapsedMilliseconds} ms');
    }
  }

FYI 16GB macbook pro.
running with export BUILD_MAX_WORKERS_PER_TASK=3
limiting to 1 worker didn't help, it just took longer to crash.

Interesting, I wonder if CL-121645 alone (without the other change) will show a larger number.

In #38797 the crash was happening in global inference, which would be before we hit the SsaSimplifyInterceptor step entirely. If you are seeing the first print, it seems to indicate that you are crashing at a later?

If that's the case, what OOM exception are you getting? Maybe we are in a different stage of the compiler?

Also, what flags do you have enabled?

Also, what flags do you have enabled?

Sorry for this duplicate question, I initially didn't see it, but I found again the answer you provided to this earlier

pbrb -o build --delete-conflicting-outputs --low-resources-mode --release
and build.yaml of

targets:
  $default:
    sources:
      exclude:
        - "lib/src/local_unified_via_bigsky_transformer.dart"
        - "test/performance/performance_test.dart"
    builders:
      build_web_compilers|entrypoint:
        generate_for:
          include:
            - "test/unit/**.browser_test.dart"
            - "web/*.dart"
        release_options:
          dart2js_args:
            - -O3
            - --csp

# Hack-y solution for source maps pending more permanent solution:
# https://github.com/dart-lang/build/issues/1779
global_options:
  build_web_compilers|dart_source_cleanup:
    release_options:
      enabled: false
  build_web_compilers|dart2js_archive_extractor:
    release_options:
      filter_outputs: false

I also tried with -O2. still crashes.

I think you're right. The fix seems to have gotten us past the first big hurdle and now it crashes much later on.

 - build_web_compilers:entrypoint on web/main.dart

[SEVERE] build_web_compilers:entrypoint on web/main.dart:
Dart2Js finished with:

Exhausted heap space, trying to allocate 262176 bytes.
web/main.dart:
Internal Error: The compiler crashed when compiling this element.

The compiler is broken.

When compiling the above element, the compiler crashed. It is not
possible to tell if this is caused by a problem in your program or
not. Regardless, the compiler should not crash.

The Dart team would greatly appreciate if you would take a moment to
report this problem at http://dartbug.com/new.

Please include the following information:

* the name and version of your operating system,

* the Dart SDK build number (2.6.0-edge.d9fa27a9ed8f52eae5d93831c7e4633e852d0b73), and

* the entire message you see here (including the full stack trace
  below as well as the source location above).

The compiler crashed: Out of Memory
#0      _LinkedHashMapMixin._init (dart:collection-patch/compact_hash.dart)
#1      _LinkedHashMapMixin._rehash (dart:collection-patch/compact_hash.dart:183:7)
#2      _LinkedHashMapMixin._insert (dart:collection-patch/compact_hash.dart:227:7)
#3      _LinkedHashMapMixin.putIfAbsent (dart:collection-patch/compact_hash.dart:296:7)
#4      DynamicCallSiteTypeInformation.computeType.<anonymous closure> (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/type_graph_nodes.dart:402:14)
#5      DynamicCallSiteTypeInformation.computeType (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/type_graph_nodes.dart:345:15)
#6      TypeInformation.refine (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/type_graph_nodes.dart:148:54)
#7      InferrerEngineImpl.refine (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:895:36)
#8      InferrerEngineImpl.runOverAllElements (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:515:5)
#9      TypeGraphInferrer.analyzeMain (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/type_graph_inferrer.dart:68:14)
#10     GlobalTypeInferenceTask.runGlobalTypeInference.<anonymous closure> (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/types.dart:181:41)
#11     CompilerTask.measure (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/common/tasks.dart:64:51)
#12     GlobalTypeInferenceTask.runGlobalTypeInference (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/inferrer/types.dart:174:12)
#13     Compiler.performGlobalTypeInference (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/compiler.dart:377:28)
#14     Compiler.compileFromKernel.<anonymous closure> (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/compiler.dart:420:13)
#15     CompilerTask.measureSubtask (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/common/tasks.dart:179:35)
#16     Compiler.compileFromKernel (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/compiler.dart:414:14)
#17     Compiler.runInternal (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/compiler.dart:255:13)
#18     _FutureListener.handleValue (dart:async/future_impl.dart:814:22158)
#19     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#20     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#21     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#22     KernelLoaderTask.load.<anonymous closure> (file:///Users/robbecker/code/dart-sdk/sdk/pkg/compiler/lib/src/kernel/loader.dart)
#23     _FutureListener.handleValue (dart:async/future_impl.dart:814:22158)
#24     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#25     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#26     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#27     compile (package:front_end/src/api_unstable/dart2js.dart)
#28     _FutureListener.handleValue (dart:async/future_impl.dart:814:22158)
#29     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#30     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#31     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#32     compile.<anonymous closure> (package:front_end/src/api_unstable/dart2js.dart)
#33     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#34     _rootRunUnary (dart:async/zone.dart:1132:38)
#35     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#36     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#37     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#38     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#39     withCrashReporting (package:front_end/src/fasta/crash.dart)
#40     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#41     _rootRunUnary (dart:async/zone.dart:1132:38)
...
...
#49     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#50     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#51     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#52     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#53     withCrashReporting (package:front_end/src/fasta/crash.dart)
#54     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#55     _rootRunUnary (dart:async/zone.dart:1132:38)
#56     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#57     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#58     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#59     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#60     KernelTarget.buildComponent.<anonymous closure> (package:front_end/src/fasta/kernel/kernel_target.dart)
#61     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#62     _rootRunUnary (dart:async/zone.dart:1132:38)
#63     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#64     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#65     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#66     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#67     Loader.buildBodies (package:front_end/src/fasta/loader.dart)
#68     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#69     _rootRunUnary (dart:async/zone.dart:1132:38)
#70     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#71     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#72     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#73     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#74     SourceLoader.buildBody (package:front_end/src/fasta/source/source_loader.dart)
#75     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#76     _rootRunUnary (dart:async/zone.dart:1132:38)
#77     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#78     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#79     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#80     Future._asyncComplete.<anonymous closure> (dart:async/future_impl.dart:522:5)
#81     _rootRun (dart:async/zone.dart:1124:13)
#82     _CustomZone.bindCallback.<anonymous closure> (dart:async/zone.dart:1021:19)
#83     _microtaskLoop (dart:async/schedule_microtask.dart:41:21)
#84     _startMicrotaskLoop (dart:async/schedule_microtask.dart:50:5)
#85     _runPendingImmediateCallback (dart:isolate-patch/isolate_patch.dart:116:13)
#86     _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:173:5)


[INFO] Running build completed, took 17m 14s
[INFO] Caching finalized dependency graph completed, took 3.9s
[SEVERE] Failed after 17m 18s

Looks like Type inference. Wasn't this the area that was using a lot of memory for us under Dart 1? We took to using --trust-type-annotations and that helped.
-O3 should include the dart 2 equivalent of --omit-implicit-checks

DynamicCallSiteTypeInformation.computeType makes me wonder if this is related to heavily using noSuchMethod ? or maybe I should say call() like this https://github.com/Workiva/over_react/blob/master/lib/src/component_declaration/component_base.dart#L611

Also worth pointing out that in Dart 1 we were getting close to and sometimes exceeding the max number of classes allowed in the VM (65,536 .. max uint value?)
In Dart 2, everything compiles to JS so this particular thing isn't a limitation, but it may be related here?

Wasn't this the area that was using a lot of memory for us under Dart 1?

Yes, this is the phase that has historically been the most expensive bit of the compiler. I'm confused though because the fix in my patch affects phases that come much later in the pipeline. I'm starting to wonder again if the multiple tasks are confusing us a bit. Maybe the other tasks were adding more pressure before the fix?

To isolate things further, I think it will help if you can run dartj2s externally without using dart_web_compilers. It is possible that the build folder (under .dart_tool/) already has all the generated code (e.g. if you are using frameworks like angular that generate a bunch of code). So you could then invoke dart2js directly on the commandline and give it inputs from that folder. If that doesn't work, @jakemac53 proposed two ideas on this comment https://github.com/dart-lang/sdk/issues/38797#issuecomment-542235006.

Once we have an isolated way to run dart2js, we may be able to run it with a profiler to narrow down where the time is being spent.

DynamicCallSiteTypeInformation.computeType makes me wonder if this is related to heavily using noSuchMethod ?

Yes - I was wondering why workiva is hitting this problem when our internal apps of similar size are not. The more dynamic the code is, the worse it is for analyses like ours.

One particular issue I wanted to investigate is that there is a potential quadratic explosion of state when you have a lot of dynamic calls that could match a lot of targets. I have to look closer, but I believe noSuchMethod can create a lot of those scenarios and could be a reason behind this.

There is one possible way to remove the quadratic explosion in the algorithm, but it is a non trivial bit of work. I tried to measure the impact of this long ago in an internal app (> year ago now), and it didn't show a potential benefit so we didn't pursue it.

maybe I should say call() like this https://github.com/Workiva/over_react/blob/master/lib/src/component_declaration/component_base.dart#L611

Function.apply is pretty dynamic too :), I have a hunch that it is better than noSuchMethod.

Another idea: to get the answer from CL-121645 you could try to run the compiler by disabling this global analysis via --disable-type-inference. This may cause other issues and the app will be less optimized, so we don't recommend it for your production build, but it can be handy to see what happens on the other parts of the compiler.

https://github.com/Workiva/over_react/blob/master/lib/src/component_declaration/component_base.dart#L611

Are there many call targets like these too?

Any closure invocation whose target is not known will force us to create edges from every argument of the closure to every parameter of any call method that matches the signature. The same is true with other kinds of dynamic calls.

I ran pbr build with --disable-type-inference and the build succeeded in 15m which is about 5m faster than with the optimization enabled.

Compiled 135,875,875 characters Dart to 36,947,383 characters JavaScript in 771 seconds
Dart file web/main.dart compiled to JavaScript: web/main.dart.js

Are there many call targets like these too?

That code used to be based on NSM, but moved to call for Dart 2. I don't think there are many places in our codebase that have the call method .. but that one for instance is the base class for our entire react-based UI component library. So there are many (thousands) subclasses of that class. Are there specific code patterns we should try to move to that would result in faster compile times and less memory? ie: the less dynamic the better? How would we handle JS interop in that world?

Searching our code for NSM .. it mostly shows up in mocks for tests. There are 2 smaller uses. Would it be best to get rid of them?

Looking under build, I'm not seeing a main.dart or an obvious entrypoint to run dart2js directly on. Where would it be?

I just did some searching and in our app, we have discrete implementations of call within:

  • ~66 classes extending Redux's MiddlewareClass or one of its subclasses

    • call implementations look like void call(Store store, dynamic action, NextDispatcher next)

  • ~20 other miscellaneous classes, the majority of which take 0 or 1 arguments
    Note that this is a count of concrete call implementations does not include classes that inherit a call implementation.
    We've got very roughly around ~1422 props classes, all of which inherit the call from UiProps linked above.

It would be good to know whether inherited calls are problematic.

I'm experimenting with editing inferrer_engine.dart and type_graph_nodes.dart to see what I can see. Do you have any ideas on how I would verify if our usage of call is problematic?

... succeeded in 15m which is about 5m faster than with the optimization enabled.

glad to hear!

If it is not too much trouble, it would be great to see the effect of our change in simplify-interceptors. Could you provide the --verbose output of a couple configurations? I'm interested in:

(note that patch 121241 already landed on master, so another way to try is to use the old branch you checked out a few days ago with patch 121645, and repeat the experiment on the latest master branch with patch 121645.)

Searching our code for NSM .. it mostly shows up in mocks for tests. There are 2 smaller uses. Would it be best to get rid of them?

That's good to hear overall. In general yes it's a source of code bloat and usually we recommend not using them in production code. From your description though I don't believe this is where the bloat is coming from.

... It would be good to know whether inherited calls are problematic.

I don't believe so, only if you override call again in the subclass then it is an issue. The main problem with scale in this algorithm is having N dynamic calls that look the same with M different possible targets with P parameters each. We'd then generate NxMxP edges in our inference graph.

Question: Is it important for your framework to use the name call? Can this method be named something else like callComponent? The name call is special because it can be used to pretend that an object implements Function. That means that the compiler doesn't know that any closure in your program could be a target of a call of this kind.

Looking under build, I'm not seeing a main.dart or an obvious entrypoint to run dart2js directly on. Where would it be?

I was operating under the assumption that a file was created on disk there, but I believe I was wrong. @jakemac53 - if there is no codegen involved for the main entrypoint, how do you get a directory where all the sources are available and we can invoke dart2js directly on the command-line?

Using build_runner you would pass -o <output-dir> or -o <input-dir>:<output-dir>

After chatting with @jakemac53 he clarified that you need to run without --release (release mode deletes the source files and only keeps the generated code).

Ok. I first did pub run build_runner -o build --delete-conflicting-outputs to get :allthethings: into the build/web dir. Then in the build/web dir I did dart2js -v -o out.js -O3 --csp main.dart.
As a baseline, here's the output from Dart 2.4.1. I'll put up others to compare with the simplify-interceptors fixes.

✗ dart2js -v -o out.js -O3 --csp main.dart
Info: Compiling file:///Users/robbecker/code/wdesk/build/web/main.dart (2.4.1)
Info: Kernel load complete
Info: Resolved 355056 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(callMethod))
 static uses:
  StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=6, types=1))
 type uses:
  TypeUse(String,TypeUseKind.PARAMETER_CHECK)
  TypeUse(List<dynamic>,TypeUseKind.PARAMETER_CHECK)
 const-literals:
  "Object|Null"
  "#[#].apply(#, #)"
 native-data:
  NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes anything), throws: may).
Info: Resolved 959 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 349582 elements in inferencing graph.
Info: Inferred 7494248 types.
Info: Compiling methods
Info: Compiled 261611 methods.
Info: Compiled 959 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
    kernel loader:                                   87.746s  (5.2%)
    Front end:                                       15.798s  (0.9%)
    Front end > closures:                             4.435s  (0.3%)
    Front end > worldImpact:                          8.693s  (0.5%)
    Type inference:                                 978.316s (58.5%)
    Deferred Loading:                                15.265s  (0.9%)
    Deferred Loading > prepare:                       0.029s  (0.0%)
    Deferred Loading > find-imports:                  3.604s  (0.2%)
    Enqueue:                                         46.269s  (2.8%)
    Enqueue > resolution.staticUse:                   1.044s  (0.1%)
    Enqueue > resolution.typeUse:                    20.815s  (1.2%)
    Enqueue > resolution.dynamicUse:                 11.294s  (0.7%)
    Enqueue > codegen.typeUse:                        3.859s  (0.2%)
    Enqueue > codegen.staticUse:                      6.630s  (0.4%)
    Enqueue > codegen.dynamicUse:                     1.390s  (0.1%)
    Enqueue > codegen.constantUse:                    1.234s  (0.1%)
    self:                                            36.849s  (2.2%)
   *self > impl.run:                                  0.037s  (0.0%)
    self > run:                                       1.774s  (0.1%)
    self > compileFromKernel:                         0.220s  (0.0%)
    self > computeClosedWorld:                       13.891s  (0.8%)
    self > processQueue:                              0.221s  (0.0%)
    self > emptyQueue:                                1.317s  (0.1%)
    self > applyImpact:                               7.508s  (0.4%)
    self > work.run:                                 11.876s  (0.7%)
    SSA builder:                                    105.408s  (6.3%)
    SSA optimizer:                                  281.659s (16.8%)
    SSA optimizer > SsaInstructionSimplifier:        46.200s  (2.8%)
    SSA optimizer > SsaTypeconversionInserter:        1.308s  (0.1%)
    SSA optimizer > SsaRedundantPhiEliminator:        0.398s  (0.0%)
    SSA optimizer > SsaDeadPhiEliminator:             0.655s  (0.0%)
    SSA optimizer > SsaTypePropagator:               23.976s  (1.4%)
    SSA optimizer > SsaCheckInserter:                 1.167s  (0.1%)
    SSA optimizer > SsaDeadCodeEliminator:            2.416s  (0.1%)
    SSA optimizer > SsaGlobalValueNumberer:           2.057s  (0.1%)
    SSA optimizer > SsaCodeMotion:                    0.618s  (0.0%)
    SSA optimizer > SsaLoadElimination:               5.082s  (0.3%)
    SSA optimizer > SSA value range builder:          1.888s  (0.1%)
    SSA optimizer > SsaSimplifyInterceptors:        188.863s (11.3%)
    SSA code generator:                              27.314s  (1.6%)
    SSA code generator > SsaInstructionSelection:     0.999s  (0.1%)
    SSA code generator > SsaTypeKnownRemover:         0.476s  (0.0%)
    SSA code generator > SsaTrustedCheckRemover:      0.123s  (0.0%)
    SSA code generator > SsaAssignmentChaining:       0.458s  (0.0%)
    SSA code generator > SsaInstructionMerger:        1.679s  (0.1%)
    SSA code generator > SsaConditionMerger:          0.306s  (0.0%)
    SSA code generator > SsaShareRegionConstants:     0.413s  (0.0%)
    SSA code generator > SsaLiveIntervalBuilder:      3.591s  (0.2%)
    SSA code generator > SsaVariableAllocator:        6.007s  (0.4%)
    Code emitter:                                    71.452s  (4.3%)
    Code emitter > finalize rti:                      6.252s  (0.4%)
    Code emitter > build program:                    20.407s  (1.2%)
    Code emitter > emit program:                      3.588s  (0.2%)
    Code emitter > write fragments:                  29.615s  (1.8%)
    Code emitter > source-maps:                       9.043s  (0.5%)
    Code emitter > emit buffers:                      2.539s  (0.2%)
    Diagnostic handler:                               0.002s  (0.0%)
    Total compile-time 1672.363s; setup 0.000s; async 0.079s; unaccounted 6.199s (0.37%)
Compiled 135,713,484 characters Dart to 34,586,546 characters JavaScript in 1673 seconds
Info: 11,697,912 characters JavaScript in out.js
Dart file main.dart compiled to JavaScript: out.js
Emitted file 400 JavaScript files.

Question: Is it important for your framework to use the name call? Can this method be named something else like callComponent? The name call is special because it can be used to pretend that an object implements Function. That means that the compiler doesn't know that any closure in your program could be a target of a call of this kind.

Yes, these classes are intended to be callable classes to be able to build Flutter-like fluent code. See https://github.com/Workiva/over_react/#fluent-style-component-consumption

render() {
  // Create a builder for a <div>,
  // add a CSS class name by cascading a typed setter,
  // and invoke the builder with the HTML DOM <h1> and <button> children.
  return (Dom.div()..className = 'container')(

    // Create a builder for an <h1> and invoke it with children.
    // No need for wrapping parentheses, since no props are added.
    Dom.h1()('Click the button!'),

    // Create a builder for a <button>,
    (Dom.button()
      // add a ubiquitous DOM prop exposed on all components,
      // which Dom.button() forwards to its rendered DOM,
      ..id = 'main_button'
      // add another prop,
      ..onClick = _handleClick
    // and finally invoke the builder with children.
    )('Click me')
  );
}

This was built before Flutter was a thing, but maybe there's something to learn from how Flutter widgets are coded.

This is the first case you wanted to see.

  • --disable-type-inference with patch 121645 but without patch 121241
    It's on 2.6.0-dev.7.0 so it doesn't have the ssa fix yet and I reduced the time from 1000ms to 500ms to get more info and added a print the first time it goes through visitGraph(HGraph graph)
✗ dart2js -v -o out.js -O3 --csp --disable-type-inference main.dart
Info: Compiling file:///Users/robbecker/code/wdesk/build/web/main.dart (2.6.0-dev.7.0)
Info: Kernel load complete
Info: Resolved 297619 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(callMethod))member: k:method(callMethod)
 static uses:
  StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=6, types=1))
 type uses:
  TypeUse(String,TypeUseKind.PARAMETER_CHECK)
  TypeUse(List<dynamic>,TypeUseKind.PARAMETER_CHECK)
 const-literals:
  "Object|Null"
  "#[#].apply(#, #)"
 native-data:
  NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes anything), throws: may).
Info: Resolved 959 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Compiling methods
visitGraph(HGraph graph)
interceptor simplifier on j:method(AdminIntelligence.sendFilingDefaultsModifyAnalytic) took 528 ms
interceptor simplifier on j:method(_FormatUpdate.addFormat) took 867 ms
interceptor simplifier on j:constructor(AccessTokenClaims.fromJwt) took 538 ms
interceptor simplifier on j:method(valueFormatFromEntity) took 531 ms
interceptor simplifier on j:method(valueFormatFromEntity) took 529 ms
Info: Compiled 277260 methods.
Info: Compiled 959 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
    kernel loader:                                  121.169s (13.3%)
    Front end:                                       15.574s  (1.7%)
    Front end > closures:                             2.802s  (0.3%)
    Front end > worldImpact:                         10.382s  (1.1%)
    Type inference:                                   0.003s  (0.0%)
    Deferred Loading:                                15.819s  (1.7%)
    Deferred Loading > prepare:                       0.041s  (0.0%)
    Deferred Loading > find-imports:                  3.405s  (0.4%)
    Enqueue:                                         59.581s  (6.5%)
    Enqueue > resolution.staticUse:                   1.204s  (0.1%)
    Enqueue > resolution.typeUse:                    32.240s  (3.5%)
    Enqueue > resolution.dynamicUse:                 14.825s  (1.6%)
    Enqueue > codegen.typeUse:                        3.265s  (0.4%)
    Enqueue > codegen.staticUse:                      5.482s  (0.6%)
    Enqueue > codegen.dynamicUse:                     1.421s  (0.2%)
    Enqueue > codegen.constantUse:                    1.141s  (0.1%)
    self:                                            40.752s  (4.5%)
   *self > impl.run:                                  0.007s  (0.0%)
    self > run:                                       4.427s  (0.5%)
    self > compileFromKernel:                         0.152s  (0.0%)
    self > computeClosedWorld:                       15.515s  (1.7%)
    self > processQueue:                              0.094s  (0.0%)
    self > emptyQueue:                                1.341s  (0.1%)
    self > applyImpact:                               6.615s  (0.7%)
    self > work.run:                                 12.596s  (1.4%)
    SSA builder:                                    103.053s (11.3%)
    SSA optimizer:                                  431.732s (47.3%)
    SSA optimizer > SsaInstructionSimplifier:        38.904s  (4.3%)
    SSA optimizer > SsaTypeconversionInserter:        1.620s  (0.2%)
    SSA optimizer > SsaRedundantPhiEliminator:        0.462s  (0.1%)
    SSA optimizer > SsaDeadPhiEliminator:             0.620s  (0.1%)
    SSA optimizer > SsaTypePropagator:               19.853s  (2.2%)
    SSA optimizer > SsaCheckInserter:                 1.469s  (0.2%)
    SSA optimizer > SsaDeadCodeEliminator:            2.711s  (0.3%)
    SSA optimizer > SsaGlobalValueNumberer:           2.531s  (0.3%)
    SSA optimizer > SsaCodeMotion:                    0.777s  (0.1%)
    SSA optimizer > SsaLoadElimination:               4.286s  (0.5%)
    SSA optimizer > SSA value range builder:          1.995s  (0.2%)
    SSA optimizer > SsaSimplifyInterceptors:        349.495s (38.3%)
    SSA code generator:                              28.984s  (3.2%)
    SSA code generator > SsaInstructionSelection:     1.042s  (0.1%)
    SSA code generator > SsaTypeKnownRemover:         0.736s  (0.1%)
    SSA code generator > SsaTrustedCheckRemover:      0.146s  (0.0%)
    SSA code generator > SsaAssignmentChaining:       0.623s  (0.1%)
    SSA code generator > SsaInstructionMerger:        1.819s  (0.2%)
    SSA code generator > SsaConditionMerger:          0.316s  (0.0%)
    SSA code generator > SsaShareRegionConstants:     0.545s  (0.1%)
    SSA code generator > SsaLiveIntervalBuilder:      3.324s  (0.4%)
    SSA code generator > SsaVariableAllocator:        6.137s  (0.7%)
    Code emitter:                                    89.789s  (9.8%)
    Code emitter > finalize rti:                      5.527s  (0.6%)
    Code emitter > build program:                    31.903s  (3.5%)
    Code emitter > emit program:                      3.359s  (0.4%)
    Code emitter > write fragments:                  28.809s  (3.2%)
    Code emitter > source-maps:                      18.187s  (2.0%)
    Code emitter > emit buffers:                      2.001s  (0.2%)
    Diagnostic handler:                               0.000s  (0.0%)
    Total compile-time 913.153s; setup 0.000s; async 0.076s; unaccounted 6.617s (0.72%)
Compiled 136,005,610 characters Dart to 36,944,206 characters JavaScript in 913 seconds
Info: 12,694,424 characters JavaScript in out.js
Dart file main.dart compiled to JavaScript: out.js
Emitted file 401 JavaScript files.

This is the 2nd use case you asked about

  • --disable-type-inference with patch 121645 and with patch 121241
    I did this by checking out sdk 2.6.0-dev.8.1 (which has 121241) and applying the same patch to time and print in visitGraph
✗ dart2js -v -o out.js -O3 --csp --disable-type-inference main.dart
Info: Compiling file:///Users/robbecker/code/wdesk/build/web/main.dart (2.6.0-dev.8.1)
Info: Kernel load complete
Info: Resolved 297620 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(callMethod))member: k:method(callMethod)
 static uses:
  StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=6, types=1))
 type uses:
  TypeUse(String,TypeUseKind.PARAMETER_CHECK)
  TypeUse(List<dynamic>,TypeUseKind.PARAMETER_CHECK)
 const-literals:
  "Object|Null"
  "#[#].apply(#, #)"
 native-data:
  NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes anything), throws: may).
Info: Resolved 959 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Compiling methods
visitGraph(HGraph graph)
interceptor simplifier on j:method(AdminIntelligence.sendFilingDefaultsModifyAnalytic) took 548 ms
interceptor simplifier on j:method(_FormatUpdate.addFormat) took 718 ms
interceptor simplifier on j:method(LoadTestPhasesForToCCommand.parseResults) took 847 ms
Info: Compiled 277254 methods.
Info: Compiled 959 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
    kernel loader:                                  119.914s (14.2%)
    Front end:                                       13.247s  (1.6%)
    Front end > closures:                             2.491s  (0.3%)
    Front end > worldImpact:                          8.683s  (1.0%)
    Type inference:                                   0.001s  (0.0%)
    Deferred Loading:                                16.168s  (1.9%)
    Deferred Loading > prepare:                       0.038s  (0.0%)
    Deferred Loading > find-imports:                  3.379s  (0.4%)
    Enqueue:                                         50.159s  (6.0%)
    Enqueue > resolution.staticUse:                   0.961s  (0.1%)
    Enqueue > resolution.typeUse:                    25.634s  (3.0%)
    Enqueue > resolution.dynamicUse:                 12.583s  (1.5%)
    Enqueue > codegen.typeUse:                        3.074s  (0.4%)
    Enqueue > codegen.staticUse:                      5.351s  (0.6%)
    Enqueue > codegen.dynamicUse:                     1.364s  (0.2%)
    Enqueue > codegen.constantUse:                    1.188s  (0.1%)
    self:                                            41.091s  (4.9%)
   *self > impl.run:                                  0.007s  (0.0%)
    self > run:                                       2.685s  (0.3%)
    self > compileFromKernel:                         0.160s  (0.0%)
    self > computeClosedWorld:                       17.484s  (2.1%)
    self > processQueue:                              0.084s  (0.0%)
    self > emptyQueue:                                1.412s  (0.2%)
    self > applyImpact:                               7.077s  (0.8%)
    self > work.run:                                 12.178s  (1.4%)
    SSA builder:                                     93.235s (11.1%)
    SSA optimizer:                                  379.206s (45.0%)
    SSA optimizer > SsaInstructionSimplifier:        36.452s  (4.3%)
    SSA optimizer > SsaTypeconversionInserter:        1.508s  (0.2%)
    SSA optimizer > SsaRedundantPhiEliminator:        0.464s  (0.1%)
    SSA optimizer > SsaDeadPhiEliminator:             0.637s  (0.1%)
    SSA optimizer > SsaTypePropagator:               19.699s  (2.3%)
    SSA optimizer > SsaCheckInserter:                 1.423s  (0.2%)
    SSA optimizer > SsaDeadCodeEliminator:            2.641s  (0.3%)
    SSA optimizer > SsaGlobalValueNumberer:           2.572s  (0.3%)
    SSA optimizer > SsaCodeMotion:                    0.734s  (0.1%)
    SSA optimizer > SsaLoadElimination:               5.011s  (0.6%)
    SSA optimizer > SSA value range builder:          1.941s  (0.2%)
    SSA optimizer > SsaSimplifyInterceptors:        298.630s (35.4%)
    SSA code generator:                              29.486s  (3.5%)
    SSA code generator > SsaInstructionSelection:     1.059s  (0.1%)
    SSA code generator > SsaTypeKnownRemover:         0.704s  (0.1%)
    SSA code generator > SsaTrustedCheckRemover:      0.162s  (0.0%)
    SSA code generator > SsaAssignmentChaining:       0.625s  (0.1%)
    SSA code generator > SsaInstructionMerger:        1.788s  (0.2%)
    SSA code generator > SsaConditionMerger:          0.330s  (0.0%)
    SSA code generator > SsaShareRegionConstants:     0.537s  (0.1%)
    SSA code generator > SsaLiveIntervalBuilder:      3.729s  (0.4%)
    SSA code generator > SsaVariableAllocator:        6.345s  (0.8%)
    Code emitter:                                    93.881s (11.1%)
    Code emitter > finalize rti:                      7.406s  (0.9%)
    Code emitter > build program:                    26.068s  (3.1%)
    Code emitter > emit program:                      3.429s  (0.4%)
    Code emitter > write fragments:                  21.791s  (2.6%)
    Code emitter > source-maps:                      33.069s  (3.9%)
    Code emitter > emit buffers:                      2.115s  (0.3%)
    Diagnostic handler:                               0.000s  (0.0%)
    Total compile-time 842.829s; setup 0.000s; async 0.084s; unaccounted 6.351s (0.75%)
Compiled 136,005,330 characters Dart to 36,943,984 characters JavaScript in 843 seconds
Info: 12,694,198 characters JavaScript in out.js
Dart file main.dart compiled to JavaScript: out.js
Emitted file 401 JavaScript files.

So, takeaways

  • SSA optimizer is better with 121241 applied, but still substantially longer than in Dart 2.4.1
  • Type inference in 2.4.1 is by far the most expensive phase at 978.316s (58.5%). This is what we need to improve either in the compiler or by changing or reducing our code.

I'm also investigating when exactly the increase to ssa time happened. Looks like it was after 2.5.
Dart 2.5.0: SSA optimizer: 207.537s (26.0%)

Ok, so I left type inference on and compiled a smaller project to determine when the ssa step got longer. (Don't compare these numbers or % to the ones above.) It appears to have increased from 2.6-dev.1.0 to 2.6-dev.1.0. It's all in the simplifier.

1.0 SSA optimizer > SsaSimplifyInterceptors: 4.448s (6.7%)
2.0 SSA optimizer > SsaSimplifyInterceptors: 9.140s (11.9%)

2.6-1.0

✗ dart2js -v -o out.js -O3 --csp main.dart
Info: Compiling file:///Users/robbecker/code/wdesk_sdk/app/build/web/main.dart (2.6.0-dev.1.0)
Info: Kernel load complete
Info: Resolved 39295 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(getProperty))member: k:method(getProperty)
 static uses:
  StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=4, types=1))
 const-literals:
  "Object|Null"
  "#[#]"
 native-data:
  NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1)).
Info: Resolved 812 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 38250 elements in inferencing graph.
Info: Inferred 787957 types.
Info: Compiling methods
Info: Compiled 32393 methods.
Info: Compiled 812 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
    kernel loader:                                   14.801s (22.4%)
    Front end:                                        2.184s  (3.3%)
    Front end > closures:                             0.366s  (0.6%)
    Front end > worldImpact:                          1.480s  (2.2%)
    Type inference:                                  10.657s (16.1%)
    Deferred Loading:                                 1.623s  (2.5%)
    Deferred Loading > prepare:                       0.010s  (0.0%)
    Deferred Loading > find-imports:                  0.342s  (0.5%)
    Enqueue:                                          5.814s  (8.8%)
    Enqueue > resolution.staticUse:                   0.129s  (0.2%)
    Enqueue > resolution.typeUse:                     1.457s  (2.2%)
    Enqueue > resolution.dynamicUse:                  3.403s  (5.1%)
    Enqueue > codegen.typeUse:                        0.313s  (0.5%)
    Enqueue > codegen.staticUse:                      0.247s  (0.4%)
    Enqueue > codegen.dynamicUse:                     0.115s  (0.2%)
    Enqueue > codegen.constantUse:                    0.147s  (0.2%)
    self:                                             5.441s  (8.2%)
   *self > impl.run:                                  0.004s  (0.0%)
    self > run:                                       1.302s  (2.0%)
    self > compileFromKernel:                         0.017s  (0.0%)
    self > computeClosedWorld:                        2.005s  (3.0%)
    self > processQueue:                              0.048s  (0.1%)
    self > emptyQueue:                                0.145s  (0.2%)
    self > applyImpact:                               0.699s  (1.1%)
    self > work.run:                                  1.217s  (1.8%)
    SSA builder:                                      5.921s  (8.9%)
    SSA optimizer:                                    9.477s (14.3%)
    SSA optimizer > SsaInstructionSimplifier:         1.450s  (2.2%)
    SSA optimizer > SsaTypeconversionInserter:        0.164s  (0.2%)
    SSA optimizer > SsaRedundantPhiEliminator:        0.046s  (0.1%)
    SSA optimizer > SsaDeadPhiEliminator:             0.059s  (0.1%)
    SSA optimizer > SsaTypePropagator:                1.156s  (1.7%)
    SSA optimizer > SsaCheckInserter:                 0.148s  (0.2%)
    SSA optimizer > SsaDeadCodeEliminator:            0.268s  (0.4%)
    SSA optimizer > SsaGlobalValueNumberer:           0.208s  (0.3%)
    SSA optimizer > SsaCodeMotion:                    0.077s  (0.1%)
    SSA optimizer > SsaLoadElimination:               0.494s  (0.7%)
    SSA optimizer > SSA value range builder:          0.256s  (0.4%)
    SSA optimizer > SsaSimplifyInterceptors:          4.448s  (6.7%)
    SSA code generator:                               3.146s  (4.8%)
    SSA code generator > SsaInstructionSelection:     0.129s  (0.2%)
    SSA code generator > SsaTypeKnownRemover:         0.061s  (0.1%)
    SSA code generator > SsaTrustedCheckRemover:      0.015s  (0.0%)
    SSA code generator > SsaAssignmentChaining:       0.062s  (0.1%)
    SSA code generator > SsaInstructionMerger:        0.191s  (0.3%)
    SSA code generator > SsaConditionMerger:          0.035s  (0.1%)
    SSA code generator > SsaShareRegionConstants:     0.051s  (0.1%)
    SSA code generator > SsaLiveIntervalBuilder:      0.358s  (0.5%)
    SSA code generator > SsaVariableAllocator:        0.640s  (1.0%)
    Code emitter:                                     6.392s  (9.7%)
    Code emitter > finalize rti:                      0.203s  (0.3%)
    Code emitter > build program:                     0.783s  (1.2%)
    Code emitter > emit program:                      0.451s  (0.7%)
    Code emitter > write fragments:                   3.412s  (5.2%)
    Code emitter > source-maps:                       0.772s  (1.2%)
    Code emitter > emit buffers:                      0.770s  (1.2%)
    Diagnostic handler:                               0.000s  (0.0%)
    Total compile-time 66.194s; setup 0.000s; async 0.015s; unaccounted 0.716s (1.08%)
Compiled 49,450,585 characters Dart to 4,366,161 characters JavaScript in 66.3 seconds
Info: 3,629,730 characters JavaScript in out.js
Dart file main.dart compiled to JavaScript: out.js
Emitted file 13 JavaScript files.

2.6-2.0

✗ dart2js -v -o out.js -O3 --csp main.dart
Info: Compiling file:///Users/robbecker/code/wdesk_sdk/app/build/web/main.dart (2.6.0-dev.2.0)
Info: Kernel load complete
Info: Resolved 39302 elements.
Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(getProperty))member: k:method(getProperty)
 static uses:
  StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=4, types=1))
 const-literals:
  "Object|Null"
  "#[#]"
 native-data:
  NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes nothing), throws: null(1)).
Info: Resolved 812 native elements used, 0 native elements dead.
Info: Performing global type inference
Info: Added 38254 elements in inferencing graph.
Info: Inferred 788031 types.
Info: Compiling methods
Info: Compiled 32399 methods.
Info: Compiled 812 native classes, 0 native classes omitted.
Info: Emitting JavaScript
Info: Timings:
    kernel loader:                                   15.153s (19.7%)
    Front end:                                        2.404s  (3.1%)
    Front end > closures:                             0.346s  (0.5%)
    Front end > worldImpact:                          1.697s  (2.2%)
    Type inference:                                  11.764s (15.3%)
    Deferred Loading:                                 1.773s  (2.3%)
    Deferred Loading > prepare:                       0.010s  (0.0%)
    Deferred Loading > find-imports:                  0.337s  (0.4%)
    Enqueue:                                          7.094s  (9.2%)
    Enqueue > resolution.staticUse:                   0.159s  (0.2%)
    Enqueue > resolution.typeUse:                     1.650s  (2.1%)
    Enqueue > resolution.dynamicUse:                  4.398s  (5.7%)
    Enqueue > codegen.typeUse:                        0.337s  (0.4%)
    Enqueue > codegen.staticUse:                      0.271s  (0.4%)
    Enqueue > codegen.dynamicUse:                     0.121s  (0.2%)
    Enqueue > codegen.constantUse:                    0.155s  (0.2%)
    self:                                             6.063s  (7.9%)
   *self > impl.run:                                  0.004s  (0.0%)
    self > run:                                       1.172s  (1.5%)
    self > compileFromKernel:                         0.018s  (0.0%)
    self > computeClosedWorld:                        2.586s  (3.4%)
    self > processQueue:                              0.034s  (0.0%)
    self > emptyQueue:                                0.188s  (0.2%)
    self > applyImpact:                               0.801s  (1.0%)
    self > work.run:                                  1.258s  (1.6%)
    SSA builder:                                      6.113s  (8.0%)
    SSA optimizer:                                   14.862s (19.3%)
    SSA optimizer > SsaInstructionSimplifier:         1.629s  (2.1%)
    SSA optimizer > SsaTypeconversionInserter:        0.188s  (0.2%)
    SSA optimizer > SsaRedundantPhiEliminator:        0.050s  (0.1%)
    SSA optimizer > SsaDeadPhiEliminator:             0.067s  (0.1%)
    SSA optimizer > SsaTypePropagator:                1.352s  (1.8%)
    SSA optimizer > SsaCheckInserter:                 0.169s  (0.2%)
    SSA optimizer > SsaDeadCodeEliminator:            0.268s  (0.3%)
    SSA optimizer > SsaGlobalValueNumberer:           0.235s  (0.3%)
    SSA optimizer > SsaCodeMotion:                    0.080s  (0.1%)
    SSA optimizer > SsaLoadElimination:               0.634s  (0.8%)
    SSA optimizer > SSA value range builder:          0.260s  (0.3%)
    SSA optimizer > SsaSimplifyInterceptors:          9.140s (11.9%)
    SSA code generator:                               3.281s  (4.3%)
    SSA code generator > SsaInstructionSelection:     0.149s  (0.2%)
    SSA code generator > SsaTypeKnownRemover:         0.065s  (0.1%)
    SSA code generator > SsaTrustedCheckRemover:      0.015s  (0.0%)
    SSA code generator > SsaAssignmentChaining:       0.067s  (0.1%)
    SSA code generator > SsaInstructionMerger:        0.201s  (0.3%)
    SSA code generator > SsaConditionMerger:          0.035s  (0.0%)
    SSA code generator > SsaShareRegionConstants:     0.058s  (0.1%)
    SSA code generator > SsaLiveIntervalBuilder:      0.385s  (0.5%)
    SSA code generator > SsaVariableAllocator:        0.652s  (0.8%)
    Code emitter:                                     7.583s  (9.9%)
    Code emitter > finalize rti:                      0.210s  (0.3%)
    Code emitter > build program:                     0.792s  (1.0%)
    Code emitter > emit program:                      0.470s  (0.6%)
    Code emitter > write fragments:                   3.683s  (4.8%)
    Code emitter > source-maps:                       1.930s  (2.5%)
    Code emitter > emit buffers:                      0.495s  (0.6%)
    Diagnostic handler:                               0.000s  (0.0%)
    Total compile-time 76.818s; setup 0.000s; async 0.013s; unaccounted 0.709s (0.92%)
Compiled 49,454,793 characters Dart to 4,366,556 characters JavaScript in 76.9 seconds
Info: 3,630,119 characters JavaScript in out.js
Dart file main.dart compiled to JavaScript: out.js
Emitted file 13 JavaScript files.

scrolling through https://github.com/dart-lang/sdk/compare/2.6.0-dev.1.0...2.6.0-dev.2.0 ... my guess is something about the inference changes in this breaking change caused the performance drop?

Inference is changed when using Null values in a FutureOr context.

Or maybe something added for static extension methods?

Thank you so much for all this information, this is incredibly useful.

We have still more to do, but I'm glad to see that our patch cut down almost a minute from the simplifier phase. For what it is worth, on internal apps that are as big as yours, we see numbers like these:

    SSA builder:                                     58.690s (22.2%)
    SSA optimizer:                                   63.157s (23.8%)
    SSA optimizer > SsaSimplifyInterceptors:          9.918s  (3.7%)

Almost always the SSA-builder and SSA-optimizer take about the same amount of time, and generally the simplify-interceptor step takes about 10s. I hope we can recognize what's wrong to get your app to similar numbers as well.

It may be possible to learn what are the causes by looking at methods where it is slow. Are any of the methods you printed above part of an open source package that we can look at? I'll try to find examples where it is slow today on internal apps, but given that our app don't show this slow down I don't believe we'll find much there.

Meanwhile, I agree that the global inference is the biggest issue, I'm trying to explore patches we can create for you to gather more data and validate whether some algorithm changes could make a difference (making the actual changes could take a while, so we want to try to find something that can accelerate the process.).

I was also going to ask you, are there sample over_react apps open source that we can play with? I presume part of the code patterns from the framework could be at play in making the global inference slow, but we are not sure. I'm wondering if a sample app could help us validate this.

Thanks again for all the help!

We're digging around to see if there are open source examples for you to test with. I'll dig into some of the methods that were reported as slow, but those are small fired compared to the type inference portion, which I don't have much control over.

Should I be worried about this?

Info: All native types marked as used due to ResolutionWorldImpactBuilder(k:method(callMethod))member: k:method(callMethod)
 static uses:
  StaticUse(k:method(JS),StaticUseKind.STATIC_INVOKE,null,[dynamic],CallStructure(arity=6, types=1))
 type uses:
  TypeUse(String,TypeUseKind.PARAMETER_CHECK)
  TypeUse(List<dynamic>,TypeUseKind.PARAMETER_CHECK)
 const-literals:
  "Object|Null"
  "#[#].apply(#, #)"
 native-data:
  NativeBehavior(returns: [Object, Null], creates: [Object, Null], sideEffects: SideEffects(reads anything; writes anything), throws: may).
Info: Resolved 959 native elements used, 0 native elements dead.

especially "#[#].apply(#, #)"

Not sure - in terms of the actual warning about Native types, I wouldn't worry about that. Apps of your size commonly get this warning. We are indicating that we couldn't tree-shake much of the DOM (our native types) because somewhere in your program there is something that makes it hard for us to figure out what is actually used. Small apps benefit more from addressing this warning because dart:html is big compared to their app size (it could double they app size if they are really small!).

In terms of .apply, maybe. The error is referring to Function.apply, which is a very dynamic feature and very hard for the compiler to work with in general, so it might be worth studying whether that's what causing us so much trouble during inference.

I know you have this one use here: https://github.com/Workiva/over_react/blob/master/lib/src/component_declaration/component_base.dart#L638. Would it be possibel to test what happens if you didn't have it? For example, if you could ensure that you only have ReactComponentFatoryProxys, could you remove the else branch and see how much this changes?

I removed that instance of .apply. It still crashes unless skipping inference but it did shave a few seconds off the ssa steps for a total of about 32s faster. (total of 843s before, 811s with just that else block removed)
Here's the ssa section diff
image

I thought I'd log out some info (every hundred) in the refine method in inferrer_engine.dart since this is where it crashes with OOM. This is right before it crashes on 2.6-dev.8.1:

number of times through while loop: 6,608,850
overallRefineCount: 3,582,150
DynamicCallSiteTypeInformation: 2,707,508

Is there anything you can do in this method to handle things in batches? To finalize a chunk and release memory?

  void refine() {
    progress.startPhase();
    num count = 0;
    num dynamics = 0;
    while (!workQueue.isEmpty) {
      count++;
      if (count % 50 == 0) {
        print('times through while loop: $count');
      }
      progress.showProgress('Inferred ', overallRefineCount, ' types.');
      TypeInformation info = workQueue.remove();
      AbstractValue oldType = info.type;
      if (info is DynamicCallSiteTypeInformation) {
        dynamics++;
        if (dynamics % 50 == 0) {
          print('DynamicCallSiteTypeInformation: $dynamics');
        }
      }
      AbstractValue newType = info.refine(this);
      // Check that refinement has not accidentally changed the type.
      assert(oldType == info.type);
      if (info.abandonInferencing) info.doNotEnqueue = true;
      if ((info.type = newType) != oldType) {
        overallRefineCount++;
        info.refineCount++;
        if (overallRefineCount % 50 == 0) {
          print('overallRefineCount: $overallRefineCount');
        }  
        if (info.refineCount > MAX_CHANGE_COUNT) {
          //if (debug.ANOMALY_WARN) {
            print("ANOMALY WARNING: max refinement reached for $info");
          //}
          info.giveUp(this);
          info.type = info.refine(this);
          info.doNotEnqueue = true;
        }
        workQueue.addAll(info.users);
        if (info.hasStableType(this)) {
          info.stabilize(this);
        }
      }
    }
  }

and I only got 1 ANOMALY (before it crashed)
ANOMALY WARNING: max refinement reached for Parameter local(HashBiMap.remove#key) Union([exact=ProgressItem], [null|exact=JSString], [null|exact=StandardAlert], [subclass=JSNumber])

Did the same thing in 2.4.1 where our app compiles without crashing
overallRefineCount: 7,494,200
times through while loop: 4,660,000
DynamicCallSiteTypeInformation: 2,038,900
and I'm seeing LOTS of ANOMALYs.

by any chance are the numbers in 2.4.1 swapped (refine count 4.6M and iterations 7.5M?) If not, it seems quite odd that we do more iterations in 2.6.*, given that your program is the same internally.

If they were swapped, that seems to indicate that we hit the OOM issue when we are 80% of the way. That seems like all the sudden the memory usage in dart2js changed dramatically between 2.4.1 and now. Two theories behind that could be: a change on our earlier stages (unlikely), a regression in the VM in terms of memory usage.


Is there anything you can do in this method to handle things in batches? To finalize a chunk and release memory?

Nothing immediate, unfortunately. We've been taking baby steps towards improving things like this. For example, we want to try propagating information local to a method once (and do this modularly), and only do this worklist algorithm for propagating information across methods. This is however a big refactor of what this algorithm does today. To be honest, it is more likely that we'll replace the entire algorithm given how much of it came from modeling Dart 1 and that we believe modeling Dart 2 directly will yield better results today.


I didn't get a chance to put together some CLs for you today, but I wanted to try instrumenting the algorithm to gather a few more numbers. In case you have a chance to explore it before I do, some of the ideas I wanted to try are:

  • collect statistics about which methods are being visited over and over again. (both in terms of what is the enclosing method of a DynamicTypeInformation node, and in terms of what is the target of those calls)
  • determine whether some methods have a disproportionate number of incoming edges. For example, long ago I was playing with some logic to track edges into parameter nodes because they can grow quickly for dynamic targets with multiple definitions that may be a target across many call sites in the program. We could try that patch with 2.4.1 (it only reports the data at the end of inference) or collect similar data differently throughout the algorithm so we can get it before crashing.

Another idea here could be to set up a git-bisect to try to detect if there is one change that is causing this spike in memory usage between 2.4.1 and 2.5.0, maybe it is something we can recover back?

I think the fragment of JavaScript containing 'apply' comes from callMethod
in dart:js. As Siggi said, this warning has a limited impact in large
programs.

On Tue, Oct 22, 2019, 21:23 sigmundch notifications@github.com wrote:

Not sure - in terms of the actual warning about Native types, I wouldn't
worry about that. Apps of your size commonly get this warning. We are
indicating that we couldn't tree-shake much of the DOM (our native types)
because somewhere in your program there is something that makes it hard for
us to figure out what is actually used. Small apps benefit more from
addressing this warning because dart:html is big compared to their app
size (it could double they app size if they are really small!).

In terms of .apply, maybe. The error is referring to Function.apply,
which is a very dynamic feature and very hard for the compiler to work with
in general, so it might be worth studying whether that's what causing us so
much trouble during inference.

I know you have this one use here:
https://github.com/Workiva/over_react/blob/master/lib/src/component_declaration/component_base.dart#L638.
Would it be possibel to test what happens if you didn't have it? For
example, if you could ensure that you only have ReactComponentFatoryProxys,
could you remove the else branch and see how much this changes?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/dart-lang/sdk/issues/27883?email_source=notifications&email_token=AAQC3FMMRYIRYUDB763SDFTQP5OMRA5CNFSM4CXLRDQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB7CG6A#issuecomment-545137528,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAQC3FMNSMGPLUZHFGHA2ATQP5OMRANCNFSM4CXLRDQQ
.

I don't believe the numbers are switched. I'm currently digging to find out exactly where the memory increase started crashing us.
To experiment on your own, you can use the web directory in over_react or this project https://github.com/kealjones-wk/ap_over_react

Compilation succeeds on 2.5-dev.2.1 and OOM on 2.5 stable release. So the change where it started crashing should be in https://github.com/dart-lang/sdk/compare/2.5.0-dev.2.1...2.5.0

Another thought: consider running dart2js with this environment variable defined first:
DART_VM_OPTIONS=--print_metrics dart2js foo.dart -o out.js --verbose -O3

This will make the underlyng VM emit heap metrics statistics at the end of the run. This may be useful to detect if we had one big regression in one change or if the regression was gradual and happened over several changes.

@robbecker-wf - after you mentioned that the failure was between -dev.2.1 and -dev.3.0, I started looking at our changes, but then noticed an anomaly: it appears that 2.5.0 stable branched off around 2.2.0-dev.2.0 (it seems 2.2.0-dev.3.0 did not make it into the stable channel for some reason)

I'll try to find out why the numbers are off, but this means we should narrow the bisecting strictly between 2.5.0-dev.2.1 and 2.5.0

@robbecker-wf - another random question. It seems like you are running on a Mac, correct? Can you verify whether this OOM issue replicate on linux host machines?

I was looking through the list of changes on the VM within the affected range, and I see several relating to MacOS and Catalina support. This might be a red herring, but it may be worth ruling it out.

It seems like you are running on a Mac, correct? Can you verify whether this OOM issue replicate on linux host machines?

Yes, I'm on Mac. It happens in our CI also which is debian based.

We've discovered the commit where we started crashing with OOM. https://github.com/dart-lang/sdk/commit/13005eec86c40d8a2d804b4fe41b2fdded3aae22
I'm not sure if this max memory check is OS / platform specific... but reverting it (locally) allows our app to compile on newer versions of the SDK (>=2.5.0) and adding this check into 2.4.1 causes the build to fail with OOM.
Basically, we aren't actually running out of memory in any of the Dart SDK versions, but this ceiling is making the build exit thinking it's about the run out of memory.

I might suggest either reverting it, or making it configurable via a DART_VM_OPTIONS

Thanks so much @robbecker-wf for all the hard work investigating and bisecting this!

@a-siva @rmacnak-google - please take a look at the last comment above. Can that change be reverted or made configurable?

The heap size limit is already configurable with --old-gen-heap-size=size-in-MB. However, a program that exceeds the default value is very likely to soon hit Linux's page table limit.

I'm worried that other large apps may start hitting this issue too. Is there anything we can be do to make that less likely? For example, could the limit be set a bit different so we continue to support apps that didn't run OOM before? Or could we make the OOM error message also indicate how to do change the limit, so those that hit the issue can try?

edited: sorry I had a redundant question from an earlier draft and didn't edit it properly

Tried running build with DART_VM_OPTIONS="--print_metrics" flag and discovered that heap.old.capacity.max in 2.4.1 resolved to 20.377 GB. We had concern that with newer version of sdk that number will go up, but it actually went down to 19.517 GB. Applying == fix helped reduce another ~4.5GB (wow). This brought us down to 15.000 GB for 2.5.0 which is right on the edge, but our app successfully builds.

Thanks @olesiathoms-wk for the details, that's really great to see.

After doing some testing with internal apps I noticed that my patch for == needs some additional work to avoid regressions on code quality. I expect the extra change will continue to be on the same order of memory savings, but I'll want to confirm what the final number looks like once we land it.

@rmacnak-google can you comment a bit more on how/when the system's page table limit kicks in? Is this platform specific? Desktop/mobile or all? Could the limit be more adaptive? Seeing the numbers from @olesiathoms-wk concern me as they will quickly hit the limit as their application grows.

@sigmundch Is that a different CL I should track or will watching https://dart-review.googlesource.com/c/sdk/+/122743 be the best place?

Most likely I'll update the same CL, but if not I can post a new CL here.

Just wanted to let you know that we had several interruptions the last couple days so I have not made much progress on it yet.

It seems the limit on the number of mappings is Linux specific. It is configurable and controlled by Kernel parameter vm.max_map_count which is by default set to 2^16. AFAIK Windows does not have such a limit. I could not find any information about Mac - but maybe it also does not have any limit on the number of mappings.

So it seems we could change that code to lower the default memory limit only on Linux, while bumping it back up on other OSes.

Meanwhile you can simply pass --old-gen-heap-size to the VM when running dart2js on Mac to avoid OOM.

If you ever need to build on Linux you will need to use --old-gen-heap-size in conjunction with bumping vm.max_map_count using sysctl.

Thanks @mraleph for investigating on different platforms! That's helpful information!

@robbecker-wf after tracking a few issues on our internal apps, we had to practically rewrite how we'd support == in the global inference. I'm still working through a few additional details here internally, so I'm not ready yet to land it. However, I want to verify that this new approach continues to work well on your app in terms of memory usage in the compiler. Could you give this a try and let us know?

Here is the patch to try https://dart-review.googlesource.com/c/sdk/+/123980

Trying it now

We tried setting both --old-gen-heap-size==30000000 and vm.max_map_count to 262144 and still got OOM. Meaning I don't think setting the max map count is sufficient to get past the ceiling that was added in https://github.com/dart-lang/sdk/commit/13005eec86c40d8a2d804b4fe41b2fdded3aae22
This is on Linux w/ 16Gb physical and 18Gb swap stock 2.6-dev.8.2.

@sigmundch I verified that the latest on your == patch (https://dart-review.googlesource.com/c/sdk/+/123980) still shows the memory reduction to 15.4GB and reduced inference for == methods for us.

Thanks!

We tried setting both --old-gen-heap-size==30000000 and vm.max_map_count to 262144 and still got OOM.

The flag is in megabytes. Try --old-gen-heap-size=30000. The commit just changes default value for this flag - so setting it explicitly can overwrite it.

This just in: We started hitting the memory/page table limit in Linux on 2.4.1. setting both --old-gen-heap-size==30000000 and vm.max_map_count to 262144 on Dart 2.4.1 does in fact not crash where it used to. I feel like this should work on 2.6 as well if it weren't for the ceiling added in https://github.com/dart-lang/sdk/commit/13005eec86c40d8a2d804b4fe41b2fdded3aae22

@robbecker-wf I want to make sure that we are on the same page. You need to use --old-gen-heap-size=30000 not --old-gen-heap-size=30000000 (number too large) and not --old-gen-heap-size==30000 (== instead of =).

@mraleph Yes, thank you. The double equals was a typo on my part (on github). I've also updated to MB instead of bytes. I can verify the effectiveness again if you think the flag would be ignored if too large? Currently using --old-gen-heap-size=32000

@robbecker-wf I was finally able to land the changes modeling == in e65d95e8c869a6ef26a5fdb6951267ea06c3e6fe. It took us a bit longer than I wanted to get this done (there were several corner cases to work through to ensure we didn't regress on code quality and we had to validate this on several of our large internal apps.)

The technique we use here could be extended one day to dynamically handle cases where the number of edges is too large. At this time, however, we only use the technique for == where it mattered the most. Before we invest time in generalizing the technique, we should assess whether it will make a difference.

When you have a chance, I'd be curious to see if adding some of the methods that you saw from the instrumented compile (like addAll) changes the overall memory and time spent in this phase. Now that this logic is in, we can test that out fairly easily by checking for those method names here:

https://github.com/dart-lang/sdk/blob/e65d95e8c869a6ef26a5fdb6951267ea06c3e6fe/pkg/compiler/lib/src/inferrer/inferrer_engine.dart#L1166

@sigmundch Fantastic! Looks like we just missed the 2.6.0 release. I suppose this would make 2.6.1?
I'll give adding other methods a try. Is it just adding the other method names in inferrer_engine or are there other changes needed?

Also, wow. That sure did escalate from basically a 1-liner fix. Thank you so much for putting in the effort to get this through!

Looks like we just missed the 2.6.0 release. I suppose this would make 2.6.1?

Correct, we would have missed the cut last week too given that we were already in the stabilization period. I submitted a merge request for the next stable release. Meanwhile, it should be visible in the next dev channel release.

Is it just adding the other method names in inferrer_engine or are there other changes needed?

Only that one line should be enough. We included == in a couple invariant assertions (see validCallType and the assert in CallSiteTypeInformation) that could technically fail if you don't update them, but because you'll be running vm with assertions disabled by default, then I believe you won't have to fix them.

I can confirm that vm.max_map_count=262144 and setting --old-gen-heap-size=32000 as a DART_VM_OPTIONS does help our (Linux) CI on 2.4.1.
These settings still do not help for >=2.5. It still reports the graceful out of memory even though it isn't (The builders have been upgraded to 32GB and dart2js still reports OOM)

Maybe in isolate.cc it is not actually using FLAG_old_gen_heap_size (because is_service_or_kernel_isolate is true) but instead using kDefaultMaxOldGenHeapSize when running dart2js .. meaning I can't actually override that limit afterall?

  Heap::Init(result,
             is_vm_isolate
                 ? 0  // New gen size 0; VM isolate should only allocate in old.
                 : FLAG_new_gen_semi_max_size * MBInWords,
             (is_service_or_kernel_isolate ? kDefaultMaxOldGenHeapSize
                                           : FLAG_old_gen_heap_size) *
                 MBInWords);

@robbecker-wf you can't override heap limit for internal isolates used by the VM. this should not matter because dart2js compiler itself is running within main isolate.

So, what I'm hearing is >= Dart 2.5 I can't use more than 16GB on Linux even if I have more than that in physical memory?

@robbecker-wf that's not at all what I said - I would expect --old_gen_heap_size to have exactly the same effect on Dart 2.5 as it had before.

Do you have an exact error message you are getting?

Sorry. Based on Ryan's comment https://github.com/dart-lang/sdk/issues/27883#issuecomment-546496952 and following discussion I was under the impression that setting the 2 settings would effectively give dart2js the ability to access more than 16GB.

Here's the OOM error

[WARNING] No actions completed for 15.1s, waiting on:
- build_web_compilers:entrypoint on web/main.dart
[SEVERE]build_web_compilers:entrypoint on web/main.dart: Dart2Js finished with:
Exhausted heap space, trying to allocate 131104 bytes.
packages/licensing_frugal/licensing_frugal_v3_roles/f_role_package_response.dart:123:10:
Internal Error: The compiler crashed when compiling this element.
String toString() {
^
The compiler is broken.
When compiling the above element, the compiler crashed. It is not
possible to tell if this is caused by a problem in your program or
not. Regardless, the compiler should not crash.
The Dart team would greatly appreciate if you would take a moment to
report this problem at http://dartbug.com/new.
Please include the following information:
* the name and version of your operating system,
* the Dart SDK build number (2.5.2), and
* the entire message you see here (including the full stack trace
below as well as the source location above).
The compiler crashed: Out of Memory
#0      _CompactLinkedHashSet._init (dart:collection-patch/compact_hash.dart)
#1      _CompactLinkedHashSet._rehash (dart:collection-patch/compact_hash.dart:505:7)
#2      _CompactLinkedHashSet.add (dart:collection-patch/compact_hash.dart:555:7)
#3      InferrerEngineImpl.updateSideEffects (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:203:30)
#4      InferrerEngineImpl.registerCalledSelector.<anonymous closure> (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:1142:7)
#5      SetMixin.forEach (dart:collection/set.dart:146:30)
#6      Setlet.forEach (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/util/setlet.dart)
#7      InferrerEngineImpl.registerCalledSelector (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:1141:47)
#8      KernelTypeGraphBuilder._handleDynamic (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/builder_kernel.dart:965:22)
#9      KernelTypeGraphBuilder.visitMethodInvocation (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/builder_kernel.dart:994:12)
#10     MethodInvocation.accept (package:kernel/ast.dart:3071:36)
#11     KernelTypeGraphBuilder.visitExpressionStatement (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/builder_kernel.dart:256:24)
#12     ExpressionStatement.accept (package:kernel/ast.dart:4128:35)
#13     KernelTypeGraphBuilder.visitBlock (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/builder_kernel.dart:256:24)
#14     Block.accept (package:kernel/ast.dart:4156:35)
#15     KernelTypeGraphBuilder.visitFunctionNode (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/builder_kernel.dart:256:24)
#16     FunctionNode.accept (package:kernel/ast.dart:2322:30)
#17     KernelTypeGraphBuilder.run (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/builder_kernel.dart:256:24)
#18     InferrerEngineImpl.computeMemberTypeInformation (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:828:20)
#19     InferrerEngineImpl.analyze.<anonymous closure> (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:760:14)
#20     CompilerDiagnosticReporter.withCurrentElement (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/compiler.dart:751:15)
#21     InferrerEngineImpl.analyze (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:759:14)
#22     InferrerEngineImpl.analyzeAllElements.<anonymous closure> (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:699:38)
#23     Iterable.forEach (dart:core/iterable.dart:277:30)
#24     InferrerEngineImpl.analyzeAllElements (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:694:22)
#25     InferrerEngineImpl.runOverAllElements (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/inferrer_engine.dart:509:5)
#26     TypeGraphInferrer.analyzeMain (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/type_graph_inferrer.dart:68:14)
#27     GlobalTypeInferenceTask.runGlobalTypeInference.<anonymous closure> (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/types.dart:181:41)
#28     CompilerTask.measure (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/common/tasks.dart:64:51)
#29     GlobalTypeInferenceTask.runGlobalTypeInference (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/inferrer/types.dart:174:12)
#30     Compiler.performGlobalTypeInference (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/compiler.dart:377:28)
#31     Compiler.compileFromKernel.<anonymous closure> (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/compiler.dart:420:13)
#32     CompilerTask.measureSubtask (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/common/tasks.dart:179:35)
#33     Compiler.compileFromKernel (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/compiler.dart:414:14)
#34     Compiler.runInternal (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/compiler.dart:255:13)
#35     _FutureListener.handleValue (dart:async/future_impl.dart:814:22158)
#36     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#37     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#38     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#39     KernelLoaderTask.load.<anonymous closure> (file:///tmp/tmpAGuAto/dart-2.5.2/dart/pkg/compiler/lib/src/kernel/loader.dart)
#40     _FutureListener.handleValue (dart:async/future_impl.dart:814:22158)
#41     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#42     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#43     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
...
...
#68     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#69     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#70     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#71     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#72     withCrashReporting (package:front_end/src/fasta/crash.dart)
#73     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#74     _rootRunUnary (dart:async/zone.dart:1132:38)
#75     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#76     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#77     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#78     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#79     KernelTarget.buildComponent.<anonymous closure> (package:front_end/src/fasta/kernel/kernel_target.dart)
#80     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#81     _rootRunUnary (dart:async/zone.dart:1132:38)
#82     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#83     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#84     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#85     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#86     Loader.buildBodies (package:front_end/src/fasta/loader.dart)
#87     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#88     _rootRunUnary (dart:async/zone.dart:1132:38)
#89     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#90     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#91     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#92     _completeOnAsyncReturn (dart:async-patch/async_patch.dart:313:7343)
#93     SourceLoader.buildBody (package:front_end/src/fasta/source/source_loader.dart)
#94     _asyncThenWrapperHelper.<anonymous closure> (dart:async-patch/async_patch.dart)
#95     _rootRunUnary (dart:async/zone.dart:1132:38)
#96     _FutureListener.handleValue (dart:async/future_impl.dart:814:10681)
#97     Future._propagateToListeners.handleValueCallback (dart:async/future_impl.dart:678:45)
#98     Future._propagateToListeners (dart:async/future_impl.dart:707:32)
#99     Future._asyncComplete.<anonymous closure> (dart:async/future_impl.dart:522:5)
#100    _rootRun (dart:async/zone.dart:1124:13)
#101    _CustomZone.bindCallback.<anonymous closure> (dart:async/zone.dart:1021:19)
#102    _microtaskLoop (dart:async/schedule_microtask.dart:41:21)
#103    _startMicrotaskLoop (dart:async/schedule_microtask.dart:50:5)
#104    _runPendingImmediateCallback (dart:isolate-patch/isolate_patch.dart:116:13)
#105    _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:173:5)
[INFO] Running build completed, took 10m 29s
[INFO] Caching finalized dependency graph completed, took 3.3s
[SEVERE] Failed after 10m 32s
[SEVERE] FailureType: 1

@robbecker-wf - I wonder how you are providing the heap arguments when using webdev. I ask because I believe webdev is launching subprocesses internally, so those flags need to be plumb through all the way. I'm not as familiar with the configuration with webdev so I'm not sure if they expose a way to provide the flags in your configuration files or if more work is needed.

That's a good point. In our Dockerfile I just added a
ENV DART_VM_OPTIONS "--print_metrics --old-gen-heap-size=32000"
Here's the command that's running.
pub global run webdev build --output web:build/web -- --delete-conflicting-outputs
Perhaps I should be using build runner or dart2js directly?

I have noticed that the --print-metrics only seem to print one block vs many blocks of info when running dart2js directly .. so maybe those aren't getting passed into dart2js.

Trying pbr and dart2js directly.. will report back.

I wonder how you are providing the heap arguments when using webdev. I ask because I believe webdev is launching subprocesses internally, so those flags need to be plumb through all the way.

That did it. Apparently the DART_VM_OPTIONS do not tunnel through to internal processes. I ran dart2js directly and was able to compile the app on Linux in CI using Dart 2.6. It's not as convenient but it works.
Is there a sub-ticket here to somehow pass along or pick up DART_VM_OPTIONS when launching subprocesses?

Is there a sub-ticket here to somehow pass along or pick up DART_VM_OPTIONS when launching subprocesses?

Yes, I've created one at https://github.com/dart-lang/webdev/issues/814. I asked @jakemac53 and he doesn't believe this is currently supported yet.

🍾 This issue can finally be closed now that the fixes in https://github.com/dart-lang/sdk/issues/40217 have been merged and released as 2.7.1

Woo hoo!

No, peak RSS is still 890MB.

Can you give more details @rmacnak-google ? What's peak RSS?
Update: Googled it. Resident Set Size?

Was this page helpful?
0 / 5 - 0 ratings