TruffleRuby 20.1.0-dev-20200204_0342
The files in the repo
https://github.com/petenorth/bundle-install-memory.git
attempt to install the rails gem into an image. In a memory constrained system
it falls over. Memory can be constrained either via the Docker desktop preferences or via the a flag --memory=4g .
The test case is
docker build -f Dockerfile-graalvm -t graalvm:latest .
docker build -f Dockerfile-rails -t rails:latest .
The second build fails in memory constrained systems. In unconstrained systems (reported by @eregon) it succeeds with time output something like
176.33user 20.10system 1:42.24elapsed 192%CPU (0avgtext+0avgdata 10311016maxresident)k 1120inputs+174888outputs (35major+15184614minor)pagefaults 0swaps
The value of the maxresident memory here is very large (~10GB)
Thanks for the report.
Probably relates to #1824.
I could reproduce it with --memory=4g on Linux.
Running bundle install with --jvm --vm.Xmx3g (JVM) and --vm.Xmn256m --vm.Xmx3g (Native Image) seem to both work.
Just --vm.Xmn256m or just --vm.Xmx3g still OOM.
The command line looks like this with those flags:
RUN env time ruby --vm.Xmn256m -S bundle install
free -m shows the full host memory in the container:
$ docker run --memory=4g --rm -it b59b3b0bc5cd bash
bash-4.2# free -m
total used free shared buff/cache available
Mem: 31808 8833 9793 1349 13182 21189
Swap: 0 0 0
Maybe Native Image considers it has much memory available than the container has?
That seems to be the case because:
$ ruby --vm.XX:+PrintGCSummary -e0
PrintGCSummary: YoungGenerationSize: 268435456 = 256MB
PrintGCSummary: MinimumHeapSize: 536870912 = 512 MB
PrintGCSummary: MaximumHeapSize: 26683254320 = ~25GB (80% of the 32GB I have on this laptop)
...
If we trick the Ruby launcher into thinking we're running Bundler, Xmn is bumped (as expected):
$ ruby --vm.XX:+PrintGCSummary -S bundle --version
Bundler version 1.17.2
PrintGCSummary: YoungGenerationSize: 1073741824 = 1GB
PrintGCSummary: MinimumHeapSize: 2147483648 = 2GB
PrintGCSummary: MaximumHeapSize: 26683254320 = ~25GB
So Xms changes, but Xmx does not (as expected, Xmn is at most half of the heap).
I think some of my docker commands went wrong yesterday, because now I see that Native Image correctly detects the maximum memory size of the container:
$ docker run --memory=4g --rm -it b59b3b0bc5cd bash
bash-4.2# free -m
total used free shared buff/cache available
Mem: 31808 6954 14015 1105 10839 23290
Swap: 0 0 0
bash-4.2# ruby --vm.XX:+PrintGCSummary -e0
PrintGCSummary: YoungGenerationSize: 268435456 = 256MB
PrintGCSummary: MinimumHeapSize: 536870912 = 512MB
PrintGCSummary: MaximumHeapSize: 3435973760 = 3.2GB
It also seems fine during docker build.
@eregon so you don't see any
[truffle] opt fail Gem::Package::TarHeader.from ~/.rubies/truffleruby-20.1.0-dev/lib/mri/rubygems/package/tar_header.rb:102|ASTSize 252 |Reason java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.
java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.
can these be ignored? Can you confirm the bundle install command used in your Dockerfile?
Yes, I noticed one:
[truffle] opt fail Gem::Package::TarHeader.from /opt/graalvm-ce-java8-20.1.0-dev/jre/languages/ruby/lib/mri/rubygems/package/tar_header.rb:102|ASTSize 252 |Reason java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.
java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.
when running RUN env time ruby --vm.Xmn256m --vm.Xmx3g --vm.XX:+PrintGCSummary -S bundle install (which succeeds).
The fact that the compilation should be safe, although obviously it can impact performance.
I think that problem is due to too much inlining and noticed by @bjfish we well.
We're experimenting to fix that one as currently we have a higher setting than the default for inlining.
Yes, I noticed one:
[truffle] opt fail Gem::Package::TarHeader.from /opt/graalvm-ce-java8-20.1.0-dev/jre/languages/ruby/lib/mri/rubygems/package/tar_header.rb:102|ASTSize 252 |Reason java.lang.OutOfMemoryError: Garbage-collected heap size exceeded. java.lang.OutOfMemoryError: Garbage-collected heap size exceeded.when running
RUN env time ruby --vm.Xmn256m --vm.Xmx3g --vm.XX:+PrintGCSummary -S bundle install(which succeeds).The fact that the compilation should be safe, although obviously it can impact performance.
I think that problem is due to too much inlining and noticed by @bjfish we well.
We're experimenting to fix that one as currently we have a higher setting than the default for inlining.
@eregon I thought that bundle install now runs with --engine.Mode=latency meaning that there is _no_ inlining or splitting?
@petenorth You're right. So too much inlining shouldn't be the problem here.
And Gem::Package::TarHeader.from doesn't seem really big (about 29 calls in it and a 17-entries Hash literal): https://github.com/oracle/truffleruby/blob/fdcc3f2ee9ff2e3c8a84c998ca26cb3888d8a0a2/lib/mri/rubygems/package/tar_header.rb#L102-L126
So probably something during bundle install is using all the heap and the compiler really hasn't much left.
We'll keep investigating further.
We found the issue, there was an unintended loop explosion in the node creating the Hash literal when running on Native Image.
I've got a GitHub workflow set up, for the rails bundle install which runs nightly using the truffleruby head
https://github.com/petenorth/bundle-install-memory/actions
so I once a fix gets merged it should be picked up there. Curious as whether you can see the workflow runs under the action tabs in my repo?
Yes I can see it, action runs are public for public repos (just like for TravisCI).
I reran the workload with that fix (locally, not in Docker):
$ env time ruby --vm.Xmn256m --vm.Xmx3g --vm.XX:+PrintGCSummary -S bundle install
...
PrintGCSummary: YoungGenerationSize: 268435456
PrintGCSummary: MinimumHeapSize: 536870912
PrintGCSummary: MaximumHeapSize: 3221225472
PrintGCSummary: AlignedChunkSize: 1048576
PrintGCSummary: CollectedTotalChunkBytes: 44986285048
PrintGCSummary: CollectedTotalObjectBytes: 44392746760
PrintGCSummary: AllocatedNormalChunkBytes: 45518658928
PrintGCSummary: AllocatedNormalObjectBytes: 44917655048
PrintGCSummary: IncrementalGCCount: 146
PrintGCSummary: IncrementalGCNanos: 11769340212
PrintGCSummary: CompleteGCCount: 23
PrintGCSummary: CompleteGCNanos: 11846421316
PrintGCSummary: GCNanos: 23615761528
PrintGCSummary: TotalNanos: 84509171981
PrintGCSummary: GCLoadPercent: 28
164.17user 14.76system 1:24.64elapsed 211%CPU (0avgtext+0avgdata 1152224maxresident)k
552inputs+172336outputs (38major+9858266minor)pagefaults 0swaps
That's just 1.09GB of RSS (vs 10GB of RSS above).
And without an explicit Xmx:
$ env time ruby --vm.XX:+PrintGCSummary -S bundle install
...
PrintGCSummary: YoungGenerationSize: 1073741824
PrintGCSummary: MinimumHeapSize: 2147483648
PrintGCSummary: MaximumHeapSize: 26683254320
PrintGCSummary: AlignedChunkSize: 1048576
PrintGCSummary: CollectedTotalChunkBytes: 44574058624
PrintGCSummary: CollectedTotalObjectBytes: 44262839856
PrintGCSummary: AllocatedNormalChunkBytes: 45331724056
PrintGCSummary: AllocatedNormalObjectBytes: 45010602728
PrintGCSummary: IncrementalGCCount: 35
PrintGCSummary: IncrementalGCNanos: 3817875581
PrintGCSummary: CompleteGCCount: 7
PrintGCSummary: CompleteGCNanos: 3849537282
PrintGCSummary: GCNanos: 7667412863
PrintGCSummary: TotalNanos: 69120499322
PrintGCSummary: GCLoadPercent: 11
154.53user 7.80system 1:09.49elapsed 233%CPU (0avgtext+0avgdata 2434052maxresident)k
40inputs+172352outputs (34major+3607781minor)pagefaults 0swaps
Which is 2.32GB RSS, due to the larger young gen:
$ env time ruby --vm.Xmn256m --vm.XX:+PrintGCSummary -S bundle install
...
PrintGCSummary: YoungGenerationSize: 268435456
PrintGCSummary: MinimumHeapSize: 536870912
PrintGCSummary: MaximumHeapSize: 26683254320
PrintGCSummary: AlignedChunkSize: 1048576
PrintGCSummary: CollectedTotalChunkBytes: 44722296984
PrintGCSummary: CollectedTotalObjectBytes: 44136069224
PrintGCSummary: AllocatedNormalChunkBytes: 45354680064
PrintGCSummary: AllocatedNormalObjectBytes: 44761168800
PrintGCSummary: IncrementalGCCount: 145
PrintGCSummary: IncrementalGCNanos: 11466049817
PrintGCSummary: CompleteGCCount: 23
PrintGCSummary: CompleteGCNanos: 11649559645
PrintGCSummary: GCNanos: 23115609462
PrintGCSummary: TotalNanos: 80220843315
PrintGCSummary: GCLoadPercent: 29
159.45user 14.85system 1:20.57elapsed 216%CPU (0avgtext+0avgdata 1145756maxresident)k
16inputs+172344outputs (34major+9989431minor)pagefaults 0swaps
is again 1.09GB RSS.
@eregon I noticed this recent commit to oracle/graal is it relevant/related?
https://github.com/oracle/graal/commit/6faf23d1b060a6a0afbe2176ab80278a397d9205
it is to do with inlining exploded loops.
@petenorth Yes, that's the fix. I'll make a PR to bump the truffle/graal import.
Fixed in c138bc04b656a09b0e72b37cf839cc82f2bd8e70, the fix will be in 20.0 too.
I'll trigger a nightly build.
Thank you for the report, this was indeed a major bug.
Most helpful comment
I reran the workload with that fix (locally, not in Docker):
That's just 1.09GB of RSS (vs 10GB of RSS above).
And without an explicit Xmx:
Which is 2.32GB RSS, due to the larger young gen:
is again 1.09GB RSS.