Graal: RC15 G1GC OSX crashes

Created on 13 Apr 2019 · 13Comments · Source: oracle/graal

With RC15, and it's defaulting to -XX:+UseJVMCINativeLibrary, we're seeing crashes when using -XX:+UseG1GC on some of our OSX machines, including Mojave and High Sierra.

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000010ee4dc26, pid=24193, tid=0x0000000000005103
#
# JRE version: OpenJDK Runtime Environment (8.0_202-b08) (build 1.8.0_202-20190206132754.buildslave.jdk8u-src-tar--b08)
# Java VM: OpenJDK GraalVM CE 1.0.0-rc15 (25.202-b08-jvmci-0.58 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.dylib+0x24dc26]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/.../hs_err_pid24193.log
Compiled method (JVMCI)   18800 11827       4       org.graalvm.compiler.truffle.runtime.OptimizedCallTarget::callRoot (68 bytes)
 total in heap  [0x0000000115d79ed0,0x0000000115d7aa08] = 2872
 relocation     [0x0000000115d7a000,0x0000000115d7a048] = 72
 main code      [0x0000000115d7a060,0x0000000115d7a340] = 736
 stub code      [0x0000000115d7a340,0x0000000115d7a360] = 32
 oops           [0x0000000115d7a360,0x0000000115d7a3e8] = 136
 metadata       [0x0000000115d7a3e8,0x0000000115d7a690] = 680
 scopes data    [0x0000000115d7a690,0x0000000115d7a818] = 392
 scopes pcs     [0x0000000115d7a818,0x0000000115d7a8e8] = 208
 dependencies   [0x0000000115d7a8e8,0x0000000115d7a990] = 168
 handler table  [0x0000000115d7a990,0x0000000115d7a9c0] = 48
 nul chk table  [0x0000000115d7a9c0,0x0000000115d7a9e0] = 32
 JVMCI data     [0x0000000115d7a9e0,0x0000000115d7aa08] = 40

These crashes no longer happen with -XX:-UseJVMCINativeLibrary.

Source

hashtag-smashtag

Most helpful comment

This should be fixed by https://github.com/graalvm/graal-jvmci-8/commit/a53131efce4e1c7d4e5aa3eefef504fee750e9f3 which did not make it into GraalVM RC16. It will be in the subsequent GraalVM release.

dougxc on 25 Apr 2019

🎉3 👍3

All 13 comments

Would you be able to attach hs_err_pid24193.log to this issue? That may offer some clues as to what is going wrong. Even more helpful would be steps to reproduce the crash.

Using the default collector should be a workaround until this is resolved.

dougxc on 13 Apr 2019

@dougxc Unfortunately no longer have that exact file, but I've attached another one that occurred during a test I was running.
hs_err_pid21035_redacted.log

hashtag-smashtag on 15 Apr 2019

Here is another one, which occurred when debug running a local version of our server, accompanied by this on the console:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000000011004dc26, pid=30930, tid=0x0000000000005403
#
# JRE version: OpenJDK Runtime Environment (8.0_202-b08) (build 1.8.0_202-20190206132754.buildslave.jdk8u-src-tar--b08)
# Java VM: OpenJDK GraalVM CE 1.0.0-rc15 (25.202-b08-jvmci-0.58 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.dylib+0x24dc26]  G1ParScanThreadState::copy_to_survivor_space(InCSetState, oopDesc*, markOopDesc*)+0x44
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/...tomcat/bin/hs_err_pid30930.log
Compiled method (JVMCI)  210906 20315       4       org.graalvm.compiler.truffle.runtime.OptimizedCallTarget::callRoot (68 bytes)
 total in heap  [0x0000000119cd49d0,0x0000000119da25e8] = 842776
 relocation     [0x0000000119cd4b00,0x0000000119cd6ea0] = 9120
 main code      [0x0000000119cd6ea0,0x0000000119d040c0] = 184864
 stub code      [0x0000000119d040c0,0x0000000119d04880] = 1984
 oops           [0x0000000119d04880,0x0000000119d05958] = 4312
 metadata       [0x0000000119d05958,0x0000000119d07c10] = 8888
 scopes data    [0x0000000119d07c10,0x0000000119d9a880] = 601200
 scopes pcs     [0x0000000119d9a880,0x0000000119da0070] = 22512
 dependencies   [0x0000000119da0070,0x0000000119da0c58] = 3048
 handler table  [0x0000000119da0c58,0x0000000119da18b8] = 3168
 nul chk table  [0x0000000119da18b8,0x0000000119da25b8] = 3328
 JVMCI data     [0x0000000119da25b8,0x0000000119da25e8] = 48

hs_err_pid30930_redacted.log

hashtag-smashtag on 15 Apr 2019

Do you know if the crashes still occur without libgraal (i.e., -XX:-UseJVMCINativeLibrary)?

dougxc on 15 Apr 2019

These crashes only occur when using libgraal.

hashtag-smashtag on 15 Apr 2019

Sorry, I just saw that you stated that earlier. Do you only get the crashes with G1, not the (default on 8) parallel collector?

dougxc on 15 Apr 2019

Correct.. we see the crashes with G1, but not with the default collector.

hashtag-smashtag on 15 Apr 2019

@hashtag-smashtag where is this info coming from:

Compiled method (JVMCI)  210906 20315       4       org.graalvm.compiler.truffle.runtime.OptimizedCallTarget::callRoot (68 bytes)

I don't see it in the hs_err files.

To get more details, it would be great if you can add these options:

-XX:+UnlockDiagnosticVMOptions -XX:+VerifyBeforeGC -XX:+VerifyAfterGC -Dgraal.PrintCompilation=true -Dgraal.TraceTruffleCompilation=true

We need to find out which compilation is not producing the right GC info/barriers and why it only happens with libgraal and not JavaGraal. Note that these flags will also slow down execution considerably.

dougxc on 16 Apr 2019

When I run the server, it often doesn't crash in the same way. Here are two hs_err files from runs with the options you specified:

hs_err_pid64055_redacted.log

hs_err_pid63727_redacted.log

hashtag-smashtag on 16 Apr 2019

There should also be a bunch of console output from the -Dgraal.PrintCompilation=true -Dgraal.TraceTruffleCompilation=true options. Is is possible for you to attach a (redacted) copy of that?

At this point, we're going to have to try and reproduce locally. In the meantime, I would suggest not using G1 with GraalVM for your workload.

dougxc on 16 Apr 2019

It was easier to repro the crash on an integrated test (with the vm options you provided) and redact the console and logs:

hs_err_pid78925_redacted.log

console for hs_err_pid78925.log

Hope those help.

hashtag-smashtag on 17 Apr 2019

This should be fixed by https://github.com/graalvm/graal-jvmci-8/commit/a53131efce4e1c7d4e5aa3eefef504fee750e9f3 which did not make it into GraalVM RC16. It will be in the subsequent GraalVM release.

dougxc on 25 Apr 2019

🎉3 👍3

Please re-open if not fixed in subsequent GraalVM release.

dougxc on 5 May 2019