openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.7+10)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.20.0, JRE 11 Linux amd64-64-Bit Compressed References 20200416_574 (JIT enabled, AOT enabled)
OpenJ9 - 05fa2d361
OMR - d4365f371
JCL - 838028fc9d based on jdk-11.0.7+10)
OpenJ9 Segmentation Fault
#0: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x85a535) [0x7fb4b7a99535]
#1: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x865350) [0x7fb4b7aa4350]
#2: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x13b87e) [0x7fb4b737a87e]
#3: /opt/java/openjdk/lib/compressedrefs/libj9prt29.so(+0x1ac1a) [0x7fb4bdfa9c1a]
#4: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890) [0x7fb4c03c2890]
#5: /opt/java/openjdk/lib/compressedrefs/libj9vm29.so(+0xba041) [0x7fb4be6df041]
#6: /opt/java/openjdk/lib/compressedrefs/libj9vm29.so(+0xbac83) [0x7fb4be6dfc83]
#7: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x8fff7d) [0x7fb4b7b3ef7d]
#8: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x273e14) [0x7fb4b74b2e14]
#9: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x273fff) [0x7fb4b74b2fff]
#10: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x270b62) [0x7fb4b74afb62]
#11: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x2707f9) [0x7fb4b74af7f9]
#12: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x278bd8) [0x7fb4b74b7bd8]
#13: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x279319) [0x7fb4b74b8319]
#14: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14ab51) [0x7fb4b7389b51]
#15: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14cbfa) [0x7fb4b738bbfa]
#16: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14d979) [0x7fb4b738c979]
#17: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14e321) [0x7fb4b738d321]
#18: /opt/java/openjdk/lib/compressedrefs/libj9prt29.so(+0x1b753) [0x7fb4bdfaa753]
#19: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14fff5) [0x7fb4b738eff5]
#20: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x150598) [0x7fb4b738f598]
#21: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14bf7b) [0x7fb4b738af7b]
#22: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14c472) [0x7fb4b738b472]
#23: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14c51a) [0x7fb4b738b51a]
#24: /opt/java/openjdk/lib/compressedrefs/libj9prt29.so(+0x1b753) [0x7fb4bdfaa753]
#25: /opt/java/openjdk/lib/compressedrefs/libj9jit29.so(+0x14c974) [0x7fb4b738b974]
#26: /opt/java/openjdk/lib/compressedrefs/libj9thr29.so(+0xe326) [0x7fb4be418326]
#27: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fb4c03b76db]
#28: function clone+0x3f [0x7fb4bfccb88f]
Unhandled exception
Type=Segmentation error vmState=0x0005ffff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007FB4BE6B7ED0 Handler2=00007FB4BDFA99F0 InaccessibleAddress=0000000FF755F640
RDI=0000000000018A00 RSI=00000000007E6DC0 RAX=0000000FF755F640 RBX=0000000000000000
RCX=0000000000000008 RDX=0000000000000000 R8=0000000000000008 R9=00007FB4A1AE5E68
R10=00007FB4BEDF2140 R11=0000000000000000 R12=0000000000018A00 R13=00007FB469D81B08
R14=00000000007E6DC0 R15=00007FB4A1AE5F08
RIP=00007FB4BE6DF041 GS=0000 FS=0000 RSP=00007FB4A1AE5D00
EFlags=0000000000010246 CS=0033 RBP=00000000FF6D7888 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000FF755F640
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 65636e6174736e69 (f: 1953721984.000000, d: 2.519686e+180)
xmm2 ffffffffffffffff (f: 4294967296.000000, d: -nan)
xmm3 656a624f2f676e61 (f: 795307648.000000, d: 3.421278e+180)
xmm4 ffffffffffffffff (f: 4294967296.000000, d: -nan)
xmm5 ffffffffffffffff (f: 4294967296.000000, d: -nan)
xmm6 ffffffffffffffff (f: 4294967296.000000, d: -nan)
xmm7 bbb92bb0162116ec (f: 371267296.000000, d: -5.330094e-21)
xmm8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 3b100d062b461a1e (f: 726014464.000000, d: 3.319243e-24)
xmm11 0000000049d70a38 (f: 1238829568.000000, d: 6.120632e-315)
xmm12 000000004689a022 (f: 1183424512.000000, d: 5.846894e-315)
xmm13 0000000047ac082f (f: 1202456576.000000, d: 5.940925e-315)
xmm14 0000000048650dc0 (f: 1214582272.000000, d: 6.000833e-315)
xmm15 0000000046b73e38 (f: 1186414080.000000, d: 5.861665e-315)
Module=/opt/java/openjdk/lib/compressedrefs/libj9vm29.so
Module_base_address=00007FB4BE625000
Method_being_compiled=akka/stream/impl/GraphStageIsland.onIslandReady()V
Target=2_90_20200416_574 (Linux 4.15.0-1077-azure)
CPU=amd64 (2 logical CPUs) (0x1f12ef000 RAM)
----------- Stack Backtrace -----------
(0x00007FB4BE6DF041 [libj9vm29.so+0xba041])
(0x00007FB4BE6DFC83 [libj9vm29.so+0xbac83])
(0x00007FB4B7B3EF7D [libj9jit29.so+0x8fff7d])
(0x00007FB4B74B2E14 [libj9jit29.so+0x273e14])
(0x00007FB4B74B2FFF [libj9jit29.so+0x273fff])
(0x00007FB4B74AFB62 [libj9jit29.so+0x270b62])
(0x00007FB4B74AF7F9 [libj9jit29.so+0x2707f9])
(0x00007FB4B74B7BD8 [libj9jit29.so+0x278bd8])
(0x00007FB4B74B8319 [libj9jit29.so+0x279319])
(0x00007FB4B7389B51 [libj9jit29.so+0x14ab51])
(0x00007FB4B738BBFA [libj9jit29.so+0x14cbfa])
(0x00007FB4B738C979 [libj9jit29.so+0x14d979])
(0x00007FB4B738D321 [libj9jit29.so+0x14e321])
(0x00007FB4BDFAA753 [libj9prt29.so+0x1b753])
(0x00007FB4B738EFF5 [libj9jit29.so+0x14fff5])
(0x00007FB4B738F598 [libj9jit29.so+0x150598])
(0x00007FB4B738AF7B [libj9jit29.so+0x14bf7b])
(0x00007FB4B738B472 [libj9jit29.so+0x14c472])
(0x00007FB4B738B51A [libj9jit29.so+0x14c51a])
(0x00007FB4BDFAA753 [libj9prt29.so+0x1b753])
(0x00007FB4B738B974 [libj9jit29.so+0x14c974])
(0x00007FB4BE418326 [libj9thr29.so+0xe326])
(0x00007FB4C03B76DB [libpthread.so.0+0x76db])
clone+0x3f (0x00007FB4BFCCB88F [libc.so.6+0x12188f])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2020/05/18 08:15:35 - please wait.
JVMDUMP032I JVM requested System dump using '//core.20200518.081535.6.0001.dmp' in response to an event
JVMPORT030W /proc/sys/kernel/core_pattern setting "|/usr/share/apport/apport %p %s %c %d %P %E" specifies that the core dump is to be piped to an external program. Attempting to rename either core or core.152.
JVMDUMP012E Error in System dump: The core file created by child process with pid = 152 was not found. Expected to find core file with name "//core"
JVMDUMP032I JVM requested Java dump using '//javacore.20200518.081535.6.0002.txt' in response to an event
JVMDUMP010I Java dump written to //javacore.20200518.081535.6.0002.txt
JVMDUMP032I JVM requested Snap dump using '//Snap.20200518.081535.6.0003.trc' in response to an event
JVMDUMP010I Snap dump written to //Snap.20200518.081535.6.0003.trc
JVMDUMP007I JVM Requesting JIT dump using '//jitdump.20200518.081535.6.0004.dmp'
#JITDUMP: vmThread=0000000000018A00 Crashed while printing out current IL.JVMDUMP010I JIT dump written to //jitdump.20200518.081535.6.0004.dmp
JVMDUMP013I Processed dump event "gpf", detail "".
JAVA_OPTS:
-Xtune:virtualized -Xshareclasses:cacheDir=/jvmcache -Xjit:enableSelfTuningScratchMemoryUsageBeforeCompile
This is happening in a stateless k8s pod, so it is difficult to get the dump. Either way, it looks like OpenJ9 is crashing when creating the dump
fyi @andrewcraik
Some kind of core / diagnostic output beyond the raw dump message is likely needed to make progress on this. Does the failure persist if the -Xjit option is removed? How about if it is run -Xsharelcasses:none?
Hi @andrewcraik. i'll try to get the diagnostic output.
I googled for vmState=0x0005ffff, and found this. We have disabled aot for now and so far so good. I'll report back if I see the same error with aot enabled.
I also tried removing the Xjit option and letting the aot compiler enabled. I wasn't able to reproduce the issue, but honestly I wouldn't rule out that it could still happen with those settings, since this issue doesn't reproduce 100% of the time (and not having the Xjit option enabled isn't an option for us in production).
To give you a bit more context: we are using OpenJ9 on K8s and we are mounting a host path with the SCC to reduce memory footprint of our services. If we hit this issue, the pods will enter a CrashLoopBackOff state, since they will hit this issue again and again and again. If the pods are deleted and recreated (which might happen in a different node), then the pods work just fine. This makes me think that the problem might indeed be related to the SCC, and the IBM support page I linked before seems somewhat related to the behavior I'm seeing (as in, it might be a problem when an AOT-compiled method has to be JITted)
Thanks @edrevo the vmstate just tells us which general area of the compiler might be related. That link is for the IBM SDK for Java, not OpenJ9, and is from 2015 - well before the OpenJ9 codebase existed so is not likely relevant.
@dsouzai seems we have another report of an AOT/SCC crash - does this look anything like ones that have been fixed recently?
No, a crash with vmstate 0x0005ffff looks new, though a coredump would validate that / help with debugging.
I manually resolved the backtrace:
resolveStaticFieldRefInto
jitCTResolveStaticFieldRefWithMethod
TR_RelocationRecordDataAddress::findDataAddress(TR_RelocationRuntime*, TR_RelocationTarget*)
TR_RelocationRecordDataAddress::applyRelocation(TR_RelocationRuntime*, TR_RelocationTarget*, unsigned char*)
TR_RelocationRecord::applyRelocationAtAllOffsets(TR_RelocationRuntime*, TR_RelocationTarget*, unsigned char*)
...
This looks similar to the issue fixed by https://github.com/eclipse/openj9/pull/9418. @cathyzhyi does this backtrace look similar?
This looks similar to the issue fixed by #9418. @cathyzhyi does this backtrace look similar?
Yes, it's the same backtrace as the problem I fixed. This should be fixed by #9418.
I am thinking the best way to avoid this bug is to just exclude AOT for the problematic method. I see there is an option -Xaot:exclude=<method> in https://www.eclipse.org/openj9/docs/xaot/, but there's no mention as to what the correct syntax for specifying the method is. What should I put as method?
akka/stream/impl/GraphStageIsland.onIslandReady()Vakka/stream/impl/GraphStageIsland.onIslandReady()akka/stream/impl/GraphStageIsland.onIslandReadyakka.stream.impl.GraphStageIsland.onIslandReadyIt would be nice to document that somewhere
FYI - I've opened https://github.com/eclipse/openj9-docs/issues/566 for updating the doc
Thanks for opening the documentation issue. Do you happen to know what the correct syntax is? After several attempts, I have ended with -Xaot:exclude={akka/stream/impl/GraphStageIsland.onIslandReady}, but I'm still getting the segmentation fault with that, so I'm guessing it isn't the correct syntax
It's a regex-based format so it might be sufficient to add a * to the end of the method name a la "-Xaot:exclude={akka/stream/impl/GraphStageIsland.onIslandReady*}".
@dsouzai @andrewcraik @cathyzhyi would know better on the exact format
Hm, I think the option you want is -Xaot:loadExclude={akka/stream/impl/GraphStageIsland.onIslandReady*}. Looking at the code there doesn't seem to be a way to say "Don't generate a relocatable body", but loadExclude should prevent a method in the SCC from getting loaded.
Many thanks everyone for the help. I'll go ahead and close the issue since it seems to be fixed already in https://github.com/eclipse/openj9/pull/9418.
Most helpful comment
Hm, I think the option you want is
-Xaot:loadExclude={akka/stream/impl/GraphStageIsland.onIslandReady*}. Looking at the code there doesn't seem to be a way to say "Don't generate a relocatable body", butloadExcludeshould prevent a method in the SCC from getting loaded.