https://ci.eclipse.org/openj9/job/Test-sanity.system-JDK8-linux_x86-64/146
===============================================
Running test MauveSingleThreadLoadTest_0 ...
===============================================
MauveSingleThreadLoadTest_0 Start Time: Wed Nov 21 07:26:22 2018 Epoch Time (ms): 1542785182759
variation: NoOptions
JVM_OPTIONS: -Xnocompressedrefs
STF 07:26:22.925 - ========================= S T F =========================
systemtest-prereqs has been processed, and set to: /home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/jvmtest/systemtest/systemtest_prereqsRetrieving amount of free space on drive containing /home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15427850089951/MauveSingleThreadLoadTest_0
There is 90568 Mb free
STF 07:26:22.931 - ================== G E N E R A T I O N ==================
STF 07:26:22.932 - Checking JVM: /home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdkbinary/j2sdk-image/jre/bin/../../
STF 07:26:22.932 - Starting process to generate scripts: /home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdkbinary/j2sdk-image/jre/bin/../..//bin/java -Dlog4j.skipJansi=true -Djava.system.class.loader=net.adoptopenjdk.stf.runner.StfClassLoader -classpath /home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdk-tests/TestConfig/scripts/testKitGen/../../../../jvmtest/systemtest/systemtest_prereqs/log4j-2.3/log4j-api-2.3.jar:/home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdk-tests/TestConfig/scripts/testKitGen/../../../../jvmtest/systemtest/systemtest_prereqs/log4j-2.3/log4j-core-2.3.jar:/home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/jvmtest/systemtest/stf/stf.core/scripts/../bin net.adoptopenjdk.stf.runner.StfRunner -properties "/home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15427850089951/MauveSingleThreadLoadTest_0/20181121-072622-MauveSingleThreadLoadTest/stf_parameters.properties, , /home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/jvmtest/systemtest/stf/stf.core/config/stf.properties" -testDir "/home/jenkins/workspace/Test-sanity.system-JDK8-linux_x86-64/openjdk-tests/TestConfig/scripts/testKitGen/../../../TestConfig/test_output_15427850089951/MauveSingleThreadLoadTest_0/20181121-072622-MauveSingleThreadLoadTest"
...
STF 07:41:25.139 - Heartbeat: Process LT is still running
LT 07:41:27.046 - Completed 24.4%. Number of tests started=471521 (+0)
LT 07:41:47.070 - Completed 24.4%. Number of tests started=471521 (+0)
LT 07:42:07.029 - Completed 24.4%. Number of tests started=471521 (+0)
LT 07:42:17.342 - **POSSIBLE HANG DETECTED**
STF 07:42:17.705 - **FAILED** Process LT has hung
STF 07:42:17.705 - Collecting dumps for: LT
Looked at the three javacore in the result file. The threads seem not progressing hence this is a hang (or deadlock) instead of overrun.
The javacore file shows that following two threads hold vm access and apparently there is a thread attempting to acquire exclusive VM access which probably is the cause of this hang.
3XMTHREADINFO "MemoryMXBean notification dispatcher" J9VMThread:0x0000000000336300, omrthread_t:0x00003FFF50031598, java/lang/Thread:0x00000000E0113F80, state:R, prio=6
3XMJAVALTHREAD (java/lang/Thread getId:0x16, isDaemon:true)
3XMTHREADINFO1 (native thread ID:0x7FA0, native priority:0x6, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x000000a1)
3XMTHREADINFO2 (native stack address range from:0x00003FFF89220000, to:0x00003FFF89260000, size:0x40000)
3XMCPUTIME CPU usage total: 0.391388890 secs, current category="Application"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=28648 (0x6FE8)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at com/ibm/lang/management/internal/MemoryNotificationThread.processNotificationLoop(Native Method)
4XESTACKTRACE at com/ibm/lang/management/internal/MemoryNotificationThread.run(MemoryNotificationThread.java:183)
3XMTHREADINFO "Attach API wait loop" J9VMThread:0x0000000000103800, omrthread_t:0x00003FFF50031020, java/lang/Thread:0x00000000E00298D0, state:R, prio=10
3XMJAVALTHREAD (java/lang/Thread getId:0x12, isDaemon:true)
3XMTHREADINFO1 (native thread ID:0x7F9D, native priority:0xA, native policy:UNKNOWN, vmstate:R, vm thread flags:0x000000a1)
3XMTHREADINFO2 (native stack address range from:0x00003FFF89750000, to:0x00003FFF89790000, size:0x40000)
3XMCPUTIME CPU usage total: 0.000940426 secs, current category="System-JVM"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at com/ibm/tools/attach/target/IPC.waitSemaphore(Native Method)
4XESTACKTRACE at com/ibm/tools/attach/target/CommonDirectory.waitSemaphore(CommonDirectory.java:254)
4XESTACKTRACE at com/ibm/tools/attach/target/WaitLoop.waitForNotification(WaitLoop.java:66)
4XESTACKTRACE at com/ibm/tools/attach/target/WaitLoop.run(WaitLoop.java:154)
These two threads have vm thread flags 0xa1 in which 0x1 (many other threads have 0x1 as well) means there is a thread trying to acquire exclusive VM access, 0x20 means both threads have vm access.
@pdbain-ibm any insights about the Attach API thread holding the vm access?
@LinHu2016 FYI
> !threads flags
Attached Threads List. For more options, run !threads help
!j9vmthread 0x1bf6700 publicFlags=400 privateFlags=1008 inNative=0 // main
!j9vmthread 0x1bfa300 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-0
!j9vmthread 0x1bfdf00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-1 Suspended
!j9vmthread 0x1c01b00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-2 Suspended
!j9vmthread 0x1c05700 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-3 Suspended
!j9vmthread 0x1c09300 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-4 Suspended
!j9vmthread 0x1c0cf00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-5 Suspended
!j9vmthread 0x1c10b00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-6 Suspended
!j9vmthread 0x1c14700 publicFlags=0 privateFlags=101a inNative=0 // JIT Diagnostic Compilation Thread-7 Suspended
!j9vmthread 0x1c18300 publicFlags=0 privateFlags=101a inNative=0 // JIT-SamplerThread
!j9vmthread 0x1c1bf00 publicFlags=0 privateFlags=101a inNative=0 // IProfiler
!j9vmthread 0x1cdf200 publicFlags=0 privateFlags=1a inNative=0 // Concurrent Mark Helper
!j9vmthread 0x1ce2e00 publicFlags=0 privateFlags=1a inNative=0 // GC Slave
!j9vmthread 0x1ce6a00 publicFlags=0 privateFlags=1a inNative=0 // GC Slave
!j9vmthread 0x1cea600 publicFlags=0 privateFlags=1a inNative=0 // GC Slave
!j9vmthread 0x1cfb200 publicFlags=20 privateFlags=2 inNative=1 // Attach API wait loop
!j9vmthread 0x1cfe200 publicFlags=0 privateFlags=20080a inNative=0 // Finalizer thread
!j9vmthread 0x1d0bb00 publicFlags=20 privateFlags=2 inNative=1 // MemoryMXBean notification dispatcher
!j9vmthread 0x1fa9a00 publicFlags=20 privateFlags=8000 inNative=0 // load-0
!j9vmthread 0x1f2f900 publicFlags=100 privateFlags=2 inNative=0 // ClassCache Reaper
!j9vmthread 0x20a9c00 publicFlags=100 privateFlags=2 inNative=0 // Java2D Disposer
!j9vmthread 0x23ee300 publicFlags=20000 privateFlags=2 inNative=0 // TimerQueue
!j9vmthread 0x2186f00 publicFlags=0 privateFlags=a inNative=0 // SIGABRT Thread
There is a difference between system core and javacore. System core doesn't show that there is a thread trying to acquire exclusive VM access.
3XMTHREADINFO "load-0" J9VMThread:0x0000000001FA9A00, omrthread_t:0x00007FA3B87AFFA0, java/lang/Thread:0x00000000E04B2F18, state:R, prio=5
3XMJAVALTHREAD (java/lang/Thread getId:0x18, isDaemon:false)
3XMTHREADINFO1 (native thread ID:0x5E68, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000081)
3XMTHREADINFO2 (native stack address range from:0x00007FA3B4101000, to:0x00007FA3B4142000, size:0x41000)
3XMCPUTIME CPU usage total: 957.309699992 secs, current category="Application"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=8042624 (0x7AB880)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at java/awt/image/SinglePixelPackedSampleModel.getPixel(SinglePixelPackedSampleModel.java:442(Compiled Code))
4XESTACKTRACE at gnu/testlet/java/awt/image/SinglePixelPackedSampleModel/getPixel.test(getPixel.java:86(Compiled Code))
4XESTACKTRACE at sun/reflect/GeneratedMethodAccessor519.invoke(Bytecode PC:40)
4XESTACKTRACE at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43(Compiled Code))
4XESTACKTRACE at java/lang/reflect/Method.invoke(Method.java:498(Compiled Code))
4XESTACKTRACE at net/adoptopenjdk/loadTest/adaptors/MauveAdaptor.executeTest(MauveAdaptor.java:73(Compiled Code))
4XESTACKTRACE at net/adoptopenjdk/loadTest/LoadTestRunner$2.run(LoadTestRunner.java:177)
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
4XESTACKTRACE at java/lang/Thread.run(Thread.java:813)
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK (0x00007FA3B78C7882 [libj9prt29.so+0x3c882])
4XENATIVESTACK (0x00007FA3B78ABD27 [libj9prt29.so+0x20d27])
4XENATIVESTACK (0x00007FA3B78C78FC [libj9prt29.so+0x3c8fc])
4XENATIVESTACK (0x00007FA3B78C7A06 [libj9prt29.so+0x3ca06])
4XENATIVESTACK (0x00007FA3B78ABD27 [libj9prt29.so+0x20d27])
4XENATIVESTACK (0x00007FA3B78C7751 [libj9prt29.so+0x3c751])
4XENATIVESTACK (0x00007FA3B78C3FBC [libj9prt29.so+0x38fbc])
4XENATIVESTACK (0x00007FA3BDAEC4B0 [libc.so.6+0x354b0])
4XENATIVESTACK pthread_cond_wait+0xbe (0x00007FA3BD6A335E [libpthread.so.0+0xd35e])
4XENATIVESTACK omrthread_monitor_wait+0x24b (0x00007FA3BC276D3B [libj9thr29.so+0x8d3b])
4XENATIVESTACK (0x00007FA3BC5084EC [libj9vm29.so+0x7f4ec])
4XENATIVESTACK (0x00007FA3BC4A1FE6 [libj9vm29.so+0x18fe6])
4XENATIVESTACK (0x00007FA3B6489DFA [libj9jit29.so+0x75fdfa])
4XENATIVESTACK (0x00007FA3B649E3B6 [libj9jit29.so+0x7743b6])
The Java code in question are:
public int [] getPixel(int x, int y, int iArray[], DataBuffer data) {
....
int value = data.getElem(y * scanlineStride + x);
for (int i=0; i<numBands; i++) { ---> line 442
pixels[i] = (value & bitMasks[i]) >>> bitOffsets[i];
}
return pixels;
}
The native code in stack probably looks like:
jitCheckAsyncMessages --> old_slow_jitCheckAsyncMessages --> javaCheckAsyncMessages --> internalAcquireVMAccessNoMutex --> internalAcquireVMAccessNoMutexWithMask --> omrthread_monitor_wait
The matching source code is:
void internalAcquireVMAccessNoMutex(J9VMThread * vmThread)
{
internalAcquireVMAccessNoMutexWithMask(vmThread, J9_PUBLIC_FLAGS_HALT_THREAD_ANY);
}
void internalAcquireVMAccessNoMutexWithMask(J9VMThread * vmThread, UDATA haltMask)
{
....
while (vmThread->publicFlags & haltMask) {
omrthread_monitor_wait(vmThread->publicFlagsMutex); ----> waiting here
}
}
2LKREGMON Thread public flags mutex lock (0x00007FA3B87899E8): <unowned>
3LKNOTIFYQ Waiting to be notified:
3LKWAITNOTIFY "load-0" (J9VMThread:0x0000000001FA9A00)
Note: the thread public flags from the core dump doesn't make vmThread->publicFlags & haltMask to be true.
The core file was captured only once after three javacore files were generated 30 seconds apart, ideally, system core should be captured right after javacore generation and repeated three times.
Those system core dump files in the job result do not match the thread states in the javacore files as per analysis above, it appears the thread states might be modified by JVM RAS system dump generation process.
Attempted to force a system core via de-referencing a NULL pointer when a signal -6 is received to skip RAS dump mechanism, it works in a simple stand-alone case but no core was collected by the job (the core might be generated elsewhere or doesn't work in this particular test).
It is difficult to debug and collect data without machine access considering each experiment requires a custom JVM build and a test grinder run.
Hence requesting machine access to ub16-x86-1.
Reviewers: @jdekonin @DanHeidinga @pshipton
Hence requesting machine access to ub16-x86-1.
+1 assuming access will be revoked once the issue is debugged. FYI @jdekonin
I contacted @JasonFengJ9 direct and provided temporary access to the machine. It has been disabled until Jason is finished with it.
Got the machine access, and attempting to reproduce the hang.
Update:
Reproduced the hang at the machine, and generated the system core with current threads.
The javacore appears same as those reported in earlier investigation (following contents looks same though from a new set of dump files);
3XMTHREADINFO "load-0" J9VMThread:0x0000000001950400, omrthread_t:0x00007F14A40F77F8, java/lang/Thread:0x00000000E0486E38, state:R, prio=5 3XMJAVALTHREAD (java/lang/Thread getId:0x1A, isDaemon:false) 3XMTHREADINFO1 (native thread ID:0x488C, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000001) 3XMTHREADINFO2 (native stack address range from:0x00007F148C3EB000, to:0x00007F148C42C000, size:0x41000) 3XMCPUTIME CPU usage total: 990.198539253 secs, current category="Application" 3XMHEAPALLOC Heap bytes allocated since last GC cycle=20315560 (0x135FDA8) 3XMTHREADINFO3 Java callstack: 4XESTACKTRACE at java/awt/image/SinglePixelPackedSampleModel.getPixel(SinglePixelPackedSampleModel.java:453(Compiled Code)) 4XESTACKTRACE at gnu/testlet/java/awt/image/SinglePixelPackedSampleModel/getPixel.test(getPixel.java:86(Compiled Code)) 4XESTACKTRACE at sun/reflect/GeneratedMethodAccessor519.invoke(Bytecode PC:40) 4XESTACKTRACE at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55(Compiled Code)) 4XESTACKTRACE at java/lang/reflect/Method.invoke(Method.java:508(Compiled Code)) 4XESTACKTRACE at net/adoptopenjdk/loadTest/adaptors/MauveAdaptor.executeTest(MauveAdaptor.java:73(Compiled Code)) 4XESTACKTRACE at net/adoptopenjdk/loadTest/LoadTestRunner$2.run(LoadTestRunner.java:177) 4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160) 4XESTACKTRACE at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 4XESTACKTRACE at java/lang/Thread.run(Thread.java:812) 3XMTHREADINFO3 Native callstack: 4XENATIVESTACK (0x00007F14A340F462 [libj9prt29.so+0x4e462]) 4XENATIVESTACK (0x00007F14A33E2E18 [libj9prt29.so+0x21e18]) 4XENATIVESTACK (0x00007F14A340F4DE [libj9prt29.so+0x4e4de]) 4XENATIVESTACK (0x00007F14A340F5D4 [libj9prt29.so+0x4e5d4]) 4XENATIVESTACK (0x00007F14A33E2E18 [libj9prt29.so+0x21e18]) 4XENATIVESTACK (0x00007F14A340F33B [libj9prt29.so+0x4e33b]) 4XENATIVESTACK (0x00007F14A340B77F [libj9prt29.so+0x4a77f]) 4XENATIVESTACK (0x00007F14A9A9D390 [libpthread.so.0+0x11390]) 4XENATIVESTACK pthread_cond_wait+0xbe (0x00007F14A9A9935E [libpthread.so.0+0xd35e]) 4XENATIVESTACK omrthread_monitor_wait+0x287 (0x00007F14A384A6F7 [libj9thr29.so+0x86f7]) 4XENATIVESTACK (0x00007F14A3ADC91C [libj9vm29.so+0x7f91c]) 4XENATIVESTACK (0x00007F14A3A75E90 [libj9vm29.so+0x18e90]) 4XENATIVESTACK (0x00007F14A1A4488A [libj9jit29.so+0x83d88a]) 4XENATIVESTACK (0x00007F14A1A5B7D6 [libj9jit29.so+0x8547d6])
The Java code where thread stopped is:
public int [] getPixel(int x, int y, int iArray[], DataBuffer data) {
...
int value = data.getElem(y * scanlineStride + x);
for (int i=0; i < numBands; i++) { SinglePixelPackedSampleModel.java:453(Compiled Code)
pixels[i] = (value & bitMasks[i]) >>> bitOffsets[i];
}
return pixels;
}
From Native callstack in this javacore file, it seems like
jitCheckAsyncMessages --> old_slow_jitCheckAsyncMessages --> javaCheckAsyncMessages --> internalAcquireVMAccessNoMutex --> internalAcquireVMAccessNoMutexWithMask --> omrthread_monitor_wait
The source code is:
void internalAcquireVMAccessNoMutex(J9VMThread * vmThread)
{
internalAcquireVMAccessNoMutexWithMask(vmThread, J9_PUBLIC_FLAGS_HALT_THREAD_ANY);
}
void internalAcquireVMAccessNoMutexWithMask(J9VMThread * vmThread, UDATA haltMask)
{
....
while (vmThread->publicFlags & haltMask) {
omrthread_monitor_wait(vmThread->publicFlagsMutex); waiting here
}
}
However the system core doesn't have vmThread->publicFlags to support waiting at omrthread_monitor_wait(vmThread->publicFlagsMutex);
> !threads flags
Attached Threads List. For more options, run !threads help
!j9vmthread 0x161a700 publicFlags=400 privateFlags=1008 inNative=0 // main
!j9vmthread 0x161e300 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-0
!j9vmthread 0x1621f00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-1 Suspended
!j9vmthread 0x1625b00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-2 Suspended
!j9vmthread 0x1629700 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-3 Suspended
!j9vmthread 0x162d300 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-4 Suspended
!j9vmthread 0x1630f00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-5 Suspended
!j9vmthread 0x1634b00 publicFlags=0 privateFlags=101a inNative=0 // JIT Compilation Thread-6 Suspended
!j9vmthread 0x1638700 publicFlags=0 privateFlags=101a inNative=0 // JIT Diagnostic Compilation Thread-7 Suspended
!j9vmthread 0x163c300 publicFlags=0 privateFlags=101a inNative=0 // JIT-SamplerThread
!j9vmthread 0x163ff00 publicFlags=0 privateFlags=101a inNative=0 // IProfiler
!j9vmthread 0x170b300 publicFlags=20 privateFlags=2 inNative=1 // Signal Dispatcher
!j9vmthread 0x170ee00 publicFlags=0 privateFlags=1a inNative=0 // Concurrent Mark Helper
!j9vmthread 0x1712a00 publicFlags=0 privateFlags=1a inNative=0 // GC Slave
!j9vmthread 0x1716600 publicFlags=0 privateFlags=1a inNative=0 // GC Slave
!j9vmthread 0x171a200 publicFlags=0 privateFlags=1a inNative=0 // GC Slave
!j9vmthread 0x1722e00 publicFlags=20 privateFlags=2 inNative=1 // Attach API wait loop
!j9vmthread 0x1725e00 publicFlags=0 privateFlags=20080a inNative=0 // Finalizer thread
!j9vmthread 0x1744a00 publicFlags=20 privateFlags=2 inNative=1 // MemoryMXBean notification dispatcher
!j9vmthread 0x1950400 publicFlags=20 privateFlags=8000 inNative=0 // load-0this publicFlags value can't satisfy (vmThread->publicFlags & haltMask)
!j9vmthread 0x19c9a00 publicFlags=100 privateFlags=2 inNative=0 // ClassCache Reaper
!j9vmthread 0x1acb100 publicFlags=100 privateFlags=2 inNative=0 // Java2D Disposer
!j9vmthread 0x1e99c00 publicFlags=20000 privateFlags=2 inNative=0 // TimerQueue
!j9vmthread 0x2188600 publicFlags=0 privateFlags=a inNative=0 // SIGABRT Thread
This thread native stack from the system core isn't very helpful.
#0 0x00007f148e3e58fd in ?? () #1 0x0000000000000001 in ?? () #2 0x00000000ffb3f940 in ?? () #3 0x00000000ffb3f940 in ?? () #4 0x00007f148e32b571 in ?? () #5 0x00000000ffb3fa60 in ?? () #6 0x0000000000000001 in ?? () #7 0x00000000ffb3fa40 in ?? () #8 0x00000000ffb3fad0 in ?? () #9 0x00000000ffb3fa60 in ?? () #10 0x00000000ffb3fa60 in ?? () #11 0x00000000ffb3fb18 in ?? () #12 0x00007f14a1a65050 in returnFromJITConstructor0 () from /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9jit29.so #13 0x00000000ffb3f940 in ?? () #14 0x00000000ffb3f790 in ?? () #15 0x00000000ffb3f790 in ?? () #16 0x00000000019d33d9 in ?? () #17 0x00000000019d3450 in ?? () #18 0x00000000019c4128 in ?? () #19 0x00007f14a49d2822 in ?? () #20 0x0000000000000000 in ?? ()
Attempted to build a debug version of libj9jit29.so via modification of compiler/makefile BUILD_CONFIG?=debug, however this hits an assertion:
Assertion failed at /team/ive/axxon/jvmxa6480-ib-403141-jit-debug/jre/lib/amd64/compressedrefs/compiler/../omr/compiler/infra/OMRCfg.cpp:176: !f->hasSuccessor(t)
VMState: 0x000501ff
adding an exception edge when there's already a non exception edge
compiling java/security/KeyStore.isCertificateEntry(Ljava/lang/String;)Z at level: warm
#0: function TR_LinuxCallStackIterator::printStackBacktrace(TR::Compilation*)+0x45 [0x7f3177298eaf]
#1: function TR_Debug::printStackBacktrace()+0x2f [0x7f31772abced]
#2: /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9jit29.so(+0xb727b0) [0x7f3176f217b0]
#3: function TR::assertion(char const*, int, char const*, char const*, ...)+0xc3 [0x7f3176f21966]
#4: function OMR::CFG::addExceptionEdge(TR::CFGNode*, TR::CFGNode*, TR_AllocationKind)+0xfa [0x7f3176f2b51e]
#5: function TR_J9EstimateCodeSize::realEstimateCodeSize(TR_CallTarget*, TR_CallStack*, bool)+0x3e05 [0x7f3176b93301]
#6: function TR_J9EstimateCodeSize::estimateCodeSize(TR_CallTarget*, TR_CallStack*, bool)+0x35 [0x7f3176b8ec3b]
#7: function TR_EstimateCodeSize::calculateCodeSize(TR_CallTarget*, TR_CallStack*, bool)+0xf3 [0x7f3176b69c63]
#8: function TR_J9InlinerUtil::estimateAndRefineBytecodeSize(TR_CallSite*, TR_CallTarget*, TR_CallStack*, int&)+0x26a [0x7f3176b7ffc2]
#9: function TR_InlinerBase::applyPolicyToTargets(TR_CallStack*, TR_CallSite*)+0x1da [0x7f317701394a]
#10: function TR_InlinerBase::getSymbolAndFindInlineTargets(TR_CallStack*, TR_CallSite*, bool)+0x5a4 [0x7f3177013166]
#11: function TR_MultipleCallTargetInliner::inlineCallTargets(TR::ResolvedMethodSymbol*, TR_CallStack*, TR_InnerPreexistenceInfo*)+0x594 [0x7f3176b77e9e]
#12: function TR_InlinerBase::performInlining(TR::ResolvedMethodSymbol*)+0xdf [0x7f3177003dbb]
#13: function TR_Inliner::perform()+0x15d [0x7f3176b74d6d]
#14: function OMR::Optimizer::performOptimization(OptimizationStrategy const*, int, int, int)+0x2c77 [0x7f31770ea7a7]
#15: function OMR::Optimizer::optimize()+0x856 [0x7f31770e7252]
#16: function OMR::Compilation::performOptimizations()+0x139 [0x7f3176e7be7b]
#17: function OMR::Compilation::compile()+0xc35 [0x7f3176e7ae23]
#18: function TR::CompilationInfoPerThreadBase::compile(J9VMThread*, TR::Compilation*, TR_ResolvedMethod*, TR_J9VMBase&, TR_OptimizationPlan*, TR::SegmentAllocator const&)+0xba0 [0x7f3176a7901c]
#19: function TR::CompilationInfoPerThreadBase::wrappedCompile(J9PortLibrary*, void*)+0x2391 [0x7f3176a77a5f]
#20: /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9prt29.so(+0x21e18) [0x7f317d057e18]
#21: function TR::CompilationInfoPerThreadBase::compile(J9VMThread*, TR_MethodToBeCompiled*, J9::J9SegmentProvider&)+0x60b [0x7f3176a752ab]
#22: function TR::CompilationInfoPerThread::processEntry(TR_MethodToBeCompiled&, J9::J9SegmentProvider&)+0x4ac [0x7f3176a6c44e]
#23: function TR::CompilationInfoPerThread::processEntries()+0x23c [0x7f3176a6b764]
#24: function TR::CompilationInfoPerThread::run()+0x45 [0x7f3176a6b011]
#25: function protectedCompilationThreadProc(J9PortLibrary*, TR::CompilationInfoPerThread*)+0x1e6 [0x7f3176a6ae1a]
#26: /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9prt29.so(+0x21e18) [0x7f317d057e18]
#27: function compilationThreadProc(void*)+0x48d [0x7f3176a6abd8]
#28: /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9thr29.so(+0xe2c6) [0x7f317d4c52c6]
#29: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f317f5dc6ba]
java: /team/ive/axxon/jvmxa6480-ib-403141-jit-debug/jre/lib/amd64/compressedrefs/compiler/../omr/compiler/infra/Assert.cpp:72: void TR::trap(): Assertion `0' failed.
It would be nice to get JIT hands to identify the native stack and what it was blocked on.
FYI: @andrewcraik @fjeremic
Unfortunately debug JIT has not been tested in so long it definitely now has bugs in it. This one is an optimizer bug somewhere. Do you need a debug JIT however for a backtrace? You should be able to get it for the default JIT, at least the symbols. You may not be able to get line information.
This is the JIT thread backtrace, there is no symbols.
#0 0x00007f148e3e58fd in ?? ()
#1 0x0000000000000001 in ?? ()
#2 0x00000000ffb3f940 in ?? ()
#3 0x00000000ffb3f940 in ?? ()
#4 0x00007f148e32b571 in ?? ()
#5 0x00000000ffb3fa60 in ?? ()
#6 0x0000000000000001 in ?? ()
#7 0x00000000ffb3fa40 in ?? ()
#8 0x00000000ffb3fad0 in ?? ()
#9 0x00000000ffb3fa60 in ?? ()
#10 0x00000000ffb3fa60 in ?? ()
#11 0x00000000ffb3fb18 in ?? ()
#12 0x00007f14a1a65050 in returnFromJITConstructor0 () from /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9jit29.so
#13 0x00000000ffb3f940 in ?? ()
#14 0x00000000ffb3f790 in ?? ()
#15 0x00000000ffb3f790 in ?? ()
#16 0x00000000019d33d9 in ?? ()
#17 0x00000000019d3450 in ?? ()
#18 0x00000000019c4128 in ?? ()
#19 0x00007f14a49d2822 in ?? ()
#20 0x0000000000000000 in ?? ()
This is the JIT thread backtrace, there is no symbols.
What does gdb say about the core? If you load the core up in gdb and set your solib-search-path correctly so that info sharedlibrary shows the symbols for libj9jit29.so are loaded you can make a script using the above symbols like this:
define printBackTrace
info symbol 0x00007f148e3e58fd
info symbol 0x0000000000000001
info symbol 0x00000000ffb3f940
info symbol 0x00000000ffb3f940
info symbol 0x00007f148e32b571
info symbol 0x00000000ffb3fa60
info symbol 0x0000000000000001
info symbol 0x00000000ffb3fa40
info symbol 0x00000000ffb3fad0
info symbol 0x00000000ffb3fa60
info symbol 0x00000000ffb3fa60
info symbol 0x00000000ffb3fb18
info symbol 0x00007f14a1a65050
info symbol 0x00000000ffb3f940
info symbol 0x00000000ffb3f790
info symbol 0x00000000ffb3f790
info symbol 0x00000000019d33d9
info symbol 0x00000000019d3450
info symbol 0x00000000019c4128
info symbol 0x00007f14a49d2822
info symbol 0x0000000000000000
end
Then run the printBackTrace script and it will dump the backtrace in a human readable form.
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007f14a12c0290 0x00007f14a1a96cba Yes (*) /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9jit29.so
(gdb) bt
#11 0x00000000ffb3fb18 in ?? ()
#12 0x00007f14a1a65050 in returnFromJITConstructor0 () from /home/jenkins/jasonf/jvmxa6480-ib-403141-V0/jre/lib/amd64/compressedrefs/libj9jit29.so
#13 0x00000000ffb3f940 in ?? ()
(gdb) info symbol 0x00000000ffb3fb18
No symbol matches 0x00000000ffb3fb18.
(gdb) info symbol 0x00000000ffb3f940
No symbol matches 0x00000000ffb3f940.
@JasonFengJ9 apologies when you said native stack did you mean the JIT generated code or code within the libj9jit29.so JIT module itself? If the former the above won't work and we'll need someone familiar with x86 to take a peek for you with kca or similar tool which understands JIT generated code backtraces.
Judging from the pointers it looks like they're JIT generated offsets (0x00007f14a1a65050 vs. 0x00000000ffb3f940).
Thanks @fjeremic it appears JIT generated code instead of JIT code itself.
Just run a 100x grinder with option -Xjit:exclude={*SinglePixelPackedSampleModel*} and they all passed. Considering the fail rate was 2 out 25 before, I would ask JIT investigate the generated code in question. Please ping me if any info is required.
Additional note: -Xjit:exclude={*SinglePixelPackedSampleModel*getPixel*} is sufficient to make the test pass.
@0dvictor @andrewcraik we'll need help to investigate this one.
It seems to be an optimizer issue as it happens on both X86 and Power.
@JasonFengJ9 are you able to collect logs? If you are, a JIT trace with -Xjit:"{*SinglePixelPackedSampleModel*getPixel*}(traceFull,log=trace)" would be very helpful.
Collected jit trace data and passed to @0dvictor
Analyzed the trace log above:
Identified a ~dead~ infinite loop at post-optimization trees:
n1242n BBStart <block_55> (freq 7612) (in loop 55) [0x7efc800c43f0] bci=[2,79,454] rc=0 vc=1418 vn=- li=-2 udi=- nc=1
n2605n GlRegDeps [0x7efc804dee00] bci=[2,79,454] rc=1 vc=1418 vn=- li=-1 udi=- nc=2
n2606n iRegLoad r8d <auto slot 6>[#763 Auto] [flags 0x3 0x0 ] (cannotOverflow SeenRealReference ) [0x7efc804dee50] bci=[2,83,454] rc=3 vc=1418 vn=- li=1 udi=- nc=0 flg=0x9000
n2607n aRegLoad ebx <auto slot 5>[#758 Auto] [flags 0x4007 0x0 ] (X!=0 SeenRealReference sharedMemory ) [0x7efc804deea0] bci=[2,79,454] rc=3 vc=1418 vn=- li=1 udi=- nc=0 flg=0x8004
n1294n asynccheck jitCheckAsyncMessages[#22 helper Method] [flags 0x400 0x0 ] [0x7efc800c5430] bci=[2,105,453] rc=0 vc=1418 vn=- li=- udi=- nc=0
n1251n compressedRefs [0x7efc800c46c0] bci=[2,86,454] rc=0 vc=1418 vn=- li=- udi=- nc=2
n1249n aloadi java/awt/image/SinglePixelPackedSampleModel.bitMasks [I[#765 Shadow +24] [flags 0x80607 0x0 ] [0x7efc800c4620] bci=[2,86,454] rc=2 vc=1418 vn=- li=- udi=- nc=1
n1248n aload m1<auto slot 2>[#623 Auto] [flags 0x4007 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c45d0] bci=[2,85,454] rc=3 vc=1418 vn=- li=- udi=54 nc=0 flg=0x104
n1250n lconst 0 (highWordZero X==0 X>=0 X<=0 ) [0x7efc800c4670] bci=[2,86,454] rc=2 vc=1418 vn=- li=- udi=- nc=0 flg=0x4302
n1267n compressedRefs [0x7efc800c4bc0] bci=[2,94,454] rc=0 vc=1418 vn=- li=- udi=- nc=2
n1265n aloadi java/awt/image/SinglePixelPackedSampleModel.bitOffsets [I[#766 Shadow +28] [flags 0x80607 0x0 ] [0x7efc800c4b20] bci=[2,94,454] rc=2 vc=1418 vn=- li=- udi=- nc=1
n1248n ==>aload
n1250n ==>lconst 0
n1289n istorei <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] [0x7efc800c52a0] bci=[2,101,454] rc=0 vc=1418 vn=- li=- udi=- nc=2
n1288n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c5250] bci=[2,101,454] rc=1 vc=1418 vn=- li=- udi=- nc=2 flg=0x8100
n2607n ==>aRegLoad
n1287n lconst 8 (highWordZero X!=0 X>=0 ) [0x7efc800c5200] bci=[2,101,454] rc=3 vc=1418 vn=- li=- udi=- nc=0 flg=0x4104
n1279n iushr [0x7efc800c4f80] bci=[2,100,454] rc=1 vc=1418 vn=- li=- udi=- nc=2
n1263n iand [0x7efc800c4a80] bci=[2,92,454] rc=1 vc=1418 vn=- li=- udi=- nc=2
n1262n iloadi <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] (cannotOverflow ) [0x7efc800c4a30] bci=[2,91,454] rc=1 vc=1418 vn=- li=- udi=- nc=1 flg=0x1000
n1261n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c49e0] bci=[2,91,454] rc=1 vc=1418 vn=- li=- udi=- nc=2 flg=0x8100
n1249n ==>aloadi
n1287n ==>lconst 8
n2606n ==>iRegLoad
n1278n iloadi <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] (cannotOverflow ) [0x7efc800c4f30] bci=[2,99,454] rc=1 vc=1418 vn=- li=- udi=- nc=1 flg=0x1000
n1277n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c4ee0] bci=[2,99,454] rc=1 vc=1418 vn=- li=- udi=- nc=2 flg=0x8100
n1265n ==>aloadi
n1287n ==>lconst 8
n1243n BBEnd </block_55> [0x7efc800c4440] bci=[2,105,453] rc=0 vc=1418 vn=- li=- udi=- nc=0
n1527n BBStart <block_74> (freq 6316) (extension of previous block) (in loop 55) [0x7efc800c9d00] bci=[2,70,453] rc=0 vc=1418 vn=- li=- udi=- nc=0
n1529n ificmpgt --> block_55 BBStart at n1242n () [0x7efc800c9da0] bci=[2,76,453] rc=0 vc=1418 vn=- li=- udi=- nc=3 flg=0x20020
n1531n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] (cannotOverflow ) [0x7efc800c9e40] bci=[2,73,453] rc=1 vc=1418 vn=- li=- udi=- nc=1 flg=0x1000
n1248n ==>aload
n1280n iconst 1 (X!=0 X>=0 ) [0x7efc800c4fd0] bci=[2,101,454] rc=1 vc=1418 vn=- li=- udi=- nc=0 flg=0x104
n2609n GlRegDeps [0x7efc804def40] bci=[2,76,453] rc=1 vc=1418 vn=- li=- udi=- nc=2
n2606n ==>iRegLoad
n2607n ==>aRegLoad
n1528n BBEnd </block_74> [0x7efc800c9d50] bci=[2,70,453] rc=0 vc=1418 vn=- li=- udi=- nc=0
Node n1529n jumps back to the loop header when java/awt/image/SinglePixelPackedSampleModel.numBands is greater than constant 1 while numBands is not updated in the loop.
n1280n iconst 1 (X!=0 X>=0 ) first occurred in LocalCSE. Before LocalCSE it was:
n1293n istore <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800c53e0] bci=[2,102,453] rc=0 vc=710 vn=- li=- udi=22 nc=1
n1280n ==>iconst 1
n1243n BBEnd </block_55> [0x7efc800c4440] bci=[2,105,453] rc=0 vc=705 vn=- li=- udi=- nc=0
n1527n BBStart <block_74> (freq 6316) (extension of previous block) [0x7efc800c9d00] bci=[2,70,453] rc=0 vc=710 vn=- li=- udi=- nc=0
n1529n ificmplt --> block_55 BBStart at n1242n () [0x7efc800c9da0] bci=[2,76,453] rc=0 vc=710 vn=- li=- udi=- nc=2 flg=0x20
n1530n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow ) [0x7efc800c9df0] bci=[2,70,453] rc=1 vc=710 vn=- li=- udi=121 nc=0 flg=0x1100
n1531n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] (cannotOverflow ) [0x7efc800c9e40] bci=[2,73,453] rc=1 vc=710 vn=- li=- udi=- nc=1 flg=0x1000
n1532n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c9e90] bci=[2,72,453] rc=1 vc=710 vn=- li=- udi=122 nc=0 flg=0x104
n1528n BBEnd </block_74> [0x7efc800c9d50] bci=[2,70,453] rc=0 vc=705 vn=- li=- udi=- nc=0
This transformation looks valid.
Then I inspected node n1293n, <auto slot 7> became storing the constant in GlobalValuePropagation. Before the optimization, it was:
n1326n BBStart <block_52> (freq 3687) [0x7efc800c5e30] bci=[2,62,452] rc=0 vc=138 vn=- li=- udi=- nc=0
n1238n istore <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800c42b0] bci=[2,68,453] rc=0 vc=138 vn=- li=- udi=- nc=1
n1237n iconst 0 [0x7efc800c4260] bci=[2,67,453] rc=1 vc=138 vn=- li=- udi=- nc=0
n1177n BBEnd </block_52> ===== [0x7efc800c2fa0] bci=[2,68,453] rc=0 vc=138 vn=- li=- udi=- nc=0
n1180n BBStart <block_56> (freq 10000) [0x7efc800c3090] bci=[2,70,453] rc=0 vc=138 vn=- li=- udi=- nc=0
n1244n ificmpge --> block_54 BBStart at n1178n () [0x7efc800c4490] bci=[2,76,453] rc=0 vc=138 vn=- li=- udi=- nc=2 flg=0x20
n1239n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800c4300] bci=[2,70,453] rc=1 vc=138 vn=- li=- udi=- nc=0
n1241n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] [0x7efc800c43a0] bci=[2,73,453] rc=1 vc=138 vn=- li=- udi=- nc=1
n1240n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c4350] bci=[2,72,453] rc=1 vc=138 vn=- li=- udi=- nc=0 flg=0x104
n1181n BBEnd </block_56> ===== [0x7efc800c30e0] bci=[2,76,453] rc=0 vc=138 vn=- li=- udi=- nc=0
n1242n BBStart <block_55> (freq 7612) [0x7efc800c43f0] bci=[2,79,454] rc=0 vc=138 vn=- li=- udi=- nc=0
n1294n asynccheck jitCheckAsyncMessages[#22 helper Method] [flags 0x400 0x0 ] [0x7efc800c5430] bci=[2,105,453] rc=0 vc=151 vn=- li=- udi=- nc=0
n1251n compressedRefs [0x7efc800c46c0] bci=[2,86,454] rc=0 vc=151 vn=- li=- udi=- nc=2
n1249n aloadi java/awt/image/SinglePixelPackedSampleModel.bitMasks [I[#765 Shadow +24] [flags 0x80607 0x0 ] [0x7efc800c4620] bci=[2,86,454] rc=3 vc=151 vn=- li=- udi=- nc=1
n1248n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c45d0] bci=[2,85,454] rc=2 vc=151 vn=- li=- udi=- nc=0 flg=0x104
n1250n lconst 0 (highWordZero ) [0x7efc800c4670] bci=[2,86,454] rc=2 vc=151 vn=- li=- udi=- nc=0 flg=0x4000
n1254n NULLCHK on n1249n [#30] [0x7efc800c47b0] bci=[2,91,454] rc=0 vc=151 vn=- li=- udi=- nc=1
n1253n arraylength (stride 4) [0x7efc800c4760] bci=[2,91,454] rc=2 vc=151 vn=- li=- udi=- nc=1
n1249n ==>aloadi
n1255n BNDCHK [#1] [0x7efc800c4800] bci=[2,91,454] rc=0 vc=151 vn=- li=- udi=- nc=2
n1253n ==>arraylength
n1252n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800c4710] bci=[2,89,454] rc=5 vc=151 vn=- li=- udi=- nc=0
n1267n compressedRefs [0x7efc800c4bc0] bci=[2,94,454] rc=0 vc=151 vn=- li=- udi=- nc=2
n1265n aloadi java/awt/image/SinglePixelPackedSampleModel.bitOffsets [I[#766 Shadow +28] [flags 0x80607 0x0 ] [0x7efc800c4b20] bci=[2,94,454] rc=3 vc=151 vn=- li=- udi=- nc=1
n1248n ==>aload
n1250n ==>lconst 0
n1270n NULLCHK on n1265n [#30] [0x7efc800c4cb0] bci=[2,99,454] rc=0 vc=151 vn=- li=- udi=- nc=1
n1269n arraylength (stride 4) [0x7efc800c4c60] bci=[2,99,454] rc=2 vc=151 vn=- li=- udi=- nc=1
n1265n ==>aloadi
n1271n BNDCHK [#1] [0x7efc800c4d00] bci=[2,99,454] rc=0 vc=151 vn=- li=- udi=- nc=2
n1269n ==>arraylength
n1252n ==>iload
n1281n NULLCHK on n1245n [#30] [0x7efc800c5020] bci=[2,101,454] rc=0 vc=151 vn=- li=- udi=- nc=1
n1280n arraylength (stride 4) [0x7efc800c4fd0] bci=[2,101,454] rc=2 vc=151 vn=- li=- udi=- nc=1
n1245n aload <auto slot 5>[#758 Auto] [flags 0x7 0x0 ] [0x7efc800c44e0] bci=[2,79,454] rc=2 vc=151 vn=- li=- udi=- nc=0
n1282n BNDCHK [#1] [0x7efc800c5070] bci=[2,101,454] rc=0 vc=151 vn=- li=- udi=- nc=2
n1280n ==>arraylength
n1252n ==>iload
n1289n istorei <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] [0x7efc800c52a0] bci=[2,101,454] rc=0 vc=151 vn=- li=- udi=- nc=2
n1288n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c5250] bci=[2,101,454] rc=1 vc=151 vn=- li=- udi=- nc=2 flg=0x8100
n1245n ==>aload
n1287n lsub [0x7efc800c5200] bci=[2,101,454] rc=3 vc=151 vn=- li=- udi=- nc=2
n1285n lmul [0x7efc800c5160] bci=[2,101,454] rc=1 vc=151 vn=- li=- udi=- nc=2
n1284n i2l (X>=0 ) [0x7efc800c5110] bci=[2,101,454] rc=1 vc=151 vn=- li=- udi=- nc=1 flg=0x100
n1252n ==>iload
n1283n lconst 4 (highWordZero ) [0x7efc800c50c0] bci=[2,101,454] rc=1 vc=151 vn=- li=- udi=- nc=0 flg=0x4000
n1286n lconst -8 [0x7efc800c51b0] bci=[2,101,454] rc=1 vc=151 vn=- li=- udi=- nc=0
n1279n iushr [0x7efc800c4f80] bci=[2,100,454] rc=1 vc=151 vn=- li=- udi=- nc=2
n1263n iand [0x7efc800c4a80] bci=[2,92,454] rc=1 vc=151 vn=- li=- udi=- nc=2
n1262n iloadi <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] [0x7efc800c4a30] bci=[2,91,454] rc=1 vc=151 vn=- li=- udi=- nc=1
n1261n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c49e0] bci=[2,91,454] rc=1 vc=151 vn=- li=- udi=- nc=2 flg=0x8100
n1249n ==>aloadi
n1287n ==>lsub
n1247n iload <auto slot 6>[#763 Auto] [flags 0x3 0x0 ] [0x7efc800c4580] bci=[2,83,454] rc=1 vc=151 vn=- li=- udi=- nc=0
n1278n iloadi <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] [0x7efc800c4f30] bci=[2,99,454] rc=1 vc=151 vn=- li=- udi=- nc=1
n1277n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c4ee0] bci=[2,99,454] rc=1 vc=151 vn=- li=- udi=- nc=2 flg=0x8100
n1265n ==>aloadi
n1287n ==>lsub
n1293n istore <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800c53e0] bci=[2,102,453] rc=0 vc=151 vn=- li=- udi=- nc=1
n1292n isub [0x7efc800c5390] bci=[2,102,453] rc=1 vc=151 vn=- li=- udi=- nc=2
n1252n ==>iload
n1291n iconst -1 [0x7efc800c5340] bci=[2,102,453] rc=1 vc=151 vn=- li=- udi=- nc=0
n1295n goto --> block_56 BBStart at n1180n [0x7efc800c5480] bci=[2,105,453] rc=0 vc=151 vn=- li=- udi=- nc=0
n1243n BBEnd </block_55> ===== [0x7efc800c4440] bci=[2,105,453] rc=0 vc=151 vn=- li=- udi=- nc=0
…
n1180n BBStart <block_56> (freq 10000) (in loop 56) [0x7efc800c3090] bci=[2,70,453] rc=0 vc=215 vn=- li=- udi=- nc=0
n1244n ificmpge --> block_54 BBStart at n1178n () [0x7efc800c4490] bci=[2,76,453] rc=0 vc=215 vn=- li=- udi=- nc=2 flg=0x20
n1239n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow ) [0x7efc800c4300] bci=[2,70,453] rc=1 vc=215 vn=- li=34 udi=75 nc=0 flg=0x1100
n1241n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] (cannotOverflow ) [0x7efc800c43a0] bci=[2,73,453] rc=1 vc=215 vn=- li=- udi=- nc=1 flg=0x1000
n1240n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c4350] bci=[2,72,453] rc=1 vc=215 vn=- li=- udi=76 nc=0 flg=0x104
n1181n BBEnd </block_56> ===== [0x7efc800c30e0] bci=[2,76,453] rc=0 vc=212 vn=- li=- udi=- nc=0
[ 1924] O^O VALUE PROPAGATION: Constant folding isub [00007EFC800C5390] to iconst 1
This transformation seems invalid.
Further analyzed the trace logs, it turned out Global Value Propagation did the correct transformation, as the loop induction variable is indeed only updated once and hence can be turned into a constant.
The second interaction of the loop should exit due to a BNDCHK. Last tree containing this BNDCHK is before Last Loop Versioner Group:
n1180n BBStart <block_56> (freq 10000) (in loop 56) [0x7efc800c3090] bci=[2,70,453] rc=0 vc=459 vn=- li=- udi=- nc=0
n1244n ificmpge --> block_54 BBStart at n1178n () [0x7efc800c4490] bci=[2,76,453] rc=0 vc=459 vn=- li=- udi=- nc=2 flg=0x20
n1239n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow ) [0x7efc800c4300] bci=[2,70,453] rc=1 vc=459 vn=- li=1 udi=75 nc=0 flg=0x1100
n1241n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] (cannotOverflow ) [0x7efc800c43a0] bci=[2,73,453] rc=1 vc=459 vn=- li=1 udi=- nc=1 flg=0x1000
n1240n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c4350] bci=[2,72,453] rc=1 vc=459 vn=- li=1 udi=76 nc=0 flg=0x104
n1181n BBEnd </block_56> ===== [0x7efc800c30e0] bci=[2,76,453] rc=0 vc=459 vn=- li=- udi=- nc=0
n1242n BBStart <block_55> (freq 7612) (in loop 56) [0x7efc800c43f0] bci=[2,79,454] rc=0 vc=459 vn=- li=- udi=- nc=0
n1294n asynccheck jitCheckAsyncMessages[#22 helper Method] [flags 0x400 0x0 ] [0x7efc800c5430] bci=[2,105,453] rc=0 vc=459 vn=- li=- udi=- nc=0
n1251n compressedRefs [0x7efc800c46c0] bci=[2,86,454] rc=0 vc=459 vn=- li=- udi=- nc=2
n1249n aloadi java/awt/image/SinglePixelPackedSampleModel.bitMasks [I[#765 Shadow +24] [flags 0x80607 0x0 ] [0x7efc800c4620] bci=[2,86,454] rc=3 vc=459 vn=- li=1 udi=- nc=1
n1248n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c45d0] bci=[2,85,454] rc=2 vc=459 vn=- li=1 udi=77 nc=0 flg=0x104
n1250n lconst 0 (highWordZero X==0 X>=0 X<=0 ) [0x7efc800c4670] bci=[2,86,454] rc=2 vc=459 vn=- li=1 udi=- nc=0 flg=0x4302
n1254n NULLCHK on n1249n [#30] [0x7efc800c47b0] bci=[2,91,454] rc=0 vc=459 vn=- li=- udi=- nc=1
n1253n arraylength (stride 4) (X>=0 cannotOverflow ) [0x7efc800c4760] bci=[2,91,454] rc=2 vc=459 vn=- li=1 udi=- nc=1 flg=0x1100
n1249n ==>aloadi
n1255n BNDCHK [#1] [0x7efc800c4800] bci=[2,91,454] rc=0 vc=459 vn=- li=- udi=- nc=2
n1253n ==>arraylength
n1252n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow ) [0x7efc800c4710] bci=[2,89,454] rc=5 vc=459 vn=- li=1 udi=78 nc=0 flg=0x1100
n1267n compressedRefs [0x7efc800c4bc0] bci=[2,94,454] rc=0 vc=459 vn=- li=- udi=- nc=2
n1265n aloadi java/awt/image/SinglePixelPackedSampleModel.bitOffsets [I[#766 Shadow +28] [flags 0x80607 0x0 ] [0x7efc800c4b20] bci=[2,94,454] rc=3 vc=459 vn=- li=1 udi=- nc=1
n1248n ==>aload
n1250n ==>lconst 0
n1270n NULLCHK on n1265n [#30] [0x7efc800c4cb0] bci=[2,99,454] rc=0 vc=459 vn=- li=- udi=- nc=1
n1269n arraylength (stride 4) (X>=0 cannotOverflow ) [0x7efc800c4c60] bci=[2,99,454] rc=2 vc=459 vn=- li=1 udi=- nc=1 flg=0x1100
n1265n ==>aloadi
n1271n BNDCHK [#1] [0x7efc800c4d00] bci=[2,99,454] rc=0 vc=459 vn=- li=- udi=- nc=2
n1269n ==>arraylength
n1252n ==>iload
n1492n treetop [0x7efc800c9210] bci=[2,79,454] rc=0 vc=459 vn=- li=- udi=- nc=1
n1245n aload <auto slot 5>[#758 Auto] [flags 0x7 0x0 ] (X!=0 sharedMemory ) [0x7efc800c44e0] bci=[2,79,454] rc=2 vc=459 vn=- li=1 udi=79 nc=0 flg=0x4
n1281n treetop [0x7efc800c5020] bci=[2,101,454] rc=0 vc=459 vn=- li=- udi=- nc=1
n1280n iconst 1 (X!=0 X>=0 ) [0x7efc800c4fd0] bci=[2,101,454] rc=2 vc=459 vn=- li=1 udi=- nc=0 flg=0x104
n1282n BNDCHK [#1] [0x7efc800c5070] bci=[2,101,454] rc=0 vc=459 vn=- li=- udi=- nc=2
n1280n ==>iconst 1
n1252n ==>iload
n1495n treetop [0x7efc800c9300] bci=[2,89,454] rc=0 vc=459 vn=- li=- udi=- nc=1
n1252n ==>iload
n1289n istorei <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] [0x7efc800c52a0] bci=[2,101,454] rc=0 vc=459 vn=- li=- udi=- nc=2
n1288n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c5250] bci=[2,101,454] rc=1 vc=459 vn=- li=1 udi=- nc=2 flg=0x8100
n1245n ==>aload
n1287n lconst 8 (highWordZero X!=0 X>=0 ) [0x7efc800c5200] bci=[2,101,454] rc=3 vc=459 vn=- li=1 udi=- nc=0 flg=0x4104
n1279n iushr [0x7efc800c4f80] bci=[2,100,454] rc=1 vc=459 vn=- li=1 udi=- nc=2
n1263n iand [0x7efc800c4a80] bci=[2,92,454] rc=1 vc=459 vn=- li=1 udi=- nc=2
n1262n iloadi <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] (cannotOverflow ) [0x7efc800c4a30] bci=[2,91,454] rc=1 vc=459 vn=- li=1 udi=- nc=1 flg=0x1000
n1261n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c49e0] bci=[2,91,454] rc=1 vc=459 vn=- li=1 udi=- nc=2 flg=0x8100
n1249n ==>aloadi
n1287n ==>lconst 8
n1247n iload <auto slot 6>[#763 Auto] [flags 0x3 0x0 ] (cannotOverflow ) [0x7efc800c4580] bci=[2,83,454] rc=1 vc=459 vn=- li=1 udi=80 nc=0 flg=0x1000
n1278n iloadi <array-shadow>[#482 Shadow] [flags 0x80000603 0x0 ] (cannotOverflow ) [0x7efc800c4f30] bci=[2,99,454] rc=1 vc=459 vn=- li=1 udi=- nc=1 flg=0x1000
n1277n aladd (X>=0 internalPtr sharedMemory ) [0x7efc800c4ee0] bci=[2,99,454] rc=1 vc=459 vn=- li=1 udi=- nc=2 flg=0x8100
n1265n ==>aloadi
n1287n ==>lconst 8
n1498n treetop [0x7efc800c93f0] bci=[2,89,454] rc=0 vc=459 vn=- li=- udi=- nc=1
n1252n ==>iload
n1293n istore <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800c53e0] bci=[2,102,453] rc=0 vc=459 vn=- li=- udi=24 nc=1
n1292n iconst 1 (X!=0 X>=0 ) [0x7efc800c5390] bci=[2,102,453] rc=1 vc=459 vn=- li=1 udi=- nc=0 flg=0x104
n1295n goto --> block_56 BBStart at n1180n [0x7efc800c5480] bci=[2,105,453] rc=0 vc=459 vn=- li=- udi=- nc=0
n1243n BBEnd </block_55> ===== [0x7efc800c4440] bci=[2,105,453] rc=0 vc=459 vn=- li=- udi=- nc=0
n1282n is the BNDCHK that throws an exception when the induction variable (auto slot 7) is 1, during the second iteration.
Loop Versioner versioned out the BNDCHK:
[ 1964] O^O LOOP VERSIONER: Creating test outside loop for checking if 00007EFC800C5070 exceeds bounds
1: The node 00007EFC800CAF70 has been created for testing if exceed bounds
loopLimit has been adjusted to 00007EFC800CB600
2: The node 00007EFC800CB6A0 has been created for testing if exceed bounds
3: The node 00007EFC800CC140 has been created for testing if exceed bounds
The new bound check looks invalid, for example:
n1870n BBStart <block_93> (freq 3684) [0x7efc80270840] bci=[2,68,453] rc=0 vc=507 vn=- li=- udi=- nc=0
n1609n ifiucmpgt --> block_90 BBStart at n1872n () [0x7efc800cb6a0] bci=[2,101,454] rc=0 vc=507 vn=- li=- udi=- nc=2 flg=0x10020
n1607n isub [0x7efc800cb600] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=2
n1606n iadd [0x7efc800cb5b0] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=2
n1605n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800cb560] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=0
n1603n imul [0x7efc800cb4c0] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1602n iadd [0x7efc800cb470] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1598n iadd [0x7efc800cb330] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1595n idiv [0x7efc800cb240] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1591n isub [0x7efc800cb100] bci=[2,73,453] rc=2 vc=507 vn=- li=- udi=- nc=2
n1587n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] (cannotOverflow ) [0x7efc800cafc0] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=1 flg=0x1000
n1588n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800cb010] bci=[2,72,453] rc=1 vc=507 vn=- li=- udi=76 nc=0 flg=0x104
n1590n iload <auto slot 7>[#764 Auto] [flags 0x3 0x0 ] [0x7efc800cb0b0] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=0
n1592n iconst 0 [0x7efc800cb150] bci=[2,101,454] rc=2 vc=507 vn=- li=- udi=- nc=0
n1597n icmpne [0x7efc800cb2e0] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1596n irem [0x7efc800cb290] bci=[2,73,453] rc=2 vc=507 vn=- li=- udi=- nc=2
n1591n ==>isub
n1592n ==>iconst 0
n1594n iconst 0 [0x7efc800cb1f0] bci=[2,101,454] rc=2 vc=507 vn=- li=- udi=- nc=0
n1601n iand [0x7efc800cb420] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1599n icmpeq [0x7efc800cb380] bci=[2,73,453] rc=1 vc=507 vn=- li=- udi=- nc=2
n1596n ==>irem
n1594n ==>iconst 0
n1600n iconst 0 [0x7efc800cb3d0] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=0
n1593n iconst 0 [0x7efc800cb1a0] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=0
n1604n iconst -1 [0x7efc800cb510] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=0
n1608n iconst 1 (X!=0 X>=0 ) [0x7efc800cb650] bci=[2,101,454] rc=1 vc=507 vn=- li=- udi=- nc=0 flg=0x104
n1871n BBEnd </block_93> ===== [0x7efc80270890] bci=[2,68,453] rc=0 vc=507 vn=- li=- udi=- nc=0
Note the divide-by-zeros occur on node n1595n and n1596n.
As the first transformation of the Last Loop Versioner Group, Induction Variable Analysis produced a suspicious log:
<optimization id=40 name=inductionVariableAnalysis method=gnu/testlet/java/awt/image/SinglePixelPackedSampleModel/getPixel.test(Lgnu/testlet/TestHarness;)V>
Performing 40: inductionVariableAnalysis
Loop Controlling Induction Variable 764 (00007EFC801E70E0):
Branch Block is 38 (00007EFC801050E0)
EntryValue:
n1512n iconst 0 (X==0 X>=0 X<=0 ) [0x7efc800c9850] bci=[0,67,453] rc=0 vc=459 vn=- li=- udi=- nc=0 flg=0x302
ExitBound:
n1513n iloadi java/awt/image/SinglePixelPackedSampleModel.numBands I[#757 Shadow +16] [flags 0x603 0x0 ] (cannotOverflow ) [0x7efc800c98a0] bci=[0,73,453] rc=0 vc=465 vn=- li=- udi=- nc=1 flg=0x1000
n1514n aload m1<auto slot 2>[#623 Auto] [flags 0x7 0x0 ] (X!=0 X>=0 sharedMemory ) [0x7efc800c98f0] bci=[0,72,453] rc=1 vc=465 vn=- li=- udi=52 nc=0 flg=0x104
DeltaOnBackEdge: 1
DeltaOnExitEdge: 0
UsesUnchangedValueInLoopTest: 0
No transformations done by this pass -- omitting listings
</optimization>
MauveSingleThreadLoadTest is temporarily excluded in https://github.com/AdoptOpenJDK/openjdk-tests/pull/746
Is the machine (ub16-x86-1) still required for debugging or can it be re-enabled in the Jenkins machine pool?
@AdamBrousseau the investigation hasn't finished, JIT might still collect log if required, @0dvictor, please advise if otherwise.
Sorry we are still investigating the issues and hence may still need the machine.
@0dvictor do we still need this machine? it has been disabled from the jenkins machine pool for more than a month.
@JasonFengJ9 Thanks for reminding. I think I've got enough logs for the investigation. We can return the machine.
@AdamBrousseau please re-enable the machine.
@0dvictor Is there any chance to resolve this in the next week?
What is the severity of this issue, how likely is it to affect a customer?
Much of the following is recap from @0dvictor's analysis above.
The second iteration of the loop can't finish (or reach the counter update) because a BNDCHK will fail first.
GVP (correctly) creates an IV for the loop, with entry: 0, exit: {0, 1}, increment: 1. Then it (also correctly) folds the new loop counter value to 1.
Since the update is constant, IVA doesn't detect an IV for the loop. However, it leaves the existing IV in place.
Finally, versioner versions the BNDCHK because it's testing an IV that occurs in the loop test against a loop-invariant bound. Versioner doesn't recognize the IV update tree, so the versioning test it generates is wrong - it has a bunch of zeroes where the step should be. It has div and rem by zero, but we don't trip over them because they're beneath a mul by zero and they get folded away.
The broken versioning test passes and we enter the hot loop, which now runs forever because the counter stays at 1 and the BNDCHK is gone.
At least one of the optimizations above must have done something wrong, but we need to decide which.
Personally I found IVA's behaviour surprising, and I think it would be good for versioner to be more robust even if it isn't supposed to need to be.
@andrewcraik, thoughts?
my feeling is that we should do 3 in a form where we check the trees and the IV agree and if they don't we should get conservative with a warning message that always prints in the traceFull log and a good static debug counter. I think having GVP clean up IVs when folding the RHS of the update is also sensible. Whether IVA should remove all previous IVs is less clear - there may well be a compile-time saving to be had by doing incremental rebuilds so we may not want to do that in general - more investigation on point 2 would be needed before we could decide that was the right thing to do or not. FYI @vijaysun-omr
What is the severity of this issue, how likely is it to affect a customer?
In terms of 2, I don't believe induction variable analysis has proven to be particularly significant from a compile time perspective. So we could remove all previous IVs when we run another pass of IVA. I personally don't think we rely on "preserving" past IVs for anything (compile time or anything else) and the mental model the optimizer has is that IVs will be rebuilt from scratch when IVA runs.
For severity, I think typically this should just result in a hang, but the consequences depend on the body of the loop. Effectively we've transformed this loop:
for i in 0..n { BODY }
(where the particular loop body can only run for 1–1.5 iterations), into the following:
i = 0
forever { BODY; i = 1 }
The code that now runs infinitely often (instead of approx. once) can be completely arbitrary, since we can do this even if the body contains a call.
That said, I expect that it should be pretty rare to encounter this particular issue. It requires a counted loop that can't complete two full iterations, in such a way that VP can tell that that's the case, but with block frequencies that suggest the loop iterates often enough to version. Note that the two-iteration bound is tight; if a loop can complete two (or more) full iterations, then there is no way to constant fold the IV update.
As for fix strategy, I've reproduced the problem and when I trace IVA, I can see that it's analyzing the loop, not simply trusting the existing IV. The IV is preserved despite this result in the log:
Reject - could not find a loop controlling test
This seems unintentional to me. So, consistent with @vijaysun-omr's comment, I can't imagine there's any deliberate reuse of analysis results for compile time savings here.
the mental model the optimizer has is that IVs will be rebuilt from scratch when IVA runs.
That was my mental model too. So I think I'd like to fix this mainly by ensuring that IVA doesn't leave stale IVs, with an issue to follow up for extra robustness in VP and versioner.
sounds good to me - thank you both for the analysis
eclipse/omr#3462 has been merged. Here is the repro for posterity.
$ java -Xjit:'limit={*.go(*)*},count=0,tryToInline={*.inline*},dontInline={*.opaque*}' Repro
@jdmpapin are you able to package that up as a unit test and add it to our test bucket so no one else gets hit by this problem in the future?
@llxia can you re-include the test please since the fix has promoted.
The test is included and passed last night.