I've opened this bug at Eclipse, because eclipse crashes during perform of a custom process:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=569886
They had said to me to open also a bug report here because 'it seems a open j9 crash.
They had reported in their answer:
To me it looks like this is a crash in OpenJ9, please report this crash at
OpenJ9 too [1] and add a reference to that bug here.
[1] https://github.com/eclipse/openj9/issues.
1XHEXCPCODE Windows_ExceptionCode: C0000005
1XHEXCPCODE J9Generic_Signal: 00000004
1XHEXCPCODE ExceptionAddress: 00007FF8809F0FBA
1XHEXCPCODE ContextFlags: 0010005F
1XHEXCPCODE Handler1: 00007FF880510D80
1XHEXCPCODE Handler2: 00007FF8B244AC10
1XHEXCPCODE InaccessibleWriteAddress: 00007FF40000002E
NULL
1XHEXCPMODULE Module: E:\AdoptOpenJdk11\bin\compressedrefs\j9jit29.dll
1XHEXCPMODULE Module_base_address: 00007FF880480000 1XHEXCPMODULE Offset_in_DLL: 0000000000570FBA
1XMCURTHDINFO Current thread
3XMTHREADINFO "JIT Compilation Thread-001" J9VMThread:0x0000000002C91200,
omrthread_t:0x00000000186FA4F0, java/lang/Thread:0x0000000500450BA0, state:R,
prio=10
1CIJAVAVERSION JRE 11 Windows 8 amd64-64 (build 11.0.9+11)
1CIVMVERSION 20201022_795
1CIJ9VMTAG openj9-0.23.0
1XHEXCPMODULE Compiling method: org/eclipse/jdt/internal/compiler/util/HashtableOfObject.rehash()V
NULL
1XHFLAGS VM flags:00000000000516FF
vmState [0x516ff]: {J9VMSTATE_JIT_CODEGEN} {partialRedundancyElimination}
@0xdaryl fyi
@a7ehuo : would you mind triaging this problem please? Some of the crash artifacts are available as an attachment on the bugzilla issue above.
@gpezzini : was there a larger Windows crash dump file produced (*.dmp) and is it still available? If so, could you either transfer it via Slack (you can message it to me (0xdaryl) on the OpenJ9 workspace) or your favourite file sharing service (e.g., Google Drive, Box, etc.)?
@gpezzini is there any other diagnostics? There is usually a "javacore." and a "jitdump." file created as well. These names are printed to stderr when the crash happen. They are almost always in the same directory as the core file (core.20201223.085735.16160.0001.dmp).
The "jitdump.*" file in particular could be very useful to aid in the investigation here.
Hi, you will find all files that I had found in:
https://bugs.eclipse.org/bugs/show_bug.cgi?id=569886
Please, let me know if it's enough.
Otherwise I will try to reproduce, even if I'm rather sure which I had posted all files I had found
Thanks
@a7ehuo see the bugzilla link. There is a jitdump.20201223.085735.16160.0004.dmp attached which shows the following:
<optimization id=103 name=partialRedundancyElimination method=org/eclipse/jdt/internal/compiler/util/HashtableOfObject.rehash()V>
Performing 103: partialRedundancyElimination
...
...
Block Number (ordered list) : 56
Unaffected by order <165>
Unaffected by order <169>
Affected by order <166>
Affected by order <168>
Affected by order <167>
Affected by order <170>
</logRecompilation>
</jitDump>
The jitdump recompilation has reproduced the issue and has also enabled tracePRE. We can see the crash happening at the end of the log when the tracing starts. The jitdump along with the core dump should be able to tell us enough about where in the code the problem happened.
Given we know the exact trees which caused the problem I bet you can force a compilation of that method in a unit test and reproduce the issue with some extra JIT options to force the same inlining.
Also @vijaysun-omr FYI another success for the jitdump work!
Great to see @fjeremic I did want to just mention @klangman and @JamesKingdon to make them aware that Filip has been improving the JIT dump functionality in recent months and that you may be able to reap the benefit of that work on the service streams as well.
Thanks @fjeremic for the quick assessment! I was looking at the attached files. There is no core file. jitdump shows the method being compiled is "org/eclipse/jdt/internal/compiler/util/HashtableOfObject.rehash()V" at hot. I'll see if I can reproduce the crash with unit test.
The jitdump recompilation stopped in TR_ExceptionCheckMotion::perform() https://github.com/eclipse/openj9-omr/blob/a9b64bdc8106fa8bda834b195667c6f3d2e9f26f/compiler/optimizer/PartialRedundancy.cpp#L2581
for (j=size;j<orderedListSize;j++)
{
if (trace())
traceMsg(comp(), "Affected by order <%d>\n", nextElement->getData()->getLocalIndex());
_orderedOptNumbersList[i][j] = nextElement->getData()->getLocalIndex();
nextElement = nextElement->getNextElement();
}
The crashed compilation thread stack
<jitDump>
#INFO: Crashed in compilation thread 0000000002C91200.
--------------------
3XMTHREADINFO "JIT Compilation Thread-001" J9VMThread:0x0000000002C91200, omrthread_t:0x00000000186FA4F0, java/lang/Thread:0x0000000500450BA0, state:R, prio=10
3XMJAVALTHREAD (java/lang/Thread getId:0x4, isDaemon:true)
3XMTHREADINFO1 (native thread ID:0x240C, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000000)
3XMCPUTIME CPU usage total: 32.093750000 secs, user: 30.796875000 secs, system: 1.296875000 secs, current category="JIT"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3 No Java callstack associated with this thread
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK Java_java_lang_invoke_MutableCallSite_invalidate+0x4a09fa (0x00007FF8809F0FBA [j9jit29+0x570fba])
4XENATIVESTACK Java_java_lang_invoke_MutableCallSite_invalidate+0x496640 (0x00007FF8809E6C00 [j9jit29+0x566c00])
4XENATIVESTACK Java_java_lang_invoke_MutableCallSite_invalidate+0x4297c9 (0x00007FF880979D89 [j9jit29+0x4f9d89])
4XENATIVESTACK Java_java_lang_invoke_MutableCallSite_invalidate+0x42828f (0x00007FF88097884F [j9jit29+0x4f884f])
4XENATIVESTACK Java_java_lang_invoke_MutableCallSite_invalidate+0x426125 (0x00007FF8809766E5 [j9jit29+0x4f66e5])
4XENATIVESTACK Java_java_lang_invoke_MutableCallSite_invalidate+0x275b89 (0x00007FF8807C6149 [j9jit29+0x346149])
4XENATIVESTACK (0x00007FF8804FD87E [j9jit29+0x7d87e])
4XENATIVESTACK (0x00007FF880500A3C [j9jit29+0x80a3c])
4XENATIVESTACK j9port_isCompatible+0x18946 (0x00007FF8B244B656 [j9prt29+0x1b656])
4XENATIVESTACK j9port_isCompatible+0x1a044 (0x00007FF8B244CD54 [j9prt29+0x1cd54])
4XENATIVESTACK (0x00007FF8804FCF9E [j9jit29+0x7cf9e])
4XENATIVESTACK (0x00007FF880502EBC [j9jit29+0x82ebc])
4XENATIVESTACK (0x00007FF88050289A [j9jit29+0x8289a])
4XENATIVESTACK (0x00007FF880510B3F [j9jit29+0x90b3f])
4XENATIVESTACK j9port_isCompatible+0x1a07f (0x00007FF8B244CD8F [j9prt29+0x1cd8f])
4XENATIVESTACK (0x00007FF88051075D [j9jit29+0x9075d])
4XENATIVESTACK omrthread_get_category+0xa42 (0x00007FF8CED74452 [J9THR29+0x4452])
4XENATIVESTACK _configthreadlocale+0x92 (0x00007FF8FA1F14C2 [ucrtbase+0x214c2])
4XENATIVESTACK BaseThreadInitThunk+0x14 (0x00007FF8FA417034 [KERNEL32+0x17034])
4XENATIVESTACK RtlUserThreadStart+0x21 (0x00007FF8FC2BD0D1 [ntdll+0x4d0d1])
@a7ehuo the windows stack trace above is not accurate; it's because the symbols are in a different place and I guess the javacore writer doesn't know where to grab the symbols from... If you want the real stack trace, given that you have no core file, you'll likely have to manually go through the exe using something like DUMPBIN (similar to objdump on linux).
See https://github.com/eclipse/openj9/issues/11569#issuecomment-754739789 for the core file.
Got the backtrace from the core. It crashed while executing *_actualOptSetInfo[i] |= *_tempContainer; in TR_ExceptionCheckMotion::perform()
_actualOptSetInfo[i] looks corrupted. i is block 56 if I understand it correctly. That's also where jiitdmp recompile stopped processing.
# Child-SP RetAddr Call Site
00 00000000`19193708 00007ff8`f9dba34e ntdll!NtWaitForSingleObject+0x14
01 00000000`19193710 00007ff8`b2444f90 KERNELBASE!WaitForSingleObjectEx+0x8e
02 00000000`191937b0 00007ff8`af1a5f13 j9prt29!omrdump_create(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, char * filename = 0x00000000`19193b80 "E:\TestBuildAllInBatch\core.20201223.085735.16160.0001.dmp", char * dumpType = 0x00000000`02af15c0 "???", void * userData = 0x00000000`00000000)+0x300 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win32\omrosdump.c @ 185]
03 00000000`19193850 00007ff8`af1a5a05 j9dmp29!doSystemDump(struct J9RASdumpAgent * agent = 0x00000000`02af15c0, char * label = 0x00000000`19193b80 "E:\TestBuildAllInBatch\core.20201223.085735.16160.0001.dmp", struct J9RASdumpContext * context = <Value unavailable error>)+0xa3 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\rasdump\dmpagent.c @ 751]
04 00000000`191938b0 00007ff8`b244b656 j9dmp29!protectedDumpFunction(struct J9PortLibrary * portLibrary = 0x00000000`00000000, void * userData = <Value unavailable error>)+0x15 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\rasdump\dmpagent.c @ 2904]
05 00000000`191938e0 00007ff8`b244cd54 j9prt29!runInTryExcept(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00000000`00000000, void * fn_arg = 0x00000000`00000000, <function> * handler = 0x00007ff8`af1a5a60, void * handler_arg = 0x00000000`00000000, unsigned int flags = 0x7d, unsigned int64 * result = 0x00000000`19193b40)+0x16 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 220]
06 00000000`19193920 00007ff8`af1a39a4 j9prt29!omrsig_protect(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00007ff8`af1a59f0, void * fn_arg = 0x00000000`19193b58, <function> * handler = 0x00007ff8`af1a5a60, void * handler_arg = 0x00000000`00000000, unsigned int flags = 0x7d, unsigned int64 * result = 0x00000000`19193b40)+0x214 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 285]
07 (Inline Function) --------`-------- j9dmp29!runDumpFunction(void)+0x6f [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\rasdump\dmpagent.c @ 2878]
08 00000000`19193b00 00007ff8`af1b9119 j9dmp29!runDumpAgent(struct J9JavaVM * vm = 0x00000000`00a77750, struct J9RASdumpAgent * agent = 0x00000000`02af15c0, struct J9RASdumpContext * context = 0x00000000`19194050, unsigned int64 * state = 0x00000000`19194028, char * detail = 0x00000000`191940e0 "", unsigned int64 timeNow = 0x00000176`8e9a748e)+0x2f4 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\rasdump\dmpagent.c @ 2804]
09 00000000`19193fe0 00007ff8`9e3c1f3f j9dmp29!triggerDumpAgents(struct J9JavaVM * vm = 0x00000000`00a77750, struct J9VMThread * self = 0x00000000`02c91200, unsigned int64 eventFlags = 0x2000, struct J9RASdumpEventData * eventData = 0x00000000`00000000)+0x349 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\rasdump\trigger.c @ 1046]
0a 00000000`19194350 00007ff8`b244b656 j9vm29!generateDiagnosticFiles(struct J9PortLibrary * portLibrary = <Value unavailable error>, void * userData = 0x00000000`00000000)+0x1df [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\vm\gphandle.c @ 1181]
0b 00000000`19194810 00007ff8`b244cd54 j9prt29!runInTryExcept(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00007ff4`607f4b70, void * fn_arg = 0x00007ff4`60690010, <function> * handler = 0x00007ff8`9e3c16d0, void * handler_arg = 0x00000000`19194a78, unsigned int flags = 0x7d, unsigned int64 * result = 0x00000000`19194a70)+0x16 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 220]
0c 00000000`19194850 00007ff8`9e3c2142 j9prt29!omrsig_protect(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00007ff8`9e3c1d60, void * fn_arg = 0x00000000`19194a88, <function> * handler = 0x00007ff8`9e3c16d0, void * handler_arg = 0x00000000`19194a78, unsigned int flags = 0x7d, unsigned int64 * result = 0x00000000`19194a70)+0x214 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 285]
0d 00000000`19194a30 00007ff8`b244b57c j9vm29!vmSignalHandler(struct J9PortLibrary * portLibrary = 0x00007ff8`bc1dd770, unsigned int gpType = 4, void * gpInfo = 0x00000000`19195970, void * userData = <Value unavailable error>)+0x1d2 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\vm\gphandle.c @ 850]
0e 00000000`19195920 00007ff8`b2465ca3 j9prt29!structuredExceptionHandler(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * handler = 0x00000000`00000000, void * handler_arg = 0x00000000`02c91200, unsigned int flags = 0x19195bc8, struct _EXCEPTION_POINTERS * exceptionInfo = 0x00000000`19195b80)+0x16c [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 1292]
0f 00000000`19195b10 00007ff8`e3c9c6a0 j9prt29!runInTryExcept$filt$0+0x23 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 221]
10 00000000`19195b50 00007ff8`fc31130f VCRUNTIME140!__C_specific_handler(struct _EXCEPTION_RECORD * ExceptionRecord = 0x00000000`191967f0, void * EstablisherFrame = 0x00000000`1919de70, struct _CONTEXT * ContextRecord = <Value unavailable error>, struct _DISPATCHER_CONTEXT * DispatcherContext = 0x00000000`19196180)+0xa0
11 00000000`19195bc0 00007ff8`fc2bb5e4 ntdll!RtlpExecuteHandlerForException+0xf
12 00000000`19195bf0 00007ff8`fc30fe3e ntdll!RtlDispatchException+0x244
13 00000000`19196300 00007ff8`809f0fba ntdll!KiUserExceptionDispatch+0x2e
14 (Inline Function) --------`-------- j9jit29!TR_BitVector::operator|=(class TR_BitVector * v2 = 0x00007ff4`60880bb0)+0x42 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\infra\bitvector.hpp @ 462]
15 00000000`19196a00 00007ff8`809e6c00 j9jit29!TR_ExceptionCheckMotion::perform(void)+0xdaa [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\optimizer\partialredundancy.cpp @ 2596]
16 00000000`19197b70 00007ff8`80979d89 j9jit29!TR_PartialRedundancy::perform(void)+0xf40 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\optimizer\partialredundancy.cpp @ 449]
17 00000000`19198cb0 00007ff8`8097884f j9jit29!OMR::Optimizer::performOptimization(struct OptimizationStrategy * optimization = 0x00007ff8`80cca7f0, int firstOptIndex = <Value unavailable error>, int lastOptIndex = <Value unavailable error>, int doTiming = 0n0)+0x1879 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\optimizer\omroptimizer.cpp @ 2061]
18 00000000`19199f40 00007ff8`809766e5 j9jit29!OMR::Optimizer::performOptimization(struct OptimizationStrategy * optimization = 0x00007ff8`80c45510, int firstOptIndex = 0n0, int lastOptIndex = 0n2147483647, int doTiming = 0n0)+0x33f [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\optimizer\omroptimizer.cpp @ 1607]
19 00000000`1919b1d0 00007ff8`807c6149 j9jit29!OMR::Optimizer::optimize(void)+0x3e5 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\optimizer\omroptimizer.cpp @ 1143]
1a (Inline Function) --------`-------- j9jit29!OMR::Compilation::performOptimizations(void)+0x26 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\compile\omrcompilation.cpp @ 1293]
1b 00000000`1919cb00 00007ff8`804fd87e j9jit29!OMR::Compilation::compile(void)+0x699 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\compiler\compile\omrcompilation.cpp @ 1088]
1c 00000000`1919db40 00007ff8`80500a3c j9jit29!TR::CompilationInfoPerThreadBase::compile(struct J9VMThread * vmThread = 0x00000000`02c91200, class TR::Compilation * compiler = 0x00007ff4`603a0000, class TR_ResolvedMethod * compilee = 0x00000000`1919e358, class TR_J9VMBase * vm = 0x00000000`2dcdfae0, class TR_OptimizationPlan * optimizationPlan = 0x00000000`3d3ab4d0, class TR::SegmentAllocator * scratchSegmentProvider = 0x00000000`1919e1f0)+0x7be [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 9210]
1d 00000000`1919dcd0 00007ff8`b244b656 j9jit29!TR::CompilationInfoPerThreadBase::wrappedCompile(struct J9PortLibrary * portLib = 0x00000000`186fa4f0, void * opaqueParameters = 0x00000000`1919e140)+0x144c [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 8735]
1e 00000000`1919de70 00007ff8`b244cd54 j9prt29!runInTryExcept(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00000000`0001c902, void * fn_arg = 0x00000000`0001c902, <function> * handler = 0x00007ff8`80510d80, void * handler_arg = 0x00000000`02c91200, unsigned int flags = 0x3d, unsigned int64 * result = 0x00000000`1919e0e8)+0x16 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 220]
1f 00000000`1919deb0 00007ff8`804fcf9e j9prt29!omrsig_protect(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00007ff8`804ff5f0, void * fn_arg = 0x00000000`1919e140, <function> * handler = 0x00007ff8`80510d80, void * handler_arg = 0x00000000`02c91200, unsigned int flags = 0x3d, unsigned int64 * result = 0x00000000`1919e0e8)+0x214 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 285]
20 00000000`1919e090 00007ff8`80502ebc j9jit29!TR::CompilationInfoPerThreadBase::compile(struct J9VMThread * vmThread = 0x00000000`02c91200, struct TR_MethodToBeCompiled * entry = 0x00000000`2e033610, class J9::J9SegmentProvider * scratchSegmentProvider = 0x00000000`03cac900)+0x3ee [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 7831]
21 00000000`1919f360 00007ff8`8050289a j9jit29!TR::CompilationInfoPerThread::processEntry(struct TR_MethodToBeCompiled * entry = 0x00000000`2e033610, class J9::J9SegmentProvider * scratchSegmentProvider = 0x00000000`1919f440)+0x34c [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 4261]
22 00000000`1919f3e0 00007ff8`80510b3f j9jit29!TR::CompilationInfoPerThread::processEntries(void)+0x15a [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 3951]
23 (Inline Function) --------`-------- j9jit29!TR::CompilationInfoPerThread::run(void)+0x200 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 3816]
24 00000000`1919f4d0 00007ff8`b244cd8f j9jit29!protectedCompilationThreadProc(struct J9PortLibrary * __formal = 0x00000000`186fa4f0, class TR::CompilationInfoPerThread * compInfoPT = 0x00000000`18533300)+0x34f [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 3755]
25 00000000`1919f520 00007ff8`8051075d j9prt29!omrsig_protect(struct OMRPortLibrary * portLibrary = 0x00007ff8`bc1dd770, <function> * fn = 0x00007ff8`805107f0, void * fn_arg = 0x00000000`18533300, <function> * handler = 0x00007ff8`9e3c1120, void * handler_arg = 0x00000000`02c91200, unsigned int flags = 0x7e, unsigned int64 * result = 0x00000000`1919f768)+0x24f [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\port\win64amd\omrsignal.c @ 297]
26 00000000`1919f700 00007ff8`ced74452 j9jit29!compilationThreadProc(void * entryarg = 0x00000000`18533300)+0x27d [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\compiler\control\compilationthread.cpp @ 3663]
27 00000000`1919f760 00007ff8`fa1f14c2 j9thr29!thread_wrapper(void * arg = 0x00000000`186fa4f0)+0xf2 [j:\jenkins\tmp\workspace\build\src\build\windows-x86_64-normal-server-release\vm\omr\thread\common\omrthread.c @ 1718]
28 00000000`1919f790 00007ff8`fa417034 ucrtbase!thread_start<unsigned int +0x42
29 00000000`1919f7c0 00007ff8`fc2bd0d1 kernel32!BaseThreadInitThunk+0x14
2a 00000000`1919f7f0 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Debugger.Sessions[0].Processes[16160].Threads[9228].Stack.Frames[20].SwitchTo()
@rdi class TR_BitVector * this = 0x00007ff4`6087fbc0
@rbx class TR_BitVector * v2 = 0x00007ff4`60880bb0
<unavailable> int v2Used = <value unavailable>
@r8d int i = 0n1
0:003> dx -r1 ((j9jit29!TR_BitVector *)0x7ff46087fbc0)
((j9jit29!TR_BitVector *)0x7ff46087fbc0) : 0x7ff46087fbc0 [Type: TR_BitVector *]
nullContainerCharacteristic : -1 [Type: int]
[+0x000] _chunks : 0x7ff40000001e : Unable to read memory at Address 0x7ff40000001e [Type: unsigned __int64 *]
[+0x008] _region : 0x19197c10 [Type: TR::Region *]
[+0x010] _numChunks : 4 [Type: int]
[+0x014] _firstChunkWithNonZero : 2 [Type: int]
[+0x018] _lastChunkWithNonZero : 2 [Type: int]
[+0x01c] _growable : growable (1) [Type: TR_BitVectorGrowable]
0:003> dx -r1 ((j9jit29!TR_BitVector *)0x7ff460880bb0)
((j9jit29!TR_BitVector *)0x7ff460880bb0) : 0x7ff460880bb0 [Type: TR_BitVector *]
nullContainerCharacteristic : -1 [Type: int]
[+0x000] _chunks : 0x7ff460880bd0 : 0x0 [Type: unsigned __int64 *]
[+0x008] _region : 0x19197c10 [Type: TR::Region *]
[+0x010] _numChunks : 4 [Type: int]
[+0x014] _firstChunkWithNonZero : 2 [Type: int]
[+0x018] _lastChunkWithNonZero : 2 [Type: int]
[+0x01c] _growable : growable (1) [Type: TR_BitVectorGrowable]
0:003> dv
this = 0x00007ff4`60659c20
exprsNotRedundant = 0x00007ff4`60892280
stackMemoryRegion = class TR::StackMemoryRegion
arraySize = <value unavailable>
nextNode = 0x00007ff4`60540000
cfg = 0x00007ff4`603ea590
numIterations = 0n0
rootStructure = 0x00007ff4`60696b50
redundantSetAdjustmentRequired = false
i = <value unavailable>
i = <value unavailable>
size = 0n2
i = 0n56
bvi = class TR_BitVectorCursor
orderedListSize = <value unavailable>
nextElement = 0x00000000`00000000
redundantExpressionAdjustment = 0x00007ff4`603a0000
unavailableSetInfo = <value unavailable>
currentTree = <value unavailable>
Block Number (ordered list) : 56
Unaffected by order <165>
Unaffected by order <169>
Affected by order <166>
Affected by order <168>
Affected by order <167>
Affected by order <170>
I’m trying to match the crashed compilation locally, but not able to match the exact nested inlining yet. The call flow is that HashtableOfObject.rehash() calls HashtableOfObject.putUnsafely() which in turn could call HashtableOfObject.rehash(). The crashed compilation has 3 layers of nested inlining. Locally I'm only able to force to nested inline putUnsafely() once.
crashed compilation in jitdmp
Trees: for org/eclipse/jdt/internal/compiler/util/HashtableOfObject.rehash()V
Call Stack Info
CalleeIndex CallerIndex ByteCodeIndex CalleeMethod
0 -1 10 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.<init>(I)V
1 0 1 java/lang/Object.<init>()V
|--- 2 -1 42 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.putUnsafely([CLjava/lang/Object;)V
| 3 2 7 org/eclipse/jdt/core/compiler/CharOperation.hashCode([C)I
| 4 3 1 java/util/Arrays.hashCode([C)I
|-> 5 2 74 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.rehash()V // <== inlined again
6 5 10 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.<init>(I)V
7 6 1 java/lang/Object.<init>()V
|--- 8 5 42 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.putUnsafely([CLjava/lang/Object;)V
| 9 8 7 org/eclipse/jdt/core/compiler/CharOperation.hashCode([C)I
| 10 9 1 java/util/Arrays.hashCode([C)I
|-> 11 8 74 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.rehash()V // <== inlined again
12 11 10 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.<init>(I)V
13 12 1 java/lang/Object.<init>()V
14 11 42 org/eclipse/jdt/internal/compiler/util/HashtableOfObject.putUnsafely([CLjava/lang/Object;)V
15 14 7 org/eclipse/jdt/core/compiler/CharOperation.hashCode([C)I
16 15 1 java/util/Arrays.hashCode([C)I
tryToInline={HashtableOfObject.<init>(I)V|HashtableOfObject.putUnsafely([CLjava/lang/Object;)V|HashtableOfObject.rehash()V|CharOperation.hashCode([C)I|java/lang/Object.<init>()V|java/util/Arrays.hashCode([C)I}
local test
Pre Instruction Selection Trees: for HashtableOfObject.rehash()V
Call Stack Info
CalleeIndex CallerIndex ByteCodeIndex CalleeMethod
0 -1 10 HashtableOfObject.<init>(I)V
1 0 1 java/lang/Object.<init>()V
|--- 2 -1 46 HashtableOfObject.putUnsafely([CLjava/lang/Object;)V
| 3 2 7 CharOperation.hashCode([C)I
| 4 3 1 java/util/Arrays.hashCode([C)I
|-> 5 2 74 HashtableOfObject.rehash()V // <== inlined again
6 5 10 HashtableOfObject.<init>(I)V
7 6 1 java/lang/Object.<init>()V
Maybe those paths that are not being inlined on in your local test don't cross some frequency threshold that makes them candidates for inlining. I wonder if you would get the inlining you want if you relaxed some of the frequency based thresholds that inliner uses (note this could inline a lot more and have inlining differences in general, but it may at least be worth a shot en route to forcing the specific inlining that you want). @ashu-mehra has tried the long set of options to avoid considering frequency in the inliner recently and @mpirvu might also be able to share that options string.
"disableConservativeColdInlining,disableConservativeInlining,bigCalleeThreshold=600,bigCalleeHotOptThreshold=600,bigCalleeScorchingOptThreshold=600,inlineVeryLargeCompiledMethods"
Thanks @mpirvu @vijaysun-omr ! I'll give that the option a try.
Hi, consider which I can reproduce the problem always.
If you need something else by me, e.g. run jvm with additional options to give you more info, please let me know and I will do.
Thanks
@gpezzini, could you help try the following? Thanks!
“-Xjit:disableInlining” with thejdk-11.0.9+11_openj9-0.23.0 that crashes jdk-11.0.9+11_openj9-0.23.0 to catch logs but it works when I use a nightly build. I wonder if other build works for you.@a7ehuo I will. I'll back to you when I will have the results.
Consider which the error does not comes soon, but about after 3 hours of processing.
Thanks
I tried the options from https://github.com/eclipse/openj9/issues/11569#issuecomment-756275270 along with all the methods in tryToInline. It still does not do the nested inline as the crashed case. I got sidetracked with a crash (not related to PRE) when I try to collect the trace log along with tryToInline using build jdk-11.0.9+11_openj9-0.23.0. Adding only tryToInline without collecting tracelog, or using another build, would bypass the crash.
@a7ehuo
I'm performing the test 1.
In the meanwhile I've download the build u suggest to me, but avast says which the file is infected
Please see:
This does not happens using jdk-11.0.9+11_openj9-0.23.0
Right now I do not know if I can continue or not,
I found the nightly build on the AdoptOpenJDk nightly build page: https://adoptopenjdk.net/nightly.html?variant=openjdk11&jvmVariant=openj9 and chose the build on Jan 7th which is the one I tried locally. It's likely a false hit. I'll open an issue in AdoptOpenJDK support to track it.
Windows x64 | jdk | normal | 7 January 2021 | .zip (197 MB) | .msi | Checksum
What infection was detected? I used to have problems with anti-virus on Windows, but the detected problem seemed to be just that the binary wasn't recognized, which makes sense for a nightly build.
Still looking, since _actualOptSetInfo[56]->_chunks is invalid, also checked _actualOptSetInfo[55]->_chunks, _actualOptSetInfo[58]->_chunks, _actualRednSetInfo[56]->_chunks. They all look good.
[+0x000] _chunks : 0x7ff40000001e : Unable to read memory at Address 0x7ff40000001e [Type: unsigned __int64 *]
_actualOptSetInfo is allocated on stack. At the time when TR_ExceptionCheckMotion::analyzeBlockStructure() is called, _actualOptSetInfo[56]->_chunks should still be good otherwise it'd crash when_actualOptSetInfo[blockNum]->set() or _actualOptSetInfo[blockNum]->get()is called.
In the meanwhile I've download the build u suggest to me, but avast says which the file is infected
@gpezzini I asked the question in AdoptOpenJDK slack channel. Here is the answer copied from the reply in case you don't have access to it:
"Reports like that are usually false positives and not that uncommon. A good test is to upload files to virustotal.com which scans the file with 30+ AV engines. None reported a virus for the file you gave."
@a7ehuo
Test 1.
“-Xjit:disableInlining” with the jdk-11.0.9+11_openj9-0.23.0
Its ended correctly.
Test 2.
Ok, I will start with test 2 and I will let you know
@a7ehuo
Hi, ended test 2, without “-Xjit:disableInlining” and dowloaded build:
openjdk version "11.0.10" 2021-01-19
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.10+8-202101062342)
Eclipse OpenJ9 VM AdoptOpenJDK (build master-137829daa, JRE 11 Windows 10 amd64-64-Bit Compressed References 20210106_889 (JIT enabled, AOT enabled)
OpenJ9 - 137829daa
OMR - 13926a736
JCL - 6319cd4ec9 based on jdk-11.0.10+8)
The job ends correctly
@gpezzini Thanks for helping try the two things! To work around this issue while we're investigating it, either you could continue using the above nightly build, or add "-Xjit:disablePRE" to build jdk-11.0.9+11_openj9-0.23.0.
@a7ehuo Thanks a lot for the your support!!
With the help from @vijaysun-omr, we found the problem. A potential fix is being tested.
@vijaysun-omr found the jitdmp shows expression 28 and 30 appear on both affected and unaffected list in block_58 before the compilation crashed in block_56. Expression28 is n419n. n1034n is considered to be the same as n419n (li=28). However, the first child of n419n (li=7) and the first child of n1034n (li=11 ) are actually different. Other awrtbari nodes that are different from n419n are also mistakenly considered to be the same li=28.
Block Number (ordered list) : 58
Unaffected by order <28>
Unaffected by order <30>
Unaffected by order <165>
Affected by order <28>
Affected by order <30>
Expression #28 is :
n419n ==>awrtbari
n419n awrtbari org/eclipse/jdt/internal/compiler/util/HashtableOfObject.keyTable [[C[#374 Shadow +8] [flags 0x607 0x0 ] (X!=0 skipWrtBar sharedMemory ) [0x00007FF45F4472B0] bci=[0,37,41] rc=1 vc=7174 vn=27 li=28 udi=- nc=3 flg=0x282c
n414n loadaddr <temp slot 13>[#572 Auto] [flags 0x6000000e 0x0 ] (X!=0 X>=0 cannotOverflow nodePointsToNonNull cannotTrackLocalUses escapesInColdBlock sharedMemory ) [0x00007FF45F447120] bci=[0,32,41] rc=4 vc=7174 vn=10 li=7 udi=306 nc=0 flg=0x9104
n417n ==>anewarray
n414n ==>loadaddr
n1034n awrtbari org/eclipse/jdt/internal/compiler/util/HashtableOfObject.keyTable [[C[#374 Shadow +8] [flags 0x607 0x0 ] (X!=0 skipWrtBar sharedMemory ) [0x00007FF45F4E32F0] bci=[6,37,41] rc=1 vc=7216 vn=112 li=28 udi=- nc=3 flg=0x282c
n1029n loadaddr <temp slot 11>[#564 Auto] [flags 0x6000000e 0x0 ] (X!=0 X>=0 cannotOverflow nodePointsToNonNull cannotTrackLocalUses escapesInColdBlock sharedMemory ) [0x00007FF45F4E3160] bci=[6,32,41] rc=4 vc=7216 vn=13 li=11 udi=313 nc=0 flg=0x9104
n1032n ==>anewarray
n1029n ==>loadaddr
In TR_LocalAnalysisInfo::hasOldExpressionOnRhs(), we temporarily change awrtbari to aloadi for the syntactic comparison in areSyntacticallyEquivalent(). When it’s an indirect store, the number of children of the node is set to 1, otherwise 0.
https://github.com/eclipse/omr/blob/bded46caf73287830139853961ffacb96cc9758e/compiler/optimizer/LocalAnalysis.cpp#L530
TR::Node::recreate(node, _compilation->il.opCodeForCorrespondingLoadOrStore(node->getOpCodeValue()));
if (node->getOpCode().isStoreIndirect())
{
node->setNumChildren(1);
}
else
{
node->setNumChildren(0);
Because the check on whether not the node is an indirect store happens after the awrtbari node is recreated as aloadi,node->getOpCode().isStoreIndirect() is false for aloadi, and the number of children for awrtbari node ends up as 0. It prevents
areSyntacticallyEquivalent() from comparing the first child of the two nodes. It concludes that different expressions as the same. The fix is to check if it’s an indirect store from the original node.
FYI @gpezzini
Very good find @a7ehuo & @vijaysun-omr 👏
Thanks @fjeremic I was pleasantly surprised to see the JIT dump show tracePRE output in addition to traceFull and this was invaluable in this case for detecting what the problem was. How is the decision taken on which optimization to trace in more depth (e.g. tracePRE) for a given crash (I am assuming this is done automatically when the JIT dump compilation is done) ? I asked @a7ehuo to look into this aspect as well but I thought I'd ask since you were following and commented on this issue.
Attn : @klangman @JamesKingdon and @0xdaryl for this type of a crash in PRE since I won't be surprised if there are duplicates from the change that apparently regressed this behavior in Aug 2020. You may want to make a note of it from a service viewpoint.
Attn : @klangman @JamesKingdon and @0xdaryl for this type of a crash in PRE since I won't be surprised if there are duplicates from the change that apparently regressed this behavior in Aug 2020. You may want to make a note of it from a service viewpoint.
@vijaysun-omr Do we expect it to always show-up as a crash in the TR_ExceptionCheckMotion::perform(void)?
How is the decision taken on which optimization to trace in more depth (e.g. tracePRE) for a given crash (I am assuming this is done automatically when the JIT dump compilation is done) ?
The jitdump's had this ability for a while; a crash in an opt will trigger tracing for that opt:
https://github.com/eclipse/openj9/blob/5bd152fc4a5a60e63540bdacb439ecc53798a592/runtime/compiler/control/CompilationThread.cpp#L8151-L8160
@fjeremic also has a new PR that cleans up this up further (https://github.com/eclipse/openj9/pull/11610)
@klangman it is possible for other symptoms, such as a wrong field value being privatized for example.
The compile time crash seen in this issue (or somewhere around that code) is certainly one of the more likely places to crash though, and so you should probably take special note of that while recognizing that there could be other (even run time) problems.
To give an example of the kind of run time problem that could be caused by this bug:
1) Incorrect commoning of field accesses with different base
load o1.f
...
load o2.f
could be commoned up when o1 != o2
2) Incorrect copy propagation of field accesses with different base
store o1.f = rhs
...
load o2.f
could be wrongly copy propagated, i.e. the load of o2.f could be changed to pick up the rhs value from the earlier store (either via a temp or a register) even in the case when o1 != o2
Most helpful comment
Great to see @fjeremic I did want to just mention @klangman and @JamesKingdon to make them aware that Filip has been improving the JIT dump functionality in recent months and that you may be able to reap the benefit of that work on the service streams as well.