Openj9: OpenJDK java/util/WeakHashMap/GCDuringIteration.java failed to do a "full" gc

Created on 27 Jul 2020  路  25Comments  路  Source: eclipse/openj9

https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.openjdk_ppc64le_linux_Nightly/121
java/util/WeakHashMap/GCDuringIteration.java
ub16-ppcle-3

java.lang.AssertionError: failed to do a "full" gc
    at GCDuringIteration.forceFullGc(GCDuringIteration.java:74)
    at GCDuringIteration.gcAwait(GCDuringIteration.java:81)
    at GCDuringIteration.test(GCDuringIteration.java:241)
    at GCDuringIteration.instanceMain(GCDuringIteration.java:263)
    at GCDuringIteration.main(GCDuringIteration.java:261)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    at java.base/java.lang.Thread.run(Thread.java:836)
java.lang.AssertionError: Some tests failed
    at GCDuringIteration.instanceMain(GCDuringIteration.java:265)
    at GCDuringIteration.main(GCDuringIteration.java:261)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
    at java.base/java.lang.Thread.run(Thread.java:836)
test test excluded test failure

Most helpful comment

I wrote up a description of the problem on AArch64 here -> https://github.com/eclipse/openj9/issues/8897#issuecomment-607897793

I don't recall the exact name of the method, but the problem occurred when that method was compiled with optLevel cold which does not run an aggressive liveness optimization (the reason why this is a problem is described in that link above).

To see if this is the same issue you can try grinding by adding -Xjit:enableAggressiveLiveness to the options.

Any test that expects precise behaviour from the GC to occur at certain (un-speced) points is not really a valid test in my opinion.

All 25 comments

@dmitripivkine fyi

30x grinder using latest Adopt build https://ci.eclipse.org/openj9/job/Grinder/992

@0xdaryl @andrewcraik
As @cedrichansen found out https://github.com/eclipse/openj9/issues/9043 (with reference to https://github.com/eclipse/openj9/issues/8897) is similar to this problem (just use to be failure for ARM).
Would you please advice possible course of actions? Is it a JIT related test?

I don't know anything about the test so I am not sure

I wrote up a description of the problem on AArch64 here -> https://github.com/eclipse/openj9/issues/8897#issuecomment-607897793

I don't recall the exact name of the method, but the problem occurred when that method was compiled with optLevel cold which does not run an aggressive liveness optimization (the reason why this is a problem is described in that link above).

To see if this is the same issue you can try grinding by adding -Xjit:enableAggressiveLiveness to the options.

Any test that expects precise behaviour from the GC to occur at certain (un-speced) points is not really a valid test in my opinion.

@0xdaryl Thank you very much!
I found source for the test https://github.com/ibmruntimes/openj9-openjdk-jdk11/blob/openj9/test/jdk/java/util/WeakHashMap/GCDuringIteration.java

There is a misleading message from assertion. The real problem is:
Test creates Weak Ref and run System GC in a loop expecting reference to be Finalized but it never happen.
I observed this problem earlier in https://github.com/eclipse/openj9/issues/9651 , https://github.com/eclipse/openj9/issues/8872 and a few more cases - object can not be sent to Finalization because it kept alive by O-slot in JIT frame.
This test does not generate system core to examine but this is exactly the same pattern in the test. I believe there is no reason to waste time for extra investigation. This failure is quite intermittent, so how many times we need to iterate with -Xjit:enableAggressiveLiveness to confirm it works?
I am not sure should test be excluded again.

In the AArch64 case, the failure rate was about 1/25. The test didn't take too long to run as I recall so I was able to let it run hundreds of times in a script with -Xjit:enableAggressliveLiveness to demonstrate no failures. Your mileage may vary though.

@cedrichansen Would you please run grinders "as is" to determine failure rate and with provided option?

Yes, setting up grinders now 馃憤馃徎

Ran GCDuringIteration 200x through grinder (with no jvm options provided) and no failures were recorded. After speaking with @dmitripivkine , this may be a slight difference in test environment (grinder tests running only GCDuringIteration vs nightly eclipse builds which run a larger set of tests) which might cause a different level of jit optimization to be used as the tests are run, and this resulted in different fail rate for the test in question.

Another occurrence at https://ci.eclipse.org/openj9/job/Test_openjdk15_j9_sanity.openjdk_s390x_linux_Nightly/51/console

01:55:26  STDERR:
01:55:26  java.lang.AssertionError: failed to do a "full" gc
01:55:26    at GCDuringIteration.forceFullGc(GCDuringIteration.java:74)
01:55:26    at GCDuringIteration.gcAwait(GCDuringIteration.java:81)
01:55:26    at GCDuringIteration.test(GCDuringIteration.java:232)
01:55:26    at GCDuringIteration.instanceMain(GCDuringIteration.java:263)
01:55:26    at GCDuringIteration.main(GCDuringIteration.java:261)
01:55:26    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
01:55:26    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
01:55:26    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
01:55:26    at java.base/java.lang.reflect.Method.invoke(Method.java:564)
01:55:26    at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
01:55:26    at java.base/java.lang.Thread.run(Thread.java:845)
01:55:26  java.lang.AssertionError: Some tests failed
01:55:26    at GCDuringIteration.instanceMain(GCDuringIteration.java:265)
01:55:26    at GCDuringIteration.main(GCDuringIteration.java:261)
01:55:26    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
01:55:26    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
01:55:26    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
01:55:26    at java.base/java.lang.reflect.Method.invoke(Method.java:564)
01:55:26    at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:298)
01:55:26    at java.base/java.lang.Thread.run(Thread.java:845)

If the ultimate issue here is that the test is relying on un-spec'ed behaviour to succeed and that success only happens when the test is run at a "high enough" optimization level then is there any value in continuing to run this arguably invalid test as part of our standard suite of tests?

I haven't been following this issue, but what is the next course of action expected here? It certainly can't be to run the aggressive liveness optimization at lower optimization levels. Perhaps the next action should be to exclude the test so that it doesn't continue to consume triaging resources.

I am reclassifying this failure as a test issue because based on our investigation the JIT is working as designed. As mentioned above, it would seem we should either fix the test case or stop running it.

Created https://github.com/AdoptOpenJDK/openjdk-tests/pull/2083 to exclude the test for jdk11, 15, 16.

Was this page helpful?
0 / 5 - 0 ratings