Openjdk-infrastructure: build-equinix-ubuntu2004-armv8-1 intermittently fails to build openj9

Created on 12 Apr 2021  路  21Comments  路  Source: AdoptOpenJDK/openjdk-infrastructure

https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk16u/job/jdk16u-linux-aarch64-openj9/17/

05:38:00  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/build-scripts/jobs/jdk16u/jdk16u-linux-aarch64-openj9/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.desktop/_the.java.desktop_batch' failed
05:38:00  gmake[3]: *** [/home/jenkins/workspace/build-scripts/jobs/jdk16u/jdk16u-linux-aarch64-openj9/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.desktop/_the.java.desktop_batch] Error 1
05:38:00  make/Main.gmk:197: recipe for target 'java.desktop-java' failed
05:38:00  gmake[2]: *** [java.desktop-java] Error 1

No clear reason, but only fails on this node, and not always

ansible openj9

Most helpful comment

The failures mentioned in #2119 (comment) are compiling C/C++ code, not sure why this would have anything to do with the boot JDK.

@pshipton it's not the C compilation, it's the fact that the GenJvmti.java "build tool" that generates the C source which fails to compile, is run using the boot jdk and for whatever reason generates bad C source with openj9, but not with hotspot.

All 21 comments

I'll take this one since it's sufficiently weird that I'm intrigued by it ;-)

Failed in the final run out of the first five attempts: https://ci.adoptopenjdk.net/job/SXA-JDK16J9-aarch64-equinix/
Now attempting a further five in parallel (#6-#10)

This is odd given that it's within docker containers. If I can't identify an obvious reason I'll look at switching off this machine until we can resolve it at some point tomorrow in order to avoid any risk to next week's GAs.

Hmmm one of my builds has come up with this:

17:38:08  ccache: error: /home/jenkins/.ccache/ccache.conf: No such file or directory

I wonder if it's hitting a race condition (Entirely possible on a machine with 160 cores)
I'll attempt to build with ccache disabled (it's entirely pointless in the dynamic docker containers anyway and may even slow things down)

Nope - still happening with ccache disabled

And also still a problem with the build reduced to use 16 cores. ...

Have removed the dockerBuild label from the machine for now. Very odd problem though ...

Current testing:

  • Does it fail with --with-jobs=1. So far, no 84 84 85 86 87 88 95
  • Does it fail running on a CentOS8 host system (i.e. not docker). Only once, the rest passed: 83 89 90 91 92 93 94 96 97 98 99 100
  • Does it fail outside jenkins in a docker container?
  • Does it fail in a docker container with the file system mapped to the host? (Yes - First 3 out of 10 runs)
  • Does it fail if a hotspot boot JDK15 is used instead of OpenJ9?

Also need to try:

  • Reboot to enable updated Linux kernel (Currently 5.4.0-40-generic latest installed is 5.4.0-72.80 - 4/8 failed between 121 and 128
  • Upgrade docker (Was 19.03.8-0ubuntu1.20.04.2, now 20.10.6~3-0~ubuntu-focal)
apt-get remove docker docker.io containerd runc
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo   "deb [arch=arm64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install docker-ce docker-ce-cli containerd.io

OK - almost all the failures are at this point:

18:48:24  Updating support/src.zip
18:48:45  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@6/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.desktop/_the.java.desktop_batch' failed
18:48:45  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@6/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.desktop/_the.java.desktop_batch] Error 1
18:48:45  make/Main.gmk:197: recipe for target 'java.desktop-java' failed
18:48:45  gmake[2]: *** [java.desktop-java] Error 1
18:48:45  gmake[2]: *** Waiting for unfinished jobs....

The one failure I've seen on the CentOS host systems was at this point in run 106

18:22:47  Compiling 5507 files for openj9.dtfj
18:22:49  Creating jdk/modules/jdk.jpackage/jdk/jpackage/internal/resources/jpackageapplauncher from 15 file(s)
18:23:08  gmake[3]: *** [CompileJavaModules.gmk:605: /home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@4/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/openj9.dtfj/_the.openj9.dtfj_batch] Error 1
18:23:08  gmake[2]: *** [make/Main.gmk:197: openj9.dtfj-java] Error 2

115 failed in a container on the CentOS8 source but that was a different problem as it didn't get as far as running the build:
130 (Container on ubuntu 2004 host)

12:32:26  Compiling 15 files for java.prefs
12:32:26  Compiling 77 files for java.sql
12:32:26  Compiling 94 files for jdk.xml.dom
12:32:26  Compiling 275 files for java.xml.crypto
12:32:26  Compiling 225 files for jdk.javadoc
12:32:32  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.jdeps/_the.jdk.jdeps_batch' failed
12:32:32  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.jdeps/_the.jdk.jdeps_batch] Error 1
12:32:32  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.xml.dom/_the.jdk.xml.dom_batch' failed
12:32:32  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.xml.dom/_the.jdk.xml.dom_batch] Error 1
12:32:32  make/Main.gmk:197: recipe for target 'jdk.jdeps-java' failed
12:32:32  gmake[2]: *** [jdk.jdeps-java] Error 1
12:32:32  gmake[2]: *** Waiting for unfinished jobs....
12:32:32  make/Main.gmk:197: recipe for target 'jdk.xml.dom-java' failed
12:32:32  gmake[2]: *** [jdk.xml.dom-java] Error 1
12:32:32  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.xml.crypto/_the.java.xml.crypto_batch] Error 1

131:

12:39:45  Compiling 136 files for jdk.jdeps
12:39:48  Compiling 84 files for jdk.jlink
12:39:48  Compiling 15 files for java.prefs
12:39:48  Compiling 77 files for java.sql
12:39:48  Compiling 94 files for jdk.xml.dom
12:39:48  Compiling 275 files for java.xml.crypto
12:39:48  Compiling 225 files for jdk.javadoc
12:39:51  Compiling 95 files for jdk.jshell
12:39:51  Compiling 56 files for java.sql.rowset
12:39:59  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.javadoc/_the.jdk.javadoc_batch' failed
12:39:59  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.javadoc/_the.jdk.javadoc_batch] Error 1
12:39:59  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.jshell/_the.jdk.jshell_batch' failed
12:39:59  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/jdk.jshell/_the.jdk.jshell_batch] Error 1
12:39:59  make/Main.gmk:197: recipe for target 'jdk.javadoc-java' failed
12:39:59  gmake[2]: *** [jdk.javadoc-java] Error 1
12:39:59  gmake[2]: *** Waiting for unfinished jobs....
12:39:59  make/Main.gmk:197: recipe for target 'jdk.jshell-java' failed
12:39:59  gmake[2]: *** [jdk.jshell-java] Error 1
12:39:59  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.sql.rowset/_the.java.sql.rowset_batch' failed
12:39:59  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.sql.rowset/_the.java.sql.rowset_batch] Error 1
12:39:59  make/Main.gmk:197: recipe for target 'java.sql.rowset-java' failed
12:39:59  gmake[2]: *** [java.sql.rowset-java] Error 1
12:40:15  Blob written to file: /home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/vm/runtime/j9ddr.dat
12:40:15  Superset written to file: /home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@5/workspace/build/src/build/linux-aarch64-server-release/vm/superset.dat
12:40:15  [100%] Built target j9ddr

Job 143 (Using --with-jobs=2)

13:48:01  Compiling 8 files for jdk.unsupported.desktop
13:48:02  Creating support/modules_cmds/openj9.traceformat/traceformat from 1 file(s)
13:48:04  Compiling 5505 files for openj9.dtfj
13:48:17  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@6/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/openj9.dtfj/_the.openj9.dtfj_batch' failed
13:48:17  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@6/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/openj9.dtfj/_the.openj9.dtfj_batch] Error 1
13:48:17  gmake[2]: *** [openj9.dtfj-java] Error 1
13:48:17  make/Main.gmk:197: recipe for target 'openj9.dtfj-java' failed
13:48:17  gmake[2]: *** Waiting for unfinished jobs....

This one has shown up a small numbers of times too:

Compiling 6 properties into resource bundles for java.base
Creating javadoc element list
jvmtiGen error: java.lang.NullPointerException: Cannot load from object array because "this.m_extendedTypes" is nulljavax.xml.transform.TransformerException: java.lang.NullPointerException: Cannot load from object array because "this.m_extendedTypes" is null
    at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:797)
    at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:383)
    at jvmtiGen.main(jvmtiGen.java:137)
Caused by: java.lang.NullPointerException: Cannot load from object array because "this.m_extendedTypes" is null
    at java.xml/com.sun.org.apache.xml.internal.dtm.ref.sax2dtm.SAX2DTM2._type2(SAX2DTM2.java:1942)
    at java.xml/com.sun.org.apache.xml.internal.dtm.ref.sax2dtm.SAX2DTM2.getStringValueX(SAX2DTM2.java:2832)
    at java.xml/com.sun.org.apache.xml.internal.dtm.ref.sax2dtm.SAX2DTM2.getStringValue(SAX2DTM2.java:2930)
    at java.xml/com.sun.org.apache.xalan.internal.xsltc.dom.AdaptiveResultTreeImpl.getStringValue(AdaptiveResultTreeImpl.java:133)
    at jdk.translet/die.verwandlung.jvmti.template$dot$152()
    at jdk.translet/die.verwandlung.jvmti.applyTemplates18()
    at jdk.translet/die.verwandlung.jvmti.template$dot$107()
    at jdk.translet/die.verwandlung.jvmti.applyTemplates15()
    at jdk.translet/die.verwandlung.jvmti.template$dot$106()
    at jdk.translet/die.verwandlung.jvmti.applyTemplates15()
    at jdk.translet/die.verwandlung.jvmti.template$dot$102()
    at jdk.translet/die.verwandlung.jvmti.applyTemplates()
    at jdk.translet/die.verwandlung.jvmti.template$dot$100()
    at jdk.translet/die.verwandlung.jvmti.applyTemplates()
    at jdk.translet/die.verwandlung.jvmti.applyTemplates()
    at jdk.translet/die.verwandlung.jvmti.transform()
    at java.xml/com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.transform(AbstractTranslet.java:624)
    at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:790)
    ... 2 more

https://ci.adoptopenjdk.net/job/SXA-JDK16J9-aarch64-equinix/150/consoleFull failed with the build set to only use a single core (First time I've seen that):

15:39:07  Compiling 56 files for java.sql.rowset
15:39:07  Compiling 275 files for java.xml.crypto
15:39:10  Compiling 1 files for java.se
15:39:11  Compiling 22 files for java.smartcardio
15:39:13  CompileJavaModules.gmk:604: recipe for target '/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@6/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.smartcardio/_the.java.smartcardio_batch' failed
15:39:13  gmake[3]: *** [/home/jenkins/workspace/SXA-JDK16J9-aarch64-equinix@6/workspace/build/src/build/linux-aarch64-server-release/jdk/modules/java.smartcardio/_the.java.smartcardio_batch] Error 1
15:39:13  make/Main.gmk:197: recipe for target 'java.smartcardio-java' failed
15:39:13  gmake[2]: *** [java.smartcardio-java] Error 1

Hmmm I've just done 20 builds on this machine using a HotSpot boot JDK instead of OpenJ9 15.0.2-ea and have had zero failures. Given that this is an ea and the issue appears to be when processing java code I'm inclined to say that whatever we're seeing is specific to the OpenJ9 boot JDK at the moment.

Tagging @pshipton @0xdaryl for any input - this is happening on a 160 core Ampere Altra (armsystem but as per previous messages it has been seen even with --with-cores=1)

Have you tried a new boot JVM? You can use OpenJ9 16 to build 16. Given that aarch64 is ea, you can expect there are number of problems fixed in the newer release.

Have been testing 16 for the last half hour - I got a failure in each of 10 attempts, so it seems to be worse than J9/JDK15 so far.

Which 16 are you trying, the last release or the latest nightly build? Might as well try the latest.

One thought I had was to pull the latest version of Docker and see if the problem persists. I agree that 160 cores gives you lots of room for race conditions, especially within a containerized environment.

Which 16 are you trying, the last release or the latest nightly build? Might as well try the latest.

It was last release, but I'm conscious of how much time it's taking at the moment for me to run the variations I already have

One thought I had was to pull the latest version of Docker and see if the problem persists. I agree that 160 cores gives you lots of room for race conditions, especially within a containerized environment.

Yeah I've already eliminated that (since it was something you suggested to me I added it to the list of things to try) I'm on 20.10.6~3-0~ubuntu-focal from the standalone docker repos and it hasn't made a difference. I'll be honest I think I've eliminated core count explicitly as a cause since I got the same when forcing the build to only use one core for the builds.

On some of the runs I've had problems with the jvmti compilations which seem similar to those seen occasionally on AIX (Ref: https://github.com/eclipse/openj9/issues/12061)
e.g.

Creating javadoc element list
In file included from ./src/hotspot/share/oops/arrayKlass.cpp:31:0:
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmti.h:1589:5: error: expected identifier before ')' token
     );
     ^

and

Creating javadoc element list
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnterTrace.cpp: In function 'jvmtiError jvmtiTrace_RedefineClasses(jvmtiEnv*, jint, const jvmtiClassDefinition*)':
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnterTrace.cpp:9051:36: error: no matching function for call to 'JvmtiEnv::RedefineClasses()'
   err = jvmti_env->RedefineClasses();
                                    ^
In file included from ./src/hotspot/share/prims/jvmtiEnter.inline.hpp:29:0,
                 from ./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnterTrace.cpp:33:
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnv.hpp:160:16: note: candidate: jvmtiError JvmtiEnv::RedefineClasses(jint, const jvmtiClassDefinition*)
     jvmtiError RedefineClasses(jint class_count, const jvmtiClassDefinition* class_definitions);
                ^~~~~~~~~~~~~~~
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnv.hpp:160:16: note:   candidate expects 2 arguments, 0 provided
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnterTrace.cpp: In function 'jvmtiError jvmtiTrace_GetObjectSize(jvmtiEnv*, jobject, jlong*)':
./build/linux-aarch64-server-release/hotspot/variant-server/gensrc/jvmtifiles/jvmtiEnterTrace.cpp:9125:34: error: no matching function for call to 'JvmtiEnv::GetObjectSize()'
   err = jvmti_env->GetObjectSize();
                                  ^

Looks like it might be ok with the latest JDK16 nightly - I've been able to run it ten times without a failure on this one:

BOOT JDK: openjdk version "16" 2021-03-16
BOOT JDK: OpenJDK Runtime Environment AdoptOpenJDK-16+36-202104192334 (build 16+36-202104192334)
BOOT JDK: Eclipse OpenJ9 VM AdoptOpenJDK-16+36-202104192334 (build master-21140c9f9, JRE 16 Linux aarch64-64-Bit Compressed References 20210419_22 (JIT enabled, AOT enabled)
BOOT JDK: OpenJ9   - 21140c9f9
BOOT JDK: OMR      - 840473a5d
BOOT JDK: JCL      - 073accb86d based on jdk-16+36)

The failures mentioned in https://github.com/AdoptOpenJDK/openjdk-infrastructure/issues/2119#issuecomment-823216467 are compiling C/C++ code, not sure why this would have anything to do with the boot JDK.

The failures mentioned in #2119 (comment) are compiling C/C++ code, not sure why this would have anything to do with the boot JDK.

@pshipton it's not the C compilation, it's the fact that the GenJvmti.java "build tool" that generates the C source which fails to compile, is run using the boot jdk and for whatever reason generates bad C source with openj9, but not with hotspot.

Latest GA of 16 also seems to fail: https://ci.adoptopenjdk.net/job/SXA-JDK16J9-aarch64-equinix/180/consoleFull

18:51:25  BOOT JDK: openjdk version "16.0.1-ea" 2021-04-20
18:51:25  BOOT JDK: OpenJDK Runtime Environment AdoptOpenJDK-16.0.1+9 (build 16.0.1-ea+9)
18:51:25  BOOT JDK: Eclipse OpenJ9 VM AdoptOpenJDK-16.0.1+9 (build openj9-0.26.0, JRE 16 Linux aarch64-64-Bit Compressed References 20210421_23 (JIT enabled, AOT enabled)
18:51:25  BOOT JDK: OpenJ9   - b4cc246d9
18:51:25  BOOT JDK: OMR      - 162e6f729
18:51:25  BOOT JDK: JCL      - cea22090ecf based on jdk-16.0.1+9)
Was this page helpful?
0 / 5 - 0 ratings

Related issues

judovana picture judovana  路  5Comments

sxa picture sxa  路  4Comments

Mesbah-Alam picture Mesbah-Alam  路  4Comments

M-Davies picture M-Davies  路  4Comments

LongyuZhang picture LongyuZhang  路  4Comments