Openjdk-infrastructure: AIX nodes System zlib not setup ?

Created on 18 Feb 2021  路  36Comments  路  Source: AdoptOpenJDK/openjdk-infrastructure

Not sure this fails every time, but about 50% of the time:
https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-hotspot/854/console

18:20:45  checking for which libpng to use... bundled
18:20:45  checking for compress in -lz... yes
18:20:45  checking for which zlib to use... system
18:20:46  checking for system zlib functionality... configure: error: System zlib not working correctly
18:20:46  not ok
18:20:46  configure exiting with result code 1
18:20:46  Error: No configurations found for /home/jenkins/workspace/build-scripts/jobs/jdk11u/jdk11u-aix-ppc64-hotspot/workspace/build/src.
18:20:46  Please run 'bash configure' to create a configuration.
ansible

All 36 comments

I can't see zlib.h in /usr/include ?

I think the ansible playbooks are missing zlib install ?

zlib is one of those strange things. AIX provides libz.a via rpm.rte, but not libz.h - and that created regular issues with installing the RPM packaging libz and libz-devel.

Historically, the RPM packaging of libz was just different enough that it broke things expecting the IBM libz (taht was still using the member name libz(shr.o) or libz(libz.so.1) - and the RPM had libz(libz.so.2) - all from memory.

The AIX toolbox packages libz.a better, i.e., it extracts and inserts the archive members in /usr/lib/libz.a into /opt/freeware/lib/libz.a.

Just had a thought, I think rpm.rte that provides libz.a - does it by having /usr/lib/libz.a point - as a symbolic link - to /opt/freeware/lib/libz.a. If that is the case, then an update of libz.a - e.g., as an update of rpm.rte - may break whatever was there before.

And. lastly, what version of libz is installed. Historically, AIX has been way behind.

Note: the whole yum dependency tree often installs a new rpm.rte - but it might not need to update the RPM libz - and the library is in an unknown state.

Anyway - fixable - but, imho, inherent on having two package managers updating the same file.

This is the situation on build-osuosl-aix71...-{1,2}:

# rpm -qa | grep zlib
zlib-1.2.11-2.ppc
zlib-devel-1.2.11-2.ppc
# find /usr /opt -name zlib.h
/opt/freeware/include/zlib.h

On a system installed later (test-osuosl-aix71-ppc64-1 I see this:

root@p9-aix1-ojdk05:[/root]rpm -qa | grep zlib
zlib-1.2.11-1.ppc
zlib-devel-1.2.11-1.ppc
root@p9-aix1-ojdk05:[/root]find /opt /usr -name zlib.h
/opt/freeware/include/zlib.h
/usr/include/zlib.h
root@p9-aix1-ojdk05:[/root]find /opt /usr -name zlib.h -ls
21833   94 -rw-r--r--  1 root      system       96239 Feb 14  2017 /opt/freeware/include/zlib.h
14203    1 lrwxrwxrwx  1 root      system          33 Sep  4 15:21 /usr/include/zlib.h -> ../../opt/freeware/include/zlib.h

So, I expect the current playbooks handle this properly - OR - the recent update (note the later RPM packaging label on the build servers (zlib-1.2.11-1.ppc versus zlib-1.2.11-2.ppc) - they were, iirc, recently updated using yum update - removed the link. (?)

I propose to manually add/restore the symbolic links.

OK - additional proposal (for PR) - just looked at test-osuosl-aix72-ppc64-{1,2} and they are also missing the symbolic link.

Needed: an update to the playbooks to ensure there is a symbolic link /usr/include/zlib.h when it does not exist as a file AND /opt/freeware/lib/zlib.h exists. AND - what is also needed - from experience - zconf.h must also be included. Configure, iirc, only checks for existence of zlib.h

find /opt /usr -name zlib.h -ls
21833   94 -rw-r--r--  1 root      system       96239 Feb 14  2017 /opt/freeware/include/zlib.h
14203    1 lrwxrwxrwx  1 root      system          33 Sep  4 15:21 /usr/include/zlib.h -> ../../opt/freeware/include/zlib.h
root@p9-aix1-ojdk05:[/root]find /opt /usr -name zconf.h -ls
21832   16 -rw-r--r--  1 root      system       16262 Feb 14  2017 /opt/freeware/include/zconf.h
14202    1 lrwxrwxrwx  1 root      system          34 Sep  4 15:21 /usr/include/zconf.h -> ../../opt/freeware/include/zconf.h

@aixtools above changes sound good to me, thanks

@andrew-m-leonard Hmmm it is a bit of a fudge - I wonder if we can make the build process pick it up from the expected location instead?

So there is a variance of Bundled vs System across jdk versions and variant's and platforms:
ZLib AIX "defaults":

Hotspot:

  • jdk8: Bundled
  • jdk11: System, if not found default to Bundled. If System found does compile check.
  • jdk16: Bundled
  • jdk17: Bundled

OpenJ9:

  • jdk8: System, if not found default to Bundled. If System found no compile check.
  • jdk11: System, if not found default to Bundled. If System found does compile check.
  • jdk16: System, if not found default to Bundled. If System found does compile check.
  • jdk17: System, if not found default to Bundled. If System found does compile check.

Build Nodes:
test-ibm-aix71-ppc64-1 : Failure
build-osuosl-aix71-ppc64-1 : Success
build-osuosl-aix71-ppc64-2 : Success

So only test-ibm-aix71-ppc64-1 needs fixing
@aixtools

@andrew-m-leonard Hmmm it is a bit of a fudge - I wonder if we can make the build process pick it up from the expected location instead?

OpenJDK make files do not have a way of specifying zlib system聽location

@Haroon-Khel can we try running the latest playbooks on test-ibm-aix71-ppc64-1 ?

@Haroon-Khel can we try running the latest playbooks on test-ibm-aix71-ppc64-1 ?

Ill give it a go

@andrew-m-leonard The playbook ran fine on the machine

Running a jdk 11 openj9 job on test-ibm-aix71-ppc64-1
https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-openj9/918/console

@Haroon-Khel thanks, looks like it didn't resolve it though. I guess we need to fix up the paths manually

Made the zlib symlink as above
lrwxrwxrwx 1 root system 28 Feb 22 08:16 /usr/include/zlib.h -> /opt/freeware/include/zlib.h
Rerunning the job
https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-openj9/919/

Job still failed

13:27:01  checking for which zlib to use... system
13:27:02  checking for system zlib functionality... configure: error: System zlib not working correctly

I just saw above that @aixtools suggests making the zconf.h link, so ive now done that.

p159a01:/ # ls -la /usr/include/zconf.h
lrwxrwxrwx 1 root system 29 Feb 22 08:43 /usr/include/zconf.h -> /opt/freeware/include/zconf.h

Rerunning the jdk11 openj9 job on test-ibm-aix71-ppc64-1
https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-openj9/920/

That fixed the zlib issue

13:53:29  checking for which zlib to use... system
13:53:29  checking for system zlib functionality... ok

But now cmake isnt loading

13:56:15  Could not load program cmake:
13:56:15  Symbol resolution failed for cmake because:
13:56:15    Symbol _ZTINSt6thread6_StateE (number 358) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZTVNSt6thread6_StateE (number 359) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt6thread4joinEv (number 379) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt18condition_variableD1Ev (number 399) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt18condition_variableC1Ev (number 400) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt18condition_variable4waitERSt11unique_lockISt5mutexE (number 443) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt18condition_variable10notify_oneEv (number 444) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt18condition_variable10notify_allEv (number 445) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt6thread6_StateD2Ev (number 457) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15    Symbol _ZNSt6thread15_M_start_threadESt10unique_ptrINS_6_StateESt14default_deleteIS1_EEPFvvE (number 458) is not exported from dependent
13:56:15      module /opt/freeware/lib/libstdc++.a[libstdc++.so.6].
13:56:15  Examine .loader section symbols with the 'dump -Tv' command.

@Haroon-Khel I raised the cmake issue with openj9, maybe a different problem: https://github.com/eclipse/openj9/issues/12018

cmake --version appears to run ok on both of the IBM systems. Symbol resolutions in the tools on our machine are unlikely that it's something that OpenJ9 will fix

While I'd be happier with somehow convincing the build system to pick it up from elsewhere I'm happy with the symlink for the two zlib headers being put in place via the playbooks

p159a01:/ # ldd `which cmake`
/opt/freeware/bin/cmake needs:
         /usr/lib/libc.a(shr_64.o)
         /usr/lib/libpthreads.a(shr_xpg5_64.o)
         /opt/freeware/lib/pthread/ppc64/libstdc++.a(libstdc++.so.6)

Ive temporarily made /opt/freeware/lib/libstdc++.a to point to /opt/freeware/lib/pthread/ppc64/libstdc++.a to see if this fixes the problem (and to see what else breaks!)
Rerunning at https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-aix-ppc64-openj9/922/console

While I'd be happier with somehow convincing the build system to pick it up from elsewhere I'm happy with the symlink for the two zlib headers being put in place via the playbooks

Can this issue be closed since https://github.com/AdoptOpenJDK/openjdk-infrastructure/pull/1964 was merged? Or do we want to keep this open to investigate into how to make the build scripts look for zlib in the appropriate place?

This issue again: https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk16u/job/jdk16u-aix-ppc64-openj9/4/console

23:35:55  checking for which zlib to use... system
23:35:55  checking for system zlib functionality... not ok
23:35:55  configure: error: System zlib not working correctly
23:35:55  configure exiting with result code 1

Node: test-ibm-aix71-ppc64-2

Looking...

The links
/usr/include/zlib.h
/usr/include/zconf.h

are not setup on the machine. Im fixing that now

Also cmake is v3.16.0 and not 3.14.3

Was just going to mention the links are missing - but @Haroon-Khel beats me to it!

ibm...2:

p159a02:/ # find /opt /usr -name zlib.h -ls
26370   94 -rw-r--r--  1 root      system       96239 Dec 17 10:29 /opt/freeware/include/zlib.h
p159a02:/ # find /opt /usr -name zconf.h -ls
26395   16 -rw-r--r--  1 root      system       16262 Dec 17 10:29 /opt/freeware/include/zconf.h

The job hit an expected cmake error. Ive downgraded cmake on the machine to 3.14.3. Rerunning the build job
https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk16u/job/jdk16u-aix-ppc64-openj9/6/console

The job passed. Hopefully this shouldnt be a problem on this machine anymore

Looks like we need to rinse and repeat the current status of the playbooks on our 'ansible-test' node (ojdk06) using different OS levels - and when happy with that - make a backup of ojdk01 (e.g., clone) and run the playbooks on that node - and see if it cleans, or makes a mess.

My simple soul says it should be safe to reapply the playbooks at anytime. But simple souls sometimes face surprises! :wink:

@Haroon-Khel If this is resolved can you self-assign and close please

Closing as the problem is resolved

Was this page helpful?
0 / 5 - 0 ratings