openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Windows 10 amd64-64-Bit Compressed References 20190521_368 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
I'm experimenting with both AdoptOpenJDK and openJ9 but I'm having weirdly poor performance with our Spring Boot based application in comparison to OpenJDK and HotSpot.
The exact version of AdoptOpenJDK is:
C:\Program Files\Java>"jdk-12.0.1+12\bin\java.exe" -version
openjdk version "12.0.1" 2019-04-16
OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.1+12)
Eclipse OpenJ9 VM AdoptOpenJDK (build openj9-0.14.1, JRE 12 Windows 10 amd64-64-Bit Compressed References 20190418_57 (JIT enabled, AOT enabled)
OpenJ9 - 412258978
OMR - 4a4278e6
JCL - 9fad1c64af based on jdk-12.0.1+12)
And the version of OpenJDK is:
C:\Program Files\Java>jdk-12.0.1\bin\java -version
openjdk version "12.0.1" 2019-04-16
OpenJDK Runtime Environment (build 12.0.1+12)
OpenJDK 64-Bit Server VM (build 12.0.1+12, mixed mode, sharing)
I know the following results are not real benchmarks or anything but just to give an idea of the magnitude how much slower the openJ9 is in comparison.
With HotSpot:
20:42:12.197 [main] INFO c.y.p.application.Application - Starting Application on ...
20:42:20.064 [main] INFO org.reflections.Reflections - Reflections took 92 ms to scan 1 urls, producing 1 keys and 1 values
20:42:25.918 [main] INFO c.y.p.application.Application - Started Application in 14.462 seconds (JVM running for 16.663)
And then with openJ9:
20:37:33.995 [main] INFO c.y.p.application.Application - Starting Application on ...
20:39:01.341 [main] INFO org.reflections.Reflections - Reflections took 3333 ms to scan 1 urls, producing 1 keys and 1 values
20:40:05.520 [main] INFO c.y.p.application.Application - Started Application in 161.122 seconds (JVM running for 168.174)
I am launching the openJ9 version with flags -Xms256m -Xmx1G -Xshareclasses -Xquickstart.
Also, partial output of systeminfo cmd:
$ systeminfo
Host Name: xxx
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.17763 N/A Build 17763
OS Manufacturer: Microsoft Corporation
OS Configuration: Standalone Workstation
OS Build Type: Multiprocessor Free
Registered Owner: Jani
Registered Organization: Microsoft
Product ID: 00330-80000-00000-AA476
Original Install Date: 21.2.2019, 16.44.02
System Boot Time: 28.5.2019, 21.27.11
System Manufacturer: Dell Inc.
System Model: XPS 15 9550
System Type: x64-based PC
Processor(s): 1 Processor(s) Installed.
[01]: Intel64 Family 6 Model 94 Stepping 3 GenuineIntel ~2601 Mhz
BIOS Version: Dell Inc. 1.3.0, 11.8.2017
Windows Directory: C:\WINDOWS
System Directory: C:\WINDOWS\system32
Boot Device: \Device\HarddiskVolume1
System Locale: en-us;English (United States)
Input Locale: fi;Finnish
Time Zone: (UTC+02:00) Helsinki, Kyiv, Riga, Sofia, Tallinn, Vilnius
Total Physical Memory: 32聽630 MB
Available Physical Memory: 23聽378 MB
Virtual Memory: Max Size: 44聽553 MB
Virtual Memory: Available: 34聽241 MB
Virtual Memory: In Use: 10聽312 MB
Page File Location(s): C:\pagefile.sys
Any tips or pointers what could be wrong with my setup here? Thanks!
Some followup discussion can be found from openj9 slack #newcomers: https://openj9.slack.com/archives/C862YFGL9/p1559066270001000
Also testing the performance with JDK8:
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Windows 10 amd64-64-Bit Compressed References 20190521_368 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
And the startup time with args -Xms256m -Xmx1G -Xshareclasses:
21:42:37.876 [main] INFO c.y.p.application.Application - Started Application in 83.586 seconds (JVM running for 86.686)
Then another try with JDK8 and args -Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog:
21:45:31.391 [main] INFO c.y.p.application.Application - Started Application in 35.898 seconds (JVM running for 38.25)
Here is the vlog file (changed the file type to .txt to make GitHub happy)
vlog.20190528.220511.15780.txt
$ perl avgCompTime.pl vlog.20190528.220511.15780.txt
+ (warm) org/opensaml/core/xml/config/XMLConfigurator.initializeObjectProviders(Lorg/w3c/dom/Element;)V @ 0x00000000579A92C0-0x00000000579B40D8 OrdinaryMethod - Q_SZ=23 Q_SZI=23 QW=48 j9m=0000000003F80978 bcsz=233 OSR time=1533159us mem=[region=65216 system=65536]KB compThread=1 CpuLoad=178%(22%avg) JvmCpu=163%
vlog.20190528.220511.15780.txt
Samples MAX(ms) TOTAL(ms) AVG(usec) MIN(usec)
Total 2771 1533.2 27939.7 10083 308
cold 2046 136.2 12358.8 6040 308
warm 724 1533.2 15545.3 21471 771
hot 1 35.6 35.6 35614 35614
MAXLINE=+ (warm) org/opensaml/core/xml/config/XMLConfigurator.initializeObjectProviders(Lorg/w3c/dom/Element;)V @ 0x00000000579A92C0-0x00000000579B40D8 OrdinaryMethod - Q_SZ=23 Q_SZI=23 QW=48 j9m=0000000003F80978 bcsz=233 OSR time=1533159us mem=[region=65216 system=65536]KB compThread=1 CpuLoad=178%(22%avg) JvmCpu=163%
numCGR recompilations = 0
numgcr recompilations = 0
pushy recompilations = 0
coldupgr recompilations = 8 (AOT=0, JIT=8
forcedAOTupgrades = 0
numAOTLoadsNotRecompiled= 0
inv recompilations = 0
sync compilations = 0
DLT compilations = 0
numFailures = 202
max_Q_SZ = 56
lastTimestamp = 199922
total compilation time = 27939.71 ms
Doesn't look like any AOT compilations happened :S. From the vlog:
#INFO: Shared Class Cache Information:
#INFO: SCCname:sharedcc_Jani Simomaa SCCpath:C:\Users\Jani\AppData\Local\javasharedresources\C290M4F1A64P_sharedcc_Jani_G37
#INFO: SCC_stats_bytes: size=314572192 free=19665902 ROMClass=40628752 AOTCode=0 AOTData=0 JITHint=0 JITProfile=0
#INFO: SCC_stats_#: ROMClasses=16814 AOTMethods=0 AOTDataEntries=0 AOTHints=0 AOTJitProfiles=0
there isn't even any AOT code in the SCC, which is weird...
Could you destroy the cache and run your program again, collecting the vlog both times? to destroy the cache, do java -Xshareclasses:destroyAll
Could you destroy the cache and run your program again, collecting the vlog both times? to destroy the cache, do java -Xshareclasses:destroyAll
I destroyed the cache by invoking:
java -Xshareclasses:destroyAll -Xjit:verbose={compilePerformance},vlog=vlog
but this outputted an empty vlog file. However, the cache was deleted.
Rerun with
java -Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog
produced this vlog:
I destroyed the cache by invoking:
java -Xshareclasses:destroyAll -Xjit:verbose={compilePerformance},vlog=vlog
but this outputted an empty vlog file. However, the cache was deleted.
Sorry, I should've been more clear, I meant collecting the vlog for both invocations of your benchmark :). However, doesn't look like we need the second invocation; we didn't do any AOT compilations in that vlog...I don't know why :S.
For the sake of sanity, could you do the following:
#destroy cache
java -Xshareclasses:destroyAll
#run version
java -Xshareclasses -Xjit:count=0,verbose={compilePerformance},vlog=vlog -version
I did this with
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190521_315 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
from https://github.com/AdoptOpenJDK/openjdk8-binaries/releases/download/jdk8u212-b04_openj9-(0.14.2/OpenJDK8U-jdk_x64_linux_openj9_8u212b04_openj9-0.14.2.tar.gz)
And the vlog does show AOT compilations happening:
Samples MAX(ms) TOTAL(ms) AVG(usec) MIN(usec)
Total 1344 170.7 3156.0 2348 32
aotl 11 0.1 0.5 48 32
cold 7 1.7 3.6 519 146
aotw 1275 170.7 3138.0 2461 202
jni 50 0.7 13.7 274 160
fails 1 0.1 0.1 116 116
This will verify that you're able to at least get AOT compilations going.
I destroyed the cache and invoked
java -Xshareclasses -Xjit:count=0,verbose={compilePerformance},vlog=C:/git/vlog -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Windows 10 amd64-64-Bit Compressed References 20190521_368 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
The only difference seems to be that I'm on Windows 10 host
Here is the vlog of that:
vlog.20190528.224833.7220.txt
Ok, so looks like you did get AOT compilations when running -version:
Total 1256 110.0 2284.1 1819 60
aotl 12 0.4 1.3 106 60
cold 7 1.2 3.8 536 193
aotw 1183 110.0 2261.9 1912 221
jni 53 0.9 16.9 318 183
fails 1 0.3 0.3 336 336
so now the question becomes, why are there no AOT compilations when you run your benchmark..
Here is how my cache dir looks like:
C:\Users\Jani\AppData\Local\javasharedresources>dir
Volume in drive C is OS
Volume Serial Number is F2BE-BE2C
Directory of C:\Users\Jani\AppData\Local\javasharedresources
28.05.2019 22.48 <DIR> .
28.05.2019 22.48 <DIR> ..
28.05.2019 11.50 314聽572聽800 C290M12F1A64P_sharedcc_Jani_G37
28.05.2019 22.48 314聽572聽800 C290M4F1A64P_sharedcc_Jani_G37
2 File(s) 629聽145聽600 bytes
2 Dir(s) 12聽341聽964聽800 bytes free
Ok, just for the sake of doing a second sanity, and also to get a sense of how much perf could improve, could you do the following:
# destroy cache
java -Xshareclasses:destroyAll
# Benchmark run 1 (populate SCC)
java -Xms256m -Xmx1G -Xshareclasses -Xaot:forceAOT -Xjit:forceAOT,verbose={compilePerformance},vlog=vlog1 ...
# Benchmark run 2 (load from SCC)
java -Xms256m -Xmx1G -Xshareclasses -Xaot:forceAOT -Xjit:forceAOT,verbose={compilePerformance},vlog=vlog2 ...
I think the key lies in the large number of compilation failures. A quick glance at the vlog shows many "compilationIlGenFailure". We need to understand the reason of that. I'll look at the code.
There's 200 failures out of ~2900 methods, but that doesn't explain why methods like java/lang/StringBuffer.lengthInternalUnsynchronized()I aren't AOT compiled.
There are many possible reasons why the JIT throws ILGenFailure
@jsimomaa To get a better understanding of the failures could you please do a run with -Xjit:printErrorInfoOnCompFailure. For example:
java -Xms256m -Xmx1G -Xshareclasses -Xjit:printErrorInfoOnCompFailure,verbose={compilePerformance},vlog=vlog ...
There will be some messages printed to stderr about the reason for those failures.
@dsouzai Can you update the script to give the total time spent in failed compilations?
The script already prints that out, eg:
Samples MAX(ms) TOTAL(ms) AVG(usec) MIN(usec)
Total 2944 1679.9 29779.0 10115 65
cold 1792 67.8 8044.6 4489 334
warm 949 334.5 17365.6 18299 692
hot 2 1437.3 1447.8 723914 10552
fails 201 1679.9 2921.0 14532 65
ah, ok. Looked straight past that. Thanks.
The problem with compilationIlGenFailures is that compilations are not retried, so the methods subject to these failures will run interpreted for the entire duration of the JVM. This could lead to severe slowdowns depending on the importance of these methods.
That fits - the 10x deficit made me wonder if we were running in the interpreter. If we fail some hot methods it would be similar.
Ok, just for the sake of doing a second sanity, and also to get a sense of how much perf could improve, could you do the following:
# destroy cache java -Xshareclasses:destroyAll # Benchmark run 1 (populate SCC) java -Xms256m -Xmx1G -Xshareclasses -Xaot:forceAOT -Xjit:forceAOT,verbose={compilePerformance},vlog=vlog1 ... # Benchmark run 2 (load from SCC) java -Xms256m -Xmx1G -Xshareclasses -Xaot:forceAOT -Xjit:forceAOT,verbose={compilePerformance},vlog=vlog2 ...
Please find attached here both vlogs:
vlog1.20190528.233827.17284.txt
vlog2.20190528.233925.15140.txt
There are many possible reasons why the JIT throws ILGenFailure
@jsimomaa To get a better understanding of the failures could you please do a run with-Xjit:printErrorInfoOnCompFailure. For example:
java -Xms256m -Xmx1G -Xshareclasses -Xjit:printErrorInfoOnCompFailure,verbose={compilePerformance},vlog=vlog ...
There will be some messages printed to stderr about the reason for those failures.
Here are the prints from stderr:
Compilation Failed Because: IL Gen Failure
Compilation Failed Because: IL Gen Failure
Compilation Failed Because: IL Gen Failure
Compilation Failed Because: IL Gen Failure
Compilation Failed Because: IL Gen Failure
Compilation Failed Because: IL Gen Failure
Compilation Failed Because: IL Gen Failure
... (totaling 168 hits)
Compilation Failed Because: FSD_HAS_INVOKEHANDLE 0
... (totaling 18 hits)
Compilation Failed Because: FSD_HAS_INVOKEHANDLE 2
What did the perf look like for the run that produced vlog2?
vlog2.20190528.233925.15140.txt
Samples MAX(ms) TOTAL(ms) AVG(usec) MIN(usec)
Total 5064 4589.7 15643.5 3089 22
aotl 4117 14.3 2253.0 547 22
cold 322 75.8 2436.7 7567 291
aotw 106 276.6 3030.0 28585 386
warm 96 520.2 2514.5 26193 559
hot 1 4589.7 4589.7 4589724 4589724
jni 183 17.0 314.9 1721 191
fails 239 31.5 504.6 2111 171
$ grep '!' vlog2.20190528.233925.15140.txt | wc
239 2390 56293
$ grep '!' vlog2.20190528.233925.15140.txt | grep 'compilationAotClassReloFailure' | wc
206 2060 47232
Looks like the majority of failures were due to AOT relocations when doing forceAOT. Additionally, there weren't as many failures in the compile run.
Compilation Failed Because: FSD_HAS_INVOKEHANDLE 0
are you running with an agent attached by any chance for debugging purposes? Because that might explain why there's no AOT happening (when we don't specify forceAOT)
First run (vlog1):
23:39:14.749 [main] INFO c.y.p.application.Application - Started Application in 44.284 seconds (JVM running for 47.72)
Second run (vlog2):
23:39:59.526 [main] INFO c.y.p.application.Application - Started Application in 32.227 seconds (JVM running for 33.905)
Compilation Failed Because: FSD_HAS_INVOKEHANDLE 0are you running with an agent attached by any chance for debugging purposes? Because that might explain why there's no AOT happening.
Yes, that is actually true! I should have stated that in the beginning that I was using Visual Studio Code for the first vlogs and it attaches a debugger.
The vlog1 and vlog2 was then purely with cmd and no debuggers attached or anything. Sorry about this, I should have mentioned about this in the setup.
But without the debugger attached I wasn't able to get any output to stderr with printErrorInfoOnCompFailure.
Ahh ok, that at least explains why there was no AOT before. For the sake a comparison to allow OpenJ9 to get the benefit of AOT heh, would you mind doing a run without the debugger :). ie:
-Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog1 ...-Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog2 ...Thanks!
Here is one more vlog no debuggers attached and with args: -Xms256m -Xmx1G -Xshareclasses -Xjit:printErrorInfoOnCompFailure,verbose={compilePerformance},vlog=vlog-new.
vlog-new.20190529.000410.16028.txt
Also, with the shareclasses populated on previous runs the result of the "benchmark" is now ~31s (compared to ~16s with HotSpot):
00:04:43.525 [main] INFO c.y.p.application.Application - Started Application in 31.485 seconds (JVM running for 33.054)
Haha beat me to it :)
Ahh ok, that at least explains why there was no AOT before. For the sake a comparison to allow OpenJ9 to get the benefit of AOT heh, would you mind doing a run without the debugger :). ie:
- Destroy SCC
- Populate SCC; options
-Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog1 ...- Load from SCC; options
-Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog2 ...Thanks!
java -Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog1java -Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog2Here are the files:
vlog1.20190529.001023.4224.txt
vlog2.20190529.001127.2660.txt
perl avgCompTime.pl vlog2.20190529.001127.2660.txt
+ (hot) org/springframework/boot/loader/LaunchedURLClassLoader.loadClass(Ljava/lang/String;Z)Ljava/lang/Class; @ 0x0000000057C50668-0x0000000057C55FF8 OrdinaryMethod 35.20% T Q_SZ=6 Q_SZI=6 QW=46 j9m=0000000003323A40 bcsz=75 time=3105992us mem=[region=91712 system=98304]KB compThread=1 CpuLoad=186%(23%avg) JvmCpu=133%
vlog2.20190529.001127.2660.txt
Samples MAX(ms) TOTAL(ms) AVG(usec) MIN(usec)
Total 5179 3106.0 14187.9 2740 22
aotl 4092 14.3 2269.1 555 22
cold 325 92.0 2039.1 6274 206
aotw 146 332.0 2827.6 19367 310
warm 224 200.4 3352.4 14966 394
hot 1 3106.0 3106.0 3105992 3105992
jni 181 20.4 290.1 1603 170
fails 210 8.5 303.6 1446 387
$ grep '!' vlog2.20190529.001127.2660.txt | grep 'compilationAot' | wc
209 2090 48200
So looks like with AOT, the majority of failures are AOT load failures that https://github.com/eclipse/openj9/issues/4897 was opened to address; this at least removes the ILGenFailures as a potential problem. However, since we're still at 30s, I'm wondering if the aotw and hot compilations in addition to the load failures all play into this. @mpirvu what do you think?
The hot compilation is expensive, no doubt, but I do see other compilations being executed in parallel with. The JVM can use up to 8 cores and the JVM utilization is always under 150%, so the hot compilation should not interfere with application or other compilation threads.
Compilation queue grows up to 370 entries in the first 100 ms. The are bigger jumps in Q_SZ around AOT failures, so I would say that AOT relocation failures play a role.
AOT warm compilations in the second run can be eliminated by making the cache readonly.
There are some relatively big periods of time when compilation queue is close to 0 (for instance between t=18.7s and t=27.7s), so code quality may matter as well.
@dsouzai I created a reduced example for you to test yourself: https://github.com/jsimomaa/spring-boot-openj9
In order to launch the app do:
git clone https://github.com/jsimomaa/spring-boot-openj9.git
cd spring-boot-openj9/
mvn package
After the build is complete the app can be run by invoking
# Insert your vmargs here as needed
java -jar target/spring.boot.openj9-0.0.1-SNAPSHOT.jar
With this example my startup times are
HotSpot (java -Xms256m -Xmx1G -jar target/spring.boot.openj9-0.0.1-SNAPSHOT.jar)
08:52:43.119 [main] INFO com.example.application.Application - Started Application in 7.3 seconds (JVM running for 7.979)
OpenJ9 (java -Xms256m -Xmx1G -Xshareclasses -Xjit:verbose={compilePerformance},vlog=vlog -jar target/spring.boot.openj9-0.0.1-SNAPSHOT.jar)
08:53:12.988 [main] INFO com.example.application.Application - Started Application in 15.356 seconds (JVM running for 16.398)
So the startup time is roughly the same (2x) with OpenJ9 than compared to HotSpot. Hope this helps you to debug further!
Thanks @jsimomaa , that should be really useful.
I will have to spin up my Windows machine but running quickly on my Linux machine I see the following results.
VMs being used are the latest JDK8 LTS builds from AdoptOpenJDK:
OpenJ9
[cgracie@breathe1 5918]# jdk8u212-b04/bin/java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190521_315 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
HotSpot
[cgracie@breathe1 5918]# jdk8u212-b03/bin/java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
Results from the run:
OpenJ9:
[cgracie@breathe1 5918]# jdk8u212-b04/bin/java -Xms1g -Xmx1g -jar spring-boot-openj9/target/spring.boot.openj9-0.0.1-SNAPSHOT.jar
.......
07:43:09.737 [main] INFO com.example.application.Application - Started Application in 9.503 seconds (JVM running for 10.881)
OpenJ9 cold run with -Xshareclasses:
[cgracie@breathe1 5918]# jdk8u212-b04/bin/java -Xshareclasses -Xms1g -Xmx1g -jar spring-boot-openj9/target/spring.boot.openj9-0.0.1-SNAPSHOT.jar
....
07:49:18.634 [main] INFO com.example.application.Application - Started Application in 15.504 seconds (JVM running for 18.042)
OpenJ9 warm run with -Xshareclasses:
[cgracie@breathe1 5918]# jdk8u212-b04/bin/java -Xshareclasses -Xms1g -Xmx1g -jar spring-boot-openj9/target/spring.boot.openj9-0.0.1-SNAPSHOT.jar
....
07:51:02.075 [main] INFO com.example.application.Application - Started Application in 6.093 seconds (JVM running for 6.554)
HotSpot:
[cgracie@breathe1 5918]# jdk8u212-b03/bin/java -Xms1g -Xmx1g -jar spring-boot-openj9/target/spring.boot.openj9-0.0.1-SNAPSHOT.jar
....
07:42:22.512 [main] INFO com.example.application.Application - Started Application in 10.318 seconds (JVM running for 11.328)
My results are very consistent.
Since I had JDK 12 builds as well I tested them and they were pretty similar, except the OpenJ9 runs were all about 1 second slower. I did destroy my Shareclasses cache before I started using the new JDK12 builds.
In light of @charliegracie's results, the one difference I see in the options he used is that Xms was also specified. I was having a discussion with @r30shah and he mentioned that in the past, adding the Xms option does tend to improve the performance. I wonder if that's what's resulting in @charliegracie's results showing better perf. Granted, we should probably wait until Charlie spins up his Windows box and tries this out.
I just re-ran with the same -Xms option as @jsimomaa and I reproduced my results. The initial heap setting is not affecting these results.
Ok, I got the testcase running. I find a smaller deficit when running without shareclasses, and an improvement when running with. It looks like there's a significant platform dependency that I don't understand, but I wonder if it relates somehow to the OS or filesystem.
Java versions compared, both downloaded from AdoptOpenJDK a week or so ago:
Hotspot:
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
OpenJ9:
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b03)
Eclipse OpenJ9 VM (build openj9-0.14.0, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190417_286 (JIT enabled, AOT enabled)
OpenJ9 - bad1d4d06
OMR - 4a4278e6
JCL - 5590c4f818 based on jdk8u212-b03)
Startup times (three runs, mx1G) :
Hotspot:
3.777, 3.799, 3.749
OpenJ9, without SCC:
4.594, 4.606, 4.747
OpenJ9, with SCC:
3.103, 3.367, 2.928
A fixed heap brings things down a little more:
OpenJ9, with SCC and fixed 1G heap
2.772, 2.692, 2.688
ah, sorry, I should have refreshed before posting - I missed Charlie's results. Oh well, more data points fwiw.
I just re-ran with the same -Xms option as @jsimomaa and I reproduced my results. The initial heap setting is not affecting these results.
Makes a difference for me.
Did you go from no -Xms to a -Xms? My comments were about switching from -Xms1g to -Xms128m.
Yes, I went from no -Xms to setting -Xms = -Xmx
ah, default Xms is only 8M, that would hurt
Windows results on a Windows Server 2012 R2 with 8CPUs
HotSpot:
C:\temp\OpenJ9>jdk8u212-b03\bin\java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
jdk8u212-b03\bin\java -Xms128m -Xmx1g -jar spring-boot-openj9\target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
18.59s,16.844s,16.313
OpenJ9:
C:\temp\OpenJ9>jdk8u212-b04\bin\java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Windows Server 2012 R2 amd64-64-Bit Compressed References 20190521_368 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
jdk8u212-b04\bin\java -Xms128m -Xmx1g -jar spring-boot-openj9\target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
11.985s,14.252s,12.188s
OpenJ9 with -Xshareclasses:
jdk8u212-b04\bin\java -Xms128m -Xmx1g -Xshareclasses -jar spring-boot-openj9\target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
Cold run 22.058s
Warm runs 8.896s,9.672s,9.25s
That looks pretty good. Which leaves us with the question of what's going on with the OP's setup.
Yep. Trying to see if I can limit my runs to a single CPU to see if that matters. The OP only has a single CPU.
@jsimomaa would it be possible run the following command using the OpenJ9 VM and provide the javacore*.txt and the gc.log?
java -Xms128m -Xmx1g -Xdump:java:events=vmstop -Xverbosegclog:gc.log -verbose:gc Foo
Interesting results indeed - of course Windows Server 2012 is not the same than Windows 10 so there is still a difference. Is there something more I could do or test with my setup?
I think I can get an access to a Windows Server 2012/2016 as well to run my benchmark there.
Single cpu, but it was 8 cores I think? I seem to remember seeing multiple active JIT compilation threads in the logs.
Here is the output of the command @charliegracie suggested:
C:/Progra~1/Java/jdk8u212-b04/bin/java -Xms128m -Xmx1g -Xdump:java:events=vmstop -Xverbosegclog:gc.log -verbose:gc Foo
Error: Could not find or load main class Foo
JVMDUMP039I Processing dump event "vmstop", detail "#0000000000000001" at 2019/05/29 18:45:09 - please wait.
JVMDUMP032I JVM requested Java dump using 'C:\git\spring-boot-openj9\javacore.20190529.184509.18644.0001.txt' in response to an event
JVMDUMP010I Java dump written to C:\git\spring-boot-openj9\javacore.20190529.184509.18644.0001.txt
JVMDUMP013I Processed dump event "vmstop", detail "#0000000000000001".
And here are the files:
gc.log
javacore.20190529.184509.18644.0001.txt
Single cpu, but it was 8 cores I think? I seem to remember seeing multiple active JIT compilation threads in the logs.
Yes, single CPU and 8 cores (I7-6700HQ).
Using start /affinity 1 <command> slowed both JVMs down and appeared to affect HotSpot even more than OpenJ9.
I should be able to spin up a Windows 10 machine to check things out.
I7-6700HQ
That's very close to the i7-6820HQ that I ran on, which makes the difference in absolute times we are seeing pretty weird. IO? I'm running on Linux with an ext4 filesystem on SSD.
@jsimomaa while I spin up a Windows 10 image could you do one more test. Could you add -Xdump:java:events=vmstop to your OpenJ9 command line when you are starting the spring application and provide that javacore.*.txt file? Thanks so much for helping dig into this issue!
I7-6700HQ
That's very close to the i7-6820HQ that I ran on, which makes the difference in absolute times we are seeing pretty weird. IO? I'm running on Linux with an ext4 filesystem on SSD.
Here NVMe PM951 SAMSUNG 512GB and supposedly on NTFS filesystem..
@jsimomaa while I spin up a Windows 10 image could you do one more test. Could you add
-Xdump:java:events=vmstopto your OpenJ9 command line when you are starting the spring application and provide that javacore.*.txt file? Thanks so much for helping dig into this issue!
Well this is getting interesting - I ran (note: without -Xshareclasses):
java -Xms128m -Xmx1g -Xdump:java:events=vmstop -Xverbosegclog:gc.log -verbose:gc -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
And my startup was:
19:06:50.449 [main] INFO com.example.application.Application - Started Application in 7.793 seconds (JVM running for 8.652)
But now with -Xshareclasses (which I've used for every run during this debugging):
java -Xms128m -Xmx1g -Xshareclasses -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
19:07:58.376 [main] INFO com.example.application.Application - Started Application in 11.833 seconds (JVM running for 12.546)
So it seems the -Xshareclasses is slowing things down on my setup..
Here is the javacore.txt file @charliegracie requested:
javacore.20190529.190736.13032.0001.txt
And here javacore.txt with both -Xshareclasses -Xdump:java:events=vmstop:
java -Xms128m -Xmx1g -Xshareclasses -Xdump:java:events=vmstop -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
19:11:29.314 [main] INFO com.example.application.Application - Started Application in 14.251 seconds (JVM running for 14.936)
That is very interesting. For some reason adding -Xshareclasses is making things slower on your system. @JamesKingdon or @dsouzai or @mpirvu can we disable AOT for the shared classes cache via a command line option? It would be interesting to see if just have the ROM classes in the cache slows things down. If it does then it feels like some sort of file IO or something that is making shared classes slow. @jsimomaa is your machine running any virus software? Just trying to think about things that may make file IO slow.
My Windows 10 image does not have anything installed so getting it up to date with GIT, etc will likely take me until tomorrow.
@jsimomaa is your machine running any virus software? Just trying to think about things that may make file IO slow.
Yes, Windows Defender but I've turned it off and have verified that the process is not active during the Spring Boot app startup.
If it does then it feels like some sort of file IO or something that is making shared classes slow
Also made me think of my SSD performanace but I ran couple of benchmarks and the disk seems to be in good shape.
-Xnoaot will disable AOT, but will leave the sharing of classes.
@jsimomaa Just to make sure, is the slowdown induced by -Xshareclasses seen in the second run, or just in the first? The first run is expected to be slower because the cache is empty (non-existent).
The slowdown is on the second run. The first run takes even longer (approximately ~23 seconds).
If you delete your share classes cache and then run with -Xshareclasses and -Xnoaot what is the time for your first and second run? If it is still slow then there is likely some sort of issue with how the file is being read on your system.
java -Xshareclasses:destroyAlljava -Xshareclasses -Xnoaot -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar23:27:24.813 [main] INFO com.example.application.Application - Started Application in 15.577 seconds (JVM running for 17.207)
java -Xshareclasses -Xnoaot -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar23:28:55.581 [main] INFO com.example.application.Application - Started Application in 12.814 seconds (JVM running for 13.777)
java -Xshareclasses -Xnoaot -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar23:29:41.825 [main] INFO com.example.application.Application - Started Application in 12.816 seconds (JVM running for 13.771)
So that's roughly a 60% penalty for just using the SCC. How strange.
That is very strange. I am very interested in seeing how my Windows machine runs once I get all of the pre-reqs installed.
Is there anything strange about the C:UsersJaniAppDataLocaljavasharedresources directory? Extra encryption or compression maybe?
Yeah.. I am starting to think that
If it is still slow then there is likely some sort of issue with how the file is being read on your system.
there is something not quite right with my machine here..
However, we have come down from the initial 160s startup time to ~12s so at least I've learned a lot.
Is there anything strange about the C:UsersJaniAppDataLocaljavasharedresources directory? Extra encryption or compression maybe?
Not that I'm aware of. The folder contains two files.
However, we have come down from the initial 160s startup time to ~12s so at least I've learned a lot.
True, although it ought to be faster, so it's hard to let it go :) We were looking at making shared classes on by default, so we'd definitely want to know if there's a gotcha lurking on some platforms.
We are aware of some opportunities to improve debug mode, and hopefully that gap will be narrowed in future releases.
hmm my Windows 10 box is not showing a slowdown with -Xshareclasses.
default runs:
7.9s,7.8s,7.9s
shareclasses cold:
14.4s
shareclasses warm:
7.1s,6.9s,7.0s
I would have expected -Xshareclasses to provide more of a win but it is still a win even on this Windows 10 machine.
I did not measure Hotspot as I do not believe it is required now since it looks like an issue with -Xshareclasses. @jsimomaa would it be possible to share your jar file just to make sure that my setup is as close as possible?
The improvements I see on my laptop (Windows 10 based) are significant:
-Xms256m -Xmx1G
Started Application in 6.133 seconds (JVM running for 7.087)
Started Application in 6.433 seconds (JVM running for 7.332)
Started Application in 6.539 seconds (JVM running for 7.485)
-Xms256m -Xmx1G -Xshareclasses -Xscmx100M
Started Application in 14.555 seconds (JVM running for 16.19)
Started Application in 3.696 seconds (JVM running for 4.052)
Started Application in 3.505 seconds (JVM running for 3.806)
Started Application in 3.729 seconds (JVM running for 4.004)
@jsimomaa would it be possible to share your jar file just to make sure that my setup is as close as possible?
Here you go: https://drive.google.com/open?id=1dvNNXXmYQZqvlQRzkST4DRGMrm5nsfhh
I tracked the file accesses to javasharedresources folder with Process Monitor and there were total of 5412 tracked events. I'm not sure if this helps at all but still I'm attaching the results here.
True, although it ought to be faster, so it's hard to let it go :) We were looking at making shared classes on by default, so we'd definitely want to know if there's a gotcha lurking on some platforms.
We are aware of some opportunities to improve debug mode, and hopefully that gap will be narrowed in future releases.
Okay, this is good to know.
So if I want to debug my application I either just use OpenJDK+HotSpot for that or suffer the performance loss with openJ9.
The app will run in production inside a container so there openJ9 should give me a nice boost in terms of performance. I can try the performance inside a container later this week and can then provide feedback on that.
I had the chance myself to test this on a fresh Windows 10 machine and here are the results:
$ java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
$ java -jar target/spring.boot.openj9-0.0.1-SNAPSHOT.jar
09:45:58.547 [main] INFO com.example.application.Application - Started Application in 3.249 seconds (JVM running for 3.643)
C:\Progra~1\AdoptOpenJDK\jdk8u212-b04\bin\java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (build 1.8.0_212-b04)
Eclipse OpenJ9 VM (build openj9-0.14.2, JRE 1.8.0 Windows 10 amd64-64-Bit Compressed References 20190521_368 (JIT enabled, AOT enabled)
OpenJ9 - 4b1df46fe
OMR - b56045d2
JCL - a8c217d402 based on jdk8u212-b04)
# First run, no cache
$ java -Xshareclasses -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
09:54:47.119 [main] INFO com.example.application.Application - Started Application in 11.768 seconds (JVM running for 13.267)
````
$ java -Xshareclasses -jar targetspring.boot.openj9-0.0.1-SNAPSHOT.jar
09:55:59.893 [main] INFO com.example.application.Application - Started Application in 8.138 seconds (JVM running for 8.716)
````
So similar results - the machine itself is a bit newer Dell model with the following specs:
OS Name: Microsoft Windows 10 Pro
OS Version: 10.0.17134 N/A Build 17134
OS Manufacturer: Microsoft Corporation
OS Configuration: Standalone Workstation
OS Build Type: Multiprocessor Free
Registered Owner: N/A
Registered Organization: N/A
Product ID: 00330-50000-00000-AAOEM
Original Install Date: 30/05/2019, 18.56.58
System Boot Time: 30/05/2019, 8.56.55
System Manufacturer: Dell Inc.
System Model: XPS 15 9570
System Type: x64-based PC
Processor(s): 1 Processor(s) Installed.
[01]: Intel64 Family 6 Model 158 Stepping 10 GenuineIntel ~2904 Mhz
BIOS Version: Dell Inc. 1.8.1, 01/02/2019
Windows Directory: C:\Windows
System Directory: C:\Windows\system32
Boot Device: \Device\HarddiskVolume1
System Locale: en-us;English (United States)
Input Locale: fi;Finnish
Time Zone: (UTC+02:00) Helsinki, Kyiv, Riga, Sofia, Tallinn, Vilnius
Total Physical Memory: 32聽504 MB
Available Physical Memory: 23聽750 MB
Virtual Memory: Max Size: 37聽624 MB
Virtual Memory: Available: 27聽090 MB
Virtual Memory: In Use: 10聽534 MB
Page File Location(s): C:\pagefile.sys
I can reproduce the regression with Xshareclasses using the jdk8u212-b04_openj9-0.14.2 SDK (the one used by @jsimomaa). My results above (https://github.com/eclipse/openj9/issues/5918#issuecomment-497107621) were using a development driver I had on my laptop.
The regression is seen even with -Xnoaot, so it's related to class sharing infra rather than AOT.
The regression is seen even with -Xnoaot, so it's related to class sharing infra rather than AOT.
fyi @hangshao0
Hmm interesting since I was using the same VM as the OP and I was not seeing any regression, but I am glad someone can reproduce it :)
@mpirvu any chance you can grab a profile so we can see where all of the time is being spent? The only Windows machines I have access to with Windows 10 are virtualized instances so that may be why I can not reproduce.
Also @mpirvu what was the java -version from your development build? The SHAs will help track down changes
I pulled the 14.2 build for Linux just to see if there was any sign of the regression on that platform, but I'm not seeing it.
If it does then it feels like some sort of file IO or something that is making shared classes slow.
I suspect it is related to timestamp checking of the jars. We do timestamp checking to make sure the jar files are not updated when returning classes from the shared cache. Probably it is worth trying with -Xshareclasses:noTimestampChecks @jsimomaa
Nice - while 14.2 didn't have a regression, turning off the timestamps does give me a startup boost:
1.93, 1.947, 1.985
vs around 2.6s before. It's really flying now!
I remember @mpirvu did some benchmark tests before showing that the timestamp checking has more performance impact on Windows than on Linux.
That fits. I wonder what else we did going from 14.0 to 14.2 that compensates for the timestamp checking overhead (at least on Linux).
So, after we have started loading classes from the SCC, if we discover the timestamp has changed what do we do? I'm just wondering if there is any need to check the timestamps on every access, or if once per run or once per 'arbitrary interval' would be sufficient.
if we discover the timestamp has changed what do we do? I'm just wondering if there is any need to check the timestamps on every access, or if once per run or once per 'arbitrary interval' would be sufficient.
We do not return the class from the shared cache. The class loader goes to the disk to load the class as usually.
A bit of history: On Windows, the jar file cannot be updated if it is hold open. So we skip timestamp checking if the jar file is opened. JCL knows the info if a Jar file is open or not, and used to provide such info to the JVM. Howerver, the patch on the JCL side to detect such info was removed, making the JVM doing timestamp checking on every class loaded from the shared cache. There is an issue open on the extension JCL repo to add this patch back.
OK, I have run the test on my Windows10 using the jdk8u212-b04_openj9-0.14.2 SDK (the one used by @jsimomaa).
java -jar spring.boot.openj9-0.0.1-SNAPSHOT.jar
12:17:48.151 [main] INFO com.example.application.Application - Started Application in 9.556 seconds (JVM running for 10.502)
java -Xshareclasses:name=spring -jar spring.boot.openj9-0.0.1-SNAPSHOT.jar
12:18:39.609 [main] INFO com.example.application.Application - Started Application in 19.274 seconds (JVM running for 21.551)
java -Xshareclasses:name=spring -jar spring.boot.openj9-0.0.1-SNAPSHOT.jar
12:19:03.437 [main] INFO com.example.application.Application - Started Application in 13.58 seconds (JVM running for 14.522)
java -Xshareclasses:name=spring,noTimestampChecks -jar spring.boot.openj9-0.0.1-SNAPSHOT.jar
12:19:55.992 [main] INFO com.example.application.Application - Started Application in 4.127 seconds (JVM running for 4.456)
So timestamp checking is slowing things down (from 4s to 13s). This issue should be gone once JCL add the missing patch back that detects jar file open. But for now if you are not updating the jar file, you can use -Xshareclasses:noTimestampChecks @jsimomaa
@hangshao0 Yep, I can confirm the same behavior on my setup!
java -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
08:24:50.842 [main] INFO com.example.application.Application - Started Application in 4.577 seconds (JVM running for 5.159)
java -Xshareclasses:name=spring -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
08:26:19.972 [main] INFO com.example.application.Application - Started Application in 12.431 seconds (JVM running for 13.818)
java -Xshareclasses:name=spring -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
08:27:40.098 [main] INFO com.example.application.Application - Started Application in 8.646 seconds (JVM running for 9.228)
noTimestampChecksjava -Xshareclasses:name=spring,noTimestampChecks -jar target\spring.boot.openj9-0.0.1-SNAPSHOT.jar
08:28:58.045 [main] INFO com.example.application.Application - Started Application in 2.77 seconds (JVM running for 2.984)
So to me seems like we've found the culprit that's causing the overhead here.
So timestamp checking is slowing things down (from 4s to 13s). This issue should be gone once JCL add the missing patch back that detects jar file open.
Once that change is in, can we revisit this question to see how much overhead remains on different platforms? I'm intrigued by the 30% or so benefit that noTimestampChecks gives on Linux at the moment - I don't think we should leave that on the table if we can avoid it. @mpirvu
I'm intrigued by the 30% or so benefit that noTimestampChecks gives on Linux at the moment
On Linux, using Liberty start-up, noTimestampChecks gives me less than 1% which is noise.
I wonder why this test shows so much more impact - maybe Spring boot brings in a larger number of jars?
-- looks like the whole app is in only one jar on the filesystem which is spring.boot.openj9-0.0.1-SNAPSHOT.jar. That jar contains 157 jars inside it, but it wouldn't make sense to be checking timestamps of those would it? There seems to be an element of mystery here.
edit 2, delta is still reproducible, a solid 30% from disabling timestamp checks.
edit 3: -verbose indicates we load 9072 classes during startup for this app, I can't imagine that's a lot more than a typical liberty test case, but it would be interesting to know what that number is.
I was also able to reproduce a big win on Linux launching this spring application by adding the noTimestampChecks. I definitely think this is something we should investigate further! With -Xshareclasses:noTimestampCheck on Linux this application starts up 2X as fast as HotSpot.
Originally the jar of jar files did not work with -Xshareclasses so I wouldn't be surprised if the timestamp for each individual jar inside the wrapping jar file is also being checked. 157 jar files would be larger than any app I normally run by a very big margin.
We only check the timestamp of the outside most jar, which is spring.boot.openj9-0.0.1-SNAPSHOT.jar in this case. But we are checking this every time class is loaded from the shared cache. (If the JCL patch is added back, this check would only happen once I believe)
For Liberty, one possibility is that they might have called SharedClassURLHelper.setMinimizeUpdateChecks() in their java code that disabled most timestamp checks.
Probably on the VM side, we can change the default behavior to do timestamp check only once for each jar and introduce a new option "forceTimestampChecks" which does timestamp check on every load. @pshipton
Looking at the code, it looks like there's a need to add the concept of time-last-checked and the ability to specify an interval between checks. Checking the same jar timestamp at micro-second resolution makes little sense, particularly when walking the classpath and everything is coming from the same jar file.
If I recall correctly, many of the design decisions for timestamp checking came about from supporting the SCC in development scenarios when you want to be sure you're running the very latest version of the code you just wrote.
Wouldn't that require a mechanism to monitor the FS and pro-actively unload jitted code?
That would be a further improvement / different feature. This was to ensure that we picked the most recent version of the class available.
We don't really "monitor" the FS, we do timestamp check (when necessary) before returning class from the shared cache. Not sure about jitted code, but for classes, if any JVM connecting to the shared cache detects an updated jar, we will mark all potentially stale classes in the shared cache as stale pessimistically. When we later know the class is not actually updated we will remove the stale marking.
Wouldn't that require a mechanism to monitor the FS and pro-actively unload jitted code?
Wouldn't that be equivalent (in a non SCC run) to a JVM loading classes, and then someone modified the jars on disk? I don't think the currently running JVM is expected to now start using the updated jars is it? I would've thought that it just affected the next JVM instance.
Yes, that's where I was heading. I don't think we can be sure that we're running the very latest version of the code, so it probably doesn't matter to anyone if we only check for updates every 100ms instead of on every class load.
it probably doesn't matter to anyone if we only check for updates every 100ms instead of on every class load
if classes are modified and we get some from the cache and some from the jar, they may not be compatible with each other
What has been described so far certainly doesn't prevent that case.
File timestamps are not atomically updated and are not guaranteed to be updated for any given file update. Given the limited resolution of operating system timers and the aforementioned limitations of timestamps no scheme based on timestamp checking can ensure a file read twice at two different points in time is actually the same file.
@pshipton and I have been discussing alternative approaches. There is a tentative design being considered but it needs more discussion in particular input from @hangshao0 is needed.
This will be updated when that design has been fleshed out more.
Peter just discussed with me. In Java 11+, we can get the jar file open information from the class loader rather than using the JCL zip_hook. Since this change may not make it to the 0.15.0 release, we can change the code to check the timestamp only once for each jar during the startup phase for the 0.15.0 release. This will at least partially solve this problem for 0.15.0. Later with jar file open info from class loader, this issue should be fully solved.
Moving this to 0.16.0 as the changes aren't ready yet.
This should be resolved now by #6174
Fixed for the 0.15 release via https://github.com/eclipse/openj9/pull/6401
Most helpful comment
I can reproduce the regression with Xshareclasses using the jdk8u212-b04_openj9-0.14.2 SDK (the one used by @jsimomaa). My results above (https://github.com/eclipse/openj9/issues/5918#issuecomment-497107621) were using a development driver I had on my laptop.
The regression is seen even with -Xnoaot, so it's related to class sharing infra rather than AOT.