Given a simple class like:
public class Foo {
public static void main(String [] args) {
System.out.println("Hello");
}
}
$ javac Foo.java
$ brew install Tenzer/tap/multitime
$ export GRAALVM_HOME=/Users/borkdude/Downloads/graalvm-ce-java8-19.3.0/Contents/Home
$GRAALVM_HOME/bin/native-image Foo --no-server
$ multitime -n100 ./foo
===> multitime results
1: ./foo
Mean Std.Dev. Min Median Max
real 0.007 0.001 0.005 0.008 0.009
user 0.002 0.000 0.002 0.002 0.003
sys 0.002 0.000 0.001 0.002 0.004
$ export GRAALVM_HOME=/Users/borkdude/Downloads/graalvm-ce-java8-20.1.0-dev/Contents/Home
$GRAALVM_HOME/bin/native-image Foo --no-server
$ multitime -n100 ./foo
===> multitime results
1: ./foo
Mean Std.Dev. Min Median Max
real 0.015 0.001 0.011 0.015 0.016
user 0.007 0.001 0.005 0.007 0.008
sys 0.005 0.000 0.004 0.005 0.006
I noticed this when trying out 20.1-dev jdk8 with babashka, a scripting tool made with GraalVM. On GraalVM 19.3.0 java8 it has a startup time of around 14ms, but when I was trying GraalVM 20.1-dev it became 24ms which is 10ms slower.
$ export GRAALVM_HOME=/Users/borkdude/Downloads/graalvm-ce-java8-19.3.0/Contents/Home
$ script/compile
$ multitime -n100 ./bb '(System/getProperty "java.vm.version")'
===> multitime results
1: ./bb "(System/getProperty \"java.vm.version\")"
Mean Std.Dev. Min Median Max
real 0.014 0.001 0.012 0.014 0.016
user 0.005 0.000 0.004 0.005 0.006
sys 0.006 0.001 0.004 0.006 0.008
$ export GRAALVM_HOME=/Users/borkdude/Downloads/graalvm-ce-java8-20.1.0-dev/Contents/Home
$ script/compile
$ multitime -n100 ./bb '(System/getProperty "java.vm.version")'
===> multitime results
1: ./bb "(System/getProperty \"java.vm.version\")"
Mean Std.Dev. Min Median Max
real 0.024 0.002 0.018 0.024 0.026
user 0.011 0.001 0.008 0.011 0.011
sys 0.010 0.001 0.007 0.010 0.012
To build babashka:
$ git clone https://github.com/borkdude/babashka --recursive
Install lein, a Clojure build tool.
Set GRAALVM_HOME to either 19 or 20 with java8.
Then run script/compile (on linux or macos).
This will produce a bb executable.
This does not seem to be the case on linux, so it seems to be mac specific:
On linux:
19:
1: ./bb -e "(+ 1 2 3)"
Mean Std.Dev. Min Median Max
real 0.015 0.010 0.005 0.010 0.035
user 0.008 0.006 0.000 0.006 0.023
sys 0.008 0.007 0.000 0.005 0.031
20:
1: ./bb -e "(+ 1 2 3)"
Mean Std.Dev. Min Median Max
real 0.014 0.009 0.006 0.010 0.037
user 0.006 0.005 0.000 0.005 0.024
sys 0.008 0.006 0.000 0.006 0.027
This data was provided to me by @sogaiu.
@borkdude thanks for your bug report. I will take a look.
@olpaw I re-tested with the newly released 20.0.0 java8 version, but the problem is still there. If it helps, this is an uberjar of babashka:
https://github.com/borkdude/babashka/releases/download/v0.0.70/babashka-0.0.70-standalone.jar
It is provided which each new release:
https://github.com/borkdude/babashka/releases
This is the part of the compilation script using that jar:
if [ -z "$BABASHKA_XMX" ]; then
export BABASHKA_XMX="-J-Xmx3g"
fi
$BABASHKA_VERSION=0.0.70
$GRAALVM_HOME/bin/native-image \
-jar target/babashka-$BABASHKA_VERSION-standalone.jar \
-H:Name=bb \
-H:+ReportExceptionStackTraces \
-J-Dclojure.spec.skip-macros=true \
-J-Dclojure.compiler.direct-linking=true \
"-H:IncludeResources=BABASHKA_VERSION" \
"-H:IncludeResources=SCI_VERSION" \
-H:ReflectionConfigurationFiles=reflection.json \
--initialize-at-run-time=java.lang.Math\$RandomNumberGeneratorHolder \
--initialize-at-build-time \
-H:Log=registerResource: \
-H:EnableURLProtocols=http,https \
--enable-all-security-services \
-H:+JNI \
--verbose \
--no-fallback \
--no-server \
"$BABASHKA_XMX"
@olpaw Some additional data related to general performance, not only startup time.
With 20.0.0-java8 on macOS:
$ time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
./bb 1.98s user 0.10s system 99% cpu 2.089 total
With 19.3.1-java8 on macOS:
$ time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
./bb 1.59s user 0.08s system 99% cpu 1.676 total
That's a 25% loss in performance for v20.
On linux the performance on version 20-java8 is slightly worse (6%) but not as noticeable as with macOS:
19.3.1:
$ multitime -n10 -s1 -q ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
===> multitime results
1: -q ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
Mean Std.Dev. Min Median Max
real 1.785 0.037 1.737 1.780 1.873
user 1.709 0.039 1.653 1.716 1.797
sys 0.076 0.015 0.040 0.080 0.092
12:44 AM
20.0.0:
$ multitime -n10 -s1 -q ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
===> multitime results
1: -q ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
Mean Std.Dev. Min Median Max
real 1.897 0.055 1.837 1.872 1.988
user 1.819 0.044 1.762 1.800 1.889
sys 0.078 0.015 0.052 0.078 0.100
This data was provided by @sogaiu who uses linux.
With 20.0.0-java8 on macOS
Thanks for checking with latest releases!
Atm, this is my prime subject:
> otool -L hellojava_19.3.1
hellojava_19.3.1:
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.200.5)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1535.12.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
> otool -L hellojava_latest
hellojava_latest:
/System/Library/Frameworks/CoreServices.framework/Versions/A/CoreServices (compatibility version 1.0.0, current version 921.2.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.200.5)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1535.12.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
We are now additionally dynamically linking to CoreServices and this likely adds to the startup time.
Yep, that's it. If I comment out https://github.com/oracle/graal/blob/9a53453d5528e04dcc9bf8895705c62c5a74c4c7/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/jdk/JNIRegistrationJava.java#L139 I'm back to the startup characteristics of 19.3.1.
While we cannot fully eliminate this dynamic dependency we could make the dependency conditional on actual requirement instead of unconditional linking. @eginez please take this one from here and implement the conditional linking so that we only have to pay for the increased startup time if we actually need it.
@olpaw That's good news! Is this also the cause of the general performance regression of about 25% on macOS? https://github.com/oracle/graal/issues/2136#issuecomment-587941241
That makes sense, we had it conditionally linked before if I recall correctly. I'll issue a fix for this , thanks @olpaw and @borkdude
@borkdude most likely yes. I only tested with a trivial hello world that does nothing except startup. Your examples with babashka are actually doing a bit of work so pure startup is mixed with actual work that would likely water down the effect of increased startup overhead.
@olpaw That's true, but the "general performance" I was referring to later was not about startup (although this issue reported that as the first thing and it's already if that can be fixed). A loop with a 1000000 iterations takes 25% longer on macOS 20-java8 (1.98s) than macOS 19.3.1-java8 (1.59s).
@borkdude tomorrow I will take a look at your bb data by building it locally with the recipe you provide in https://github.com/oracle/graal/issues/2136#issuecomment-587865438
@borkdude I cannot reproduce your +25% against master:
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_old "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.556s
user 0m1.403s
sys 0m0.144s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_old "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.616s
user 0m1.429s
sys 0m0.180s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_old "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.562s
user 0m1.387s
sys 0m0.170s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.614s
user 0m1.411s
sys 0m0.200s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.583s
user 0m1.421s
sys 0m0.150s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.573s
user 0m1.416s
sys 0m0.148s
where bb_old is built with graalvm-ce-java8-19.3.1 and bb with master
babashka-0.0.70> strings bb | grep -p '^com.oracle.svm.core.VM.*='
com.oracle.svm.core.VM.Target.CCompiler=clang|apple|x86_64|10.0.1
com.oracle.svm.core.VM.Target.StaticLibraries=libnet.a|libzip.a|libjava.a|libnio.a|libsunec.a|libffi.a|liblibchelper.a|libstrictmath.a|libjvm.a
com.oracle.svm.core.VM.Target.Platform=org.graalvm.nativeimage.Platform$DARWIN_AMD64
com.oracle.svm.core.VM.Target.LibC=com.oracle.svm.core.c.libc.GLibc
com.oracle.svm.core.VM=GraalVM 20.1.0-dev Java 8 LIBGRAAL
com.oracle.svm.core.VM.Target.Libraries=-framework CoreServices|stdc++|pthread|-framework CoreFoundation|dl|z
@olpaw That sounds promising. But were you able to reproduce it against the released version 20 java8 macOS? I don't know the relation between the master branch and that release.
You are right 20.0 binaries are a tad slower
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_10.0 "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.610s
user 0m1.456s
sys 0m0.151s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_10.0 "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.625s
user 0m1.448s
sys 0m0.175s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_10.0 "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.619s
user 0m1.434s
sys 0m0.182s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_master "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.576s
user 0m1.419s
sys 0m0.151s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_master "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.576s
user 0m1.418s
sys 0m0.156s
paul@pauls-iMac:~/OLabs/bb/babashka-0.0.70> time ./bb_master "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
real 0m1.577s
user 0m1.400s
sys 0m0.172s
Whatever it was I cannot reproduce on master thus it will be fixed in 20.1.
Thank you! I'll keep an eye on the dev releases here https://github.com/graalvm/graalvm-ce-dev-builds/releases.
Yup. Wait for the next GraalVM CE 20.1.0-dev release and let me know if you can confirm my test results from my local graalvm build from master.
@olpaw I just realized that the recipe I gave above with the .jar file also requires a reflection.json. I forgot to give you that. In case that may influence the performance, I'll paste it here anyway. I assume you compiled with an empty reflection.json just to make the build work? Maybe also try this one, if it's not too much to ask.
reflection.json.zip.
@olpaw I'm not able to reproduce the "general performance" problem anymore. The reflection.json doesn't seem to have an impact.
Compiled with "empty" reflection.json: []
$ time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
./bb 1.53s user 0.08s system 99% cpu 1.614 total
Compiled with full reflection.json:
$ ./bb '(System/getProperty "java.vm.version")'
"GraalVM 20.0.0 CE"
$ time ./bb "(loop [val 0 cnt 1000000] (if (pos? cnt) (recur (inc val) (dec cnt)) val))"
1000000
./bb 1.58s user 0.09s system 99% cpu 1.677 total
So please ignore my previous message. I don't know what caused the issue at the time of my initial reproduction of the second issue, I might have been making a mistake somewhere. Sorry to waste your time on the second issue. I'm glad everything is sorted out now. Thanks a lot!
@olpaw I just realized that the recipe I gave above with the .jar file also requires a reflection.json. I forgot to give you that.
No worries, I ran a full build with lein (thus I had reflection.json in place).
BTW, if you put your reflection configuration instead into a file named reflect-config.json somewhere below META-INF/native-image it will automatically get picked up and also become part of the jar itself. Thus the result of your image building will not depend on external files anymore. See https://github.com/graalvm/graalvm-demos/tree/master/native-image-configure-examples/configure-at-runtime-example/src/main/resources/META-INF/native-image/org.graalvm.nativeimage/configure-at-runtime-example for an example.
@borkdude, @olpaw commits a conditional linking for core services https://github.com/oracle/graal/commit/aa9769b5ddbcd906dd326cfcc5b25790f3cc2aa1
which should help with the performance
Thanks. I'll test this once a new dev release comes out. I'm assuming that will be based on the master branch.
@eginez @olpaw I tested babashka with the new Darwin dev release from https://github.com/graalvm/graalvm-ce-dev-builds/releases/download/20.1.0-dev-20200304_0848/graalvm-ce-java8-darwin-amd64-20.1.0-dev.tar.gz.
I'm still seeing the same slowdown as compared to 19.3.1.
GraalVM CE 20.1.0-dev-20200304_0848 Darwin java 8:
$ multitime -n100 -s0 ./bb -e 'nil'
===> multitime results
1: ./bb -e nil
Mean Std.Dev. Min Median Max
real 0.024 0.011 0.021 0.022 0.133
user 0.010 0.001 0.009 0.010 0.015
sys 0.008 0.003 0.007 0.008 0.033
19.3.1 Darwin java 8:
$ multitime -n100 -s0 /usr/local/bin/bb -e 'nil'
===> multitime results
1: /usr/local/bin/bb -e nil
Mean Std.Dev. Min Median Max
real 0.013 0.001 0.013 0.013 0.017
user 0.005 0.000 0.004 0.005 0.006
sys 0.005 0.001 0.004 0.005 0.008
That can only mean your application uses one of the methods defined in https://github.com/oracle/graal/blob/d3d82d1f23b0d2b584a7d36ea5632dcbe450c6e6/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/jdk/JNIRegistrationJava.java#L152 and therefore depends on CoreServices. Using -H:+PrintAnalysisCallTree should be able to show you what causes one of theses methods to end up in your application. @borkdude can you attach the output of reports/call_tree_native-image_*.txt when you build bb.
@olpaw This file is +400MB unzipped, 37MB zipped, but too big according to Github to upload here. I am hosting it on my Dropbox now here: https://www.dropbox.com/s/hja2zwjtttcnbks/call_tree_bb_20200304_140919.txt.zip?dl=0
@borkdude there are many places where the static analysis sees potential calls to one of the methods listed in https://github.com/oracle/graal/issues/2136#issuecomment-594474124
A lot of the reachabilities can be blamed on the inaccuracies of our static analysis. Currently the analysis is context insensitive and drags in methods just because they are reachable from some method that is reachable from some Runnable implementation. See e.g. the whole subtree of java.security.AccessController.doPrivileged. The CoreServices triggering DefaultProxySelector.getSystemProxy e.g. gets reachable via this inaccuracy.
I do not have a good idea what to do here other than providing substitutions and @Delete classes/methods that you do not want to have in your image and that you know that you will not need at runtime. @christianwimmer any ideas?
Maybe another fix would be to opt out of the dynamic loading of the keychain library (at the penalty of a runtime error)?
I don't think we can do much on this issue. The problem is the inefficiency of the Apple dynamic linker/loader.
Due to impreciseness of the static analyzer the exact same applications become slower by upgrading from 19.3.1 to 20 because the binary ends up dynamically linking a library that isn't even used. Is this acceptable? I think one of the goals of GraalVM native-image is to have the best startup time possible?
Yes, startup time is important, but correctness can of course never be compromised.
You are correct that a more precise static analysis could detect that, but our current context insensitive analysis it too limited.
@christianwimmer @olpaw @eginez @thomaswue
What about having a setting of forcibly opting out of dynamic linking of certain libraries (with the responsibility of any runtime errors being on the person who uses these options) to improve startup time in case the static analyzer is too broad?
I don't think 10 ms startup time (on MacOS only) are worth the risk of strange linking errors later at run time.
The problem is the inefficiency of the Apple dynamic linker/loader.
Thanks for putting time into this. I hope the Apple people can improve their dynamic linking speed.
No worries thanks for working with us. I am closing this issue for now
The suggested manual substitutions to "help" the static analysis until we further improve it should be a possible workaround?
@thomaswue I'd like to try this idea but I have no idea where to start. Can I find any docs on this and which methods should I mark as @Delete?
Can I find any docs on this and which methods should I mark as @Delete?
I would start with something like
import org.graalvm.nativeimage.Platform;
import org.graalvm.nativeimage.Platforms;
import com.oracle.svm.core.annotate.TargetClass;
import com.oracle.svm.core.annotate.Delete;
public final class CutOffCoreServicesDependencies {
}
@Platforms(Platform.DARWIN.class)
@TargetClass(className = "sun.net.spi.DefaultProxySelector")
@Delete
final class Target_sun_net_spi_DefaultProxySelector {
}
@Platforms(Platform.DARWIN.class)
@TargetClass(className = "apple.security.KeychainStore")
@Delete
final class Target_apple_security_KeychainStore {
}
If this is too aggressive and breaks building (because reports/call_tree_native-image_*.txt shows situations where the deleted classes are referenced for a reason (i.e. not due to analysis imprecision)) then you have to cut off (or substitute) branches on a more fine granular (@Delete or @Substitute on methods) level. If you grep for these annotations in our repo you will find pleny of examples.
To javac the above snippet you will need to have svm.jar on you classpath (contains com.oracle.svm.core.annotate):
$GRAALVM_HOME/bin/javac -cp $GRAALVM_HOME/lib/svm/builder/svm.jar CutOffCoreServicesDependencies.java
# or on 8:
$GRAALVM_HOME/bin/javac -cp $GRAALVM_HOME/jre/lib/svm/builder/svm.jar CutOffCoreServicesDependencies.java
_Caveat emptor, ..._
Follow up:
With the above Java code the startup time was fixed:
$ multitime -n100 -s0 ./bb -e 'nil'
===> multitime results
1: ./bb -e nil
Mean Std.Dev. Min Median Max
real 0.013 0.000 0.013 0.013 0.016
user 0.005 0.000 0.005 0.005 0.005
sys 0.006 0.001 0.004 0.006 0.008
I had to add the flag --report-unsupported-elements-at-runtime to get it to compile, but a lot of my unit tests broke with the error message:
Unsupported method sun.net.spi.DefaultProxySelector.select(URI) is reachable
I have a good set of unit tests and additionally I test babashka with a selection of libraries that are supposed to work with them.
Adding the sun.net.spi.DefaultProxySelector back, while only "deleting" apple.security.KeychainStore got me the desired results:
I still need the flag --report-unsupported-elements-at-runtime probably due to the analyzer ending up discovering that apple.security.KeychainStore is used, while it's probably not used for anything.
Note to self:
I had to add:
["/Users/borkdude/Downloads/graalvm-ce-java8-20.1.0-dev/Contents/Home//jre/lib/boot/graal-sdk.jar"
"/Users/borkdude/Downloads/graalvm-ce-java8-20.1.0-dev/Contents/Home//jre/lib/svm/builder/svm.jar"]
to the classpath to get it to compile.
Thanks for all the help.
@olpaw Would it be an idea to have an option to turn:
error:com.oracle.svm.hosted.substitute.DeletedElementException: Unsupported type apple.security.KeychainStore is reachable
into a warning so I won't need the --report-unsupported-elements-at-runtime flag? I mean, I already know the KeychainStore is reachable, that's the reason why I added a @Delete annotation, but I would still like to get exceptions for other classes maybe?
I mean, I already know the KeychainStore is reachable, that's the reason why I added a @Delete annotation
Think about all the other @Delete annotations we have. Usually we want a user to know when a build makes something reachable that we cannot support. What you suggest would be something else. Like @Delete(strict = false). Community contributions are welcome :-)
cc @christianwimmer
@olpaw Alternatively there could be an option --report-unsupported-elements-at-build-time that can be used together with --report-unsupported-elements-at-runtime. If the flag is provided, GraalVM could emit the unsupported classes to a file or stdout, while still not crashing (the name of the flag probably needs some thought).
@olpaw Despite the fix with substitutions in https://github.com/oracle/graal/issues/2136#issuecomment-595688524 I'm now back to where we started with the dev build https://github.com/graalvm/graalvm-ce-dev-builds/releases/tag/20.2.0-dev-20200423_0149.
When compiling with the new dev-build I'm seeing a startup of around 28ms instead of 13ms and that a lot more dynamic libraries are loaded compared to compiling with 19.3.1:
19.3.1:
$ otool -L /usr/local/bin/bb
/usr/local/bin/bb:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 307.5.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1349.8.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.8)
Dev build:
$ otool -L bb
bb:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1575.17.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1575.17.0)
/usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
Can I prevent this somehow, if I'm not interested in these libraries, to still get good startup time on macOS?
Another issue I found is that the binary with the new dev build is around 6MB larger. For what reason?
We figured out that /usr/lib/libobjc.A.dylib is needed on OSX. @eginez knows more about that.
@eginez I'd like to learn more about the reason why this was needed. So far I haven't seen any problems with babashka on 19.3.1 on macOS where the library wasn't loaded. Is it maybe possible to opt out of loading certain dynamic libraries at the cost of errors (on the person who chooses to opt out) to restore startup time in macOS? Adding more dynamic libraries adds tens of milliseconds of startup time, is the end of this in sight?
To be honest I don't recall any previous change bringing in libobjc directly. However we did decide to increate the size of the image in order to properly support time zones. 6Mb though does not sound correct and we would have take a look at why libobjc is getting linked in
Ok I took some time today to analyze some of the variation. I built bb with
Below the results
➜ babashka git:(master) ✗ for i in `ls bb-*`
do
multitime -n100 -s0 ./$i -e 'nil'
echo -Size-
bsz=`stat -f "%z" bb-19.3.1`
sz=`stat -f "%z" $i`
diff=$(echo $sz - $bsz|bc)
printf '%s\t\t%s\t%s\n' $i $sz $diff
otool -L $i
done
===> multitime results
1: ./bb-19.3.1 -e nil
Mean Std.Dev. Min Median Max
real 0.010 0.005 0.008 0.010 0.061
user 0.004 0.000 0.003 0.004 0.004
sys 0.004 0.001 0.003 0.004 0.009
-Size-
bb-19.3.1 48145656 0
bb-19.3.1:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 800.7.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1673.126.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
===> multitime results
1: ./bb-20.2.0 -e nil
Mean Std.Dev. Min Median Max
real 0.014 0.000 0.013 0.013 0.015
user 0.006 0.000 0.006 0.006 0.006
sys 0.005 0.000 0.005 0.005 0.006
-Size-
bb-20.2.0 54356548 6210892
bb-20.2.0:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 800.7.0)
/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1673.126.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1673.126.0)
/usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
===> multitime results
1: ./bb-master -e nil
Mean Std.Dev. Min Median Max
real 0.012 0.008 0.011 0.011 0.097
user 0.005 0.000 0.005 0.005 0.006
sys 0.005 0.001 0.004 0.004 0.013
-Size-
bb-master 54356564 6210908
bb-master:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 800.7.0)
/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1673.126.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1673.126.0)
/usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
===> multitime results
1: ./bb-reverts -e nil
Mean Std.Dev. Min Median Max
real 0.010 0.000 0.009 0.010 0.013
user 0.004 0.000 0.004 0.004 0.005
sys 0.004 0.000 0.004 0.004 0.005
-Size-
bb-reverts 54160544 6014888
bb-reverts:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 800.7.0)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1673.126.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1281.0.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
Some observations
28ms run-time for this particular work load in any of the builds.@loicottet can probably expand on the need for 5c5417b. To investigate the size @borkdude I'd like to run a git bisect to see if I can find a probable cause. On your earlier comment(https://github.com/oracle/graal/issues/2136#issuecomment-595814343) it seems like you successfully created a build that was considered good using an snapshot build of 20.1 would you be able to provide the git hash of such build? (found in $GRAALVM_HOME/Contents/Home/release)
Ok I actually ran git bisect with the range starting with the commit from 19.3.1 and ending in master and I found what seems to be the issue causing the increase:
https://github.com/oracle/graal/commit/9d5ce43fbb2093ab2d8d76e38b3266a91c59815d Load classes and resources for 'javax.xml.*'.
I confirmed that is actually the problem by disabling this feature https://github.com/oracle/graal/blob/0ee434c01b4fac9403d8c988f13041af8394467c/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/xml/JavaxXmlClassAndResourcesLoaderFeature.java and on that build the size
➜ babashka git:(master) ✗ stat -f "%z" bb
46696340
@arodionov I know you have been working on this feature, is there something we can do minimize the impact on the image size
@eginez thanks a lot for investigating!
I don't recall any previous change bringing in libobjc directly.
Ahh sorry I remember now ... this was driven by the need to generate OSX binaries that do not link against the internal CoreFoundation library. @loicottet implemented that. Its needed for OSX so that the binaries that we produce can be uploaded to AppStore. See https://github.com/oracle/graal/pull/2266
Thanks @eginez @olpaw !
If this is related to XML, then it makes sense. Babashka includes xml functionality, but to do so, I've only added these classes to the reflection config:
com.sun.xml.internal.stream.XMLInputFactoryImpl
{:methods [{:name "<init>" :parameterTypes []}]}
com.sun.xml.internal.stream.XMLOutputFactoryImpl
{:methods [{:name "<init>" :parameterTypes []}]}
How does the XML feature work? Does it automatically detect XML usage and can I now remove these classes from the reflection config for version 20.1 or 20.2?
This is what I'm seeing with the latest dev release (https://github.com/graalvm/graalvm-ce-dev-builds/releases/tag/20.2.0-dev-20200423_0149) on my system, a Macbook Pro 2019 i9, 32GB ram, macOS mojave 10.14.6:
$ multitime -n100 -s0 ./bb -e 'nil'
===> multitime results
1: ./bb -e nil
Mean Std.Dev. Min Median Max
real 0.023 0.002 0.019 0.023 0.031
user 0.010 0.001 0.008 0.010 0.012
sys 0.010 0.002 0.007 0.010 0.017
$ otool -L ./bb
./bb:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1575.17.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1575.17.0)
/usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
md5-c492ea8b7bf21d3d46c1804716dbcff6
$ multitime -n100 -s0 tmp/bb-19.3.1 -e 'nil'
===> multitime results
1: tmp/bb-19.3.1 -e nil
Mean Std.Dev. Min Median Max
real 0.011 0.005 0.009 0.010 0.061
user 0.004 0.000 0.003 0.004 0.005
sys 0.004 0.001 0.003 0.004 0.010
md5-1a4b05ab991407565104282a7706c569
$ otool -L tmp/bb-19.3.1
tmp/bb-19.3.1:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1575.17.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
This is a 12ms startup regression on my system. I'm not planning to deploy my binaries to the AppStore and I would like to opt out of this startup regression if that is possible.
@borkdude
How does the XML feature work?
During the analysis, it detects if there are methods in a call-tree that instantiate XML-parsers via reflection, and registers those parsers. The feature code.
Does it automatically detect XML usage and can I now remove these classes from the reflection config for version 20.1 or 20.2?
Yes, you can remove reflection config for XML.
For 20.1 it will be used the previous version of this feature (that brings all XML-parser and is not optimal). For 20.2, I think, it will be the current implementation.
@eginez
I know you have been working on this feature, is there something we can do minimize the impact on the image size
This benchmark uses the last (already optimized) version of the feature. We can try to check maybe some libraries or frameworks also triggers XML-parsers instantiation and substitute those calls. Should have a call-tree for this
Have checked, during image build, the following parsers are instantiated via methods calls (it means that those methods are present somewhere in a babashka code or libraries):
javax.xml.stream.FactoryFinder.find(java.lang.Class, java.lang.String) -> StAXParserClassesjavax.xml.transform.FactoryFinder.find(java.lang.Class, java.lang.String) -> TransformerClassesAndResourcesjavax.xml.parsers.DocumentBuilderFactory.newInstance() -> DOMParserClassesjavax.xml.parsers.SAXParserFactory.newInstance() -> SAXParserClasses@arodionov Thank you for explaining
@borkdude as I can see, the most XML-parsers coming via https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/PgSQLXML.java#L240
@arodionov I only recently added support for postgres to babashka, only as of today actually. Try with commit 2fef8a793e22237a53059737cab91f336aef699b, that was the commit before psql support was added.
I'm also adding clojure.data.xml to it, so you can do things like this:
$ echo '"<foo></foo>"' | bb '(-> *input* xml/parse-str xml/emit-str)'
"<?xml version=\"1.0\" encoding=\"UTF-8\"?><foo/>"
so I'm not surprised that XML-related classes are detected, I was only surprised by the difference in binary size between 19.3.1 and 20.1-dev/20.2-dev.
@borkdude yes, you are right, for the provided commit the following methods bring all XML via:
clojure.data.xml.jvm.parse$make_input_factory.invokeStatic(java.lang.Object)clojure.data.xml.jvm.emit$write_document.invokeStatic(java.lang.Object, java.lang.Object, java.lang.Object)java.util.Properties.storeToXML(java.io.OutputStream, java.lang.String, java.lang.String) -> XmlSupport.savejava.util.Properties.loadFromXML(java.io.InputStream) -> XmlSupport.loadMost of the XML classes are brought by java.util.Properties methods. For Java 11 storeToXML and loadFromXML are implemented in a different way (without reflection usage and triggering almost all parsers).
@borkdude in your reflection.json config I'd like to advise to specify methods for reflection more selectively (it can reduce the size of the image).
For example, for
{ "name" : "java.util.Properties", "allPublicMethods" : true, ... }
you can exclude storeToXML and loadFromXML to be analyzed and included into the image (with all XML-related stuff).
@arodionov Thanks for the advice.
I have tried this now with the 19.3.1 version of babashka:
$ ./bb '(.storeToXML (System/getProperties) System/out "foo")'
clojure.lang.ExceptionInfo: Provider com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl not found [at line 1, column 1]
As you can see this fails. But with 20.2-dev this works:
$ tmp/bb-20.2-dev '(.storeToXML (System/getProperties) System/out "foo")'
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>foo</comment>
<entry key="java.class.path"/>
<entry key="user.name">borkdude</entry>
<entry key="java.ext.dirs"/>
<entry key="java.vendor">Oracle Corporation</entry>
So this explains the extra binary size: there is actually more functionality available now.
Since this is a Clojure interpreter, I don't know ahead of time which classes and methods people are going to use, but I'll consider removing java.util.Properties.storeToXML and java.util.Properties.loadFromXML since these methods are not likely to be used in Clojure programs.
@sogaiu did some measurements and omitting the java.util.Property classes saves about 5MB of binary size, so this explains that part.
@eginez @olpaw I think the only unresolved bit in this issue is startup time and dynamically linking. I'll just past that part of the comment here again.
This is what I'm seeing with the latest dev release (https://github.com/graalvm/graalvm-ce-dev-builds/releases/tag/20.2.0-dev-20200423_0149) on my system, a Macbook Pro 2019 i9, 32GB ram, macOS mojave 10.14.6:
$ multitime -n100 -s0 ./bb -e 'nil'
===> multitime results
1: ./bb -e nil
Mean Std.Dev. Min Median Max
real 0.023 0.002 0.019 0.023 0.031
user 0.010 0.001 0.008 0.010 0.012
sys 0.010 0.002 0.007 0.010 0.017
$ otool -L ./bb
./bb:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1575.17.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1575.17.0)
/usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)
md5-c492ea8b7bf21d3d46c1804716dbcff6
$ multitime -n100 -s0 tmp/bb-19.3.1 -e 'nil'
===> multitime results
1: tmp/bb-19.3.1 -e nil
Mean Std.Dev. Min Median Max
real 0.011 0.005 0.009 0.010 0.061
user 0.004 0.000 0.003 0.004 0.005
sys 0.004 0.001 0.003 0.004 0.010
md5-1a4b05ab991407565104282a7706c569
$ otool -L tmp/bb-19.3.1
tmp/bb-19.3.1:
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.250.1)
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1575.17.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.11)
This is a 12ms startup regression on my system. I'm not planning to deploy my binaries to the AppStore and I would like to opt out of this startup regression if that is possible.
This is a 12ms startup regression on my system. I'm not planning to deploy my binaries to the AppStore and I would like to opt out of this startup regression if that is possible.
If Apple does not want users to link to CoreFoundation directly we would like to respect that.
It would be interesting to see why your measurement results do not seem to match the results from @loicottet as shown on https://github.com/oracle/graal/pull/2266. Maybe a different version of OSX or XCode is responsible for that mismatch.
@olpaw
I see two more dynamic libraries linked in 20.2-dev:
All of the other dynamic libraries are still there, so also the older API that should not be used anymore? See previous comment https://github.com/oracle/graal/issues/2136#issuecomment-619351276.
Was this a request made by Apple? I read #2266 as a way of getting things into the AppStore, not as something that should be avoided?
I don't think so. See https://github.com/oracle/graal/issues/1568#issue-476699005 from @johanvos
Relying on linking against non-public symbols sounds like a bad idea to me ... independent from being able to upload native-images to the AppStore. @christianwimmer how should we handle this case?
@olpaw To be clear, I'm only interested in getting the good startup time on macOS back like it was in 19.3.1. How that is done is an implementation detail to me, but maybe there is not much that can be done? Is the result of loading the public API that 2 more libraries compared to 19.3.1 are being loaded, i.e. the public API loading 2 other libraries (including the private one that was used before)?
Is the result of loading the public API that 2 more libraries compared to 19.3.1 are being loaded, i.e. the public API loading 2 other libraries (including the private one that was used before)?
https://github.com/oracle/graal/pull/2266/files#diff-bfc44cccccc87507300243b7d38779bd
I assume us now having -ObjC on the linker command line plays out like this.
Most helpful comment
No worries, I ran a full build with lein (thus I had
reflection.jsonin place).BTW, if you put your reflection configuration instead into a file named
reflect-config.jsonsomewhere belowMETA-INF/native-imageit will automatically get picked up and also become part of the jar itself. Thus the result of your image building will not depend on external files anymore. See https://github.com/graalvm/graalvm-demos/tree/master/native-image-configure-examples/configure-at-runtime-example/src/main/resources/META-INF/native-image/org.graalvm.nativeimage/configure-at-runtime-example for an example.