2.0.0 seems to be slower to start and I'd like to know why!
Some timing notes:
2.0.0
java -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -jar target/*.jar | grep GC
2017-12-18T13:55:08.899+0800: 0.656: [GC (Allocation Failure) 65536K->4743K(251392K), 0.0040314 secs]
2017-12-18T13:55:09.181+0800: 0.938: [GC (Allocation Failure) 70279K->7921K(251392K), 0.0060079 secs]
2017-12-18T13:55:09.430+0800: 1.187: [GC (Allocation Failure) 73457K->8195K(251392K), 0.0061011 secs]
2017-12-18T13:55:09.436+0800: 1.194: [GC (Metadata GC Threshold) 9643K->8065K(316928K), 0.0054321 secs]
2017-12-18T13:55:09.442+0800: 1.199: [Full GC (Metadata GC Threshold) 8065K->7701K(220672K), 0.0215755 secs]
2017-12-18T13:55:10.129+0800: 1.887: [GC (Allocation Failure) 138773K->14315K(220672K), 0.0055671 secs]
2017-12-18T13:55:10.783+0800: 2.540: [GC (Allocation Failure) 145387K->21461K(335360K), 0.0098723 secs]
2017-12-18T13:55:11.693+0800: 3.450: [GC (Metadata GC Threshold) 247050K->29365K(338944K), 0.0139084 secs]
2017-12-18T13:55:11.707+0800: 3.464: [Full GC (Metadata GC Threshold) 29365K->23158K(371712K), 0.0749999 secs]
1.5.x
java -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -jar target/*.jar | grep GC
2017-12-18T13:55:01.892+0800: 0.518: [GC (Allocation Failure) 65536K->4312K(251392K), 0.0036483 secs]
2017-12-18T13:55:02.135+0800: 0.761: [GC (Allocation Failure) 69848K->8099K(251392K), 0.0061107 secs]
2017-12-18T13:55:02.380+0800: 1.006: [GC (Allocation Failure) 73635K->7604K(251392K), 0.0052713 secs]
2017-12-18T13:55:02.625+0800: 1.251: [GC (Metadata GC Threshold) 66324K->9032K(316928K), 0.0078808 secs]
2017-12-18T13:55:02.633+0800: 1.259: [Full GC (Metadata GC Threshold) 9032K->8723K(250368K), 0.0269571 secs]
2017-12-18T13:55:03.387+0800: 2.013: [GC (Allocation Failure) 139795K->20439K(250368K), 0.0090122 secs]
2017-12-18T13:55:03.842+0800: 2.468: [GC (Allocation Failure) 151511K->22637K(311296K), 0.0090661 secs]
2017-12-18T13:55:04.448+0800: 3.074: [GC (Allocation Failure) 212077K->27815K(313344K), 0.0122303 secs]
Spring Security's BCrypt
class appears as hot method. Add a NoOpPasswordEncoder
bean yields the following results:
NoOp:
Started SampleActuatorUiApplication in 3.632 seconds (JVM running for 4.032)
Started SampleActuatorUiApplication in 3.536 seconds (JVM running for 3.968)
Started SampleActuatorUiApplication in 3.437 seconds (JVM running for 3.846)
BCrypt:
Started SampleActuatorUiApplication in 3.826 seconds (JVM running for 4.237)
Started SampleActuatorUiApplication in 3.777 seconds (JVM running for 4.193)
Started SampleActuatorUiApplication in 3.654 seconds (JVM running for 4.067)
I've raise https://github.com/spring-projects/spring-security/issues/4915
Finally had a chunk of time to dedicate to this, breaking out a number of new issues:
BeanInfo
mechanism (#9873 & SPR-16322)OnEnableEndpointCondition
default enabled result (#11407)Binder
(#11352)PropertyMapper
to use arrays rather than lists (#11411)Benchmark timings with Framework and Security patches and all Boot changes except the BeanInfo
update:
SpringBoot15xBenchmark.explodedJarLauncher avgt 10 2.910 ± 0.079 s/op
SpringBoot15xBenchmark.explodedJarMain avgt 10 2.007 ± 0.032 s/op
SpringBoot15xBenchmark.fatJar avgt 10 2.738 ± 0.025 s/op
SpringBoot20M7Benchmark.explodedJarLauncher avgt 10 3.610 ± 0.125 s/op
SpringBoot20M7Benchmark.explodedJarMain avgt 10 2.390 ± 0.052 s/op
SpringBoot20M7Benchmark.fatJar avgt 10 3.383 ± 0.035 s/op
SpringBoot20xBenchmark.explodedJarLauncher avgt 10 2.952 ± 0.048 s/op
SpringBoot20xBenchmark.explodedJarMain avgt 10 2.288 ± 0.034 s/op
SpringBoot20xBenchmark.fatJar avgt 10 2.815 ± 0.040 s/op
And petclinic:
--- 1.5.x
PetclinicLatestBenchmark.devtoolsRestart avgt 10 1.347 ± 0.066 s/op
PetclinicLatestBenchmark.explodedJarMain avgt 10 4.400 ± 0.077 s/op
PetclinicLatestBenchmark.fatJar avgt 10 5.868 ± 0.093 s/op
PetclinicLatestBenchmark.noverify avgt 10 5.184 ± 0.071 s/op
--- 2.0.0.M7
PetclinicLatestBenchmark.devtoolsRestart avgt 10 0.001 ± 0.001 s/op
PetclinicLatestBenchmark.explodedJarMain avgt 10 5.561 ± 0.159 s/op
PetclinicLatestBenchmark.fatJar avgt 10 7.616 ± 0.083 s/op
PetclinicLatestBenchmark.noverify avgt 10 6.936 ± 0.171 s/op
--- 2.0.0.BUILD-SNAPSHOT
PetclinicLatestBenchmark.devtoolsRestart avgt 10 0.001 ± 0.001 s/op
PetclinicLatestBenchmark.explodedJarMain avgt 10 5.166 ± 0.080 s/op
PetclinicLatestBenchmark.fatJar avgt 10 6.766 ± 0.087 s/op
PetclinicLatestBenchmark.noverify avgt 10 6.001 ± 0.060 s/op
Relevant info from @dsyer
Here's some interesting performance data for 2.0.0.M7
Boot Benchmark (sample) Mode Cnt Score Error Units
2.0.0.M7 MainBenchmark.main actr ss 10 1.623 ± 0.036 s/op
2.0.0.M7 MainBenchmark.main actr(*) ss 10 1.469 ± 0.028 s/op
1.5.9 OldBenchmark.old actr ss 10 1.386 ± 0.083 s/op
The (*)
sample has -Dspring.autoconfigure.exclude=org.springframework.boot.actuate.autoconfigure.metrics.MetricsAutoConfiguration -Dmanagement.endpoints.jmx.enabled=false
I spotted those as hotspots using an Aspect that prints timings for AbstractAutowireCapableBeanFactory.initializeBean()
That takes us about half way back to 1.5.9. Only considering Actuator features. Their initialization probably could be deferred - e.g. WebMvcMetricsFilter
eagerly instantiates a bean in Spring MVC that would otherwise be @Lazy
(HandlerMappingIntrospector
)
Excluding JMX also speeds up 1.5.9, but I don't think there was an option to exclude only Actuator JMX in 1.5.9
TomcatServletWebServerFactory
also shows up on that hostspot list
Can't switch that one off, but maybe it shows us a place to look
Using https://github.com/dsyer/spring-boot-startup-bench with the following:
cd static; mvn install; java -jar target/benchmarks.jar main snap old
Currently gives me:
Benchmark (sample) Mode Cnt Score Error Units
SnapBenchmark.snap empt avgt 10 0.823 ± 0.014 s/op
SnapBenchmark.snap demo avgt 10 1.681 ± 0.071 s/op
SnapBenchmark.snap actr avgt 10 2.222 ± 0.067 s/op
SnapBenchmark.snap jdbc avgt 10 1.926 ± 0.061 s/op
SnapBenchmark.snap actj avgt 10 2.526 ± 0.035 s/op
SnapBenchmark.snap jpae avgt 10 3.002 ± 0.078 s/op
SnapBenchmark.snap conf avgt 10 3.026 ± 0.362 s/op
MainBenchmark.main empt ss 10 0.958 ± 0.037 s/op
MainBenchmark.main demo ss 10 1.869 ± 0.095 s/op
MainBenchmark.main actr ss 10 2.514 ± 0.063 s/op
MainBenchmark.main jdbc ss 10 2.107 ± 0.082 s/op
MainBenchmark.main actj ss 10 2.797 ± 0.205 s/op
MainBenchmark.main jpae ss 10 3.168 ± 0.219 s/op
MainBenchmark.main conf ss 10 2.884 ± 0.111 s/op
MainBenchmark.main erka ss 10 4.098 ± 0.232 s/op
MainBenchmark.main busr ss 10 4.374 ± 0.642 s/op
MainBenchmark.main zuul ss 10 5.800 ± 0.460 s/op
MainBenchmark.main erkb ss 10 5.872 ± 0.447 s/op
MainBenchmark.main slth ss 10 6.006 ± 0.611 s/op
OldBenchmark.old empt ss 10 0.747 ± 0.094 s/op
OldBenchmark.old demo ss 10 1.761 ± 0.261 s/op
OldBenchmark.old actr ss 10 2.190 ± 0.190 s/op
OldBenchmark.old jdbc ss 10 1.984 ± 0.093 s/op
OldBenchmark.old actj ss 10 2.470 ± 0.140 s/op
OldBenchmark.old jpae ss 10 2.887 ± 0.178 s/op
OldBenchmark.old conf ss 10 3.185 ± 0.276 s/op
OldBenchmark.old erka ss 10 4.083 ± 0.846 s/op
OldBenchmark.old buna ss 10 3.076 ± 0.313 s/op
OldBenchmark.old busr ss 10 3.944 ± 0.330 s/op
OldBenchmark.old zuul ss 10 5.926 ± 0.504 s/op
OldBenchmark.old erkb ss 5 5.713 ± 0.387 s/op
OldBenchmark.old slth ss 10 5.915 ± 0.671 s/op
(this is with the Spring Security patch applied)
| Project | 1.5.x | Milestone | Snapshot |
| --- | --- | --- | --- |
| empt | 0.747 ± 0.094 | 0.958 ± 0.037 | 0.823 ± 0.014 |
| demo | 1.761 ± 0.261 | 1.869 ± 0.095 | 1.681 ± 0.071 |
| actr | 2.190 ± 0.190 | 2.514 ± 0.063 | 2.222 ± 0.067 |
| jdbc | 1.984 ± 0.093 | 2.107 ± 0.082 | 1.926 ± 0.061 |
| actj | 2.470 ± 0.140 | 2.797 ± 0.205 | 2.526 ± 0.035 |
| jpae | 2.887 ± 0.178 | 3.168 ± 0.219 | 3.002 ± 0.078 |
| conf | 3.185 ± 0.276 | 2.884 ± 0.111 | 3.026 ± 0.362 |
Those results look a bit different to mine so it's a surprise. Also, JPA isn't particularly expensive in Boot 2.0 for me. One observation: your error bars are quite large. This often indicates that you ran the tests on a noisy system (not enough free resources, memory, CPU etc.). If you strip down to just running this process and no IDE/browser etc, does it stabilize?
Also (although it doesn't matter for the "snap" comparison), I forgot to say, the "erk*" samples need a local Eureka. You can run it with Spring Boot CLI (with the Cloud extension), or just a vanilla @EnableEurekaServer
running on port 8761.
Here's my run from this morning on different hardware (still Ubuntu 16.04):
$ java -jar target/benchmarks.jar main snap old -p sample=empt,demo,actr,jdbc,actj,jpae,conf
Benchmark (sample) Mode Cnt Score Error Units
SnapBenchmark.snap empt avgt 10 0.825 ± 0.060 s/op
SnapBenchmark.snap demo avgt 10 1.713 ± 0.055 s/op
SnapBenchmark.snap actr avgt 10 2.312 ± 0.129 s/op
SnapBenchmark.snap jdbc avgt 10 1.983 ± 0.074 s/op
SnapBenchmark.snap actj avgt 10 2.682 ± 0.087 s/op
SnapBenchmark.snap jpae avgt 10 2.929 ± 0.073 s/op
SnapBenchmark.snap conf avgt 10 2.759 ± 0.108 s/op
MainBenchmark.main empt ss 10 0.824 ± 0.061 s/op
MainBenchmark.main demo ss 10 1.762 ± 0.054 s/op
MainBenchmark.main actr ss 10 2.352 ± 0.087 s/op
MainBenchmark.main jdbc ss 10 2.046 ± 0.088 s/op
MainBenchmark.main actj ss 10 2.668 ± 0.089 s/op
MainBenchmark.main jpae ss 10 3.072 ± 0.087 s/op
MainBenchmark.main conf ss 10 2.908 ± 0.118 s/op
OldBenchmark.old empt ss 10 0.716 ± 0.077 s/op
OldBenchmark.old demo ss 10 1.538 ± 0.060 s/op
OldBenchmark.old actr ss 10 1.983 ± 0.082 s/op
OldBenchmark.old jdbc ss 10 1.805 ± 0.111 s/op
OldBenchmark.old actj ss 10 2.326 ± 0.121 s/op
OldBenchmark.old jpae ss 10 2.760 ± 0.137 s/op
OldBenchmark.old conf ss 10 2.534 ± 0.096 s/op
Slower overall than my "fast" desktop. Less memory available (might explain the error bars). No difference in the conclusion that 2.0.0 snapshots are marginally better than 2.0.0.M7 (and not on a par with 1.5.9).
Maybe someone else could run that on a Mac and post back some results?
Here's my run on latest MBP with Spring Boot 2.0.0 SNAP b96f862c9eff8
$ java -jar target/benchmarks.jar main snap old -p sample=empt,demo,actr,jdbc,actj,jpae,conf
Benchmark (sample) Mode Cnt Score Error Units
SnapBenchmark.snap empt ss 10 0.962 ± 0.092 s/op
SnapBenchmark.snap demo ss 10 1.868 ± 0.084 s/op
SnapBenchmark.snap actr ss 10 2.412 ± 0.105 s/op
SnapBenchmark.snap jdbc ss 10 2.131 ± 0.110 s/op
SnapBenchmark.snap actj ss 10 2.649 ± 0.133 s/op
SnapBenchmark.snap jpae ss 10 3.178 ± 0.206 s/op
SnapBenchmark.snap conf ss 10 2.646 ± 0.079 s/op
MainBenchmark.main empt ss 10 0.878 ± 0.033 s/op
MainBenchmark.main demo ss 10 1.914 ± 0.101 s/op
MainBenchmark.main actr ss 10 2.489 ± 0.135 s/op
MainBenchmark.main jdbc ss 10 2.136 ± 0.201 s/op
MainBenchmark.main actj ss 10 2.825 ± 0.231 s/op
MainBenchmark.main jpae ss 10 3.317 ± 0.279 s/op
MainBenchmark.main conf ss 10 3.240 ± 0.757 s/op
OldBenchmark.old empt ss 10 0.786 ± 0.089 s/op
OldBenchmark.old demo ss 10 1.615 ± 0.098 s/op
OldBenchmark.old actr ss 10 2.136 ± 0.234 s/op
OldBenchmark.old jdbc ss 10 1.992 ± 0.112 s/op
OldBenchmark.old actj ss 10 2.517 ± 0.285 s/op
OldBenchmark.old jpae ss 10 2.764 ± 0.198 s/op
OldBenchmark.old conf ss 10 2.610 ± 0.254 s/op
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
Second run without any background apps:
Benchmark (sample) Mode Cnt Score Error Units
SnapBenchmark.snap empt avgt 10 0.854 ± 0.014 s/op
SnapBenchmark.snap demo avgt 10 1.716 ± 0.044 s/op
SnapBenchmark.snap actr avgt 10 2.300 ± 0.042 s/op
SnapBenchmark.snap jdbc avgt 10 1.960 ± 0.031 s/op
SnapBenchmark.snap actj avgt 10 2.651 ± 0.072 s/op
SnapBenchmark.snap jpae avgt 10 3.070 ± 0.061 s/op
SnapBenchmark.snap conf avgt 10 2.787 ± 0.088 s/op
MainBenchmark.main empt ss 10 0.893 ± 0.030 s/op
MainBenchmark.main demo ss 10 1.886 ± 0.069 s/op
MainBenchmark.main actr ss 10 2.498 ± 0.084 s/op
MainBenchmark.main jdbc ss 10 2.176 ± 0.095 s/op
MainBenchmark.main actj ss 10 2.795 ± 0.121 s/op
MainBenchmark.main jpae ss 10 3.068 ± 0.122 s/op
MainBenchmark.main conf ss 10 2.773 ± 0.101 s/op
MainBenchmark.main erka ss 10 3.678 ± 0.109 s/op
MainBenchmark.main busr ss 10 3.986 ± 0.068 s/op
MainBenchmark.main zuul ss 10 5.302 ± 0.047 s/op
MainBenchmark.main erkb ss 10 5.322 ± 0.077 s/op
MainBenchmark.main slth ss 10 5.410 ± 0.050 s/op
OldBenchmark.old empt ss 10 0.725 ± 0.008 s/op
OldBenchmark.old demo ss 10 1.514 ± 0.023 s/op
OldBenchmark.old actr ss 10 1.917 ± 0.017 s/op
OldBenchmark.old jdbc ss 10 1.797 ± 0.066 s/op
OldBenchmark.old actj ss 10 2.252 ± 0.074 s/op
OldBenchmark.old jpae ss 10 2.630 ± 0.140 s/op
OldBenchmark.old conf ss 10 2.436 ± 0.066 s/op
OldBenchmark.old erka ss 10 3.407 ± 0.084 s/op
OldBenchmark.old buna ss 10 2.797 ± 0.101 s/op
OldBenchmark.old busr ss 10 3.411 ± 0.061 s/op
OldBenchmark.old zuul ss 10 4.891 ± 0.093 s/op
OldBenchmark.old slth ss 10 4.794 ± 0.065 s/op
Pushing this one back to RC2 for now. I think performance is good enough (especially in the IDE) for RC1. I still can't work out why the benchmark shows a 100ms difference.
1.5.10 SNAPSHOT in IDE
2018-01-08 17:35:56.682 INFO 16454 --- [ main] s.a.ui.SampleActuatorUiApplication : Started SampleActuatorUiApplication in 2.647 seconds (JVM running for 2.927)
2018-01-08 17:36:24.634 INFO 16464 --- [ main] s.a.ui.SampleActuatorUiApplication : Started SampleActuatorUiApplication in 2.694 seconds (JVM running for 2.959)
2018-01-08 17:36:35.993 INFO 16469 --- [ main] s.a.ui.SampleActuatorUiApplication : Started SampleActuatorUiApplication in 2.658 seconds (JVM running for 2.926)
2.0.0 in IDE with Spring Security patch
2018-01-08 17:26:40.757 INFO 16102 --- [ main] s.a.ui.SampleActuatorUiApplication : Started SampleActuatorUiApplication in 2.737 seconds (JVM running for 3.066)
2018-01-08 17:27:18.883 INFO 16119 --- [ main] s.a.ui.SampleActuatorUiApplication : Started SampleActuatorUiApplication in 2.696 seconds (JVM running for 3.01)
2018-01-08 17:27:26.456 INFO 16123 --- [ main] s.a.ui.SampleActuatorUiApplication : Started SampleActuatorUiApplication in 2.732 seconds (JVM running for 3.093)
Snapshots looking much better now ("main" is 2.0.0.M7, "old" is 1.5.9):
Benchmark (sample) Mode Cnt Score Error Units
MainBenchmark.main empt ss 10 0.586 ± 0.025 s/op
MainBenchmark.main demo ss 10 1.276 ± 0.044 s/op
MainBenchmark.main actr ss 10 1.653 ± 0.056 s/op
MainBenchmark.main jdbc ss 10 1.437 ± 0.069 s/op
MainBenchmark.main actj ss 10 1.910 ± 0.086 s/op
MainBenchmark.main jpae ss 10 2.307 ± 0.184 s/op
MainBenchmark.main conf ss 10 1.972 ± 0.058 s/op
OldBenchmark.old empt ss 10 0.491 ± 0.016 s/op
OldBenchmark.old demo ss 10 1.064 ± 0.027 s/op
OldBenchmark.old actr ss 10 1.399 ± 0.090 s/op
OldBenchmark.old jdbc ss 10 1.295 ± 0.080 s/op
OldBenchmark.old actj ss 10 1.637 ± 0.078 s/op
OldBenchmark.old jpae ss 10 1.904 ± 0.053 s/op
OldBenchmark.old conf ss 10 1.786 ± 0.040 s/op
SnapBenchmark.snap empt ss 10 0.552 ± 0.012 s/op
SnapBenchmark.snap demo ss 10 1.166 ± 0.086 s/op
SnapBenchmark.snap actr ss 10 1.489 ± 0.044 s/op
SnapBenchmark.snap jdbc ss 10 1.334 ± 0.052 s/op
SnapBenchmark.snap actj ss 10 1.748 ± 0.079 s/op
SnapBenchmark.snap jpae ss 10 2.066 ± 0.059 s/op
SnapBenchmark.snap conf ss 10 1.794 ± 0.038 s/op
Marginally slower than 1.5.9, but not so noticeable now.
Most helpful comment
Finally had a chunk of time to dedicate to this, breaking out a number of new issues:
Reduce number of exceptions thrown on startup(no benefit in the end)ImplementClassUtils.getMethodIfAvailable()
to removeNoSuchMethodException
(SPR-16320)RefactorAnnotatedElementUtils
to removeNoSuchMethodException
(SPR-16321)Use alternative(doesn't seem to help)BeanInfo
mechanism (#9873 & SPR-16322)OnEnableEndpointCondition
default enabled result (#11407)Binder
(#11352)PropertyMapper
to use arrays rather than lists (#11411)