JDK 9 had some major performance improvements for SIMD / AutoVectorization that resulted in 2x to 4x improvements in many cases. Does Graal VM have the "same" improvements as jdk9 did for SIMD / Autovectorization support? (In fact, does it support SIMD / Autovectorization even at jdk8 level?)
Put another way, does the fact that Graal VM will eventually support jdk11 (according to this post: https://github.com/oracle/graal/issues/651) mean that it will automatically support SIMD / Autovectorization (and do so at least as well as jdk 11?)
There will be 2 versions of the VM – Community Edition (CE) and Enterprise Edition (EE).
GraalVM EE provides additional performance, security, and scalability relevant for running critical applications in production.
As I understand it vectorization support will only be in the EE.
@apete Ok....but to confirm, is vectorization support already in EE or it slated for some future version of Graal EE edition...(And, conversely: is there 0 auto vectorization in the community edition?)
We will improve performance for both CE and EE versions. And do accept suggestions or patches for the CE version. Generally, the CE version tries to compile faster and we therefore avoid very expensive kinds of analysis. So in the trade-off compilation speed vs quality of machine code, the CE version values compilation speed higher than the EE version. We currently do not auto-vectorize loops in the CE version, because the transformation is complex and the benefits not always as clear. We are open to adding vectorization for important cases though.
To answer the original question: Our JDK11 version would not automatically pick up 1:1 the vectorization support added to C2 in JDK11.
@thomaswue Thanks for the clear answer. I am not a SIMD expert (hence my joy with using Java....), but I suspect that there are many cases where the autovectorization that already exists in C2 of jdk9 (and up) is very useful, especially for those programs that are doing many computationally heavy loop-based operations (which is common in the scientific community). (So, at minimum, I think the Graal version should have comparable perforamance on the standard SIMD chips, eg Intel Skylake, as the free openjdk's C2 jdk11 system....If (at some later point) it can have even faster performance, even better, but I think just being on par with C2 jdk11 would be good enough for now. (Right now, jdk11 autovectorization can achieve an order of magnitude speed up for the right scenarios, which is huge....) Just my 2 cents.
I benchmarked HotSpot, Graal CE and Graal EE doing matrix multiplication using 3 commonly used pure java linear algebra libraries.
The chart speaks for itself and the results are as expected. Graal CE is typically slower than HotSpot, and Graal EE faster. The difference between the CE and EE versions of Graal is significant.
https://ojalgo.blogspot.com/2019/02/oracles-jvms-hotspot-graal-ce-graal-ee.html
Most helpful comment
I benchmarked HotSpot, Graal CE and Graal EE doing matrix multiplication using 3 commonly used pure java linear algebra libraries.
https://ojalgo.blogspot.com/2019/02/oracles-jvms-hotspot-graal-ce-graal-ee.html