Julia: exp, @fastmath, SVML vectorization.

Created on 20 Apr 2017 · 22Comments · Source: JuliaLang/julia

In Julia 0.6, I noticed that exp is no longer a call to libm but has been implemented in Julia itself. I wonder if this decision has potential performance implications not far down the road. Through SVML, LLVM is able to provide vectorization for exp, if exp is invoked through a SVML intrinsic or a call into libm. It won't vectorize if the expanded LLVM from Julia's exp implementation is included. We can use fastmath to revert to a call to libm but this raises the question of the semantics of fastmath. It seems like the semantics of fastmath should be a loss of accuracy in exchange for performance. My understanding is that this is indeed the behavior for Julia fastmath adds in that Julia will use the LLVM fastmath flag. I also believe that currently Julia fastmath exp is not consistent in that it does not signal a lower accuracy version and so we would expect Julia's exp to have the same accuracy/performance as fastmath/libm exp?

I have been told that SVML provides three points within the accuracy/performance tradeoff space. We can debate but it seems like fastmath should map to one of the lower two accuracy (higher performance) levels. The question then becomes, how do you get vectorization at the highest accuracy level with SVML? It seems that implementing exp in Julia precludes this possibility unless more code is added to detect potential vectorization with SVML and in that case to revert back to a libm call. Why not just always libm then? In what circumstance is Julia exp superior?

codegen maths performance

Source

DrTodd13

👍2

Most helpful comment

In the long run, it would be neat to have something like ISPC (https://ispc.github.io/) in Julia itself.

simonbyrne on 19 Jul 2018

❤3 👍3

All 22 comments

Through SVML, LLVM is able to provide vectorization for exp, if exp is invoked through a SVML intrinsic or a call into libm.

Have you actually seen this happen? I don't think we have ever lowered it in any way that llvm can recognize.

yuyichao on 20 Apr 2017

Yes, we don't lower exp in a way that LLVM can recognize at the moment. However, it should be fairly simple to add a generic hook to fix that. As you noted, manually calling exp or the llvm intrinsic will work for testing purposes. The julia native implementation is generally faster than libm. In any case, there's no rush on this since we can't drop in SVML at the moment anyway.

Keno on 20 Apr 2017

I also believe that currently Julia fastmath exp is not consistent in that it does not signal a lower accuracy version

It seems intuitive to me that fastmath would allow calling a lower-accuracy version but not require it. In this case I believe the intent of the fastmath version was to skip error checks.

I don't think we have ever lowered it in any way that llvm can recognize.

This doesn't really matter --- we could implement exp in such a way that llvm could recognize it, but that would mean skipping the julia implementation, so the point still stands.

JeffBezanson on 20 Apr 2017

We can use fastmath to revert to a call to libm

This is most likely an oversight that should be fixed. In fact, the fast math version seems slower.

This doesn't really matter

My point being there should be no regression caused by this change in 0.6. Being able to vectorize it would obviously be even better.

that would mean skipping the julia implementation

Hopefully no since that means a failure to vectorize will create slower code....

yuyichao on 20 Apr 2017

There's no problem with just telling LLVM that our exp function is the same as what it considers exp to be. Just one extra hook in TargetLibraryInfo. Even better we could come up with a generic way of annotating This function is a vectorized version of this other function.

Keno on 20 Apr 2017

👍3

That'll be cool. How hard would it be to tell LLVM that a julia function can be vectorized (either because there's no complex control flow in it or because we defined a version that operate on NTuple{...,VecElement{...}} directly)?

yuyichao on 20 Apr 2017

The hooks are already there in TargetLibraryInfo as I said, but may require some hacking to have it do anything other than what is hardcoded right now. For functions without complex control flow, LLVM should be able to figure out by itself that the function can be vectorized, so we should just fix that in LLVM.

Keno on 20 Apr 2017

We are working on experimental patch to LLVM 4.0 which enables vectorization for all the SVML functions. Here is the list of enabled functions: sin cos pow exp log acos acosh asin asinh atan2 atan atanh cbrt cdfnorm cdfnorminv ceil cosd cosh erf erfc erfcinv erfinv exp10 exp2 expm1 floor fmod hypot invsqrt log10 log1p log2 logb nearbyint rint round sind sinh sqrt tan tanh trunc
Moreover, Intel will soon provide you a license to redistribute SVML the same way as your redistribute MKL in your binary Julia Distribution. It would be very cool if we can enable vectorization of these functions not only in the fastmath mode, SVML provides HA functions for high accuracy as well.

anton-malakhov on 20 Apr 2017

👍1

We can't actually legally distribute Julia with MKL unless Julia is built without any GPL libraries, which is not a standard build setup, so we won't be able to ship with SVML either. If we get rid of all the GPL libraries from Base Julia (which is a long term goal) then we'll be able to ship with MKL and SVML.

StefanKarpinski on 20 Apr 2017

👍1

Of course if Intel wanted to open source SVML under a GPL-compatible license that'd be great (and we could start using it immediately).

Keno on 20 Apr 2017

👍4

Ditto with MKL 😀

StefanKarpinski on 20 Apr 2017

While we are considering open-sourceing SVML (though it might be still limitted and takes time to release).. it's quite unlikely to happen for MKL

anton-malakhov on 20 Apr 2017

👍4

I think this is a duplicate of #15265.

simonbyrne on 21 Apr 2017

@StefanKarpinski @Keno, Viral assured us that "We expect that JuliaPro will start shipping with mkl by juliacon."
Thus our question is w.r.t. integration of SVML into MKL build of Julia Pro distro is still valid and urgent enough.

anton-malakhov on 21 Apr 2017

👍1

As I said SVML is not currently integrable into Julia for technical reasons. Intel NDA prevents me from giving details in this forum. Feel free to email me.

Keno on 21 Apr 2017

Is there an update to having SVML under Julia?

It seems to be holding back Julia (At part of the reason) in the following test:

https://www.modelsandrisk.org/appendix/speed/

Though Python + Numba is still faster when Julia is using @inbounds and Apple libm (See https://julialang.slack.com/archives/C67910KEH/p1531490464000597?thread_ts=1531475750.000264&cid=C67910KEH).

RoyiAvital on 13 Jul 2018

No.

StefanKarpinski on 13 Jul 2018

In the long run, it would be neat to have something like ISPC (https://ispc.github.io/) in Julia itself.

simonbyrne on 19 Jul 2018

❤3 👍3

@simonbyrne , Using SVML and ispc like approach are complementary of each other, aren't they?
Not that I'm an expert on that but I would assume integrating SVML is easier especially when Intel offers assistance.

RoyiAvital on 19 Jul 2018

They are complimentary, but properly integrating SVML requires julia at the frontend level to be aware of vector lanes, which we currently don't have, but would be a prerequisite for exposing a general spmd programming model.

Keno on 19 Jul 2018

@Keno, numba is not aware of vector lanes, still thanks to ability of LLVM to recognize libm calls and transform them into svml calls along allows it to enjoy nice speedups on transcendental functions. I know that Julia goes away from libm calls.. but can you consider having a mixed or opt-in approach to enable them back? E.g. we can start with some` @fastmath(SVML)-like macro which will enable good old libm functions emitting and switch LLVM into SVML mode?

anton-malakhov on 20 Jul 2018

👍1

Sure, that's why I said "properly integrating". A plethora of other hacks are and have always been possible. E.g. we used SVML for Celeste just fine.

Keno on 20 Jul 2018

Was this page helpful?

0 / 5 - 0 ratings