Julia: keyword arguments are very slow

Created on 2 Jan 2015  Â·  18Comments  Â·  Source: JuliaLang/julia

The script that demonstrate this can be found here. Pasted below

#!/usr/bin/julia -f

f() = nothing

forward_nothing(func) = func()
forward_pos_only(func, args...) = func(args...)
forward_kw(func, args...; kws...) = func(args...; kws...)

do_sth(a, b, c) = begin
    global g = a * b * c
end

macro time_forward(ex)
    quote
        println($(Expr(:quote, ex)))
        gc()
        @time for i in 1:10000000
            $(esc(ex))
        end
    end
end

@time_forward forward_nothing(f)
@time_forward forward_pos_only(f)
@time_forward forward_kw(f)
@time_forward do_sth("a", "b", "c")
@time_forward do_sth(1, 2, 3)
@time_forward do_sth(1.2, 2.3, 3.4)
@time_forward (do_sth(1, 2, 3); do_sth(1.2, 2.3, 3.4))

The output of the script on my laptop

forward_nothing(f)
elapsed time: 0.019369954 seconds (0 bytes allocated)
forward_pos_only(f)
elapsed time: 6.2e-8 seconds (0 bytes allocated)
forward_kw(f)
elapsed time: 0.95345393 seconds (960000000 bytes allocated, 33.34% gc time)
do_sth("a","b","c")
elapsed time: 1.014413397 seconds (800000000 bytes allocated, 28.80% gc time)
do_sth(1,2,3)
elapsed time: 0.04512597 seconds (0 bytes allocated)
do_sth(1.2,2.3,3.4)
elapsed time: 0.125168819 seconds (160000000 bytes allocated, 37.53% gc time)
begin 
    do_sth(1,2,3)
    do_sth(1.2,2.3,3.4)
end
elapsed time: 0.157331348 seconds (160000000 bytes allocated, 30.24% gc time)

There are several interesting things here,

  1. It seems that the compiler can inline f(args...) and figure out it is useless and eliminate the loop altogether (I don't think my cpu can excute a million loops in tens of nanosecond) but it cannot do it for f(). This particular case isn't very interesting but I'm wondering if there's any missing optimizations here.
  2. Forwarding keyword argument is very costy, even when there's nothing to be forwarded (probably a big part of it is memory allocation?). Although it would be nice to make keyword argument fast in general, the slowness of forwarding empty keyword argument means that it is very costy to support keyword arguments. One example (also how I discover this issue) is a version of invoke that supports keyword arguments. This version is ~3 times slower than the builtin invoke and a version without keyword argument support and simply forward to the builtin invoke is as fast as the builtin one.
  3. Just for comparison, I've also included some other random functions and it seems that keyword arguments are much more expensive than global variables and is as expensive as concatenating strings. What I don't quite understand either is that why is it necessary to allocate memory when assigning floating point numbers to a global variable but not for integers? (Edit: it seems that the integer version does no allocation because of the cache of boxed value.)

Anothing thing I've noticed is that the extra keyword argument got by the functions is in a different format (Array{(Symbol, Any), 1}) from the format of argument (Array{Any, 1}) to the .env.kwsorter. Is it necessary to do this conversion? This probably won't affect the performance for empty forwarding but feels inefficient in general.

Test done on current master as of a few hours ago.

julia> versioninfo()
Julia Version 0.4.0-dev+2369
Commit bab9ec5* (2015-01-02 01:47 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libblas
  LAPACK: liblapack
  LIBM: libm
  LLVM: libLLVM-3.3

The llvm version installed is 3.5 and I have no idea why it prints llvm-3.3....

keyword arguments performance

Most helpful comment

This would be one of the few things I had wished would make it for 1.0: indeed, many APIs have been designed with their slowness constraint, so they have been often avoided. I occasionally see a function in base which would benefit from a redesign to use keywords, so a pass could be done before freezing the APIs to update such functions (if keywords get faster).

All 18 comments

The performance penalty associated with keyword arguments is documented as a performance tip and is a duplicate of #8843.

I see. Now I remember that I've read those tip before but didn't really understand/remember them very well because I didn't really know how kwargs work at that time......

Please feel free to rename or open a new one for the other minor issues mentioned above if you think it is not the time yet to optimize keyword argument performance.

Although actually I think it might be possible to get a lot better performance if the argument to the kwsorter changes to ::Array{(Symbol, Any), 1}. I just hacked my version of invoke that support keyword argument by extending .env.kwsorter directly and the performance is much better (3x).

The llvm version installed is 3.5 and I have no idea why it prints llvm-3.3....

I fixed this bit in #9566

@tkelman Thanks. I noticed that it prints the right version yesterday but didn't figured out why..... =)

Slightly off topic: I just stumbled across this as well and, as recommended above, had a look at the performance tips. I found this sentence confusing: "Functions are specialized on the types of keyword arguments, so these declarations will not affect performance of code inside the function." Is it not contradictory? Is there a "not" missing, although, that wouldn't make sense either. If the function is specialized on the types of the keyword it should be faster, no? Anyways, maybe something to clean up.

AFAIK, what it means is that the function body is specialized so the function body will be compiled to the same level as if those were normal arguments.

julia> f(;a=0, b=1) = a + b
f (generic function with 1 method)

julia> @code_llvm f()

define i64 @julia_f_42853() {
top:
  ret i64 1
}

The overhead here is that when the function is called, it goes through another dispatch function which is slow but that's not part of the "function body" and you don't need to write a separate function with only normal arguments to let julia do type inference and increase performance.

I don't know if this is really what it mean and I do find that confusing too...

Thanks, that kind of makes sense. But I don't think that sentence helps anyone who doesn't know about the internals of kw-arg functions. And those who know probably don't need to read the performance tips.

(BTW, @code_llvm and friends often don't work properly with kw-args, see #9818, but should be ok for your example)

Yeah, I figured what I understand now after I figured out how kw-args works.

And yeah, I know @code_* and code_* doesn't work for keyword arguments. Here's a more "complete" example for that...

julia> f(;a=0, b=1) = if a > 0 && b < 1
       a * b + b + a
       else
       a - b * a - b
       end
f (generic function with 1 method)

julia> @code_llvm f()

define i64 @julia_f_42805() {
top:
  %0 = call i64 @"julia___f#0___42806"(i64 0, i64 1)
  ret i64 %0
}                                                                                  

julia> @code_typed f.env.kwsorter(Any[])
1-element Array{Any,1}:                                                            
 :($(Expr(:lambda, Any[symbol("#s2")], Any[Any[:a,:b,symbol("#s8"),symbol("#s9"),symbol("#s10"),symbol("#s1"),symbol("#s3"),symbol("#s4"),:_var1,:_var2,:_var3,:_var0,:_var4],Any[Any[symbol("#s2"),Array{Any,1},0],Any[:a,Any,2],Any[:b,Any,2],Any[symbol("#s8"),UnitRange{Int64},18],Any[symbol("#s9"),Int64,2],Any[symbol("#s10"),(Int64,Int64),18],Any[symbol("#s1"),Int64,18],Any[symbol("#s3"),Int64,18],Any[symbol("#s4"),Any,18],Any[:_var1,Int64,18],Any[:_var2,Int64,18],Any[:_var3,(ASCIIString,Any,ASCIIString),18],Any[:_var0,Int64,18],Any[:_var4,Int64,18]],Any[]], :(begin $(Expr(:line, 0, symbol(""), symbol("")))                                                    
        a = 0                                                                      
        b = 1                                                                      
        _var1 = (top(arraylen))(#s2::Array{Any,1})::Int64                          
        _var2 = (top(box))(Int64,(top(ashr_int))(_var1::Int64,(top(box))(Int32,(top(checked_trunc_sint))(Int32,1))))                                                  
        #s8 = $(Expr(:new, UnitRange{Int64}, 1, :(((top(getfield))(Intrinsics,:select_value))((top(sle_int))(1,_var2::Int64)::Bool,_var2::Int64,(top(box))(Int64,(top(sub_int))(1,1)))::Int64)))                                                         
        #s9 = (top(getfield))(#s8::UnitRange{Int64},:start)::Int64                 
        unless (top(box))(Bool,(top(not_int))(#s9::Int64 === (top(box))(Int64,(top(add_int))((top(getfield))(#s8::UnitRange{Int64},:stop)::Int64,1))::Bool)) goto 1   
        2:                                                                         
        _var0 = #s9::Int64
        _var4 = (top(box))(Int64,(top(add_int))(#s9::Int64,1))
        #s1 = _var0::Int64
        #s9 = _var4::Int64
        #s3 = (top(box))(Int64,(top(sub_int))((top(box))(Int64,(top(mul_int))(#s1::Int64,2)),1))
        #s4 = (top(arrayref))(#s2::Array{Any,1},#s3::Int64)::Any
        unless #s4 === :b::Bool goto 4
        b = (top(arrayref))(#s2::Array{Any,1},(top(box))(Int64,(top(add_int))(#s3::Int64,1)))::Any
        goto 6
        4: 
        unless #s4 === :a::Bool goto 5
        a = (top(arrayref))(#s2::Array{Any,1},(top(box))(Int64,(top(add_int))(#s3::Int64,1)))::Any
        goto 6
        5: 
        _var3 = $(Expr(:call1, :(top(tuple)), "unrecognized keyword argument \"", symbol("#s4"), "\""))
        throw(ErrorException((top(_apply))(call,string,_var3::(ASCIIString,Any,ASCIIString)))::ErrorException)
        6: 
        3: 
        unless (top(box))(Bool,(top(not_int))((top(box))(Bool,(top(not_int))(#s9::Int64 === (top(box))(Int64,(top(add_int))((top(getfield))(#s8::UnitRange{Int64},:stop)::Int64,1))::Bool)))) goto 2
        1: 
        0: 
        return __f#0__(a,b)::Any
    end::Any))))

julia> code_typed(eval(symbol("__f#0__")), (Int, Int))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:a,:b], Any[Any[symbol("#s13")],Any[Any[:a,Int64,0],Any[:b,Int64,0],Any[symbol("#s13"),Bool,2]],Any[]], :(begin $(Expr(:line, 1, :none, :f))
        unless (top(slt_int))(0,a::Int64)::Bool goto 0
        #s13 = (top(slt_int))(b::Int64,1)::Bool
        goto 1
        0: 
        #s13 = false
        1: 
        unless #s13::Bool goto 2 # line 2:
        return (top(box))(Int64,(top(add_int))((top(box))(Int64,(top(add_int))((top(box))(Int64,(top(mul_int))(a::Int64,b::Int64)),b::Int64)),a::Int64))
        goto 3
        2:  # line 4:
        return (top(box))(Int64,(top(sub_int))((top(box))(Int64,(top(sub_int))(a::Int64,(top(box))(Int64,(top(mul_int))(b::Int64,a::Int64)))),b::Int64))
        3: 
    end::Int64))))

This would be a breaking feature, so it's not eligible for a point release

@vtjnash said that:

This would be a breaking feature, so it's not eligible for a point release

So, maybe this issue should change its actual milestone (0.6.x).

What here would be a breaking feature? I'm confused.

Depends how it gets implemented. If we can make them faster without changing semantics at all then it could be backported, but who knows how likely that is at this point.

Keyword args need a redesign anyway, so I don't think it would be worth the effort to optimize this for 0.6, then again after the redesign, so moving milestone.

Given this is a performance issue, perhaps it can be addressed post 1.0.

This would be one of the few things I had wished would make it for 1.0: indeed, many APIs have been designed with their slowness constraint, so they have been often avoided. I occasionally see a function in base which would benefit from a redesign to use keywords, so a pass could be done before freezing the APIs to update such functions (if keywords get faster).

We'll improve keyword argument performance in the future. This may open up some API changes, at which point we can add keyword methods of those functions and keep the positional argument versions around as legacies. We'll probably want an opt-in mechanism to make sure your code is up to the standards of your current Julia version.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

musm picture musm  Â·  3Comments

sbromberger picture sbromberger  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

TotalVerb picture TotalVerb  Â·  3Comments

manor picture manor  Â·  3Comments