Julia: Vararg vs "..." slurping has huge performance impacts which defy analysis

Created on 1 Aug 2019  ·  15Comments  ·  Source: JuliaLang/julia

Here are two almost-identical functions:

function getindex1!(dest::AbstractArray, src::AbstractArray, I::Union{Real, AbstractArray}...)
    @inbounds for (i, j) in zip(eachindex(dest), Iterators.product(I...))
        dest[i] = src[j...]
    end
    return dest
end
function getindex2!(dest::AbstractArray, src::AbstractArray, I::Vararg{Union{Real, AbstractArray}, N}) where N
    @inbounds for (i, j) in zip(eachindex(dest), Iterators.product(I...))
        dest[i] = src[j...]
    end
    return dest
end



md5-951807fec754c4068ede884619927cae



julia> a = zeros(300, 300); b = rand(500, 500);

julia> @benchmark getindex1!($a, $b, 201:500, 201:500)
BenchmarkTools.Trial:
  memory estimate:  20.59 MiB
  allocs estimate:  629500
  --------------
  minimum time:     17.831 ms (0.00% GC)
  median time:      20.793 ms (0.00% GC)
  mean time:        20.941 ms (5.62% GC)
  maximum time:     31.831 ms (8.14% GC)
  --------------
  samples:          239
  evals/sample:     1

julia> @benchmark getindex2!($a, $b, 201:500, 201:500)
BenchmarkTools.Trial:
  memory estimate:  352 bytes
  allocs estimate:  6
  --------------
  minimum time:     77.827 μs (0.00% GC)
  median time:      77.973 μs (0.00% GC)
  mean time:        87.117 μs (0.00% GC)
  maximum time:     514.022 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1



md5-c6d122f604b4d16888cfee64fbe90edd



julia> foo1(a, b) = getindex1!(a, b, 201:500, 201:500)
foo1 (generic function with 1 method)

julia> foo2(a, b) = getindex2!(a, b, 201:500, 201:500)
foo2 (generic function with 1 method)

julia> @code_typed foo1(a, b)
CodeInfo(
1 ─ %1 = invoke Main.getindex1!(_2::Array{Float64,2}, _3::Array{Float64,2}, $(QuoteNode(201:500))::UnitRange{Int64}, $(QuoteNode(201:500))::Vararg{UnitRange{Int64},N} where N)::Array{Float64,2}
└──      return %1
) => Array{Float64,2}

julia> @code_typed foo2(a, b)
CodeInfo(
1 ─ %1 = invoke Main.getindex2!(_2::Array{Float64,2}, _3::Array{Float64,2}, $(QuoteNode(201:500))::UnitRange{Int64}, $(QuoteNode(201:500))::UnitRange{Int64})::Array{Float64,2}
└──      return %1
) => Array{Float64,2}

The reason for this difference in performance/compilation is not clear. The function signatures describe identical methods. If you define both as methods for the same function, there is still only one method in the function. I think this is a bug?

Most helpful comment

I thought this was documented but I might have mixed it up with variables captured in closures. I think adding something small about this is a good idea. It tends to happen for Function, Type and Vararg.

Cool, I'll write up an addition to Performance Tips today and check to see if there are any good places to link to it from other places.

All 15 comments

This is not a dup, as this issue is primarily about the resulting performance issue and how expressing the same method in two different ways causes wildly different performance. I don't care what the @code_ tools do specifically, I was just showing that I did not have the tools to identify what was causing the performance difference.

This is due to the N static parameter. It forces us to specialize the method for every value of N; otherwise we'll try to reuse the same code for different calls.

Is this documented somewhere?

Julia is always described as specializing on the runtime types of the arguments. I don't think many people know that type parameters change the specialization behaviour of otherwise identical methods.

I thought this was documented but I might have mixed it up with variables captured in closures. I think adding something small about this is a good idea. It tends to happen for Function, Type and Vararg.

Count me as flabbergasted by this too. I would have expected both versions to be identical. Actually, I don't even understand the explanation about the specialization on N. The slow benchmark involves a single value N=2, so only one method should be generated and reused, right?

The method generated will work for any N.

I thought this was documented but I might have mixed it up with variables captured in closures. I think adding something small about this is a good idea. It tends to happen for Function, Type and Vararg.

Cool, I'll write up an addition to Performance Tips today and check to see if there are any good places to link to it from other places.

I've come up with some good examples for Type and Vararg but I'm having trouble getting one for Function. Anyone have any suggestions?

Sometimes foo(f, args...) behaves differently from foo(f::F, args...) where F. Generally, I think, when things start getting more complicated (and dependent on Julia version). Do a grep "::F.*where F" * on base/ for possible sources of inspiration.

Thanks! I had checked `"::F.*Function" but yours returns more useful results

Interestingly, the two most prominent uses in Base specialize either way (ntuple(::F, ::Integer) where F and map!(::F, ::AbstractArray, ::AbstractArray) where F. I wonder if this just isn't applicable to functions anymore.

If I remember correctly, the f::F where F specialization was only required if you didn't call f directly inside the function body but instead just passed it to another function — probably a non-inlined function. Or at least that's the way it worked a long time ago (year+).

@mbauman That seems to be the key observation, thank you! That also explains why I had to search a long time to find an example for Type.

Based on that (and checking specialization behaviour in the REPL), it seems like there are a number of functions in Base which could have their F<:Function parameter removed.

Was this page helpful?
0 / 5 - 0 ratings