Julia: Performance of value type construction

Created on 22 Jun 2020  Β·  12Comments  Β·  Source: JuliaLang/julia

I've already posted about this issue on discourse here, but I didn't get any responses so I thought I would try here.

I noticed the following performance discrepancy between making a value type and making the instance of a value type using a runtime value:

julia> using BenchmarkTools

julia> x = "x"
"x"

julia> f(x::String) = Val{Symbol(x)}
f (generic function with 1 method)

julia> @btime f($x)
  138.326 ns (0 allocations: 0 bytes)
Val{:x}

julia> g(x::String) = Val{Symbol(x)}()
g (generic function with 1 method)

julia> @btime g($x)
  4.037 ΞΌs (0 allocations: 0 bytes)
Val{:x}()

This is with Julia v1.4.2:

julia> versioninfo()
Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) E-2176M  CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

but I see a similar discrepancy with Julia v1.5 (though it is a bit faster in v1.5, which is great!).

Is there a fundamental reason for this discrepancy? We are using a custom value type to do β€œruntime dispatch” (take a runtime value like a string, turn it into a value type, and then dispatch on that type). Originally we wanted to dispatch on the instances of the type. However, the 4 microsecond constructor overhead creates a bottleneck in the function (once the type is created, the dispatch and function that are being overloaded are very fast, and we are dispatching on multiple value types so the overhead adds up). We can dispatch on the type instead of the instance, but we were wondering if the constructor can (in principle) be made fast.

Sorry if this topic came up before, I tried searching for it but didn’t find anything addressing this particular question.

performance

Most helpful comment

We don't really want people depending on implementation details of that sort though. What's interesting here is that something is blocking inlining:

julia> @code_typed g("")
CodeInfo(
    @ REPL[11]:1 within `g'
   β”Œ @ boot.jl:456 within `Symbol'
1 ─│ %1 = $(Expr(:foreigncall, :(:jl_string_ptr), Ptr{UInt8}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{UInt8}
β”‚  β”‚ %2 = Core.sizeof(x)::Int64
β”‚  β”‚ %3 = $(Expr(:foreigncall, :(:jl_symbol_n), Ref{Symbol}, svec(Ptr{UInt8}, Int64), 0, :(:ccall), :(%1), :(%2), :(%2), :(%1)))::Symbol
β”‚  β””
β”‚   %4 = Core.apply_type(Main.Val, %3)::Type{Val{_A}} where _A
β”‚   %5 = (%4)()::Val{_A} where _A
└──      return %5
) => Val{_A} where _A

I think it's the unbound TypeVar in the representation of the method signature after intersection. This hack does much better (though we then have to write the error checking into this method ourself):

julia> @eval struct Val2{x}; (f::Type{<:Val2})() = (Base.isconcretetype(f) ? $(Expr(:new, :f)) : throw(UndefVarError(:x))); end

julia> Val2{T}() where T
ERROR: UndefVarError: x not defined
Stacktrace:
 [1] Val2{T}() at ./REPL[2]:1
 [2] top-level scope at REPL[3]:1

julia> @noinline g(x::String) = Val2{Symbol(x)}()
g (generic function with 1 method)

julia> @code_typed g("")
CodeInfo(
    @ REPL[4]:1 within `g'
   β”Œ @ boot.jl:456 within `Symbol'
1 ─│ %1 = $(Expr(:foreigncall, :(:jl_string_ptr), Ptr{UInt8}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{UInt8}
β”‚  β”‚ %2 = Core.sizeof(x)::Int64
β”‚  β”‚ %3 = $(Expr(:foreigncall, :(:jl_symbol_n), Ref{Symbol}, svec(Ptr{UInt8}, Int64), 0, :(:ccall), :(%1), :(%2), :(%2), :(%1)))::Symbol
β”‚  β””
β”‚   %4 = Core.apply_type(Main.Val2, %3)::Type{Val2{_A}} where _A
β”‚  β”Œ @ REPL[2]:1 within `Val2'
β”‚  β”‚β”Œ @ reflection.jl:530 within `isconcretetype'
β”‚  β”‚β”‚β”Œ @ Base.jl:28 within `getproperty'
β”‚  β”‚β”‚β”‚ %5 = Base.getfield(%4, :isconcretetype)::Any
β”‚  β”‚β””β””
└──│      goto #3 if not %5
2 ─│ %7 = %new(%4)::Val2{_A} where _A
└──│      goto #4
   β”‚β”Œ @ boot.jl:262 within `UndefVarError'
3 ─││ %9 = %new(Core.UndefVarError, :x)::UndefVarError
β”‚  β”‚β””
β”‚  β”‚      Main.throw(%9)::Union{}
└──│      unreachable
   β””
4 ─      return %7
) => Val2{_A} where _A

julia> @btime g("1") # with Val2
  175.574 ns (0 allocations: 0 bytes)
Val2{Symbol("1")}()

julia> @btime g("1") # with .instance
  172.044 ns (0 allocations: 0 bytes)
Val{Symbol("1")}()

julia> @btime g("1") # with Val
  12.810 ΞΌs (0 allocations: 0 bytes)
Val{Symbol("1")}()

All 12 comments

Dup of https://github.com/JuliaLang/julia/issues/21730 (and https://github.com/JuliaLang/julia/issues/29887).

Workaround is to use f(x::String) = Val{Symbol(x)}.instance (I think that should work at least).

That's really helpful, thanks for the quick response. We will try that out. Feel free to close in favor of the previous issues.

We don't really want people depending on implementation details of that sort though. What's interesting here is that something is blocking inlining:

julia> @code_typed g("")
CodeInfo(
    @ REPL[11]:1 within `g'
   β”Œ @ boot.jl:456 within `Symbol'
1 ─│ %1 = $(Expr(:foreigncall, :(:jl_string_ptr), Ptr{UInt8}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{UInt8}
β”‚  β”‚ %2 = Core.sizeof(x)::Int64
β”‚  β”‚ %3 = $(Expr(:foreigncall, :(:jl_symbol_n), Ref{Symbol}, svec(Ptr{UInt8}, Int64), 0, :(:ccall), :(%1), :(%2), :(%2), :(%1)))::Symbol
β”‚  β””
β”‚   %4 = Core.apply_type(Main.Val, %3)::Type{Val{_A}} where _A
β”‚   %5 = (%4)()::Val{_A} where _A
└──      return %5
) => Val{_A} where _A

I think it's the unbound TypeVar in the representation of the method signature after intersection. This hack does much better (though we then have to write the error checking into this method ourself):

julia> @eval struct Val2{x}; (f::Type{<:Val2})() = (Base.isconcretetype(f) ? $(Expr(:new, :f)) : throw(UndefVarError(:x))); end

julia> Val2{T}() where T
ERROR: UndefVarError: x not defined
Stacktrace:
 [1] Val2{T}() at ./REPL[2]:1
 [2] top-level scope at REPL[3]:1

julia> @noinline g(x::String) = Val2{Symbol(x)}()
g (generic function with 1 method)

julia> @code_typed g("")
CodeInfo(
    @ REPL[4]:1 within `g'
   β”Œ @ boot.jl:456 within `Symbol'
1 ─│ %1 = $(Expr(:foreigncall, :(:jl_string_ptr), Ptr{UInt8}, svec(Any), 0, :(:ccall), Core.Argument(2)))::Ptr{UInt8}
β”‚  β”‚ %2 = Core.sizeof(x)::Int64
β”‚  β”‚ %3 = $(Expr(:foreigncall, :(:jl_symbol_n), Ref{Symbol}, svec(Ptr{UInt8}, Int64), 0, :(:ccall), :(%1), :(%2), :(%2), :(%1)))::Symbol
β”‚  β””
β”‚   %4 = Core.apply_type(Main.Val2, %3)::Type{Val2{_A}} where _A
β”‚  β”Œ @ REPL[2]:1 within `Val2'
β”‚  β”‚β”Œ @ reflection.jl:530 within `isconcretetype'
β”‚  β”‚β”‚β”Œ @ Base.jl:28 within `getproperty'
β”‚  β”‚β”‚β”‚ %5 = Base.getfield(%4, :isconcretetype)::Any
β”‚  β”‚β””β””
└──│      goto #3 if not %5
2 ─│ %7 = %new(%4)::Val2{_A} where _A
└──│      goto #4
   β”‚β”Œ @ boot.jl:262 within `UndefVarError'
3 ─││ %9 = %new(Core.UndefVarError, :x)::UndefVarError
β”‚  β”‚β””
β”‚  β”‚      Main.throw(%9)::Union{}
└──│      unreachable
   β””
4 ─      return %7
) => Val2{_A} where _A

julia> @btime g("1") # with Val2
  175.574 ns (0 allocations: 0 bytes)
Val2{Symbol("1")}()

julia> @btime g("1") # with .instance
  172.044 ns (0 allocations: 0 bytes)
Val{Symbol("1")}()

julia> @btime g("1") # with Val
  12.810 ΞΌs (0 allocations: 0 bytes)
Val{Symbol("1")}()

Great, we are happy to use that approach and wait for a native Julia fix.

This is interesting --- I think we could change constructor lowering to that, since new will already error if given a non-concrete type.

Ah, that makes it much easier. I didn't realize it did that, so I was trying to manually throw the same error. I suppose I should have tested it, haha.

More drastically, I think it'd be interesting for us to try to bind any function name as a local variable, so that we could detect any code with this common pattern during lowering:

Array{T,1}(::UndefInitializer, m::Int) where {T} =
    ccall(:jl_alloc_array_1d, Array{T,1}, (Any, Int), Array{T,1}, m)

And statically rewrite it to use the argument value, instead of preserving that static param and the apply_type calls (though that might complicate handling the isdefined check on usage and recursive closures, so ymmv).

But for starters, handling it as a pattern for new seems like it could be quite good already.

I'm not able to follow all of the details, but is the conclusion that the best approach for defining a faster custom value type right now is:

@eval struct Val2{x}
  (f::Type{<:Val2})() = $(Expr(:new, :f))
end

while we wait for a native fix?

Yes that should do it.

On master this takes 105ns and 235ns, so it seems to be largely fixed.

Probably https://github.com/JuliaLang/julia/pull/35384 (and thus by extension https://github.com/JuliaLang/julia/pull/35983), though that lowering change would still simplify the IR by removing the extra dispatch step, and make this 40% faster.

I'll look into the lowering change. I think the unknown static parameter will still prevent inlining (the signature can be written without static parameters, but the user might not write it that way of course), but we could fix that (at least in the case where the static parameters are not used in the body).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Keno picture Keno  Β·  3Comments

omus picture omus  Β·  3Comments

TotalVerb picture TotalVerb  Β·  3Comments

iamed2 picture iamed2  Β·  3Comments

StefanKarpinski picture StefanKarpinski  Β·  3Comments