In eliminating invalidations from AbstractString
methods I added type-asserts like these. I assumed that these would be elided in cases where the type was known concretely, but that assumption appears to be wrong. Here's a demo:
julia> code_typed(sizeof, (SubString{String},))
1-element Vector{Any}:
CodeInfo(
1 โ %1 = Base.getfield(s, :ncodeunits)::Int64
โ invoke Core.TypeVar(Symbol("#s68")::Symbol, Union{UInt16, UInt32, UInt8}::Any)::TypeVar
โ invoke Core.TypeVar(Symbol("#s68")::Symbol, Union{UInt16, UInt32, UInt8}::Any)::Core.Compiler.PartialTypeVar(var"#s68"<:Union{UInt16, UInt32, UInt8}, true, true)
โ %4 = Base.mul_int(%1, 1)::Int64
โโโ return %4
) => Int64
You can see that statements 2 and 3 aren't used by any subsequent code, yet they are still there. This is further confirmed by
julia> code_llvm(sizeof, (SubString{String},))
; @ strings/basic.jl:177 within `sizeof'
define i64 @julia_sizeof_3256({ {}*, i64, i64 }* nocapture nonnull readonly align 8 dereferenceable(24)) {
top:
%1 = alloca [2 x {}*], align 8
%.sub = getelementptr inbounds [2 x {}*], [2 x {}*]* %1, i64 0, i64 0
; โ @ strings/substring.jl:64 within `ncodeunits'
; โโ @ Base.jl:33 within `getproperty'
%2 = getelementptr inbounds { {}*, i64, i64 }, { {}*, i64, i64 }* %0, i64 0, i32 2
; โโ
; โ @ strings/substring.jl:65 within `codeunit'
store {}* inttoptr (i64 140505618344944 to {}*), {}** %.sub, align 8
%3 = getelementptr inbounds [2 x {}*], [2 x {}*]* %1, i64 0, i64 1
store {}* inttoptr (i64 140505642590608 to {}*), {}** %3, align 8
%4 = call nonnull {}* @j1_TypeVar_3258({}* inttoptr (i64 140505697665616 to {}*), {}** nonnull %.sub, i32 2)
; โ
store {}* inttoptr (i64 140505618344944 to {}*), {}** %.sub, align 8
store {}* inttoptr (i64 140505642590800 to {}*), {}** %3, align 8
%5 = call nonnull {}* @j1_TypeVar_3259({}* inttoptr (i64 140505697665616 to {}*), {}** nonnull %.sub, i32 2)
; โ @ int.jl:88 within `*'
%6 = load i64, i64* %2, align 8
; โ
ret i64 %6
}
julia> code_native(sizeof, (SubString{String},))
.text
; โ @ basic.jl:177 within `sizeof'
pushq %r15
pushq %r14
pushq %r13
pushq %r12
pushq %rbx
subq $16, %rsp
movq %rdi, %rbx
movabsq $140505618344944, %r12 # imm = 0x7FCA03786BF0
; โโ @ substring.jl:65 within `codeunit'
movq %r12, (%rsp)
movabsq $140505639939952, %rax # imm = 0x7FCA04C1EF70
movq %rax, 8(%rsp)
movabsq $TypeVar, %r13
movabsq $jl_system_image_data, %r14
movq %rsp, %r15
movq %r14, %rdi
movq %r15, %rsi
movl $2, %edx
callq *%r13
; โโ
movq %r12, (%rsp)
movabsq $140505639940112, %rax # imm = 0x7FCA04C1F010
movq %rax, 8(%rsp)
movq %r14, %rdi
movq %r15, %rsi
movl $2, %edx
callq *%r13
; โโ @ int.jl:88 within `*'
movq 16(%rbx), %rax
; โโ
addq $16, %rsp
popq %rbx
popq %r12
popq %r13
popq %r14
popq %r15
retq
nopw %cs:(%rax,%rax)
nop
; โ
and especially by Profile lines like
1โ โ โ โ โ โ โ 19 @Base/strings/basic.jl:177; sizeof
10โ โ โ โ โ โ โ 10 @Base/boot.jl:366; TypeVar(n::Symbol, ub::Any)
1โ โ โ โ โ โ โ 8 @Base/strings/substring.jl:65; codeunit
7โ โ โ โ โ โ โ 7 @Base/boot.jl:366; TypeVar(n::Symbol, ub::Any)
Contrast with:
julia> fastsizeof(s::AbstractString) = ncodeunits(s) * sizeof(fastcodeunit(s))
fastsizeof (generic function with 1 method)
julia> fastcodeunit(s::SubString) = codeunit(s.string)
fastcodeunit (generic function with 1 method)
julia> code_typed(fastsizeof, (SubString{String},))
1-element Vector{Any}:
CodeInfo(
1 โ %1 = Base.getfield(s, :ncodeunits)::Int64
โ %2 = Base.mul_int(%1, 1)::Int64
โโโ return %2
) => Int64
julia> code_llvm(fastsizeof, (SubString{String},))
; @ REPL[30]:1 within `fastsizeof'
define i64 @julia_fastsizeof_3457({ {}*, i64, i64 }* nocapture nonnull readonly align 8 dereferenceable(24)) {
top:
; โ @ strings/substring.jl:64 within `ncodeunits'
; โโ @ Base.jl:33 within `getproperty'
%1 = getelementptr inbounds { {}*, i64, i64 }, { {}*, i64, i64 }* %0, i64 0, i32 2
; โโ
; โ @ int.jl:88 within `*'
%2 = load i64, i64* %1, align 8
; โ
ret i64 %2
}
julia> code_native(fastsizeof, (SubString{String},))
.text
; โ @ REPL[30]:1 within `fastsizeof'
; โโ @ int.jl:88 within `*'
movq 16(%rdi), %rax
; โโ
retq
nopw %cs:(%rax,%rax)
nop
; โ
So, it looks like I screwed up the performance of string processing. :cry: Apologies!
However, rather than backing those out it seems the best way to fix this would be in the optimizer? I've tentatively tagged this issue as such.
Possibly-related (though it may be different, i.e., hitting a limit in Union size or something):
julia> f() = (x = rand(); x < 0.25 ? RGB{N0f8}(1, 0, 0) :
x < 0.50 ? BGR{N0f8}(1, 0, 0) :
x < 0.75 ? HSL(0.0f0, 1.0f0, 0.5f0) :
HSL(0.3, 0.8, 0.2))::Colorant
f (generic function with 1 method)
julia> code_typed(f)
1-element Vector{Any}:
CodeInfo(
1 โโ %1 = $(Expr(:foreigncall, :(:jl_threadid), Int16, svec(), 0, :(:ccall)))::Int16
โ %2 = Base.sext_int(Int64, %1)::Int64
โ %3 = Base.add_int(%2, 1)::Int64
โ %4 = invoke Random.default_rng(%3::Int64)::Random.MersenneTwister
โ %5 = Base.getfield(%4, :idxF)::Int64
โ %6 = Random.MT_CACHE_F::Int64
โ %7 = (%5 === %6)::Bool
โโโโ goto #3 if not %7
2 โโ %9 = $(Expr(:gc_preserve_begin, :(%4)))
โ %10 = Base.getfield(%4, :state)::Random.DSFMT.DSFMT_state
โ %11 = Base.getfield(%4, :vals)::Vector{Float64}
โ %12 = $(Expr(:foreigncall, :(:jl_array_ptr), Ptr{Float64}, svec(Any), 0, :(:ccall), :(%11)))::Ptr{Float64}
โ %13 = Base.getfield(%4, :vals)::Vector{Float64}
โ %14 = Base.arraylen(%13)::Int64
โ invoke Random.dsfmt_fill_array_close1_open2!(%10::Random.DSFMT.DSFMT_state, %12::Ptr{Float64}, %14::Int64)::Any
โ $(Expr(:gc_preserve_end, :(%9)))
โโโโ Base.setfield!(%4, :idxF, 0)::Int64
3 โโ goto #4
4 โโ %19 = Base.getfield(%4, :vals)::Vector{Float64}
โ %20 = Base.getfield(%4, :idxF)::Int64
โ %21 = Base.add_int(%20, 1)::Int64
โ Base.setfield!(%4, :idxF, %21)::Int64
โ %23 = Base.arrayref(false, %19, %21)::Float64
โโโโ goto #5
5 โโ goto #6
6 โโ %26 = Base.sub_float(%23, 1.0)::Float64
โโโโ goto #7
7 โโ goto #8
8 โโ goto #9
9 โโ %30 = Base.lt_float(%26, 0.25)::Bool
โโโโ goto #11 if not %30
10 โ goto #18
11 โ %33 = Base.lt_float(%26, 0.5)::Bool
โโโโ goto #13 if not %33
12 โ goto #17
13 โ %36 = Base.lt_float(%26, 0.75)::Bool
โโโโ goto #15 if not %36
14 โ goto #16
15 โ nothing::Nothing
16 โ %40 = ฯ (#14 => $(QuoteNode(HSL{Float32}(0.0f0,1.0f0,0.5f0))), #15 => $(QuoteNode(HSL{Float64}(0.3,0.8,0.2))))::Union{HSL{Float32}, HSL{Float64}}
17 โ %41 = ฯ (#12 => $(QuoteNode(BGR{N0f8}(1.0,0.0,0.0))), #16 => %40)::Union{BGR{N0f8}, HSL}
18 โ %42 = ฯ (#10 => $(QuoteNode(RGB{N0f8}(1.0,0.0,0.0))), #17 => %41)::Any
โ Core.typeassert(%42, Main.Colorant)::Colorant
โ %44 = ฯ (%42, Colorant)
โโโโ return %44
) => Colorant
There's no actual need to execute that type-assert (though inference should keep it in preference to Any
), though this one doesn't surprise me as much as the first one.
CC @kimikage
It's not really a DCE issue. The compiler couldn't prove that those functions don't throw. I'm also surprised it didn't inline the Core.TypeVar
, which is potentially what is inhibiting the nothrow analysis.
Thanks for the clarification! Thoughts on whether this can be straightforwardly fixed or whether we need to back out those changes?
I don't think it should be that bad to fix. There's basically two ways to go about it. The quick way would be to figure out why the Core.TypeVar isn't inlining and tweaking whatever is preventing that such that our nothrow model for _typevar
can kick in. The other would be for inference to start annotating effect-free information on CodeInfos, like we do pure already. I don't think it's terribly hard, but I might not get to it for a little bit.
Thanks for the info. I don't think it's a rush but we should think about it before releasing 1.6. I'll put a milestone on this issue just so we don't forget. Like I said, if it proves to be the expedient solution I'm 100% willing to revert those changes; it's nice to have things perform reasonably in the face of poor inference information, but not at the cost of good performance when inference works well.
A better way to write types like this is Union{Type{UInt8},Type{UInt16},Type{UInt32}}
, which should work around this issue.
Original issue is fixed, but to me it seems useful to leave this open for future compiler improvements. I've taken off the 1.6 milestone, though.
Most helpful comment
A better way to write types like this is
Union{Type{UInt8},Type{UInt16},Type{UInt32}}
, which should work around this issue.