I just noticed in my class yesterday that @code_warntype
has become nearly unreadable in 1.0. Consider the function
mysum(a) = let s=0; for x in a; s+=x; end; s; end
which is type-unstable for summing a floating-point array. In Julia 0.6, @code_warntype mysum([1.0])
outputs:
Variables:
#self# <optimized out>
a::Array{Float64,1}
x::Float64
#temp#@_4::Int64
s::Union{Float64, Int64}
#temp#@_6::Core.MethodInstance
#temp#@_7::Float64
Body:
begin
s::Union{Float64, Int64} = 0 # line 1:
#temp#@_4::Int64 = 1
4:
unless (Base.not_int)((#temp#@_4::Int64 === (Base.add_int)((Base.arraylen)(a::Array{Float64,1})::Int64, 1)::Int64)::Bool)::Bool goto 29
SSAValue(2) = (Base.arrayref)(a::Array{Float64,1}, #temp#@_4::Int64)::Float64
SSAValue(3) = (Base.add_int)(#temp#@_4::Int64, 1)::Int64
x::Float64 = SSAValue(2)
#temp#@_4::Int64 = SSAValue(3) # line 1:
unless (s::Union{Float64, Int64} isa Int64)::Bool goto 14
#temp#@_6::Core.MethodInstance = MethodInstance for +(::Int64, ::Float64)
goto 23
14:
unless (s::Union{Float64, Int64} isa Float64)::Bool goto 18
#temp#@_6::Core.MethodInstance = MethodInstance for +(::Float64, ::Float64)
goto 23
18:
goto 20
20:
#temp#@_7::Float64 = (s::Union{Float64, Int64} + x::Float64)::Float64
goto 25
23:
#temp#@_7::Float64 = $(Expr(:invoke, :(#temp#@_6), :(Main.+), :(s), :(x)))
25:
s::Union{Float64, Int64} = #temp#@_7::Float64
27:
goto 4
29: # line 1:
return s::Union{Float64, Int64}
end::Union{Float64, Int64}
which clearly indicates at the top that the s
variable is type-unstable.
In contrast, in Julia 1.0 we get:
Body::Union{Float64, Int64}
1 1 โโ %1 = (Base.arraylen)(a)::Int64 โโปโทโท iterate
โ %2 = (Base.sle_int)(0, %1)::Bool โโโปโท iterate
โ %3 = (Base.bitcast)(UInt64, %1)::UInt64 โโโโป <
โ %4 = (Base.ult_int)(0x0000000000000000, %3)::Boolโโโโป <
โ %5 = (Base.and_int)(%2, %4)::Bool โโโโโป &
โโโโ goto #3 if not %5 โโโ
2 โโ %7 = (Base.arrayref)(false, a, 1)::Float64 โโโโป getindex
โโโโ goto #4 โโโ
3 โโ goto #4 โโโ
4 โโ %10 = ฯ (#2 => false, #3 => true)::Bool โโ
โ %11 = ฯ (#2 => %7)::Float64 โโ
โ %12 = ฯ (#2 => 2)::Int64 โโ
โโโโ goto #5 โโ
5 โโ %14 = (Base.not_int)(%10)::Bool โ
โโโโ goto #16 if not %14 โ
6 โโ %16 = ฯ (#5 => %11, #15 => %46)::Float64 โ
โ %17 = ฯ (#5 => %12, #15 => %47)::Int64 โ
โ %18 = ฯ (#5 => 0, #15 => %32)::Union{Float64, Int64}
โ %19 = (isa)(%18, Float64)::Bool โ
โโโโ goto #8 if not %19 โ
7 โโ %21 = ฯ (%18, Float64) โ
โ %22 = (Base.add_float)(%21, %16)::Float64 โโป +
โโโโ goto #11 โ
8 โโ %24 = (isa)(%18, Int64)::Bool โ
โโโโ goto #10 if not %24 โ
9 โโ %26 = ฯ (%18, Int64) โ
โ %27 = (Base.sitofp)(Float64, %26)::Float64 โโโปโทโทโท promote
โ %28 = (Base.add_float)(%27, %16)::Float64 โโโป +
โโโโ goto #11 โ
10 โ (Core.throw)(ErrorException("fatal error in type inference (type bound)"))
โโโโ $(Expr(:unreachable)) โ
11 โ %32 = ฯ (#7 => %22, #9 => %28)::Float64 โ
โ %33 = (Base.bitcast)(UInt64, %17)::UInt64 โโโป rem
โ %34 = ฯ (0x0000000000000001, UInt64) โโโป -
โ %35 = (Base.sub_int)(%33, %34)::UInt64 โโโโป -
โ %36 = (Base.arraylen)(a)::Int64 โโโป length
โ %37 = (Base.sle_int)(0, %36)::Bool โโโโปโท >=
โ %38 = (Base.bitcast)(UInt64, %36)::UInt64 โโโโโป reinterpret
โ %39 = (Base.ult_int)(%35, %38)::Bool โโโโป <
โ %40 = (Base.and_int)(%37, %39)::Bool โโโโป &
โโโโ goto #13 if not %40 โโ
12 โ %42 = (Base.arrayref)(false, a, %17)::Float64 โโโป getindex
โ %43 = (Base.add_int)(%17, 1)::Int64 โโโป +
โโโโ goto #14 โโ
13 โ goto #14 โโ
14 โ %46 = ฯ (#12 => %42)::Float64 โ
โ %47 = ฯ (#12 => %43)::Int64 โ
โ %48 = ฯ (#12 => false, #13 => true)::Bool โ
โ %49 = (Base.not_int)(%48)::Bool โ
โโโโ goto #16 if not %49 โ
15 โ goto #6 โ
16 โ %52 = ฯ (#14 => %32, #5 => 0)::Union{Float64, Int64}
โโโโ return %52
in which the type-instability is attached to a temporary variable ฯ
. Basically, the output contains no useful information for a mere mortal programmer except for the first line Body::Union{Float64, Int64}
, which indicates that there is a type instability but not where it comes from.
Note that the @code_warntype
documentation was updated in #27033, but I'm not sure when the output actually changed.
Ref https://discourse.julialang.org/t/readability-of-code-warntype/15073.
I believe the issue is the same with debugging any optimized code. As soon as the compiler gets power enough to completely break down local variables, the information of which original variable is taking which value is in general lost. It should be possible to keep track of some part of it but given the experience with debugging a optimized build, I'm not having high hope that all the informations can be perserved with full optimization on....
I'm of the recent opinion that we should use the optimize=false
flag for code_warntype. That's the closest form that we have to reflect to the user what they gave as source code, and thus also probably the most actionable (and least influenced by whatever optimizations we happen to have implemented at any given point in time).
Totally agree --- code_warntype is about types after all. It's not called code_warn_inlining_and_other_stuff :)
Can we also restore the summary at the top about the types inferred for each variable?
code_warntype
is about types after all
But what about type (in)stability that is affected by the optimization? After all, people are using this to fix performance issue or at least type instability when they run the code that has the optimization on. It would be really confusing if they end up fixing type stabilities that doesn't matter or unfixable (from iterator protocal, for example).
I second Steven in that the summary of "user variables" and their inferred types at the top was very helpful.
I will be running an intro to Julia workshop for engineers next week and keep going back and forth about using v0.7/1.0
just because of this. I want to tell them "look what 0.7/1.0" can do, but I can't show them :)
I will leave the discussion/decisions on what information can be salvaged and presented to the user to the experts; I am sure this conversation will get way over my head very fast ;)
Just to be clear, I fully agree that the new format is less readable and more information persented will be nice. The new format is comparable to debuging an optimized build or reading LLVM IR, none of which is easy.
However, I strongly think it's much more important to persent the correct and relevant information. Showing result from inference in a non-default setting doesn't seem very useful, especially when that actually have a major impact on the type info in very common cases. For example,
julia> function f(n::Int)
s = 0
for i in 1:n
s += i
end
s
end
f (generic function with 1 method)
julia> code_typed(f, Tuple{Int}; optimize=true)
1-element Array{Any,1}:
CodeInfo(
3 1 โโ %1 = (Base.sle_int)(1, n)::Bool โโปโทโทโทโท Colon
โ (Base.sub_int)(n, 1)::Int64 โโโป Type
โ %3 = (Base.ifelse)(%1, n, 0)::Int64 โโโโ unitrange_last
โ %4 = (Base.slt_int)(%3, 1)::Bool โโโปโทโท isempty
โโโโ goto #3 if not %4 โโ
2 โโ goto #4 โโ
3 โโ goto #4 โโ
4 โโ %8 = ฯ (#2 => true, #3 => false)::Bool โ
โ %9 = ฯ (#3 => 1)::Int64 โ
โ %10 = ฯ (#3 => 1)::Int64 โ
โ %11 = (Base.not_int)(%8)::Bool โ
โโโโ goto #10 if not %11 โ
5 โโ %13 = ฯ (#4 => 0, #9 => %16)::Int64 โ
โ %14 = ฯ (#4 => %9, #9 => %22)::Int64 โ
โ %15 = ฯ (#4 => %10, #9 => %23)::Int64 โ
4 โ %16 = (Base.add_int)(%13, %14)::Int64 โโป +
โ %17 = (%15 === %3)::Bool โโโป ==
โโโโ goto #7 if not %17 โโ
6 โโ goto #8 โโ
7 โโ %20 = (Base.add_int)(%15, 1)::Int64 โโโป +
โโโโ goto #8 โโป iterate
8 โโ %22 = ฯ (#7 => %20)::Int64 โ
โ %23 = ฯ (#7 => %20)::Int64 โ
โ %24 = ฯ (#6 => true, #7 => false)::Bool โ
โ %25 = (Base.not_int)(%24)::Bool โ
โโโโ goto #10 if not %25 โ
9 โโ goto #5 โ
6 10 โ %28 = ฯ (#8 => %16, #4 => 0)::Int64 โ
โโโโ return %28 โ
) => Int64
julia> code_typed(f, Tuple{Int}; optimize=false)
1-element Array{Any,1}:
CodeInfo(
2 1 โ (s = 0)::Const(0, false) โ
3 โ %2 = (1:n)::UnitRange{Int64} โ
โ (#temp# = (Base.iterate)(%2))::Union{Nothing, Tuple{Int64,Int64}}
โ %4 = (#temp# === nothing)::Bool โ
โ %5 = (Base.not_int)(%4)::Bool โ
โโโ goto #4 if not %5 โ
2 โ %7 = #temp#::Tuple{Int64,Int64}::Tuple{Int64,Int64} โ
โ (i = (Core.getfield)(%7, 1))::Int64 โ
โ %9 = (Core.getfield)(%7, 2)::Int64 โ
4 โ (s = s + i)::Int64 โ
โ (#temp# = (Base.iterate)(%2, %9))::Union{Nothing, Tuple{Int64,Int64}}
โ %12 = (#temp# === nothing)::Bool โ
โ %13 = (Base.not_int)(%12)::Bool โ
โโโ goto #4 if not %13 โ
3 โ goto #2 โ
6 4 โ return s โ
) => Int64
Note the type instability in the second printing that's irrelevant and not fixable.
But what about type (in)stability that is affected by the optimization?
I'm fully resigned to the fact that @code_warntype
has always been mostly incapable of connecting to performance information, such as a tool like @profile
would provide. The concept of "type instability", and the assumption that it always negatively affects performance, has always been erroneous. But "some models are useful".
Maybe we need something like code_warnperf
that shows the optimized form and just highlights dynamic calls in red? I think it would be useful to see both that, and code_warntype with just type information on un-optimized code.
While we're at it, could we also show whether variables are boxed / generally cause allocations or live on the stack? This is often much more relevant than whether the variables are type-stable, since many unions are optimized by now.
Whether a type-instability is problematic is not entirely obvious from the types, e.g.
julia> t1 = Tuple{Union{Missing, Int64},Int64}; t2 = Union{Tuple{Missing,Int64}, Tuple{Int64,Int64}}; t1==t2
true
julia> Base.isbitsunion(t2), sizeof(t2), Base.isbitsunion(t1)
(true, 16, false)
julia> sizeof(t1)
ERROR: argument is an abstract type; size is indeterminate
Now, is Vector{t1}==Vector{t2}
? Will this be stored inline? If a variable has type t1
, will it cause allocations? What if it has type t2
?
(yes, Vector{t1}==Vector{t2}
and elements will not be stored inline; no clue how to find out about the layout of SSA-values that are inferred with one of these types; @code_warntype
should tell me)
The concept of "type instability", and the assumption that it always negatively affects performance, has always been erroneous. But "some models are useful".
Exactly. That connect is not 100% correct but it is nevertheless the useful one. It's wrong to say bad performance == type instability but there is a real and fundamental correlation and the correlation is a useful one. I don't doubt there are real and relevant cases where introducing type instability lead to better performance but that usually (always) come from improvements in other areas. With everything else the same, it's impossible to have a type unstable version of the code to run faster than the type stable version (I know this is not well defined but I'm here referring to when the runtime values are all of the same types in the two version yet the inference worked in one and not the other for some reasons). Moreover, I'll be shocked if either https://github.com/JuliaLang/julia/issues/29287#issue-362237858 or https://github.com/JuliaLang/julia/issues/29287#issuecomment-423240191 are not about performance.
such as a tool like
@profile
would provide
Static analysis tools exist for good reason. There are countless examples where the execution time of a function is important yet hard to profile. (hard to construct condition/input, low runtime but latency critical etc.) I'm by no mean claiming that one should only use code_warntype
or be happy as long as there's no type instability but the type stability is still a good place to start in many cases, as long as only the relevant ones are shown.
I think it would be useful to see both that, and code_warntype with just type information on un-optimized code.
My question is that what is this useful for? I mean, I've used it in specific cases to debug compiler related issues and maybe as a cool thing to demostrate but I never find it useful for anything else. I'll claim that most people use code_warntype
as a 1st order performance debugging tool and for that the unoptimized version isn't useful at all.
While we're at it, could we also show whether variables are boxed / generally cause allocations or live on the stack?
Please use a different issue.
My question is that what is this useful for?
When a type instability causes a performance problem, that will basically always appear as a dynamic call site. That will help you distinguish things like Union{Nothing,T}
with no performance impact from problematic type instabilities.
One thing I want to point out though:
julia> x = rand(1000000)
julia> @time mysum(x) # 0.6
0.034002 seconds (3.00 M allocations: 45.777 MiB, 3.20% gc time)
julia> @time mysum(x) # 1.0
0.005035 seconds (5 allocations: 176 bytes)
What I mean to say is that the fact that the variable s
is type unstable is no longer particularly relevant to the performance characteristics of the function. I understand people find it useful as a mental model, but e.g. things like:
function foo(x)
x = f(x)
end
are totally fine now from a performance perspective, so in some sense that fact that you have a type unstable phi (i.e. a loop carried instability) is much more relevant to any performance these days than that that phi used to be called s
at some point. I'm not quite sure what the answer here is, but I suspect it looks different than just "turn off optimizations". Perhaps a way to ask the optimizer where a particular call or phi came from.
When a type instability causes a performance problem, that will basically always appear as a dynamic call site
Yes, I understand that part. My question is that what is the unoptimized output good for.
Perhaps a way to ask the optimizer where a particular call or phi came from.
That sounds better than dumping all the info into the output. Especially since there are necessary a lot of guess work involved...
My question is that what is the unoptimized output good for.
Tracing where type problems came from --- e.g. finding out whether an instability leading to a dynamic call came from a loop, a data structure access, or a function call. You can do that with the optimized IR as well, but it might be easier to see unoptimized.
You can do that with the optimized IR as well, but it might be easier to see unoptimized.
How is it easier if it's wrong? I mean, dynamic calls in the unoptimized IR basically means nothing.
This is what I wrote:
Maybe we need something like
code_warnperf
that shows the optimized form and just highlights dynamic calls in red?
This is what I wrote:
Maybe we need something like code_warnperf that shows the optimized form and just highlights dynamic calls in red?
Yes? I know? I have no problem with that and I agree it seems like a good improvement to highlight the current code_warntype
output. It doesn't fully solve the original issue of trying to figure out how to fix issues, which is why I didn't expand on it (it will certainly help though).
But what I'm asking about is the change you supported on the code_warntype
,
and
code_warntype
with just type information on un-optimized code.
i.e. what is the use of this version that just print the un-optimized code that basically has nothing to do with any performance.
i.e. what is the use of this version that just print the un-optimized code that basically has nothing to do with any performance.
I think it has more than nothing to do with performance. It seems generally useful to see your code, but annotated with the type information we have. True, you can't use it directly for performance (e.g. with a simple rule like "abstract type = slow"), but you can learn to read the tea leaves and derive useful info from it.
but you can learn to read the tea leaves and derive useful info from it.
Like what? I asked because I really couldn't think of any that's of general interest. What is so useful in that for general users that justifies changing the behavior of an existing function? I would certainly be OK with making optimized
and option to whatever macro/function or at least give this unoptimized version a different name.
FWIW, there's also nothing to warn
about in it so I believe just code_typed
with optimized=false
should be enough. Moreover, it doesn't really solve the original issue. I believe what people actually want is a version that present performance related info in a more readable way. Unrelated info in a more readable way isn't a fix for that.
For reference, code_typed
with optimize=false
for the mysum
example yields:
julia> code_typed(mysum, Tuple{Vector{Float64}}, optimize=false)
1-element Array{Any,1}:
CodeInfo(
1 1 โ (s = 0)::Const(0, false) โ
โ %2 = a::Array{Float64,1} โ
โ (#temp# = (Base.iterate)(%2))::Union{Nothing, Tuple{Float64,Int64}}
โ %4 = (#temp# === nothing)::Bool โ
โ %5 = (Base.not_int)(%4)::Bool โ
โโโ goto #4 if not %5 โ
2 โ %7 = #temp#::Tuple{Float64,Int64}::Tuple{Float64,Int64} โ
โ (x = (Core.getfield)(%7, 1))::Float64 โ
โ %9 = (Core.getfield)(%7, 2)::Int64 โ
โ (s = s + x)::Float64 โ
โ (#temp# = (Base.iterate)(%2, %9))::Union{Nothing, Tuple{Float64,Int64}}
โ %12 = (#temp# === nothing)::Bool โ
โ %13 = (Base.not_int)(%12)::Bool โ
โโโ goto #4 if not %13 โ
3 โ goto #2 โ
4 โ return s โ
) => Union{Float64, Int64}
This is definitely more readable, but it doesn't clearly indicate that s
is inferred as Union{Float64,Int64}
.
It would be nice to just have a list of user-defined variables (not compiler temporaries) and the types inferred for them, with non-concrete types highlighted.
Not to derail the conversation, but my system produces something else. Am I missing something?
julia> mysum(a) = let s=0; for x in a; s+=x; end; s; end
mysum (generic function with 1 method)
julia> @code_warntype mysum(1.0)
Body::Float64
1 1 โ goto #4 if not true โ
2 โ %2 = (Base.add_float)(0.0, a)::Float64 โโปโท +
โโโ goto #4 if not false โ
3 โ nothing โ
4 โ %5 = ฯ (#3 => %2, #2 => %2, #1 => 0)::Union{Float64, Int64} โ
โ %6 = ฯ (%5, Float64) โ
โโโ return %6
Whatever the reason that I am getting this output, the point still remains that it's hard to see what's going on. I actually don't know if my function is type stable or not. I have a Body::Float64
which suggests it is, but line 4 has ฯ (#3 => %2, #2 => %2, #1 => 0)::Union{Float64, Int64}
so I don't know. I assume there are some optimizations going on in my case.
mysum([1.0])
Since it seems this issue is going to be around for a while, is it possible that someone can explain the @code_warntype
output of the OP line by line (here or in the documentation) such that people like me can get a hint of what is happening?
Most helpful comment
Can we also restore the summary at the top about the types inferred for each variable?