This seems to have started recently
Error During Test at /tmp/julia/share/julia/test/numbers.jl:2329
Got exception outside of a @test
return type Int128 does not match inferred return type Any
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] top-level scope at /tmp/julia/share/julia/test/numbers.jl:2372
[3] top-level scope at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
[4] top-level scope at /tmp/julia/share/julia/test/numbers.jl:2330
[5] include at ./boot.jl:317 [inlined]
[6] include_relative(::Module, ::String) at ./loading.jl:1038
[7] include at ./sysimg.jl:29 [inlined]
[8] include(::String) at /tmp/julia/share/julia/test/testdefs.jl:13
[9] top-level scope at /tmp/julia/share/julia/test/testdefs.jl:22
[10] top-level scope at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
[11] top-level scope at /tmp/julia/share/julia/test/testdefs.jl:21
[12] top-level scope at util.jl:289
[13] top-level scope at /tmp/julia/share/julia/test/testdefs.jl:19
[14] eval at ./boot.jl:319 [inlined]
[15] #runtests#3(::UInt128, ::Function, ::String, ::String, ::Bool) at /tmp/julia/share/julia/test/testdefs.jl:25
[16] #runtests at ./none:0 [inlined] (repeats 2 times)
[17] (::getfield(Distributed, Symbol("##112#114")){Distributed.CallMsg{:call_fetch}})() at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v1.1/Distributed/src/process_messages.jl:269
[18] run_work_thunk(::getfield(Distributed, Symbol("##112#114")){Distributed.CallMsg{:call_fetch}}, ::Bool) at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v1.1/Distributed/src/process_messages.jl:56
[19] macro expansion at /home/travis/build/JuliaLang/julia/usr/share/julia/stdlib/v1.1/Distributed/src/process_messages.jl:269 [inlined]
[20] (::getfield(Distributed, Symbol("##111#113"))
I just got another of these.
This is reproducible for me on an i686 build from a 64 bit ubuntu 18.04 host. The bug reduces to:
julia> Base.return_types(%, (Int8, UInt128))
1-element Array{Any,1}:
Any
Well, it makes absolutely no sense, but reverting e2f7b2ffd fixes this for me locally.
More precisely, reverting only the following line seems to fix the problem, but I don't understand why.
diff --git a/base/array.jl b/base/array.jl
index 9e8da6c384..74823710ab 100644
--- a/base/array.jl
+++ b/base/array.jl
@@ -272,7 +272,6 @@ function copyto!(dest::Array{T}, doffs::Integer, src::Array{T}, soffs::Integer,
throw(BoundsError())
end
unsafe_copyto!(dest, doffs, src, soffs, n)
- return dest
end
# Outlining this because otherwise a catastrophic inference slowdown
Edit: having woken my machine up from sleep state, this one line change no longer "fixes" the problem. Which makes sense if there's something nasty and intermittent going on here. Uninitialized memory read or some such thing, perhaps.
It looks like my machine has fallen out of whatever state it was in which allowed me to reproduce this.
What I did when reproducing was to build 099979931c with ARCH=i686
(on ubuntu 18.04), which gave the return_types
failure as stated. Then to revert various recent commits and measure the bug with Revise
until I found a change which "fixed" the problem.
This was yesterday, but the build of 099979931c no longer exhibits the problem today. I'll chalk that up to it being intermittent (I don't think it's just me loosing my mind :-) ).
Got another window of reproducibility; it looks like the cached inferred code for rem(::UInt128, ::UInt128)
has a return type of Any
, which may possibly be the underlying cause though I'm not sure.
Applying the following patch:
diff --git a/base/compiler/typeinfer.jl b/base/compiler/typeinfer.jl
index 01fb88b024..e1912b71da 100644
--- a/base/compiler/typeinfer.jl
+++ b/base/compiler/typeinfer.jl
@@ -594,6 +594,11 @@ function typeinf_type(method::Method, @nospecialize(atypes), sparams::SimpleVect
inf = code.inferred
if !isa(inf, CodeInfo) || (inf::CodeInfo).inferred
i == 2 && ccall(:jl_typeinf_end, Cvoid, ())
+ println("typeinf_type code = ", code)
+ println("typeinf_type ", method, " code.rettype = ", code.rettype)
+ println("typeinf_type ", code.inferred)
+ println("typeinf_type ", ccall(:jl_uncompress_ast, Any, (Any, Any), method,
+ code.inferred::Vector{UInt8}))
return code.rettype
end
end
I get the following output:
julia> using Revise; Revise.track(Base.Core.Compiler);
julia> Base.return_types(rem, (UInt128, UInt128))
typeinf_type code = rem(UInt128, UInt128)
typeinf_type Base.rem(...) code.rettype = Any
typeinf_type Array{UInt8, (105,)}[0x08, 0x03, 0x00, 0x00, 0x00, 0x23, 0x73, 0x65, 0x6c, 0x66, 0x23, 0x00, 0x78, 0x00, 0x79, 0x00, 0x16, 0x84, 0xb0, 0xea, 0x09, 0x03, 0x3c, 0x11, 0x02, 0x1d, 0x11, 0x00, 0x05, 0x02, 0x00, 0x09, 0x03, 0x3c, 0x11, 0x02, 0x1d, 0x11, 0x00, 0x05, 0x03, 0x00, 0x2d, 0x1d, 0x85, 0x03, 0x01, 0x03, 0x02, 0x2c, 0x1d, 0x1c, 0xf4, 0x03, 0x03, 0x09, 0x01, 0x3e, 0x03, 0x04, 0x39, 0x16, 0x84, 0xb0, 0xea, 0x3a, 0x3a, 0x3a, 0x3a, 0x3a, 0x16, 0x84, 0xac, 0x26, 0x00, 0x2e, 0x2e, 0x20, 0x85, 0x4d, 0x2b, 0x0a, 0x03, 0xab, 0x16, 0x01, 0xb0, 0xf6, 0x00, 0x00, 0x00, 0x00, 0x00, 0x16, 0x01, 0xae, 0xf6, 0x00, 0x00, 0x00, 0x01, 0x01, 0x01, 0x01, 0x01]
typeinf_type Core.CodeInfo(code=Array{Any, (5,)}[
Expr(:invoke, (::Type{Base.GMP.BigInt})(UInt128), Base.BigInt, Core.SlotNumber(id=2)),
Expr(:invoke, (::Type{Base.GMP.BigInt})(UInt128), Base.BigInt, Core.SlotNumber(id=3)),
Expr(:call, Base.rem, SSAValue(1), SSAValue(2)),
Expr(:call, Base.UInt128, SSAValue(3)),
Expr(:return, SSAValue(4))], codelocs=Array{Int32, (5,)}[1, 1, 1, 1, 1], method_for_inference_limit_heuristics=nothing, ssavaluetypes=Array{Any, (5,)}[
Any,
Any,
Any,
Any,
Any], linetable=Array{Core.LineInfoNode, (1,)}[Core.LineInfoNode(mod=Base, method=:rem, file=Symbol("int.jl"), line=778, inlined_at=0)], ssaflags=Array{UInt8, (5,)}[0x00, 0x00, 0x00, 0x00, 0x00], slotflags=Array{UInt8, (3,)}[0x00, 0x00, 0x00], slotnames=Array{Any, (3,)}[
Symbol("#self#"),
:x,
:y], inferred=true, inlineable=false, propagate_inbounds=false, pure=false)
1-element Array{Any,1}:
Any
Given these BigInt
in here, perhaps this comes down to a weird interaction with the patch in 6fa3e4553
Given these BigInt in here, perhaps this comes down to a weird interaction with the patch in 6fa3e45
This would also explain why it's not seen on 64 bit linux, as we don't use GMP for supporting UInt128
in that case.
Here's another oddity - @code_typed
and Base.return_types
disagree about rem
with UInt128
:
julia> @code_typed rem(UInt128(1), UInt128(1))
CodeInfo(
778 1 β %1 = invoke Base.BigInt(_2::UInt128)::BigInt β
β %2 = invoke Base.BigInt(_3::UInt128)::BigInt β
β %3 = Base.GMP.MPZ.tdiv_r::typeof(tdiv_r) ββ» rem
β %4 = invoke %3(%1::BigInt, %2::BigInt)::BigInt ββ
β %5 = invoke Base.UInt128(%4::BigInt)::UInt128 β
βββ return %5 β
) => UInt128
julia> Base.return_types(rem, (UInt128, UInt128))
1-element Array{Any,1}:
Any
More fun:
$ ./julia
[...]
julia> Base.return_types(rem, (UInt128, UInt128))
1-element Array{Any,1}:
Any
# Now replace `rem` with exactly the same definition, to cause recompilation:
julia> Base.eval(:(
function rem(x::UInt128, y::UInt128)
return UInt128(rem(BigInt(x), BigInt(y)))
end
))
rem (generic function with 136 methods)
julia> Base.return_types(rem, (UInt128, UInt128))
1-element Array{Any,1}:
UInt128
So I guess the incorrect return type for rem
is being read from the sysimg. Later on we get this incorrect version deserialized, but if we override it with the exact same function definition, the next inference run returns the correct result.
The following patch dumps inferred rettype
for rem
as it's cached in jl_set_method_inferred
.
diff --git a/src/gf.c b/src/gf.c
index ae4a1413b7..414a52e369 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -432,6 +432,13 @@ JL_DLLEXPORT jl_method_instance_t* jl_set_method_inferred(
li->functionObjectsDecls.functionObject = NULL;
li->functionObjectsDecls.specFunctionObject = NULL;
li->rettype = rettype;
+ if (li->def.method->name == jl_symbol("rem")) {
+ jl_printf(JL_STDOUT, "jl_set_method_inferred rem; rettype = ");
+ jl_static_show(JL_STDOUT, li->rettype);
+ jl_printf(JL_STDOUT, " ");
+ jl_static_show(JL_STDOUT, li->specTypes);
+ jl_printf(JL_STDOUT, "\n");
+ }
jl_gc_wb(li, rettype);
li->inferred = inferred;
jl_gc_wb(li, inferred);
It demonstrates that the 32 bit build sometimes infers the wrong type for rem(UInt128, UInt128)
. In particular, applying it and calling
make clean &> /dev/null && make &> build.log && grep 'jl_set_method_inferred.*rettype = .*UInt128, UInt128' build.log
sometimes produces
jl_set_method_inferred rem; rettype = Any Tuple{typeof(Base.rem), UInt128, UInt128}
jl_set_method_inferred rem; rettype = Any Tuple{typeof(Base.rem), UInt128, UInt128}
and other times produces
jl_set_method_inferred rem; rettype = Any Tuple{typeof(Base.rem), UInt128, UInt128}
jl_set_method_inferred rem; rettype = UInt128 Tuple{typeof(Base.rem), UInt128, UInt128}
I'm not sure whether this depends on being inside the sysimg build or not.
Ok, to reproduce this, you need an i686 build, apply the patch above and to run the sysimg build step several times until you see the uninferred version of rem
jl_set_method_inferred rem; rettype = Any Tuple{typeof(Base.rem), UInt128, UInt128}
twice in the log.
Another change which might have caused this to appear - #29795
Nice work tracking this down, @c42f! @Keno, @vtjnash, @JeffBezansonβany idea what's going on?
Thanks, it's a head scratcher that's for sure. I caught the following stack trace from inside jl_set_method_inferred
, just as it sets the bad Any
as the rettype. This is still slightly after the damage is done, but almost at the right location. At a high level it's inside jl_compile_hint
, so just before the sysimg build is completed.
jl_set_method_inferred rem; rettype = Any Tuple{typeof(Base.rem), UInt128, UInt128}
rec_backtrace at /home/tcfoster/src/julia-i686/src/stackwalk.c:94
jlbacktrace2 at /home/tcfoster/src/julia-i686/src/stackwalk.c:536
jl_set_method_inferred at /home/tcfoster/src/julia-i686/src/gf.c:441
cache_result at ./compiler/typeinfer.jl:133
typeinf at ./compiler/typeinfer.jl:61
typeinf_edge at ./compiler/typeinfer.jl:499 [inlined]
abstract_call_method at ./compiler/abstractinterpretation.jl:363
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_apply at ./compiler/abstractinterpretation.jl:518
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:566
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1120
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method_with_const_args at ./compiler/abstractinterpretation.jl:198
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:107
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method_with_const_args at ./compiler/abstractinterpretation.jl:198
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:107
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1120
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_edge at ./compiler/typeinfer.jl:499
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_method at ./compiler/abstractinterpretation.jl:363
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:85
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_call at ./compiler/abstractinterpretation.jl:775
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval_call at ./compiler/abstractinterpretation.jl:804
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
abstract_eval at ./compiler/abstractinterpretation.jl:889
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_local at ./compiler/abstractinterpretation.jl:1134
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1190
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf at ./compiler/typeinfer.jl:15
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_ext at ./compiler/typeinfer.jl:574
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
typeinf_ext at ./compiler/typeinfer.jl:611
jl_apply_generic at /home/tcfoster/src/julia-i686/src/gf.c:2233
jl_apply at /home/tcfoster/src/julia-i686/src/julia.h:1571 [inlined]
jl_type_infer at /home/tcfoster/src/julia-i686/src/gf.c:277
jl_compile_hint at /home/tcfoster/src/julia-i686/src/gf.c:1964
jl_compile_specializations at /home/tcfoster/src/julia-i686/src/precompile.c:378
jl_precompile at /home/tcfoster/src/julia-i686/src/precompile.c:388 [inlined]
jl_write_compiler_output at /home/tcfoster/src/julia-i686/src/precompile.c:34
jl_atexit_hook at /home/tcfoster/src/julia-i686/src/init.c:222
main at /home/tcfoster/src/julia-i686/ui/repl.c:222
__libc_start_main at /lib/i386-linux-gnu/libc.so.6 (unknown line)
_start at /home/tcfoster/src/julia-i686/usr/bin/julia (unknown line)
This is no longer what I'd describe as "intermittent"βit seems to happen quite consistently now.
I knew I remembered something like this. Finally I've found it: #25804 exposed a similar bug. The "solution" was a simple workaround: c002de7447183deef5ac8860fa79c8ad1311b9d2. Maybe we can just do the same for rem
?
I think I have a handle on this one: On 32bit systems, rem(::UInt128, ::Untt128)
and div(::UInt128, ::Untt128)
convert to BigInt
, have the actual computation done, and convert back to UInt128
. However, that leads to the following recursion:
BigInt(::UInt128)
β ndigits(::UInt128, base=2)
β ndigits0z(::UInt128, 2)
β ndigits0zpb(::UInt128, 2)
β div(::UInt128, 2)
β promotion β div(::UInt128, ::UInt128)
β BigInt(::UInt128)
It should be noted that ndigits0zpb
has a special case for b==2
, so the call-chain at run-time is not recursive, but apparently constant-propagation doesn't resolve this at compile-time.
To verify, I've added
function BigInt(x::UInt128)
x == 0 && return BigInt(Culong(0))
nd = sizeof(x)<<3 - leading_zeros(x) # instead of ndigits(x, base=2)
z = MPZ.realloc2(nd)
s = sign(x)
s == -1 && (x = -x)
x = unsigned(x)
size = 0
limbnbits = sizeof(Limb) << 3
while nd > 0
size += 1
unsafe_store!(z.d, x % Limb, size)
x >>>= limbnbits
nd -= limbnbits
end
z.size = s*size
z
end
which is a copy of BigInt(x::Integer)
except for replacing the call to ndigits
. At least locally, that makes the test failure go away.
Not sure how to properly fix this, though. Options include:
div(::UInt128, ::UInt128)
natively instead of calling out to its BigInt
version.@martinholters Nice! That sounds like part of the puzzle and means the problem started happening with the change at https://github.com/JuliaLang/julia/commit/6fa3e45531a68fae34f31214cbf7e008298995d5#diff-c39f3bbf0cb1c2381243b42647d9ff1dR287
Though it doesn't explain the inconsistency of sometimes getting Any
and other times UInt128
inferred for rem(::UInt128, UInt128)
.
The symptoms are mitigated by https://github.com/JuliaLang/julia/pull/30032 but the underlying inference issue remains.
Thinking about this again, my analysis above doesn't explain anything. If the arguments during recursion don't change, inference is happy with it. Only if the _same_ method gets called with _different_ argument types, problems may appear. I.e. inference may bail out (infer as Any
), but where that happens depends on where the recursion loop was entered, so results may vary. I still suspect this is the behavior we're seeing here, but I have no idea where the offending recursion might sneak in.
I ran valgrind against the offending sysimg compile step. In one run, the following log occurred
$ valgrind --tool=memcheck --leak-check=full /home/chris/dev/julia-i686/usr/bin/julia -g1 -O0 -C "pentium4" --output-ji /home/chris/dev/julia-i686/usr/lib/julia/sys.ji.tmp --startup-file=no --warn-overwrite=yes --sysimage /home/chris/dev/julia-i686/usr/lib/julia/corecompiler.ji sysimg.jl
[...]
Dates ββββββββββ 52.722144 seconds
DelimitedFiles β 1.952113 seconds
==1125== Conditional jump or move depends on uninitialised value(s)
==1125== at 0x4B51B62: ???
==1125==
Random βββββββββ 14.559922 seconds
UUIDs ββββββββββ 0.259599 seconds
Future βββββββββ 0.083137 seconds
LinearAlgebra ββ406.806756 seconds
[...]
==1125== Use --track-origins=yes to see where uninitialised values come from
Which is extremely suspicious on the face of it, but repeated reruns with --track-origins=yes
failed to reproduce or give any further clues, so perhaps it's a false positive.
@martinholters I think I've got a repro of this outside of bootstrap. Try the following:
# Force recompilation
julia> Base.eval(:(rem(x::UInt128, y::UInt128) = UInt128(rem(BigInt(x), BigInt(y)))))
rem (generic function with 136 methods)
# Compile directly gives the right result
julia> Base.return_types(rem, (UInt128, UInt128))
1-element Array{Any,1}:
UInt128
# Now force recompilation, entering inference via `parse`:
julia> Base.eval(:(rem(x::UInt128, y::UInt128) = UInt128(rem(BigInt(x), BigInt(y)))))
julia> parse(UInt128, "1")
julia> Base.return_types(rem, (UInt128, UInt128))
1-element Array{Any,1}:
Any
I think the reason this is inconsistent from one sysimg compile to the next is that the compilation inside jl_compile_specializations
is done in the order of the bindings in the modules. These bindings are stored in a ptrhash hash table, so the ordering of bindings isn't stable as it depends on the addresses which happen to be allocated to the bound objects.
In particular, in test runs I found that rem
can be compiled due to at least any of parse
, @uint128_str
ndigits
or Base._base
. The first two of these result in the wrong inferred type of Any
, while the second two result in UInt128
being inferred.
Any time there is inconsistency in the handling of cycles during inference we are likely to have this inconsistent behavior due to the inconsistent ordering coming from jl_foreach_reachable_mtable
.
If we sorted the list of specializations m
before traversing it at https://github.com/JuliaLang/julia/blob/9a7c79ce56874f58650810f22e94893554f67f88/src/precompile.c#L375 we might at least make the compilation reproducible.
That's a really great repro, and helped narrow this down quickly (spoiler: it wasn't where I expected). Usually it's enough to apply this diff to spot where inference diverges from my expectations:
diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index e30b5496b0..65c1d6a2b0 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -300,6 +300,11 @@ function abstract_call_method(method::Method, @nospecialize(sig), sparams::Simpl
newsig = limit_type_size(sig, comparison, sv.linfo.specTypes, sv.params.TUPLE_COMPLEXITY_LIMIT_DEPTH, spec_len)
if newsig !== sig
+ println()
+ println(sig)
+ println(newsig)
+ print_callstack(sv)
+ println()
# continue inference, but note that we've limited parameter complexity
# on this call (to ensure convergence), so that we don't cache this result
if call_result_unused(sv)
But I should perhaps have attempt to record (or repeat?) this debug session, since this didn't turn up anything, I had to go hunting further. Next I took a look at what was going into the cache:
diff --git a/base/compiler/typeinfer.jl b/base/compiler/typeinfer.jl
index 01fb88b024..b26860d341 100644
--- a/base/compiler/typeinfer.jl
+++ b/base/compiler/typeinfer.jl
@@ -28,6 +27,15 @@ function typeinf(frame::InferenceState)
# empty!(frames)
min_valid = frame.min_valid
max_valid = frame.max_valid
+ if cached && frame.parent !== nothing
+ println()
+ for caller in results
+ println(caller.argtypes, " => ", caller.result)
+ # println(caller.src)
+ end
+ print_callstack(frame)
+ println()
+ end
if cached || frame.parent !== nothing
for caller in results
opt = caller.src
This was suspiciously pointing the finger at something permitting results to go into the cache that were clearly bogus, and had the structure of a self-recursive call on a constant (these are marked [uncached] [limited]
)βwhich is supposed to get rejected by inferenceβgetting added to the cache.
If we sorted the list of specializations m
Yes. While we try to make inference fairly predictable and stable(*), sorting the precompile lists there by something stable (such as Method min_world when iterating over bindings) might help with faster isolation of similar problems.
(*) In particular, we want it to have the property that any particular call to inference (e.g. code_typed or Base.return_types) should not return a _wider_ result than expected due to cache effects. We're just fine with it giving unpredictably _narrower_ answers based on stuff that happened to be done in the past, and just we're just fine with it giving entirely unpredictable answers when running recursively, as part of converging inference (up to the limit that it shouldn't widen the final type given as the answer, but that does mean that the set of possible optimizations and inlining afterwards is expected to be unpredictable in general).
Great! I was hoping someone with knowledge of inference would step in and fix this. Thanks to Martin as well for giving just the hint I needed to repro it.
Yes. While we try to make inference fairly predictable and stable(*), sorting the precompile lists there by something stable (such as Method min_world when iterating over bindings) might help with faster isolation of similar problems.
How about sorting m
using the same method as jl_resort_type_cache
? Here's a PR for that: https://github.com/JuliaLang/julia/pull/30095
Most helpful comment
That's a really great repro, and helped narrow this down quickly (spoiler: it wasn't where I expected). Usually it's enough to apply this diff to spot where inference diverges from my expectations:
But I should perhaps have attempt to record (or repeat?) this debug session, since this didn't turn up anything, I had to go hunting further. Next I took a look at what was going into the cache:
This was suspiciously pointing the finger at something permitting results to go into the cache that were clearly bogus, and had the structure of a self-recursive call on a constant (these are marked
[uncached] [limited]
)βwhich is supposed to get rejected by inferenceβgetting added to the cache.Yes. While we try to make inference fairly predictable and stable(*), sorting the precompile lists there by something stable (such as Method min_world when iterating over bindings) might help with faster isolation of similar problems.
(*) In particular, we want it to have the property that any particular call to inference (e.g. code_typed or Base.return_types) should not return a _wider_ result than expected due to cache effects. We're just fine with it giving unpredictably _narrower_ answers based on stuff that happened to be done in the past, and just we're just fine with it giving entirely unpredictable answers when running recursively, as part of converging inference (up to the limit that it shouldn't widen the final type given as the answer, but that does mean that the set of possible optimizations and inlining afterwards is expected to be unpredictable in general).