Julia: Non-deterministic allocations in function call

Created on 25 Jun 2018  ยท  9Comments  ยท  Source: JuliaLang/julia

I've run into a very annoying problem whereby a certain package I'm writing would be fast on a fresh REPL, showing few allocations, and them on a different fresh REPL, it would be very very slow, with a huge number of allocations... without touching the code! This is an effort to isolate the problem into a MWE (not sure if it can be made even smaller). To reproduce the issue you will likely need to try several times. For me the behaviour is highly erratic.

So: the following module elsa, when developed in ./julia/dev, produces non-deterministic allocations when running test() in current 0.7-beta (I'm not sure when this began)

module elsa
    function test()  
        for i in 1:10_000
            foo!([0.0], 1, 1, 1, 1, rand(), 1, 1, (1, 1))   
        end   
    end 
    @noinline function foo!(vec, x1, x2, x3, x4, x5, x6, x7, (i, j))
        vec[i] = i 
        return nothing 
    end 
end

First run in a fresh REPL

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-beta.7 (2018-06-25 01:11 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 0978251ac5 (0 days old master)
|__/                   |  x86_64-apple-darwin17.6.0

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
[ Info: Recompiling stale cache file /Users/pablo/.julia/compiled/v0.7/elsa/Jexn.ji for module elsa
  0.034960 seconds (122.36 k allocations: 5.917 MiB)
  0.007761 seconds (40.00 k allocations: 1.679 MiB, 69.58% gc time)
  0.001392 seconds (40.00 k allocations: 1.679 MiB)

Subsequent runs in a fresh REPL (no recompile)

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.025599 seconds (107.65 k allocations: 5.603 MiB)
  0.000618 seconds (20.00 k allocations: 1.068 MiB)
  0.000663 seconds (20.00 k allocations: 1.068 MiB)

Note the change in execution time and allocations.

Sometimes, on a subsequent run, we again get 40k allocations

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.029740 seconds (122.36 k allocations: 5.917 MiB)
  0.001366 seconds (40.00 k allocations: 1.679 MiB)
  0.001490 seconds (40.00 k allocations: 1.679 MiB)

If one modifies the code with e.g. some whitespace to force a recompile, I can trigger a change of allocations much more frequently.

Fresh REPL after adding whitespace to force recompile

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
[ Info: Recompiling stale cache file /Users/pablo/.julia/compiled/v0.7/elsa/Jexn.ji for module elsa
  0.025894 seconds (107.69 k allocations: 5.596 MiB)
  0.000819 seconds (20.00 k allocations: 1.068 MiB)
  0.005586 seconds (20.00 k allocations: 1.068 MiB, 86.66% gc time)

Second run in a fresh REPL (no recompile)

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.062763 seconds (122.37 k allocations: 5.923 MiB, 53.05% gc time)
  0.001380 seconds (40.00 k allocations: 1.679 MiB)
  0.001364 seconds (40.00 k allocations: 1.679 MiB)

Third run

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.021969 seconds (107.69 k allocations: 5.596 MiB)
  0.000398 seconds (20.00 k allocations: 1.068 MiB)
  0.000440 seconds (20.00 k allocations: 1.068 MiB)

You get the idea.

Incidentally, if one runs the module straight at the REPL, I never see the problem, and allocations are just 10k (half of the minimum above)

julia> module elsa
           function test()  
               for i in 1:10_000
                   foo!([0.0], 1, 1, 1, 1, rand(), 1, 1, (1, 1))   
               end   
           end 
           @noinline function foo!(vec, x1, x2, x3, x4, x5, x6, x7, (i, j))
               vec[i] = i 
               return nothing 
           end 
       end 
Main.elsa

julia> @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.023286 seconds (90.55 k allocations: 5.052 MiB)
  0.000304 seconds (10.00 k allocations: 937.656 KiB)
  0.000333 seconds (10.00 k allocations: 937.656 KiB)

X-ref: https://github.com/JuliaLang/julia/issues/25900 (I'm not sure it is related, though, as the using elsa itself is deterministic in this example)

julia> versioninfo()
Julia Version 0.7.0-beta.7
Commit 0978251ac5 (2018-06-25 01:11 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin17.6.0)
  CPU: Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
inference precompile regression

Most helpful comment

May be fixed (at least the inference failure problems) by 4562ae2

All 9 comments

Seems to be a type inference issue that has to do with precompilation.

10k allocations (no precompile):

  20 โ”€       invoke Elsa.foo!(%11::Array{Float64,1}, 1::Int64, 1::Int64, 1::Int64, 1::Int64, %42::Float64, 1::Int64, 1::Int64, (1, 1)::Tuple{Int64,Int64})

Every argument inferred

20k allocations (after first precompile)

  20 โ”€       invoke Elsa.foo!(%11::Any, 1::Any, 1::Int64, 1::Int64, 1::Int64, %42::Float64, 1::Int64, 1::Int64, (1, 1)::Tuple{Int64,Int64})

Some arguments not inferred

40k allocations (recompile):

  20 โ”€       invoke Elsa.foo!(%11::Array{Float64,1}, 1::Int64, 1::Int64, 1::Int64, 1::Any, %42::Float64, 1::Int64, 1::Any, (1, 1)::Any)

Some other arguments not inferred...

Very nice MWE

Even something even smaller like

test() = foo!([0.0], 1)
@noinline foo!(vec, i) = vec[i] = i

can show this:

julia> @code_warntype Elsa.test()
Body::Int64
6 1 โ”€ %1 = invoke Base.vect(0.0::Float64)::Array{Float64,1}                                                                                                                                                     โ”‚
  โ”‚   %2 = invoke Elsa.foo!(%1::Array{Float64,1}, 1::Any)::Int64                                                                                                                                                โ”‚
  โ””โ”€โ”€      return %2 

note, 1::Any

Nice! Thanks for looking into it @KristofferC. So it's got to do with inference, but do you know why it fails randomly? It does seem to be related to the tuple destructuring.

I don't think it has to do anything with the tuple, since I can get the ::Any without any tuples (see my last example).

Also, cannot reproduce on 0.6.

Uhm, I do see how your ultraminimalistic example fails to be inferred, but it doesn't seem to allocate randomly for me. It always behaves the same in that regard.

julia> using elsa; @time elsa.test(); @time elsa.test(); @time elsa.test();
  0.016478 seconds (62.47 k allocations: 3.216 MiB)
  0.000004 seconds (5 allocations: 256 bytes)
  0.000002 seconds (5 allocations: 256 bytes)

Do you see the inference failure with your MWE always, or randomly?

It's a bit random, sometimes the first argument fails to infer aswell

julia> @code_warntype Elsa.test()
Body::Int64
6 1 โ”€ %1 = invoke Base.vect(0.0::Float64)::Array{Float64,1}                                                                                                                                                     โ”‚
  โ”‚   %2 = invoke Elsa.foo!(%1::Any, 1::Any)::Int64                                                                                                                                                             โ”‚
  โ””โ”€โ”€      return %2 

cc @vtjnash

May be fixed (at least the inference failure problems) by 4562ae2

Amazing, absolutely genius @vtjnash. Completely solved AFAICT! Not only the MWE, but the real world case too, it seems.

It may be too complicated to explain briefly, but did you understand the root cause (the non-deterministic behaviour in particular)?

Yeah, I saw some uninitialized memory when doing some unrelated work, and traced it back to here.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StefanKarpinski picture StefanKarpinski  ยท  3Comments

yurivish picture yurivish  ยท  3Comments

m-j-w picture m-j-w  ยท  3Comments

arshpreetsingh picture arshpreetsingh  ยท  3Comments

manor picture manor  ยท  3Comments