julia> function foo(xs)
count = 0
@inbounds for i in 1:length(xs)
if isnan(xs[i])
count += 1
end
end
return count
end
foo (generic function with 1 method)
julia> vals = rand(10_000);
julia> @time foo(vals)
0.009762 seconds (17.66 k allocations: 960.671 KiB)
0
julia> @time foo(vals)
0.000019 seconds
0
julia> vals[:] .= NaN;
julia> @time foo(vals)
0.000016 seconds (1 allocation: 16 bytes)
10000
julia> vals[:] .= 0;
julia> @time foo(vals)
0.000018 seconds
0
julia> vals = rand(10);
julia> @time foo(vals)
0.000003 seconds
0
julia> vals[:] .= NaN;
julia> @time foo(vals)
0.000003 seconds
10
julia> versioninfo()
Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, skylake)
Environment:
JULIA_NUM_THREADS = 12
I came across this odd behaviour today when trying to work out where this 16 byte allocation was coming from. Whether or not the function allocates seems to be something to do with the length, which is perhaps reasonable (in the generated LLVM there is a scalar/vector switch based on the length of the vector). It also seems to be affected by the contents of the vector - the latter is more surprising to me.
The behaviour doesn't change when the @inbounds
is removed, despite that preventing vector instruction generation.
I also see this behaviour on Julia Version 1.5.0-rc1.0
.
It would be great to hear a sensible explanation!
Don't use @time
to do this. You are measuring the allocation of the return value due to the caller.
The same behaviour can be observed with @btime
. Also, shouldn't the allocation of the return value be independent of the input values?
Please ask further questions on discourse.julialang.org. This is not a bug.
The same behaviour can be observed with
@btime
.
Most likely because you aren't using @btime
correctly.
Also, shouldn't the allocation of the return value be independent of the input values?
No.
A few of things to keep in mind:
The first time you run some code, it has to be compiled so using @time
times compilation as well, which includes allocation, often quite a bit of it.
When a value is returned from a function called like this its return value ends up on the heap, which means that it has to be allocated, so in general even allocation-free code when called like this will have an allocation or two for boxing the return value. When the same function is used in a context where the return value can be kept in registers or on the stack or stored into some other structure, then no additional allocation needs to occur.
There is a cache of heap-allocated small integer values, so some values will already be allocated and returning them does not trigger allocation, whereas other values will not already be allocated and will need allocation. That's what's happening here: 0-511 are in the integer cache whereas 512 and larger must be allocated.
@yuyichao, please be kinder to people posting issues. Yes, Discourse is where questions should be asked, but it's not always clear whether something is a real issue or not. The intention here is clearly to raise a possible problem which is greatly appreciated. We can and should explain that this is not a problem politely and gracefully.
@StefanKarpinski thanks for the patient explanation! I'll know what to look for next time.
Most helpful comment
@yuyichao, please be kinder to people posting issues. Yes, Discourse is where questions should be asked, but it's not always clear whether something is a real issue or not. The intention here is clearly to raise a possible problem which is greatly appreciated. We can and should explain that this is not a problem politely and gracefully.