Julia: Don't ask the call site to provide `@inbounds` when iterating over an array

Created on 2 Jun 2018 · 11Comments · Source: JuliaLang/julia

This issue came about when I discovered that sort! on a small vector of integers was slower than usual; further inspection shows that extrema is now 65% slower for a vector of 500 integers, and this is a result of bounds checks in each iteration.

I would not dare to fix this by adding @inbounds to the loop of extrema, since the function is defined for iterators in general (why is it extrema's responsibility to signal that the internals of the iterator will do inbounds things?). I'd say an iterator over an array should guarantee to stay inbounds and not require its call site to provide @inbounds.

But the issue is quite subtle. The function iterate(A::Array, ::Int) does something that comes close to bounds checking

function iterate(A::Array, i=1)
    @_propagate_inbounds_meta # why require the parent function to provide inbounds
    i >= length(A) + 1 ? nothing : (A[i], i+1) # when we here (kinda) assert we are inbounds?
end

but it's not entirely bounds checking, because in principle i could overflow, so to be safe an actual check is required to guarantee i > 0.

~The issue is a result of #27079 (more robust iteration over vectors), but in fact it cheats a bit by requiring the user to always provide @inbounds.~ Edit (sorry!): without inbounds for x in xs would do bounds checking already.

Possible solutions:

Create a wrapper for x in Mutable(xs) that allows to modify xs on the go; for x in xs is then immutable iteration by convention and can be fast even without @inbounds.
Create a separate implementation of extrema(::AbstractArray) that uses @inbounds (meh)
Just use @inbounds in extrema and document that it does that
Improve the speed of checkbounds(Bool, ::AbstractArray, ::Int) and use that rather than the i >= length(A) + 1 condition, and somehow tell the compiler i cannot overflow, so that the i > 0 check can be optimized away.

Edit, seems like the new iteration protocol is enough to fix things.

arrays iteration performance

Source

haampie

👍1

Most helpful comment

But doesn't the new iteration protocol give the iteration implementors enough information that they can safely add the @inbounds themselves without any additional context from the caller? No longer do we have to trust that the caller has called done — iterate itself does that check. The only thing it doesn't protect ourselves from is someone manually constructing and passing an invalid state that happens to pass the check but it still out of bounds.

The PR over at #27386 demonstrates (or hopefully will demonstrate once Nanosoldier is feeling better) that it's possible to check the entire domain of the state without incurring a performance penalty.

Now, it's harder (impossible?) to check the entire domain with abstract iterator implementations, but https://github.com/JuliaLang/julia/issues/20469#issuecomment-396083994 is arguing that the iterator state is sufficiently complicated that it would require the caller to specifically craft an invalid state such that it passes the "done" check but still yields an out of bounds access. I was initially inclined to agree, but as I was writing this it dawned on me that the easiest way to mess this up would be by iterating over two arrays of different lengths at the same time and inadvertently mixing up the states. Is there another approach we could take here?

I'd really love it if we could avoid either kludge (that is, either unsafe_iterate or @inbounds iterate).

mbauman on 11 Jun 2018

❤1 👍1

All 11 comments

This is quite closely related to https://github.com/JuliaLang/julia/pull/21402, but indeed things are different now that we only have one method — and don't need to assume that the caller did the start/next/done dance correctly.

mbauman on 2 Jun 2018

extrema is now 65% slower for a vector of 500 integers

Slower than v0.6.x?

JeffBezanson on 2 Jun 2018

65% slower on 0.7 when comparing bounds checking enabled vs disabled. Full report:

> @benchmark extrema(v) setup = (v = rand(Int, 1000))

884.1 ns on 0.6.2
1.222 μs on 0.7.0-alpha.0
422.8 ns with the proposed fix of #27386 (i.e. no bounds checking)

haampie on 3 Jun 2018

👍2

I have to agree with @haampie's general point here... generic code should be able to be just as fast as concretely-typed code, and the interfaces available for use by functions like extrema(::Any) should be clear, fast and safe. (IMO we should try to avoid as much as possible the situation where a concrete version of perfectly valid generic method is added, with extra @inbounds added for speed).

Have we discussed having an unsafe_iterate (used when lowering for loops) with a default fallback to iterate, which was something already suggested for unsafe_next in #15291 (and before)? The alternative seems to be asserting that all implementations of @inbounds iterate(...) shouldn't be able to crash if the user doesn't mess with the iteration state (and have lowering do this for us?). Tricky, tricky...

andyferris on 11 Jun 2018

I think we can do this without an unsafe_iterate. What would that gain us beyond Harmen's proposal in https://github.com/JuliaLang/julia/issues/20469#issuecomment-396083994?

mbauman on 11 Jun 2018

I guess the idea of unsafe_iterate is that for x in X would be lowered to it, which would allow getting rid of bounds checks? But maybe we can achieve the same result by lowering for x in X to @inbounds iterate(...)?

nalimilan on 11 Jun 2018

👍1

I'd really love it if we could avoid either kludge (that is, either unsafe_iterate or @inbounds iterate).

mbauman on 11 Jun 2018

❤1 👍1

inadvertently mixing up the states

I did not think of that, and it looks pretty hard to avoid unfortunately.

haampie on 11 Jun 2018

The only thing it doesn't protect ourselves from is someone manually constructing and passing an invalid state that happens to pass the check but it still out of bounds.

Exactly. AFAICT the previous policy is that if no-one calls something called unsafe_xxx, uses ccall, or adds any @inbounds, Julia will basically never crash (i.e. segfault).

I am however happy to agree to ditch that policy - but it's worth discussing.

andyferris on 12 Jun 2018

(To be honest the fact that done and next are now together is only a minor mitigation of the fact that users can mess iteration up, since as mentioned they can still pass whatever state that they want).

andyferris on 12 Jun 2018