Julia: Short-circuiting of all on generators

Created on 28 Oct 2016 · 8Comments · Source: JuliaLang/julia

This was touched upon in #14782 (related: #11774), but there's no open issue for it. Currently,

all(x->(print(x); x<5), 1:9)
> 12345
> false

# but

all((print(x); x<5) for x in 1:9)
> 123456789
> false

breaking

Source

cstjean

Most helpful comment

I think any and all should always short circuit. I don't understand why they call mapreduce_sc_impl and reduce; something to do with vectorization maybe? If there are special cases like BitArray that don't want short-circuiting (and where you can't tell the difference) those can be separate narrowly-applicable methods.

JeffBezanson on 28 Oct 2016

👍3

All 8 comments

JeffBezanson on 28 Oct 2016

👍3

My memory of it is foggy, but I think that the reason why it doesn't short-circuit always is to keep things like this from happening:

julia> all(Any[true,false,1]) # if short-circuiting, returns false, if not, errors out on 1

And other non-boolean hijinks.
So it only short-circuits on collections with eltype(A)===Bool (actually, it used to not even let other types of collections to be used, but we've grown soft in our demands and now they just don't short-circuit)

My original implementation wouldn't even have allowed generators as arguments, but things have changed in this land since then, so maybe it's time to take another look at this decision... ...?

fcard on 29 Oct 2016

Whether to short-circuit is a semantic choice. It should be done consistently, otherwise it's really hard to know when to expect it. My vote is also to always short-circuit.

My memory of it is foggy, but I think that the reason why it doesn't short-circuit always is to keep things like this from happening

all(a->a, Any[true, true, true, true, false, 10, 23]) is false at the moment, and doesn't trigger an error. all(Any[true, true, true, true, false, 10, 23]) is an error.

cstjean on 29 Oct 2016

@cstjean that's because f in all(f, A) is expected to always return Bool and errors otherwise, so all assumes it's safe to short-circuit. I had some complex runtime type inference mechanism to check that the function was indeed a predicate, but it was scrapped, (thankfully, in retrospect. Function types would serve this purpose better) so we're just relying on your good faith.

So, passing a->a to all/any is meant to be undefined behavior. Maybe. The conversation about this started (https://github.com/JuliaLang/julia/issues/11750#issuecomment-113249332) but I am not sure it concluded :P

fcard on 29 Oct 2016

Keep in mind that until https://github.com/JuliaLang/julia/commit/06c93ce8a5184d8e5463d0812e2fd1d2a0747f45 any/all _always_ short-circuited, or would not start at all and throw an error. That behavior made sense back then because there were no generators and the recommendation was always to use either all(pred, Coll) or all(BoolColl). Now, with generators, it might make sense to either special-case them or remove this safety check.

fcard on 29 Oct 2016

Yes @fcard I think you're right about the type-based motivation for this. Looking at it again it doesn't seem important to me. There's a lot to be said for specifying the semantics of operations as obvious, simple definitions like

function all(itr)
    for x in itr
        !x && return false
    end
    return true
end

If you have a mixed-type array, the behavior is not surprising since it just corresponds to this obvious definition.

JeffBezanson on 29 Oct 2016

👍2

@JeffBezanson Right! Now that I think about it I think the reason for the complexity of the original code was to reuse the optimizations already in place and avoid some other performance pitfalls, (mostly slow anonymous functions) and the resulting semantics could've been seen as a plus since they encouraged better use of the short-circuiting behavior (e.g. the deprecation caught several uses of all(map(f, A))), but now that we have generators and can write all(f(x) for x in A), plus a lot of the performance concerns that existed then don't anymore, the definitions can now probably be simplified to their obvious implementations.

tl;dr I agree :P

fcard on 29 Oct 2016

👍1

Since this change would be breaking, should we target it for 0.6?