This was touched upon in #14782 (related: #11774), but there's no open issue for it. Currently,
all(x->(print(x); x<5), 1:9)
> 12345
> false
# but
all((print(x); x<5) for x in 1:9)
> 123456789
> false
I think any
and all
should always short circuit. I don't understand why they call mapreduce_sc_impl
and reduce
; something to do with vectorization maybe? If there are special cases like BitArray that don't want short-circuiting (and where you can't tell the difference) those can be separate narrowly-applicable methods.
My memory of it is foggy, but I think that the reason why it doesn't short-circuit always is to keep things like this from happening:
julia> all(Any[true,false,1]) # if short-circuiting, returns false, if not, errors out on 1
And other non-boolean hijinks.
So it only short-circuits on collections with eltype(A)===Bool
(actually, it used to not even let other types of collections to be used, but we've grown soft in our demands and now they just don't short-circuit)
My original implementation wouldn't even have allowed generators as arguments, but things have changed in this land since then, so maybe it's time to take another look at this decision... ...?
Whether to short-circuit is a semantic choice. It should be done consistently, otherwise it's really hard to know when to expect it. My vote is also to always short-circuit.
My memory of it is foggy, but I think that the reason why it doesn't short-circuit always is to keep things like this from happening
all(a->a, Any[true, true, true, true, false, 10, 23])
is false at the moment, and doesn't trigger an error. all(Any[true, true, true, true, false, 10, 23])
is an error.
@cstjean that's because f
in all(f, A)
is expected to always return Bool and errors otherwise, so all
assumes it's safe to short-circuit. I had some complex runtime type inference mechanism to check that the function was indeed a predicate, but it was scrapped, (thankfully, in retrospect. Function types would serve this purpose better) so we're just relying on your good faith.
So, passing a->a
to all
/any
is meant to be undefined behavior. Maybe. The conversation about this started (https://github.com/JuliaLang/julia/issues/11750#issuecomment-113249332) but I am not sure it concluded :P
Keep in mind that until https://github.com/JuliaLang/julia/commit/06c93ce8a5184d8e5463d0812e2fd1d2a0747f45 any
/all
_always_ short-circuited, or would not start at all and throw an error. That behavior made sense back then because there were no generators and the recommendation was always to use either all(pred, Coll)
or all(BoolColl)
. Now, with generators, it might make sense to either special-case them or remove this safety check.
Yes @fcard I think you're right about the type-based motivation for this. Looking at it again it doesn't seem important to me. There's a lot to be said for specifying the semantics of operations as obvious, simple definitions like
function all(itr)
for x in itr
!x && return false
end
return true
end
If you have a mixed-type array, the behavior is not surprising since it just corresponds to this obvious definition.
@JeffBezanson Right! Now that I think about it I think the reason for the complexity of the original code was to reuse the optimizations already in place and avoid some other performance pitfalls, (mostly slow anonymous functions) and the resulting semantics could've been seen as a plus since they encouraged better use of the short-circuiting behavior (e.g. the deprecation caught several uses of all(map(f, A))
), but now that we have generators and can write all(f(x) for x in A)
, plus a lot of the performance concerns that existed then don't anymore, the definitions can now probably be simplified to their obvious implementations.
tl;dr I agree :P
Since this change would be breaking, should we target it for 0.6?
Most helpful comment
I think
any
andall
should always short circuit. I don't understand why they callmapreduce_sc_impl
andreduce
; something to do with vectorization maybe? If there are special cases like BitArray that don't want short-circuiting (and where you can't tell the difference) those can be separate narrowly-applicable methods.