sum
and mean
both can take a function argument to apply to elements before summing or averaging, and this is documented:
?sum
sum(f, itr)
Sum the results of calling function
f
on each element ofitr
.?mean
mean(f::Function, v)
Apply the function
f
to each element ofv
and take the mean.
But ?prod
doesn't show the method that takes a function argument.
Now that we have generators, maybe we could just deprecate those methods. sum(f(i) for i in itr)
should be just as fast, and prod(f(i) for i in itr)
works as is.
+1000 to removing these kinds of pre-composed generator + function patterns
function-first does allow do-block notation, is that doable with generators?
I find the generator syntax more readable than a do-block with function-first.
+1 for generators. In the rare occasions where your function definition is so big that a do
block useful I think it would be okay to refer to mapreduce
.
prod
actually does take a function argument. On master (just a couple days old):
julia> methods(prod)
# 6 methods for generic function "prod":
prod(x::Tuple{}) at tuple.jl:222
prod(x::Tuple{Any,Vararg{Any,N<:Any}}) at tuple.jl:223
prod(f::Function, A::AbstractArray, region) at reducedim.jl:291
prod(f::Union{DataType,Function}, a) at reduce.jl:400
prod(A::AbstractArray, region) at reducedim.jl:293
prod(a) at reduce.jl:407
The third and fourth methods listed are the ones you want.
@ararslan The issue is that it's not documented.
I changed the title to reflect that concern. Care to make a PR to fix this?
Deprecation in favor of generators seems like the better approach here.
Why can't both methods exist?
They can. I have to say that writing things like sum(length, arrays)
is pretty nice. Nicer than sum(length(a) for a in arrays)
, so maybe just keep this and document it.
Ok, to flip sides again: the definition of reducer(transform, itr)
methods for reducers is completely mechanical, which is exactly the sort of situation that we introduced f.(v)
syntax to avoid 鈥撀爄.e. rather than call @vectorize_1arg
and @vectorize_2arg
on functions that support vectorization, let any function support vectorization. The only thing that's better about this situation is that unlike vectorization, there's a fairly clear answer to the question "which functions get these methods" 鈥撀爐he answer is all reducers. While sum(length(a) for a in arrays)
is pretty concise, I still can't think of a way to write this generically for any reducer that is as nice as sum(length, arrays)
. So maybe just define this method for all the built in reducers, which technically makes them map-reducers.
@StefanKarpinski What about sum(length.(arrays))
? This is probably inefficient due to the intermediate array. Maybe one could have a notation like f:(v)
, which returns a generator instead of a full array like f.(v)
? Or else f.(v)
by default returns a generator, which must be explicitly converted to an array by a call to collect
.
Yes, I considered that but it's inefficient since it materializes the array of lengths. The f:(v)
notation isn't available since it already means the same thing a f:v
鈥撀爄.e. construct a range from f
to v
with unit step. Introducing laziness doesn't seem helpful here. I'm somewhat inclined to just define (and document) the map-reduce methods for all reducers since it is a fixed set of functions, at least in Base.
Sorry, :
was a poor example (I forgot ranges). Maybe another punctuation is available for lazy map? I created a new issue to discuss this: https://github.com/JuliaLang/julia/issues/19198.
Ref. discussion around https://github.com/JuliaLang/julia/issues/16285#issuecomment-237320513 (generalizing compact broadcast syntax to compact higher-order function syntax, relevant given the behavior mentioned above is mapreduce
). Best!
Most helpful comment
They can. I have to say that writing things like
sum(length, arrays)
is pretty nice. Nicer thansum(length(a) for a in arrays)
, so maybe just keep this and document it.