sum and mean both can take a function argument to apply to elements before summing or averaging, and this is documented:
?sum
sum(f, itr)Sum the results of calling function
fon each element ofitr.?mean
mean(f::Function, v)Apply the function
fto each element ofvand take the mean.
But ?prod doesn't show the method that takes a function argument.
Now that we have generators, maybe we could just deprecate those methods. sum(f(i) for i in itr) should be just as fast, and prod(f(i) for i in itr) works as is.
+1000 to removing these kinds of pre-composed generator + function patterns
function-first does allow do-block notation, is that doable with generators?
I find the generator syntax more readable than a do-block with function-first.
+1 for generators. In the rare occasions where your function definition is so big that a do block useful I think it would be okay to refer to mapreduce.
prod actually does take a function argument. On master (just a couple days old):
julia> methods(prod)
# 6 methods for generic function "prod":
prod(x::Tuple{}) at tuple.jl:222
prod(x::Tuple{Any,Vararg{Any,N<:Any}}) at tuple.jl:223
prod(f::Function, A::AbstractArray, region) at reducedim.jl:291
prod(f::Union{DataType,Function}, a) at reduce.jl:400
prod(A::AbstractArray, region) at reducedim.jl:293
prod(a) at reduce.jl:407
The third and fourth methods listed are the ones you want.
@ararslan The issue is that it's not documented.
I changed the title to reflect that concern. Care to make a PR to fix this?
Deprecation in favor of generators seems like the better approach here.
Why can't both methods exist?
They can. I have to say that writing things like sum(length, arrays) is pretty nice. Nicer than sum(length(a) for a in arrays), so maybe just keep this and document it.
Ok, to flip sides again: the definition of reducer(transform, itr) methods for reducers is completely mechanical, which is exactly the sort of situation that we introduced f.(v) syntax to avoid – i.e. rather than call @vectorize_1arg and @vectorize_2arg on functions that support vectorization, let any function support vectorization. The only thing that's better about this situation is that unlike vectorization, there's a fairly clear answer to the question "which functions get these methods" – the answer is all reducers. While sum(length(a) for a in arrays) is pretty concise, I still can't think of a way to write this generically for any reducer that is as nice as sum(length, arrays). So maybe just define this method for all the built in reducers, which technically makes them map-reducers.
@StefanKarpinski What about sum(length.(arrays))? This is probably inefficient due to the intermediate array. Maybe one could have a notation like f:(v), which returns a generator instead of a full array like f.(v)? Or else f.(v) by default returns a generator, which must be explicitly converted to an array by a call to collect.
Yes, I considered that but it's inefficient since it materializes the array of lengths. The f:(v) notation isn't available since it already means the same thing a f:v – i.e. construct a range from f to v with unit step. Introducing laziness doesn't seem helpful here. I'm somewhat inclined to just define (and document) the map-reduce methods for all reducers since it is a fixed set of functions, at least in Base.
Sorry, : was a poor example (I forgot ranges). Maybe another punctuation is available for lazy map? I created a new issue to discuss this: https://github.com/JuliaLang/julia/issues/19198.
Ref. discussion around https://github.com/JuliaLang/julia/issues/16285#issuecomment-237320513 (generalizing compact broadcast syntax to compact higher-order function syntax, relevant given the behavior mentioned above is mapreduce). Best!
Most helpful comment
They can. I have to say that writing things like
sum(length, arrays)is pretty nice. Nicer thansum(length(a) for a in arrays), so maybe just keep this and document it.