Julia: array reductions (sum, mean, etc.) and dropping dimensions

Created on 27 May 2016 · 48Comments · Source: JuliaLang/julia

Following this discussion on julia-users: https://groups.google.com/forum/#!topic/julia-users/V-meqUZtb7k

I find the fact that sum(A,1) and A[1,:] have different shapes in 0.5 counter intuitive. I am not aware of other languages for which the equivalent expressions have different shapes. It leads to unexpected results when computing e.g. A[1,:] ./ sum(A,1). At least in the case of sum, it seems like sum(A,1) should be equivalent to A[1,:] + A[2,:] + A[3,:] + ...

However, as @timholy pointed out, this behavior can be useful for broadcasting with expressions like A ./ sum(A,1).

For me, having the default slicing and array reduction behavior be consistent seems the natural choice, meaning that sum(A,1) should drop the first dimension. But I understand the attractiveness of not dropping dimensions for broadcasting. What are others' thoughts?

arrays

Source

ak168421

👍2

Most helpful comment

Back when we were brainstorming this, it was before dims was a keyword argument. Now it seems much more obvious to me that a nice solution here could be the use of a separate and parallel squeeze keyword argument. That is:

sum(A, dims=1) -> returns a 1xNxPx… object
sum(A, squeeze=1) -> returns a NxPx… object
sum(A, dims=1, squeeze=2) -> error for simplicity's sake

My solution in #23500 never really sat well with me, so that's why I haven't pushed on it. This is — at least in this brainstorming stage — much more compelling to me.

(Via https://github.com/JuliaLang/julia/pull/27608#issuecomment-398084051, brought up there due to the possibility of a surface syntax that'd ease these woes)

mbauman on 18 Jun 2018

👍11

All 48 comments

I agree with you. Perhaps the default behavior could be to drop (for consistency with A[1,:]) and have a keyword argument on reduction functions to keep dims.

BobPortmann on 27 May 2016

Having a keyword argument affect the type (i.e. number of dimensions) of the result would make this function type unstable.

StefanKarpinski on 27 May 2016

The way to keep the dimensions is to use a range, i.e. A[1:1,:].

andreasnoack on 27 May 2016

I am aware that A[1:1,:] does not drop dimensions, but it seems clear that the "default" behavior now is dropping dimensions APL-style with A[1,:], and, intuitively, it seems that the "default" behavior of sum should be compatible.

ak168421 on 27 May 2016

My comment was a reply to @BobPortmann's comment, not to the general issue.

andreasnoack on 27 May 2016

I think this was settled in https://github.com/JuliaLang/julia/pull/13612

Edit: "settled" is a bit too strong

mschauer on 27 May 2016

I am aware of the way A[1:1,:] vs A[1,:] works and think it is nice. My comment was about the reduction functions, which can be seen as inconsistent to slicing with a scalar. There was much discussion in #3262 but that was before the scalar dimensions dropping occurred. I thought perhaps that sentiment may have changed since then but it appears not.

BobPortmann on 27 May 2016

18271 is the strongest argument I've seen yet in favor of dropping dimensions from reductions. It's a pity we don't have an analog of `A[1:1, :]` to indicate that a dimension should be kept. What if we introduced `NoDrop`, and used `mean(A, NoDrop(1))` and `mean(A, NoDrop(1:2))`?

Alternatively we could switch the reduction functions so that they _required_ a boolean tuple for dims, i.e., mean(A, (true,false)) and mean(A, (true,true)) rather than mean(A, 1) and mean(A, 1:2). (true= "do reduce over the dimension in this slot".) Then we could also specify that the dimension be retained with mean(A, (true:true,false)) and mean(A, (true:true, true:true)), a direct analogy to the getindex syntax. I'm a little less excited about this variant, though.

timholy on 29 Aug 2016

This seems not-fully-decided, so moving to 1.0.

JeffBezanson on 4 Jan 2017

+1 for making reductions and getindex consistent.

tknopp on 3 Feb 2017

Discussion on Discourse: https://discourse.julialang.org/t/sum-squeeze-as-one-function/2471

StefanKarpinski on 7 Mar 2017

FWIW, just got bitten by this: I was expecting something like sum(A.*x',2) to be the same type as A*x, and it wasn't.

antoine-levitt on 20 May 2017

Half-baked idea prompted by https://github.com/JuliaLang/julia/pull/22000#issuecomment-304445583: we could pass reduction functions through the indexing syntax in order to specify which dimensions get reduced. E.g., b[sum, :, sum] == sum(b, (1,3)). This would intuitively drop dimensions.

mbauman on 30 May 2017

👍9

That's an appealing approach, which reminds me of the X_{i+} notation for denoting column sums.

nalimilan on 30 May 2017

Huh, that's a pretty cool syntax idea.

StefanKarpinski on 30 May 2017

i guess it's a dumb question, but why don't we differ the behaviour in the same way as we do for array indexing? sum(A, 3) will drop while sum(A, 3:3) won't drop the dimension

Edit: rethinking my idea showed me why: for indexing colon style is used per dimension while for sum the colon style is used across dimensions

Edit2: https://github.com/JuliaLang/julia/issues/16606#issuecomment-304924412 sounds indeed very clever

rapus95 on 9 Jul 2017

I have proposed, elsewhere squeeze(f, A, n) = squeeze(f(A, n), n) so that you can write squeeze(sum, A, 2) instead of squeeze(sum(A, 2), 2).

StefanKarpinski on 20 Jul 2017

👍4 ❤2

Since we're unlikely to change this in 1.0 and adding squeeze is non-breaking, this is no longer release-blocking.

StefanKarpinski on 12 Oct 2017

Closed by #23500.

StefanKarpinski on 12 Oct 2017

Oops, I thought that had been merged already.

StefanKarpinski on 12 Oct 2017

I frequently find doing

squeeze(mapslices(f, A, dims), dims)

ie with dims repeated, eg for fs which return a scalar. I think is (loosely) related to this issue, so I am commenting here, but I can open a new one if necessary. I have no good suggestion for the syntax.

tpapp on 2 Dec 2017

👍3

The suggestion I've made a few times for this is squeeze(f, A, dims); this was implemented by @mbauman in https://github.com/JuliaLang/julia/pull/23500 but then stalled out on some API concerns that I confess I'm not quite groking.

StefanKarpinski on 4 Dec 2017

Removing from the milestone since the suggested squeeze methods can be added at any time.

StefanKarpinski on 15 Dec 2017

sum(A, dims=1) -> returns a 1xNxPx… object
sum(A, squeeze=1) -> returns a NxPx… object
sum(A, dims=1, squeeze=2) -> error for simplicity's sake

My solution in #23500 never really sat well with me, so that's why I haven't pushed on it. This is — at least in this brainstorming stage — much more compelling to me.

(Via https://github.com/JuliaLang/julia/pull/27608#issuecomment-398084051, brought up there due to the possibility of a surface syntax that'd ease these woes)

mbauman on 18 Jun 2018

👍11

Just to clarify, the squeeze = ... keyword argument would be equivalent to calling squeeze on the result, just with the indices remapped accordingly (depending on dim?)

Or squeeze itself would affect the reduction?

tpapp on 18 Jun 2018

+1 to please provide some keyword argument for sum to not return an n-dimensional array if it isn't wanted so that the ugly syntax sum(..)[:] can be avoided.

aterenin on 6 Apr 2019

👍2

@aterenin: I recommend not to use sum(..)[:] which will create a copy but rather use vec(sum(..)). I think this is clean syntax and don't think using a keyword argument makes this cleaner.

tknopp on 6 Apr 2019

👍1

Thanks for the tip! That syntax is definitely cleaner by Julia standards, though sum(..) |> vec is perhaps even cleaner because it avoids contributing to parentheses hell. Hopefully better pipes get added in one day as the language develops.

aterenin on 6 Apr 2019

Now we have dims as a fairly ubiquitous keyword, and users often write code with repeated dims (e.g. https://github.com/JuliaLang/julia/issues/16606#issuecomment-348700496), is there any reason not to have e.g. sum(X, dims=i, dropdims=true) as nicer syntax for dropdims(sum(X, dims=i), dims=i)?

nickrobinson251 on 8 Aug 2019

I had proposed dropdims(sum, X, dims=i) a while ago, which seems even cleaner to me.

StefanKarpinski on 8 Aug 2019

I had proposed dropdims(sum, X, dims=i) a while ago, which seems even cleaner to me.

I don't feel strongly about one syntax over the other :)

Overall I think I slightly prefer sum(X, drop=true, kwargs...) to dropdims(sum, X, kwargs...) for a couple reasons

we may want the sum of some function f: sum(f, X, kwargs...) - not sure how that'd work in the dropdims(sum, X, ...) case?
to my eye, it best to have "the main thing you want" be the function call, i.e. the sum (mean, std, etc.) not the dropdims e.g. X̄ = mean(X, kwargs...) reads nicer _to me_ than X̄ = dropdims(mean, X, kwargs...)

on the other hand, it does add yet another keyword, and e.g. mean(f, X) was certainly the right choice (in my opinion) over something like mean(X, transform=f)

nickrobinson251 on 8 Aug 2019

👍1

There's a big implementation difference:

sum(X, dims=i, drop=true) requires adding a drop keyword to every single reducer.
dropdims(sum, X, dims=i) is one completely generic method

The implementation of my proposal is this method:

dropdims(f::Callable, args...; dims) = dropdims(f(args..., dims=dims), dims=dims)

julia> A = rand(0:9, 3, 4)
3×4 Array{Int64,2}:
 3  2  6  8
 6  1  4  2
 3  0  7  6

julia> dropdims(sum, A, dims=1)
4-element Array{Int64,1}:
 12
  3
 17
 16

julia> dropdims(maximum, A, dims=2)
3-element Array{Int64,1}:
 8
 6
 7

StefanKarpinski on 8 Aug 2019

👍5

If you want to support keywords, then you can do this:

dropdims(f::Callable, args...; dims, kwargs...) =
    dropdims(f(args...; kwargs..., dims=dims), dims=dims)

StefanKarpinski on 8 Aug 2019

👍4

Note that this method will work on all reducers everywhere. If we want all reducers to support a drop keyword, that will require everyone who has ever implemented a reducer to add support for it. It also makes reducers type unstable since the return type depends on a keyword argument.

StefanKarpinski on 8 Aug 2019

👍3

Nice! And just for completeness (answering my own question above), the dropdims version of e.g. sum(abs2, A, dims=1) would be

julia> dropdims(sum, abs2, A, dims=1)
4-element Array{Int64,1}:
  54
   5
 101
 104

nickrobinson251 on 8 Aug 2019

I agree that it looks a bit stranger to write, but I think this approach is the most composable one I've seen for this particular functionality.

StefanKarpinski on 9 Aug 2019

👍6

Problem with dropdims(sum, abs2, A, dims=1)
is it removes my ability to pass a do-block to the sum

eg I can't write:

dropdims(sum, A, dims=1) do x
    if x>0
        log(x)
    else
        0
    end
end

An option I don't love would be a macro.

@dropdims sum(abs2, A, dims=1) which would become

_dims=1
dropdims(sum(abs2, A, dims=_dims), dims=_dims)
end

The original argument against the kwarg is moot now, since that will constant-fold out now days.

oxinabox on 9 Aug 2019

The argument against the keyword form is not that keyword arguments are slow (when that applied, dims also wasn't a keyword, so I don't think that was every the problem), it's that you have to add that keyword to every single reducer everywhere. So we can do this in Base but any custom reducers out there will not have support for the same pattern. It's just weird to have to implement the same completely identical functionality over and over again, even though there's a completely composable solution that requires no code changes at all, just a new function.

Also, no one has addressed the type stability argument: changing return type based on a keyword argument is not generally a good idea.

StefanKarpinski on 13 Aug 2019

Yeah, I think the arguments for dropdims(f, args...; dims, kwargs...) are pretty good. I think @oxinabox was just being greedy (which is good!), and pointing out that we can currently use a do-block to sum

sum(A; dims=1) do x
    x > 0 ? log(x) : x
end

and in some ways in would be nice to be able to write (something like)

@dropdims sum(A; dims=1) do x
    x > 0 ? log(x) : x
end

rather than e.g.

dropdims(sum, x -> x > 0 ? log(x) : x, A, dims=1)

g(x) = x > 0 ? log(x) : x
dropdims(sum, g, A, dims=1)

But all are an approvement over the current

dropdims(
    sum(X; dims=1) do x
        x > 0 ? log(x) : x
    end,
    dims=1
)

nickrobinson251 on 13 Aug 2019

In #32860 we are discussing possible syntaxes for reduce/foldl/foreach. One idea for supporting multidimensional reduce is to add a syntax (say)

a[., ., :]

which is lowered to

Axed{(1, 2)}(a)

where

struct Axed{dims, T}
    x::T
end

(probably it should be Axed <: AbstractArray)

This way,

dropdims(sum(a[., ., :]))

can implement a type-stable version of

dropdims(sum(a; dims=(1, 2)); dims=(1, 2))

tkf on 13 Aug 2019

🎉1 👍1

The argument that adding drop=true to all methods would be a pain also applies to things like sum(A, squeeze=1) from https://github.com/JuliaLang/julia/issues/16606#issuecomment-398086850 , unfortunately.

Here's a quick attempt at a macro which allows sum(A; dims=2) do ... end, to try it out:

using MacroTools

macro dropdims(ex) # doesn't handle reduce(...; dims=..., init=...) etc.
    MacroTools.postwalk(ex) do x
        if @capture(x, red_(args__, dims=d_)) || @capture(x, red_(args__; dims=d_)) 
            :( dropdims($x; dims=$d) )
        elseif @capture(x, dropdims(red_(args__, dims=d1_); dims=d2_) do z_ body_ end) || 
               @capture(x, dropdims(red_(args__; dims=d1_); dims=d2_) do z_ body_ end)
            :( dropdims($red($z -> $body, $(args...); dims=$d1); dims=$d2) )
        else
            x
        end
    end |> esc
end

@dropdims begin
    a = sum(ones(3,7), dims=2)
    b = sum(10 .* randn(2,10); dims=2) do x                      
        trunc(Int, x)
    end
    (a isa Vector, b isa Vector) # (true, true)
end

mcabbott on 14 Aug 2019

Thinking about "dims syntax" https://github.com/JuliaLang/julia/issues/16606#issuecomment-520983042 a bit more, sum(a[., ., :]) should probably return a lazy object so that y .= sum(a[., ., :]) does not have to allocate anything. Note that copyto!(::Array{N}, ::Axed{dims, <:Array{M}}) where N === M - length(dims) can automatically drop dimensions.

tkf on 17 Aug 2019

If this a[., ., :] object is something like what eachslice makes, then I suspect reductions should always drop dimensions, like prod.(eachcol(x)) does now. Although more efficiently, without breaking y .= prod.(eachcol(x)). Notice that prod(eachcol(x)) is an error.

mcabbott on 17 Aug 2019

If this a[., ., :] object is something like what eachslice makes

I should have clarified that it is not the case. My model of Axed is just "an array paired with dims" and not an iterator-of-iterators. I'd even define simple forwarding methods like:

ndims(a::Axed) = ndims(a.x)
size(a::Axed) = size(a.x)
getindex(a::Axed, I...) = getindex(a.x, I...)

Axed acts as a type-stable way to mark dimensions that can be dropped when there is no ambiguity (i.e., N === M - length(dims)).

tkf on 17 Aug 2019

Ah, I didn't realise you were thinking of Axed as behaving just as label, while I was thinking of it being like Slices (views of the array). Both ways allow something like dropdims(sum(A, dims=(2,4)), dims=(2,4)) to be written more visually. Are you picturing also using it for other things, perhaps like this?

sum(p .* log.(p[:,.] ./ q)

Here log needs to act on the scalars of Axed, which under Slices you'd have to write more like sum((p .* log.(p ./ q))[:,.])

In the other thread I suggested that the Slices view might allow all sorts of generalised mapslices! operations to be done neatly:

@views b[:, .] .= f.(a[:, .])  # foreach((x,y) -> x .= f(y), eachcol(b), eachcol(a))
c[:, .] := f.(a[:, .])         # c = mapslices(f, a, dims=1)

mcabbott on 17 Aug 2019

I implemented a function ReduceDims.along to demonstrate the idea. It's somewhat inspired by bramtayl's JuliennedArrays.jl and @mcabbott's comments. The syntax is something like

sum(along(x, &, &, :))

which is equivalent to sum(x; dims=(1, 2)). [1] Here is a demo

julia> using ReduceDims

julia> x = ones(2, 3, 4);

julia> y = sum(along(x, &, :, :))  # lazy sum(x; dims=1)
mapreduce(identity, add_sum, along(::Array{Float64,3}, (1,)))

julia> copy(y) isa Array{Float64, 3}
true

julia> dropdims(y) isa Array{Float64, 2}  # type stable
true

julia> zeros(1, 3, 4) .= y;  # non-allocating

julia> zeros(3, 4) .= y;  # automatically drop dims

With #31020, you can also fuse it with broadcasting:

julia> using LazyArrays: @~  # need LazyArrays until #19198 is resolved

julia> y = sum(along(@~(x .+ 1), :, &, :))
mapreduce(identity, add_sum, along(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3},Tuple{Base.OneTo{Int64},Base.OneTo{Int64},Base.OneTo{Int64}},typeof(+),Tuple{Array{Float64,3},Int64}}, (2,)))

julia> zeros(1, 3, 4) .= y;

[1] There is also + to indicate "repeat previous marker" (like regexp):

along(x::Array{<:Any, 2}, &, +, :) === along(x, &, :)
along(x::Array{<:Any, 3}, &, +, :) === along(x, &, &, :)
along(x::Array{<:Any, 4}, &, +, :) === along(x, &, &, &, :)
along(x::Array{<:Any, 2}, :, +, &) === along(x, :, &)
along(x::Array{<:Any, 3}, :, +, &) === along(x, :, :, &)
along(x::Array{<:Any, 4}, :, +, &) === along(x, :, :, :, &)

tkf on 18 Aug 2019

@mcabbott I think something like sum((p .* log.(p ./ q))[:,.]) should work with dims-as-label approach. In fact y .= sum(along(@~(p .* log.(p ./ q)), :, &)) should already work.

The reason behind dims-as-label approach is that this also support non-dropping API as well. IIUC, your nested approach would drop dimensions unless you explicitly mark the non-dropping axes in the destination (as in@views b[:, .] .= f.(a[:, .]))?

But I agree that iterator-of-iterators approach is quite useful especially because it lets you use functions that are not aware of dims. I think it's useful to have both while sharing the "axes specifier" syntax. For example, it could be implemented using syntax like eachslice(x, &, &, :).

tkf on 18 Aug 2019

Can we do something about this? Would be great if we can replace dropdims(sum(A; dims=1); dims=1) that I have all over my codebase with something less ugly.