Julia: Make `cumsum(A)` compute a summed-area table

Created on 11 Mar 2018  Â·  8Comments  Â·  Source: JuliaLang/julia

The idea has been discussed in a few places (e.g. https://github.com/JuliaLang/julia/issues/20041#issuecomment-275938676) to make cumsum(A) return a summed-area table for multidimensional arrays.

The 1D case is already a summed-area table, since that's just the cumulative sum along the one dimension.

Happily, the appropriate deprecation for cumsum — deprecating it for AbstractArrays with no dims argument — is already in place, so this is just a tracking issue for the idea.

cc: @simonbyrne

arrays good first issue

Most helpful comment

More than just cumsum — cumprod, accumulate and maybe diff should be considered at the same time, too. Any others?

All 8 comments

More than just cumsum — cumprod, accumulate and maybe diff should be considered at the same time, too. Any others?

We should at least support it when dims is passed a tuple.

@yurivish Something like this?

A = [31 2;12 26;13 17]

import Base.cumsum

function cumsum(A::AbstractArray, dims::Tuple{Int})
    out = copy(A)
    for i in dims
        cumsum!(out, out, i)
    end
    return out
end

cumsum(A, (1,2))
#=
3×2 Array{Int64,2}:
 31   33
 43   71
 56  101
=#

#Expected behavior is consistent
cumsum(A, (1,2)) == cumsum(cumsum(A, 1), 2)
#true

#Compatibility with the basic definition is guaranteed
cumsum(A, 1) == cumsum(A, (1,))
#true
cumsum(A, 2) == cumsum(A, (2,))
#true

It works even for subsets of 1:ndims(A) just as well

A = cat(3, A, A, A)
cumsum(A, (1,2))
#=3×2×3 Array{Int64,3}:
[:, :, 1] =
 31   33
 43   71
 56  101

[:, :, 2] =
 31   33
 43   71
 56  101

[:, :, 3] =
 31   33
 43   71
 56  101=#

I can work this into a PR. Just need some time to get acquainted to JuliaLang PR policies, so devs can assign me this if you want.

GitHub only allows assignment to members of an organization, but you can certainly work on this if you like. Keep in mind that one will want to compute the summed area table in an order which is cache-friendly (i.e. compute along columns first for dense arrays).

I think that definition of cumsum is nice because

  • it generalizes well to accumulate because it defines precisely what should happen for a non-commutative function argument
  • it even has a defined behavior for dimensions occurring more than once in dims.

Yes, it does seem like a nice definition. If I'm understanding correctly, it's compatible with computing a summed area table as well, assuming that the default is cumsum(A, dims=1:ndims(A)). With perfectly commutative arithmetic, the order of summation doesn't matter so we can optimize it to choose the best order for summing the dimensions, but now that I'm thinking about it based on this observation you're doing the same set of summation operations no matter what so maybe that doesn't actually matter.

Hi, If the order of summation doesn't matter then I think simply calculating the cumulative sum will be good, right? If nobody is working on this I would like to contribute.
P.S.: I am new to Julia(motivated by the concept of this lang) so any beginners advice?

No, I would say just try it and ask for help on the #my-first-pr channel on julialang.slack.com if you need help with anything.

Was this page helpful?
0 / 5 - 0 ratings