Julia: cumsum inconvenient for RowVector

Created on 15 Jan 2017  Β·  48Comments  Β·  Source: JuliaLang/julia

It feels like a RowVector should act one-dimensionally in the following:

julia> w = [1, 2, 3]'
1Γ—3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w)
1Γ—3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

julia> accumulate(+, w)
1Γ—3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

cf. https://github.com/JuliaLang/julia/issues/4774#issuecomment-262118551

arrays decision

Most helpful comment

The primary problem here is that scan functions default to dim=1 when – I would argue that's wrong and inconsistent with sum which sums the entire array when called without a dimension argument. We departed from Matab's default dimension behavior for sum et al. for a reason.

Instead we could define scan functions to do something that's useful for row matrices or column matrices and consistent for both. I think a useful definition which fits the bill is to e.g. have cumsum compute this for matrices:

julia> [sum(A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)]
3Γ—4 Array{Float64,2}:
 0.470239  0.960547  1.77296  2.7498
 1.03861   2.18006   3.8101   5.68694
 1.16206   2.33594   4.02084  6.01088

With this behavior cumsum(x) for a vector, column matrix, row matrix or row vector are all useful and consistent and do what you want.

All 48 comments

So I just add a special case for a row vector, overriding the default?

Note that cumsum, accumulate, and accumulate! actually take a dimension argument, and when set to 2 (1 is the default), all work as expected on a RowVector. I think the current behavior makes sense, given that the dimension of accumulation is in all cases being provided, and a row vector by definition has a sense of multidimensionality. (Edit: Just saw Stefan's comment about how 1 should not be the default--seems reasonable.)

If we did decide to treat it as one-dimensional for these functions, we'd probably want to do the same for a larger group of functions to ensure consistency.

Isn't the point that it should be treated as one-dimensional when possible? As reflected in its name, [Row]Vector.

That wasn't my understanding given the inheritance structure:

julia> RowVector <: AbstractVector
false

julia> RowVector <: AbstractMatrix
true

Despite the inheritance structure, a RowVector is intuitively
1-dimensional. I vote that we treat it as such for cumsum and related
functions.

On Sat, Jan 14, 2017 at 9:50 PM, John Myles White notifications@github.com
wrote:

That wasn't my understanding giving the inheritance structure:

julia> RowVector <: AbstractVector
false

julia> RowVector <: AbstractMatrix
true

β€”
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/julia/issues/20041#issuecomment-272675554,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AANoEhG7BM3aliKhfnzzrDBTCTt0Rk2Fks5rSbOXgaJpZM4LjzCg
.

What (i.e. which methods) is (are) actually inherited by the "inheritance structure"?
As far as I understand, the fact that RowVector <: AbstractMatrix is more a technical implementation detail than a philosophical position.

So treat 1D row vector as column vector for cumsum? Which other functions can be modify like this?

Given

julia> size([1, 2, 3]')
(1,3)

I'd say these are not acting 1-dimensional at all. There's a row orientation behavior here.

in any number of discussions around mit, we felt cumsum of a row vector should be a row vector.
The philosophy is that a row vector might act like a matrix for broadcast and hcat/vcat but otherwise it's one dimensional. Not sure what large set of functions have to be thought about. Certain scan like functions though.

At one point I wanted to argue against row vectors acting two dimensionally even for broadcast, hcat and vcat, but after listening to Steven Johnson, I couldn't find a strong argument.

So far my list of the triple status of a Julia row vector is something like this:

  • A row vector does the linear algebra thing for star and transpose
  • A row vector does the 2d container thing for hcat, vcat, and broadcast
  • A row vector does the 1d thing for scan like functions

My opinions (this week):
I don't love size returning (1,3) by the way.
and I think I'd really love for A[1,:] to return a row vector only when A is 2d (apl indexing otherwise)
but I can imagine that's perhaps not in the cards.

If we want to change this, we should be thorough and do it on all dimension-reducing functions that take a dimension argument. Has anyone collated a list of all of those? Having to special-case all of them rather than dispatching to the AbstractArray behavior would increase the maintenance burden and number of methods we need to write for RowVector, and every package that implements dimension-reducing argument functions of its own would also have to write additional methods for RowVector if the generic fallbacks are not consistent.

Collecting such a list might be a good opportunity for some rationalization and to decide whether we want to continue the Matlab-ism of having dimensions as a positional argument in many places (which leads to a number of ambiguities and inconsistent APIs), or make reduction calculation along specific dimensions more explicit via a keyword argument - assuming the performance would be comparable.

Notice that accumulate is not dimension reducing. For mapreducesim like functions, I think the only option is to use treat RowVector as a matrix. I agree that we should try to define clearly which functions should be 1-d like on RowVectors. Shape preserving (independently of value) functions might be a criterion. The number of such functions should hopefully not grow too large. Otherwise RowVector <: AbstractMatrix might be more problematic than expected. I don't expect that to be the case, though.

In preparing the PR, I was leaving all the behavior of vec.' similar to now except for what * and .' does afterwards (as well as the copy-vs-view semantic). So for almost all functions, it is just a matrix, and that was intentional.

So that said, I'm not surprised by the by the behavior reported in the OP.

This doesn't mean we shouldn't aim for improvement in the future. But given that "soft" feature-freeze was 18 days ago, and what is the current implemented behavior is the minimally breaking change to get the linear algebra properties desired, it was my assumption that further breaking changes had to wait until post v0.6. If this is not the consensus, we should decide that ASAP. But if we rush we risk that the result may be somewhat incoherent.

The primary problem here is that scan functions default to dim=1 when – I would argue that's wrong and inconsistent with sum which sums the entire array when called without a dimension argument. We departed from Matab's default dimension behavior for sum et al. for a reason.

Instead we could define scan functions to do something that's useful for row matrices or column matrices and consistent for both. I think a useful definition which fits the bill is to e.g. have cumsum compute this for matrices:

julia> [sum(A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)]
3Γ—4 Array{Float64,2}:
 0.470239  0.960547  1.77296  2.7498
 1.03861   2.18006   3.8101   5.68694
 1.16206   2.33594   4.02084  6.01088

With this behavior cumsum(x) for a vector, column matrix, row matrix or row vector are all useful and consistent and do what you want.

Yes, I was thinking the same thing, actually!

Note that this computation is independent of memory layout modulo floating-point associativity. I.e. it would be independent if computations were exact, but we'll want to do the computation in memory order, scanning along the first column and then scanning along the second column, adding the corresponding element of the first column, etc. Regardless, I think that's good enough.

Also known as Summed Area Table and Integral Image.

Also, if we want to generalize this to accumulate (which sounds plausible?), the order would need to be specified in case someone passes a non-associative function.

Right, if the operation is non-associative (or commutative) then does the formula on the wiki page Gunnar posted hold still?

image

Continuing the thought of generalizing this idea to accumulate(op, A), I'd suggest it should mimic [reduce(op, A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)] as per https://github.com/JuliaLang/julia/issues/20041#issuecomment-275938676 (and likewise for N≠2 dimensions), noting that reduce makes no guarantees regarding associativity. However, above formula could obviously not be applied as it would need the inverse of op. But unless I'm mistaken, it should still be possible to evaluate the result in O(length(A)) time and O(1) extra memory in addition to the output, right?

Resolving #23018 first would make this better perhaps?

I believe this can now be closed? Current behaviour is as follows:

julia> w = [1, 2, 3]'
1Γ—3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w)
β”Œ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
β”‚   caller = top-level scope
β”” @ Core :0
1Γ—3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w, 1)
1Γ—3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w, 2)
1Γ—3 Adjoint{Int64,Array{Int64,1}}:
 1  3  6

I think what needs to be done for the 1.0 milestone is done, but the full proposed behavior isn't implemented, so maybe just move it off the milestone?

Sure. Is the deprecation / removal of RowVector on the milestone?

Yes, in #5332 and related :).

Just to be sure: is cumsum([1,2,3]') going to be an error or equal to [1,3,6]'?

I imagine I misunderstand your question, so let's iterate :).

My understanding: If we remove but do not replace the deprecated single-argument accumulate-like methods in 1.0, then cumsum([1,2,3]') should yield a MethodError. Alternatively, between 0.7 and 1.0 we can introduce an accumulate specialization for Adjoint/Transpose vectors that behaves as described in https://github.com/JuliaLang/julia/issues/19451#issuecomment-327247222 's second list item, in which case cumsum([1,2,3]') should yield [1,3,6]'. I do not know what the working plan is. Thoughts? :)

I'm fine with deprecating and adding full functionality latere – just checking πŸ‘

This is deprecated, so moving off the milestone. In 1.x we can implement the summed area table interpretation if we want.

This thread's concerns might still apply to Adjoint/Transpose; I haven't thought it through. Perhaps label triage to make certain?

They all seem to be deprecated so I think we're fine for now:

julia> cumsum(rand(3)')
β”Œ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
β”‚   caller = top-level scope
β”” @ Core :0
1Γ—3 Adjoint{Float64,Array{Float64,1}}:
 0.635384  0.89566  0.875083

julia> cumsum(Adjoint(rand(3)))
β”Œ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
β”‚   caller = top-level scope
β”” @ Core :0
1Γ—3 Adjoint{Float64,Array{Float64,1}}:
 0.58616  0.56026  0.552915

julia> cumsum(Transpose(rand(3)))
β”Œ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
β”‚   caller = top-level scope
β”” @ Core :0
1Γ—3 Transpose{Float64,Array{Float64,1}}:
 0.380938  0.0475339  0.624499

Are there any other functions like cumsum that we should consider here? diff?

Yes, that's a good point: diff, cumprod (seems already ok), accumulate (already an error for everything but vectors). So I think that diff is the only one still left with the undesirable behavior:

julia> diff([1,2,3]')
0Γ—3 Array{Int64,2}

whereas it should behave more like this:

julia> diff([1,2,3]', 2)
1Γ—2 Array{Int64,2}:
 1  1

however, I think that should probably produce a row vector when the input is a row vector.

I think that should probably produce a row vector when the input is a row vector.

Adding to the list :). Though IIRC deprecating diff was on the table recently?

Deprecating it entirely? But it seems like such a basic, useful function. Note that n-d diff can compute the inverse operation of the partial sums table proposed for cumsum which would have the pleasant property that diff(cumsum(A)) == A[1:end-1,...,1:end-1] would always be true. Of course, it might be nicer to have a design where you don't have to slice off the last of each dimension in order for that identity to hold.

IIRC both deprecating diff in favor of a more precise name and deprecating diff entirely were discussed. IIRC the primary motivating concern was the name diff's generality combined with parties' desire to use that name for other purposes outside base. Best!

I'm rather surprised to find that diff is only used as the Base.diff() function _once_ in all of base. It's used as a different binding 107 times. I'd also be sad to see it go β€” but I've often seen it abused as a derivative, in which case off-by-one (or off-by-half) mistakes and dimension mismatches are extremely common. E.g., the property is that diff(cumsum(A)) == A[2:end,…,2:end], not 1:end-1.

All that said, my preference would be to fix this behavior and let it be.

I'm rather surprised to find that diff is only used as the Base.diff() function once in all of base. It's used as a different binding 107 times. I'd also be sad to see it go β€” but I've often seen it abused as a derivative, in which case off-by-one (or off-by-half) mistakes and dimension mismatches are extremely common. E.g., the property is that diff(cumsum(A)) == A[2:end,…,2:end], not 1:end-1.

These observations seem like a strong argument for at least a more appropriate / specific name?

decumulate(-, A)?

As an instance of decumulate[!](op, [y,] x::AbstractVector) and decumulate[!](op, [B,] A, dim)? Intriguing :).

Of course, one might just want to call decumulate by the name diff and allow passing an operator as a generalization :)

One might, but that 107:1 ratio... πŸ˜„

diff exists in R, Numpy and MATLAB, so I'd rather keep that name.

diff exists in R, Numpy and MATLAB, so I'd rather keep that name.

The evidence strongly suggests that this is something Julia can do better :).

The evidence strongly suggests that this is something Julia can do better :)

We're going to start calling you "Sacha, breaker of eggs" soon... 😝

We're going to start calling you "Sacha, breaker of eggs" soon... 😝

Regrettably I am ignorant of the reference; enlighten me? :)

Is Julia an omelette? πŸ˜„

Was this page helpful?
0 / 5 - 0 ratings

Related issues

TotalVerb picture TotalVerb  Β·  3Comments

iamed2 picture iamed2  Β·  3Comments

i-apellaniz picture i-apellaniz  Β·  3Comments

Keno picture Keno  Β·  3Comments

omus picture omus  Β·  3Comments