Julia: cumsum inconvenient for RowVector

Created on 15 Jan 2017 · 48Comments · Source: JuliaLang/julia

It feels like a RowVector should act one-dimensionally in the following:

julia> w = [1, 2, 3]'
1×3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w)
1×3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

julia> accumulate(+, w)
1×3 RowVector{Int64,Array{Int64,1}}:
 1  2  3

cf. https://github.com/JuliaLang/julia/issues/4774#issuecomment-262118551

arrays decision

Source

dpsanders

Most helpful comment

The primary problem here is that scan functions default to dim=1 when – I would argue that's wrong and inconsistent with sum which sums the entire array when called without a dimension argument. We departed from Matab's default dimension behavior for sum et al. for a reason.

Instead we could define scan functions to do something that's useful for row matrices or column matrices and consistent for both. I think a useful definition which fits the bill is to e.g. have cumsum compute this for matrices:

julia> [sum(A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)]
3×4 Array{Float64,2}:
 0.470239  0.960547  1.77296  2.7498
 1.03861   2.18006   3.8101   5.68694
 1.16206   2.33594   4.02084  6.01088

With this behavior cumsum(x) for a vector, column matrix, row matrix or row vector are all useful and consistent and do what you want.

StefanKarpinski on 29 Jan 2017

👍5

All 48 comments

So I just add a special case for a row vector, overriding the default?

Shade5 on 15 Jan 2017

Note that cumsum, accumulate, and accumulate! actually take a dimension argument, and when set to 2 (1 is the default), all work as expected on a RowVector. I think the current behavior makes sense, given that the dimension of accumulation is in all cases being provided, and a row vector by definition has a sense of multidimensionality. (Edit: Just saw Stefan's comment about how 1 should not be the default--seems reasonable.)

ararslan on 15 Jan 2017

👍2

If we did decide to treat it as one-dimensional for these functions, we'd probably want to do the same for a larger group of functions to ensure consistency.

ararslan on 15 Jan 2017

👍1

Isn't the point that it should be treated as one-dimensional when possible? As reflected in its name, [Row]Vector.

dpsanders on 15 Jan 2017

That wasn't my understanding given the inheritance structure:

julia> RowVector <: AbstractVector
false

julia> RowVector <: AbstractMatrix
true

johnmyleswhite on 15 Jan 2017

👍5

Despite the inheritance structure, a RowVector is intuitively
1-dimensional. I vote that we treat it as such for cumsum and related
functions.

On Sat, Jan 14, 2017 at 9:50 PM, John Myles White notifications@github.com
wrote:

That wasn't my understanding giving the inheritance structure:

julia> RowVector <: AbstractVector
false

julia> RowVector <: AbstractMatrix
true

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/julia/issues/20041#issuecomment-272675554,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AANoEhG7BM3aliKhfnzzrDBTCTt0Rk2Fks5rSbOXgaJpZM4LjzCg
.

kmsquire on 15 Jan 2017

What (i.e. which methods) is (are) actually inherited by the "inheritance structure"?
As far as I understand, the fact that RowVector <: AbstractMatrix is more a technical implementation detail than a philosophical position.

dpsanders on 16 Jan 2017

So treat 1D row vector as column vector for cumsum? Which other functions can be modify like this?

Shade5 on 16 Jan 2017

Given

julia> size([1, 2, 3]')
(1,3)

I'd say these are not acting 1-dimensional at all. There's a row orientation behavior here.

tkelman on 16 Jan 2017

👍3

in any number of discussions around mit, we felt cumsum of a row vector should be a row vector.
The philosophy is that a row vector might act like a matrix for broadcast and hcat/vcat but otherwise it's one dimensional. Not sure what large set of functions have to be thought about. Certain scan like functions though.

At one point I wanted to argue against row vectors acting two dimensionally even for broadcast, hcat and vcat, but after listening to Steven Johnson, I couldn't find a strong argument.

So far my list of the triple status of a Julia row vector is something like this:

A row vector does the linear algebra thing for star and transpose
A row vector does the 2d container thing for hcat, vcat, and broadcast
A row vector does the 1d thing for scan like functions

My opinions (this week):
I don't love size returning (1,3) by the way.
and I think I'd really love for A[1,:] to return a row vector only when A is 2d (apl indexing otherwise)
but I can imagine that's perhaps not in the cards.

alanedelman on 16 Jan 2017

If we want to change this, we should be thorough and do it on all ~~dimension-reducing~~ functions that take a dimension argument. Has anyone collated a list of all of those? Having to special-case all of them rather than dispatching to the AbstractArray behavior would increase the maintenance burden and number of methods we need to write for RowVector, and every package that implements dimension-~~reducing~~ argument functions of its own would also have to write additional methods for RowVector if the generic fallbacks are not consistent.

Collecting such a list might be a good opportunity for some rationalization and to decide whether we want to continue the Matlab-ism of having dimensions as a positional argument in many places (which leads to a number of ambiguities and inconsistent APIs), or make ~~reduction~~ calculation along specific dimensions more explicit via a keyword argument - assuming the performance would be comparable.

tkelman on 16 Jan 2017

Notice that accumulate is not dimension reducing. For mapreducesim like functions, I think the only option is to use treat RowVector as a matrix. I agree that we should try to define clearly which functions should be 1-d like on RowVectors. Shape preserving (independently of value) functions might be a criterion. The number of such functions should hopefully not grow too large. Otherwise RowVector <: AbstractMatrix might be more problematic than expected. I don't expect that to be the case, though.

andreasnoack on 16 Jan 2017

In preparing the PR, I was leaving all the behavior of vec.' similar to now except for what * and .' does afterwards (as well as the copy-vs-view semantic). So for almost all functions, it is just a matrix, and that was intentional.

So that said, I'm not surprised by the by the behavior reported in the OP.

This doesn't mean we shouldn't aim for improvement in the future. But given that "soft" feature-freeze was 18 days ago, and what is the current implemented behavior is the minimally breaking change to get the linear algebra properties desired, it was my assumption that further breaking changes had to wait until post v0.6. If this is not the consensus, we should decide that ASAP. But if we rush we risk that the result may be somewhat incoherent.

andyferris on 18 Jan 2017

julia> [sum(A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)]
3×4 Array{Float64,2}:
 0.470239  0.960547  1.77296  2.7498
 1.03861   2.18006   3.8101   5.68694
 1.16206   2.33594   4.02084  6.01088

With this behavior cumsum(x) for a vector, column matrix, row matrix or row vector are all useful and consistent and do what you want.

StefanKarpinski on 29 Jan 2017

👍5

Yes, I was thinking the same thing, actually!

andyferris on 29 Jan 2017

Note that this computation is independent of memory layout modulo floating-point associativity. I.e. it would be independent if computations were exact, but we'll want to do the computation in memory order, scanning along the first column and then scanning along the second column, adding the corresponding element of the first column, etc. Regardless, I think that's good enough.

StefanKarpinski on 31 Jan 2017

👍1

Also known as Summed Area Table and Integral Image.

GunnarFarneback on 31 Jan 2017

👍1

Also, if we want to generalize this to accumulate (which sounds plausible?), the order would need to be specified in case someone passes a non-associative function.

martinholters on 1 Feb 2017

Right, if the operation is non-associative (or commutative) then does the formula on the wiki page Gunnar posted hold still?

andyferris on 1 Feb 2017

Continuing the thought of generalizing this idea to accumulate(op, A), I'd suggest it should mimic [reduce(op, A[1:i,1:j]) for i=1:size(A,1), j=1:size(A,2)] as per https://github.com/JuliaLang/julia/issues/20041#issuecomment-275938676 (and likewise for N≠2 dimensions), noting that reduce makes no guarantees regarding associativity. However, above formula could obviously not be applied as it would need the inverse of op. But unless I'm mistaken, it should still be possible to evaluate the result in O(length(A)) time and O(1) extra memory in addition to the output, right?

martinholters on 1 Feb 2017

StefanKarpinski on 1 Sep 2017

Resolving #23018 first would make this better perhaps?

ViralBShah on 19 Nov 2017

I believe this can now be closed? Current behaviour is as follows:

julia> w = [1, 2, 3]'
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w)
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w, 1)
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  2  3

julia> cumsum(w, 2)
1×3 Adjoint{Int64,Array{Int64,1}}:
 1  3  6

dpsanders on 26 Dec 2017

👍1

I think what needs to be done for the 1.0 milestone is done, but the full proposed behavior isn't implemented, so maybe just move it off the milestone?

StefanKarpinski on 27 Dec 2017

Sure. Is the deprecation / removal of RowVector on the milestone?

dpsanders on 27 Dec 2017

Yes, in #5332 and related :).

Sacha0 on 27 Dec 2017

👍2

Just to be sure: is cumsum([1,2,3]') going to be an error or equal to [1,3,6]'?

StefanKarpinski on 27 Dec 2017

I imagine I misunderstand your question, so let's iterate :).

My understanding: If we remove but do not replace the deprecated single-argument accumulate-like methods in 1.0, then cumsum([1,2,3]') should yield a MethodError. Alternatively, between 0.7 and 1.0 we can introduce an accumulate specialization for Adjoint/Transpose vectors that behaves as described in https://github.com/JuliaLang/julia/issues/19451#issuecomment-327247222 's second list item, in which case cumsum([1,2,3]') should yield [1,3,6]'. I do not know what the working plan is. Thoughts? :)

Sacha0 on 27 Dec 2017

I'm fine with deprecating and adding full functionality latere – just checking 👍

StefanKarpinski on 27 Dec 2017

This is deprecated, so moving off the milestone. In 1.x we can implement the summed area table interpretation if we want.

JeffBezanson on 3 Jan 2018

This thread's concerns might still apply to Adjoint/Transpose; I haven't thought it through. Perhaps label triage to make certain?

Sacha0 on 3 Jan 2018

They all seem to be deprecated so I think we're fine for now:

julia> cumsum(rand(3)')
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Adjoint{Float64,Array{Float64,1}}:
 0.635384  0.89566  0.875083

julia> cumsum(Adjoint(rand(3)))
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Adjoint{Float64,Array{Float64,1}}:
 0.58616  0.56026  0.552915

julia> cumsum(Transpose(rand(3)))
┌ Warning: `cumsum(A::AbstractArray)` is deprecated, use `cumsum(A, 1)` instead.
│   caller = top-level scope
└ @ Core :0
1×3 Transpose{Float64,Array{Float64,1}}:
 0.380938  0.0475339  0.624499

StefanKarpinski on 4 Jan 2018

Are there any other functions like cumsum that we should consider here? diff?

mbauman on 4 Jan 2018

👍3

Yes, that's a good point: diff, cumprod (seems already ok), accumulate (already an error for everything but vectors). So I think that diff is the only one still left with the undesirable behavior:

julia> diff([1,2,3]')
0×3 Array{Int64,2}

whereas it should behave more like this:

julia> diff([1,2,3]', 2)
1×2 Array{Int64,2}:
 1  1

however, I think that should probably produce a row vector when the input is a row vector.

StefanKarpinski on 4 Jan 2018

I think that should probably produce a row vector when the input is a row vector.

Adding to the list :). Though IIRC deprecating diff was on the table recently?

Sacha0 on 4 Jan 2018

👎2

Deprecating it entirely? But it seems like such a basic, useful function. Note that n-d diff can compute the inverse operation of the partial sums table proposed for cumsum which would have the pleasant property that diff(cumsum(A)) == A[1:end-1,...,1:end-1] would always be true. Of course, it might be nicer to have a design where you don't have to slice off the last of each dimension in order for that identity to hold.

StefanKarpinski on 4 Jan 2018

👍2

IIRC both deprecating diff in favor of a more precise name and deprecating diff entirely were discussed. IIRC the primary motivating concern was the name diff's generality combined with parties' desire to use that name for other purposes outside base. Best!

Sacha0 on 4 Jan 2018

I'm rather surprised to find that diff is only used as the Base.diff() function _once_ in all of base. It's used as a different binding 107 times. I'd also be sad to see it go — but I've often seen it abused as a derivative, in which case off-by-one (or off-by-half) mistakes and dimension mismatches are extremely common. E.g., the property is that diff(cumsum(A)) == A[2:end,…,2:end], not 1:end-1.

All that said, my preference would be to fix this behavior and let it be.

mbauman on 4 Jan 2018

👍2

I'm rather surprised to find that diff is only used as the Base.diff() function once in all of base. It's used as a different binding 107 times. I'd also be sad to see it go — but I've often seen it abused as a derivative, in which case off-by-one (or off-by-half) mistakes and dimension mismatches are extremely common. E.g., the property is that diff(cumsum(A)) == A[2:end,…,2:end], not 1:end-1.

These observations seem like a strong argument for at least a more appropriate / specific name?

Sacha0 on 6 Jan 2018

decumulate(-, A)?

StefanKarpinski on 6 Jan 2018

As an instance of decumulate[!](op, [y,] x::AbstractVector) and decumulate[!](op, [B,] A, dim)? Intriguing :).

Sacha0 on 7 Jan 2018

👍1

Of course, one might just want to call decumulate by the name diff and allow passing an operator as a generalization :)

StefanKarpinski on 8 Jan 2018

One might, but that 107:1 ratio... 😄

Sacha0 on 8 Jan 2018

diff exists in R, Numpy and MATLAB, so I'd rather keep that name.

nalimilan on 8 Jan 2018

👍1

diff exists in R, Numpy and MATLAB, so I'd rather keep that name.

The evidence strongly suggests that this is something Julia can do better :).

Sacha0 on 8 Jan 2018

The evidence strongly suggests that this is something Julia can do better :)

We're going to start calling you "Sacha, breaker of eggs" soon... 😝

StefanKarpinski on 9 Jan 2018

We're going to start calling you "Sacha, breaker of eggs" soon... 😝

Regrettably I am ignorant of the reference; enlighten me? :)

Sacha0 on 9 Jan 2018

Is Julia an omelette? 😄

Sacha0 on 9 Jan 2018

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Broadcasting a function on an array gives an expression

dpsanders · 3Comments

Help regression for Rational in 0.6

omus · 3Comments

Dates.format regression on master: Width of milliseconds field cannot be fixed anymore

helgee · 3Comments

Getting Openssl error while building from source on ubuntu 14.04.3 cmake version 2.8.12.2

arshpreetsingh · 3Comments

Broken booleans as numbers?

TotalVerb · 3Comments