Julia: Broadcast with SparseArrays and non-constant functions

Created on 12 Sep 2018  Â·  9Comments  Â·  Source: JuliaLang/julia

I'm opening this issue following brief discussions on slack with @mbauman @StefanKarpinski and @MikeInnes

using LinearAlgebra, SparseArrays
rng = MersenneTwister(123456)
A = sparse(I, 3, 3)
B = broadcast(_->randn(rng), A)

makes B a sparse matrix with entries

3×3 SparseMatrixCSC{Any,Int64} with 9 stored entries:
  [1, 1]  =  1.82299
  [2, 1]  =  -0.620399
  [3, 1]  =  -0.620399
  [1, 2]  =  -0.620399
  [2, 2]  =  1.06094
  [3, 2]  =  -0.620399
  [1, 3]  =  -0.620399
  [2, 3]  =  -0.620399
  [3, 3]  =  0.54373

where for each non-zero entry in the matrix a separate value has been sampled, but a single value has been sampled and used to populate all of the previously-zero elements. This is apparently a design choice, which can be summarised as assuming that a function broadcasted over a sparse array is pure. I believe that it reasonable to revisit this choice as

  • The behaviour is inconsistent with dense matrices. This inconsistency leads to (I would argue) undesirable behaviour, such as
using Random
rng, rng′ = MersenneTwister(123456), MersenneTwister(123456)
broadcast(_->randn(rng), Matrix(SparseMatrix(I, 3, 3)))
Matrix(broadcast(_->randn(rng′), SparseMatrix(I, 3, 3)))

not yielding the same answers ie. the result of our operations depends crucially on the order in which we convert between (supposedly) equivalent representations of the same matrix.

  • The behaviour is unintuitive. Put bluntly, I cannot imagine this being what _anyone_ would a priori expect this to happen.
  • The strength of any performance related arguments in favour of this behaviour aren't clear to me, as typical usage of a sparse matrix doesn't involve fully populating it.

In short, I would propose changing this behaviour in favour of consistency. (It's possible that the output should automatically be converted to a dense matrix.)

sparse

Most helpful comment

Without that optimization, broadcasting on sparse arrays is pretty much useless. There is an argument for consistency, but a more important argument to me is usability. Making things consistent and the cost of making them unusable is not a good trade-off in my world.

Striving for f(m::SparseArray) == f(Array(m)) is a good idea, but it always needs to be kept in check by the reason we are using sparse data structures in the first place.

As another example:

julia> spzeros(2,2) * fill(NaN, 2)
2-element Array{Float64,1}:
 0.0
 0.0

See also https://github.com/JuliaLang/julia/issues/22733.

All 9 comments

Thanks for putting this issue together. I would tend to agree — the savings from avoiding the repeated function calls and filling with the same value are likely to be completely swamped by the huge performance cost that comes from blowing up the storage of the sparse array. I would suspect that's why this hasn't garnered much attention; it's really something I'd ideally avoid doing in the first place.

A simpler example is SparseMatrixCSC(I, 3, 3) .+ rand.().

CC @Sacha0.

by the huge performance cost that comes from blowing up the storage of the sparse array

And what if it isn't blown up (f(0) = 0)

Ah, right, of course — thanks. We need to assume that f(0) is constant and side-effect free if we want the O(nnz) optimization when the first evaluation of f(0) == 0.

I agree that optimising for f(0) == 0 makes a lot of sense, provided that guarantees can be provided regarding its purity as @mbauman points out. We cannot safely assume this in general though.

Without that optimization, broadcasting on sparse arrays is pretty much useless. There is an argument for consistency, but a more important argument to me is usability. Making things consistent and the cost of making them unusable is not a good trade-off in my world.

Striving for f(m::SparseArray) == f(Array(m)) is a good idea, but it always needs to be kept in check by the reason we are using sparse data structures in the first place.

As another example:

julia> spzeros(2,2) * fill(NaN, 2)
2-element Array{Float64,1}:
 0.0
 0.0

See also https://github.com/JuliaLang/julia/issues/22733.

Yup. We simply don't have any generic mechanism that flags or otherwise detects a "pure" function. One interesting thought would be to ask inference if constant propagation led it to infer a constant zero, but that would still be seriously limiting.

We need to assume that f(0) is constant and side-effect free if we want the O(nnz) optimization when the first evaluation of f(0) == 0.

Without that optimization, broadcasting on sparse arrays is pretty much useless.

We simply don't have any generic mechanism that flags or otherwise detects a "pure" function.

^ These :).

Okay, I take your point regarding usability for the case that f(0) == 0. It is a shame that we don't have a mechanism for determining purity straightforwardly...

Could this assumption regarding purity please be highlighted in both SparseArrays and broadcast's documentation / docstrings? If there's an overhaul planned then maybe now is a good time to do it?

Could this assumption regarding purity please be highlighted in both SparseArrays and broadcast's documentation / docstrings?

Pull requests to make this more explicit would be welcome. You're in a good position to write them since you know what users in a similar situation as you might be looking for.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

omus picture omus  Â·  3Comments

dpsanders picture dpsanders  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments

m-j-w picture m-j-w  Â·  3Comments