Julia: aliasing and sparse broadcast

Created on 3 May 2017 · 19Comments · Source: JuliaLang/julia

Sparse broadcast does not yet check whether the input arguments alias the output arguments, causing e.g.

julia> versioninfo()
Julia Version 0.7.0-DEV.16
Commit 5f296f3 (2017-05-03 18:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin15.6.0)
  CPU: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, ivybridge)

julia> A = sprandn(10, 10, 0.1)
10×10 SparseMatrixCSC{Float64,Int64} with 12 stored entries:
  [3 ,  1]  =  -0.270259
  [1 ,  3]  =  0.555567
  [3 ,  3]  =  -0.360022
  [9 ,  3]  =  0.567621
  [10,  5]  =  0.638384
  [4 ,  6]  =  1.75078
  [5 ,  6]  =  0.138524
  [1 ,  8]  =  1.1261
  [4 ,  9]  =  1.3175
  [3 , 10]  =  1.72912
  [7 , 10]  =  1.27993
  [8 , 10]  =  -0.422809

julia> b = randn(10);

julia> broadcast!(/, A, A, b)
10×10 SparseMatrixCSC{Float64,Int64} with 100 stored entries:
  [1 ,  1]  =  NaN
  [2 ,  1]  =  NaN
  [3 ,  1]  =  NaN
  [4 ,  1]  =  NaN
  [5 ,  1]  =  NaN
  [6 ,  1]  =  NaN
  [7 ,  1]  =  NaN
  [8 ,  1]  =  NaN
  [9 ,  1]  =  NaN
  [10,  1]  =  NaN
  [1 ,  2]  =  NaN
  [2 ,  2]  =  NaN
  [3 ,  2]  =  NaN
  [4 ,  2]  =  NaN
  [5 ,  2]  =  NaN
  [6 ,  2]  =  NaN
  [7 ,  2]  =  NaN
  [8 ,  2]  =  NaN
  ⋮
  [2 ,  9]  =  NaN
  [3 ,  9]  =  NaN
  [4 ,  9]  =  NaN
  [5 ,  9]  =  NaN
  [6 ,  9]  =  NaN
  [7 ,  9]  =  NaN
  [8 ,  9]  =  NaN
  [9 ,  9]  =  NaN
  [10,  9]  =  NaN
  [1 , 10]  =  NaN
  [2 , 10]  =  NaN
  [3 , 10]  =  NaN
  [4 , 10]  =  NaN
  [5 , 10]  =  NaN
  [6 , 10]  =  NaN
  [7 , 10]  =  NaN
  [8 , 10]  =  NaN
  [9 , 10]  =  NaN
  [10, 10]  =  NaN

Best!

broadcast bug sparse

Source

Sacha0

Most helpful comment

Since it's a corruption bug, it can be fixed at any time, even if it was in Base, but it doesn't sound very serious to me to release the 1.0 version of a technical computing language with such a dangerous bug.

nalimilan on 23 Jan 2018

👍5

All 19 comments

Maybe for now it should just throw an error? Seems like this would be difficult to make work unless the sparsity pattern is not changed by the broadcast.

stevengj on 4 May 2017

Seems like this would be difficult to make work unless the sparsity pattern is not changed by the broadcast.

Agreed, strict in place operation with input-output aliasing doesn't seem possible in general.

Maybe for now it should just throw an error?

My thinking as well. Making copies of the input arguments that alias the output argument would be the alternative / a potential future improvement. Best!

Sacha0 on 4 May 2017

This should really throw an error instead of silently giving the wrong result.

jebej on 19 Jun 2017

It would be nice to do something to prevent this bug from happening in 1.0. It gives a very bad impression of the language's safety. Making a copy of the input when it's === to the output would at least protect from corruption.

nalimilan on 18 Jan 2018

👍2

I think error is the better way to do safety here than copying. One normally uses .= to avoid copies, so it would lead to some hard to debug allocations when you assume that the operation won't copy. On the other hand, an error that says "you can't do aliasing here" is pretty clear and tells the programmer exactly what they need to change.

ChrisRackauckas on 22 Jan 2018

👍1

The problem is, the person who wrote the .= line (possibly deep in a call stack) might not have anticipated to get a sparse matrix, so if we raise an error the user who passed a sparse matrix wouldn't have any choice other than making it dense, which isn't generally possible (or very costly).

nalimilan on 22 Jan 2018

Yes, an implementation that does a potentially unnecessary allocation seems way more useful than one that just errors. (But yea, the latter is still better than one that silently gives wrong results.)

martinholters on 22 Jan 2018

Does this have to block 1.0 given that sparse is now in stdlib?

ViralBShah on 23 Jan 2018

nalimilan on 23 Jan 2018

👍5

I don't really see what the controversy here is. Why can't we check for aliasing and make a copy when we detect it? Surely making a copy is better than getting complete garbage? If possible, we can improve the performance in the future, but for now giving a correct answer seems like a no-brainer.

StefanKarpinski on 23 Jan 2018

👍2

Seems like there is no controversy. Just needs to get done.

ViralBShah on 23 Jan 2018

Ok, since this is a bug, it's not a feature freeze blocker, but we really should get it done for 1.0.

StefanKarpinski on 23 Jan 2018

Much thanks Matt! 🎉 :)

Sacha0 on 27 Feb 2018

👍2 🎉1

So does a .= b now perform a copy if the two arrays alias? I'm confused by what the effects of this are on slightly different cases than the OP.

ChrisRackauckas on 27 Feb 2018

Yes, that's correct, with the slight caveat that it will _not_ create a copy in the case where a === b, which is relevant (and important) for a .+= 2. In that case — and that case alone — we know that the iteration order within broadcast will not cause any spooky action at a distance.

If, however, b is a view of a (or any parts of a), then b could re-arrange the indices, changing the access pattern, and causing corruption.

mbauman on 27 Feb 2018

What is the difference now between using broadcast with sparse arrays vs not using it?
Using dots allows fusion with regular arrays, but that doesn't seem easily doable with sparse arrays given the sparsity.

jebej on 27 Feb 2018

That's an awfully broad question. I'm not sure I fully understand it. But sparse arrays do have specialized implementations of broadcasting that tries very hard to exploit the sparsity, and they do fuse. In-place assignment is only a big win if you already have an output array with the expected output sparsity pattern.

mbauman on 27 Feb 2018

My question was mostly about fusion.

So something like A .+ 3.*B .- sin.(C)./2 will fuse and use less memory than without the dot. If so that's pretty cool :)

In-place assignment is only a big win if you already have an output array with the expected output sparsity pattern

Right, and this seems hard to predict without knowing the input in advance.

jebej on 27 Feb 2018

To avoid allocation, knowing the result's storage size suffices; the pattern isn't strictly necessary. Though I am not certain under what conditions you would know the result's storage size but not its pattern :).

Sacha0 on 27 Feb 2018

Was this page helpful?

0 / 5 - 0 ratings