Julia: SparseMatrix and SparseArray don't consitently define zero elements and allowed element types

Created on 26 Nov 2017 · 12Comments · Source: JuliaLang/julia

When I first worked with sparse matrices, I was convinced, that the element type of those objects were AbstractFloat types. I am easily convinced, that all Number types make sense as well. Nevertheless there there seems to be a sloppy idea of which element types are actually intended.

The constructors of sparse arrays accept any element type. But you can call certain functions on sparse arrays only, if the element type supports certain methods. So the element types are de facto restricted by a kind of duck typing. The methods, which I refer to, include != 0, == 0, != zero, == zero, isequal, iszero for the question, if an element to be processed is zero or not.
I see, that all those alternatives lead to the same result for Numbers (with the exception of == and isequal, which deserves a separate consideration in the case of ±0.0).
But if I want to use a different element type Tv, I have to support this type with a consistent behaviour for all of those methods. That is far more inconvenient than just have to define zero(::Type{Tv}), what would be sufficient in the case of a consistent use within the SparseArrays module.

Using iszero(x) would be a proper choice. It falls back to x == zero(typeof(x)). Comparison with 0 is good for if x isa Number, but leads into a blind alley, for other types, because all elements would be stored silently (even if zero is defined).

Here is a list of the current implementation (v0.7 2017-11-24) which illustrates the diversity in handling zeros:
How (and where) is defined, if an entry is considered a zero?

check expression | source | function | comments
---------------- | ------ | -------- | --------
v != 0 | sparsevector.jl:387 | convert | from dense
v != 0 | sparsevector.jl:292 | setindex! | but only, if index is new
v != 0 | sparsematrix.jl:2324 | setindex! | (but only, if index is new)
v != zero(Tv) | sparsematrix.jl:2598 | setindex!
nothing | sparseivector.jl:751 | getindex(x,::AbstractUnitange) | new sparse vector constructed, but no check
x != 0 | sparsevector.jl:1975 | dropzeros!
x != 0 | sparsematrix.jl:1227 | dropzeros!
x != 0 | sparsevector.jl:724 | findnz
x != 0 | sparsematrix.jl:1303 | findnz
x != 0 | sparsematrix.jl:1786 | findz
x != 0 | sparsematrix.jl:3189 | is_hermsym
x != 0 | sparsematrix.jl:3227 | istriu
x != 0 | sparsematrix.jl:3246 | istril
x != 0 | sparsematrix.jl:1564,1567,1574,1577 | ==
convert(eltype(A), x) != zero(eltype(A)) | sparsematrix.jl:2033 | Base.fill!
v != zero(v) | sparsevector.jl:1590,:1689 | A_mul_B!(::Number, ...)
v == zero(T) | sparsematrix.jl:1641 | Base._mapreduce
isequal(f(zero(Tv)), zero(Tv) | sparsematrix.jl:1744 | _mapreducecols!
iszero(A.nzval) | sparsematrix.jl:1513 | Base.iszero
iszero(x) sparsematrix.jl:1520 | Base.isone
isequal(v, zero(T)) | sparsematrix.jl:3472 | Base.hash
x == 0 | higherorderfns.jl:139 | _iszero
Base.iszero(x::Number) | higherorderfns.jl:140 | _iszero
Base.iszero(x::AbstractArray) | higherorderfns.jl:141 | _iszero

How is zero element created?

expression | source | function | comments
---------- | ------ | -------- | --------
zero(eltype(A)) | sparsevector.jl:688 | find(p, A)
zero(Tv) | sparsevector:751 | getindex(x,::Integer)
zero(T) | sparsematrix:1890,1892 | getindex(x,::Integer)
zero(Tv) | sparsematrix.jl:2594,2598,2616,2623 | setindex!
zero(eltype(S)) | sparsematrix.jl:73 | count
zero(eltype(S)) | sparsematrix.jl:1262 | find
zeros(Tv, m, n) | sparsematrix.jl:377 | convert | (to dense - if sparse array has no structural zeros, not used)
zero(T) | sparsematrix.jl:1598,1620,1629,1631,1639 | Base._mapreduce
zero(T) | sparsematrix.jl:1708 | Base._mapreducedim!
zero(T) | sparsematrix.jl:1743 | _mapreducecols!
zero(T) | sparsematrix.jl:1810 | _findr
zeros(Tv,m) | sparsematrix.jl:3373 | sortSparseMatrixCSC!
zeros(eltype(A)) | higherorderfns.jl:142,143 | _zeros_eltypes

Inconsitencies

setindex!(A, -0.0, ind...) behavior depends on ind is structural zero or not

Example where the result of getindex depends on the pre-history (before the last setindex!):

julia> B = spzeros(2)
2-element SparseVector{Float64,Int64} with 0 stored entries
julia> B[1] = 1;
julia> B[1] = -0.0; B[2] = -0.0;
julia> B[1], B[2]
(-0.0, 0.0)

Proposal:

Change in sparse/*jl all the checks if x is zero by the call iszero(x).

Amend the documentation with a hint, that the element types of SparseArrays need to define the `zero(::Type{T}) method.
Change the values of all AbstractFloat zero values -0.0 to +0.0 of the same type, before they are stored into nzvals.
Change the documentation about getindex(::AbstractSparseArray{T},...) to mention, that all returned zeros have positive signs for T<:Number.
Define a fallback method Base.zero(x) = zero(typeof(x)) and Base.zero(::Type{T}) where T = error("..."). That would be in accordance to current doc of zero.

sparse

Source

KlausC

❤1

Most helpful comment

@yurivish I made a PR on your problem. Now returning:

julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 1 stored entry:
  [1]  =  MyType(true)

Is it ok for you ?

Lecrapouille on 21 Sep 2019

👍2 ❤1

All 12 comments

Could you give a code snippet as an example where these "inconsistencies" lead to an inconsistent result?

KristofferC on 26 Nov 2017

The issue needs a better title! Would be nice to have consistency.

ViralBShah on 26 Nov 2017

Another code snippet, demonstrating the need (v0.6.1 and v0.7-DEV):

julia> using Base.Dates

julia> S = sparse([ Second(0); Second(1)])
2-element SparseVector{Base.Dates.Second,Int64} with 2 stored entries:
  [1]  =  0 seconds
  [2]  =  1 second
julia> zero(Second(1))
0 seconds
julia> iszero(Second(0))
true
julia> Second(0) == 0
false

Generally it is not possible to take advantage of the storage economy of sparse arrays for dimensionful or unitful element types.

KlausC on 26 Nov 2017

Ref https://github.com/JuliaLang/julia/pull/9325

KristofferC on 26 Nov 2017

@KristofferC : I see that this issue has bee discussed 3 years ago. The main argument then, not to change anything (as I undestood), was performance.
I cannot verify that with the current version of Julia. The timing of x == 0 and iszero(x) does not seem to differ for x isa Union{Float64,BigInt,BigFloat}.
Maybe it is time now to review the arguments.

KlausC on 26 Nov 2017

I ran into this while giving a demonstration of Julia to a friend. We had a custom struct type with zero and iszero defined and wanted to make a sparse matrix out of a dense matrix containing a custom type.

Due to the != 0 check, there was no way for us to communicate our zero value to the sparse matrix code, and the zeros were thus represented explicitly in the sparse matrix:

julia> struct MyType
           val::Bool
       end

julia> Base.zero(::Type{MyType}) = MyType(false)

julia> Base.iszero(x::MyType) = x.val == false

julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 2 stored entries:
  [1]  =  MyType(true)
  [2]  =  MyType(false)

yurivish on 13 Jun 2018

👍1

I had same issue, is there a progress on this issue ? What are the challenges ?

ekinakyurek on 26 Oct 2018

In best case, changing some x == 0 to iszero(x).

KristofferC on 26 Oct 2018

@yurivish I made a PR on your problem. Now returning:

julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 1 stored entry:
  [1]  =  MyType(true)

Is it ok for you ?

Lecrapouille on 21 Sep 2019

👍2 ❤1

Thanks! That is the behavior I think is right in this case.

yurivish on 21 Sep 2019

@KlausC do you think my commit has a chance to be merged?

Lecrapouille on 5 Feb 2020

@Lecrapouille, which PR do you mean exactly? You know I have been trying to clean-up some of the inconsistencies in SparseMatrixCSC for a while, but found unexpected resistance. So my answer to this question is not important. You should ask people with permissions to merge.

The bugs you point out in issue https://github.com/JuliaLang/julia/issues/33332 are still not fixed. I support your fixes mentioned there, but lost overview of which is the PR providing the fix.

BTW, the example you mention here: https://github.com/JuliaLang/julia/issues/24790#issuecomment-533782773 seems to work for me without your PR merged. (not in v"1.2" but in v"1.5.0")

KlausC on 6 Feb 2020

Was this page helpful?

0 / 5 - 0 ratings