When I first worked with sparse matrices, I was convinced, that the element type of those objects were AbstractFloat types. I am easily convinced, that all Number types make sense as well. Nevertheless there there seems to be a sloppy idea of which element types are actually intended.
The constructors of sparse arrays accept any element type. But you can call certain functions on sparse arrays only, if the element type supports certain methods. So the element types are de facto restricted by a kind of duck typing. The methods, which I refer to, include != 0, == 0, != zero, == zero, isequal, iszero for the question, if an element to be processed is zero or not.
I see, that all those alternatives lead to the same result for Numbers (with the exception of == and isequal, which deserves a separate consideration in the case of 卤0.0).
But if I want to use a different element type Tv, I have to support this type with a consistent behaviour for all of those methods. That is far more inconvenient than just have to define zero(::Type{Tv}), what would be sufficient in the case of a consistent use within the SparseArrays module.
Using iszero(x) would be a proper choice. It falls back to x == zero(typeof(x)). Comparison with 0 is good for if x isa Number, but leads into a blind alley, for other types, because all elements would be stored silently (even if zero is defined).
Here is a list of the current implementation (v0.7 2017-11-24) which illustrates the diversity in handling zeros:
How (and where) is defined, if an entry is considered a zero?
check expression | source | function | comments
---------------- | ------ | -------- | --------
v != 0 | sparsevector.jl:387 | convert | from dense
v != 0 | sparsevector.jl:292 | setindex! | but only, if index is new
v != 0 | sparsematrix.jl:2324 | setindex! | (but only, if index is new)
v != zero(Tv) | sparsematrix.jl:2598 | setindex!
nothing | sparseivector.jl:751 | getindex(x,::AbstractUnitange) | new sparse vector constructed, but no check
x != 0 | sparsevector.jl:1975 | dropzeros!
x != 0 | sparsematrix.jl:1227 | dropzeros!
x != 0 | sparsevector.jl:724 | findnz
x != 0 | sparsematrix.jl:1303 | findnz
x != 0 | sparsematrix.jl:1786 | findz
x != 0 | sparsematrix.jl:3189 | is_hermsym
x != 0 | sparsematrix.jl:3227 | istriu
x != 0 | sparsematrix.jl:3246 | istril
x != 0 | sparsematrix.jl:1564,1567,1574,1577 | ==
convert(eltype(A), x) != zero(eltype(A)) | sparsematrix.jl:2033 | Base.fill!
v != zero(v) | sparsevector.jl:1590,:1689 | A_mul_B!(::Number, ...)
v == zero(T) | sparsematrix.jl:1641 | Base._mapreduce
isequal(f(zero(Tv)), zero(Tv) | sparsematrix.jl:1744 | _mapreducecols!
iszero(A.nzval) | sparsematrix.jl:1513 | Base.iszero
iszero(x) sparsematrix.jl:1520 | Base.isone
isequal(v, zero(T)) | sparsematrix.jl:3472 | Base.hash
x == 0 | higherorderfns.jl:139 | _iszero
Base.iszero(x::Number) | higherorderfns.jl:140 | _iszero
Base.iszero(x::AbstractArray) | higherorderfns.jl:141 | _iszero
How is zero element created?
expression | source | function | comments
---------- | ------ | -------- | --------
zero(eltype(A)) | sparsevector.jl:688 | find(p, A)
zero(Tv) | sparsevector:751 | getindex(x,::Integer)
zero(T) | sparsematrix:1890,1892 | getindex(x,::Integer)
zero(Tv) | sparsematrix.jl:2594,2598,2616,2623 | setindex!
zero(eltype(S)) | sparsematrix.jl:73 | count
zero(eltype(S)) | sparsematrix.jl:1262 | find
zeros(Tv, m, n) | sparsematrix.jl:377 | convert | (to dense - if sparse array has no structural zeros, not used)
zero(T) | sparsematrix.jl:1598,1620,1629,1631,1639 | Base._mapreduce
zero(T) | sparsematrix.jl:1708 | Base._mapreducedim!
zero(T) | sparsematrix.jl:1743 | _mapreducecols!
zero(T) | sparsematrix.jl:1810 | _findr
zeros(Tv,m) | sparsematrix.jl:3373 | sortSparseMatrixCSC!
zeros(eltype(A)) | higherorderfns.jl:142,143 | _zeros_eltypes
Inconsitencies
setindex!(A, -0.0, ind...) behavior depends on ind is structural zero or not
Example where the result of getindex depends on the pre-history (before the last setindex!):
julia> B = spzeros(2)
2-element SparseVector{Float64,Int64} with 0 stored entries
julia> B[1] = 1;
julia> B[1] = -0.0; B[2] = -0.0;
julia> B[1], B[2]
(-0.0, 0.0)
Proposal:
iszero(x).Amend the documentation with a hint, that the element types of SparseArrays need to define the `zero(::Type{T}) method.
Change the values of all AbstractFloat zero values -0.0 to +0.0 of the same type, before they are stored into nzvals.
Change the documentation about getindex(::AbstractSparseArray{T},...) to mention, that all returned zeros have positive signs for T<:Number.
Define a fallback method Base.zero(x) = zero(typeof(x)) and Base.zero(::Type{T}) where T = error("..."). That would be in accordance to current doc of zero.
Could you give a code snippet as an example where these "inconsistencies" lead to an inconsistent result?
The issue needs a better title! Would be nice to have consistency.
Another code snippet, demonstrating the need (v0.6.1 and v0.7-DEV):
julia> using Base.Dates
julia> S = sparse([ Second(0); Second(1)])
2-element SparseVector{Base.Dates.Second,Int64} with 2 stored entries:
[1] = 0 seconds
[2] = 1 second
julia> zero(Second(1))
0 seconds
julia> iszero(Second(0))
true
julia> Second(0) == 0
false
Generally it is not possible to take advantage of the storage economy of sparse arrays for dimensionful or unitful element types.
@KristofferC : I see that this issue has bee discussed 3 years ago. The main argument then, not to change anything (as I undestood), was performance.
I cannot verify that with the current version of Julia. The timing of x == 0 and iszero(x) does not seem to differ for x isa Union{Float64,BigInt,BigFloat}.
Maybe it is time now to review the arguments.
I ran into this while giving a demonstration of Julia to a friend. We had a custom struct type with zero and iszero defined and wanted to make a sparse matrix out of a dense matrix containing a custom type.
Due to the != 0 check, there was no way for us to communicate our zero value to the sparse matrix code, and the zeros were thus represented explicitly in the sparse matrix:
julia> struct MyType
val::Bool
end
julia> Base.zero(::Type{MyType}) = MyType(false)
julia> Base.iszero(x::MyType) = x.val == false
julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 2 stored entries:
[1] = MyType(true)
[2] = MyType(false)
I had same issue, is there a progress on this issue ? What are the challenges ?
In best case, changing some x == 0 to iszero(x).
@yurivish I made a PR on your problem. Now returning:
julia> sparse([MyType(true), MyType(false)])
2-element SparseVector{MyType,Int64} with 1 stored entry:
[1] = MyType(true)
Is it ok for you ?
Thanks! That is the behavior I think is right in this case.
@KlausC do you think my commit has a chance to be merged?
@Lecrapouille, which PR do you mean exactly? You know I have been trying to clean-up some of the inconsistencies in SparseMatrixCSC for a while, but found unexpected resistance. So my answer to this question is not important. You should ask people with permissions to merge.
The bugs you point out in issue https://github.com/JuliaLang/julia/issues/33332 are still not fixed. I support your fixes mentioned there, but lost overview of which is the PR providing the fix.
BTW, the example you mention here: https://github.com/JuliaLang/julia/issues/24790#issuecomment-533782773 seems to work for me without your PR merged. (not in v"1.2" but in v"1.5.0")
Most helpful comment
@yurivish I made a PR on your problem. Now returning:
Is it ok for you ?