(This issue is a follow-up to #34000 #28854 #31303)
The docstring for zero(x)
says
Get the additive identity element for the type of x
However, defining this additive identity precisely seems to be confusing, especially for union types, and in the presence of type promotion rules. The rest of this issue is dedicated to exploring more precisely what the semantics of zero(T)
is (to help figure out what exactly the semantics should be).
zero(::Union)
is undefined when it could be defined. Intuitively, one might expect that the presence of multiple zeros might complicate the definition. However, type promotion seems to allow for a uniquely defined result. If we take the definition above seriously, the statementsjulia> Float16(0.0) + Float32(1.0) === Float32(1.0) #works for Float32
true
julia> Float16(0.0) + Float16(1.0) === Float16(1.0) #and for Float16
true
would argue that zero(Union{Float16,Float32}) === Float16(0.0)
(it is currently undefined).
Int(0)
is also a valid choice for zero(Union{Float16,Float32})
, sincejulia> 0 + Float16(1.0) === Float16(1.0) # works because of type promotion
true
However, the statement above would also imply that Int(0)
is a valid choice for zero(Float16)
.
We can exclude this value by requiring zero()
to be endomorphic, i.e. zero(::T) :: T
, and by extension zero(T) :: T
.
Int(0)
appears to be the result of zero(T)
wheneverT <: Union{Missing,Number,Complex}
, even though it is not always the unique (or even correct) additive identity in T
:julia> zero(Union{Int,Complex})
0
julia> zero(Union{Real,Complex}) #due to type promotion, see #2
0
julia> 0 + Complex{Bool}(0) === Complex{Bool}(0) # Counterexample: actually Complex{Int}(0), so Int(0) is not an additive identity over Complex
false
julia> Int(0) + Int16(0) === Int(0) #Counterexample: Int16 <: Real but adding Int(0) to it promotes to Int
true
julia> Int(0) + false === Int(0) #Counterexample: Bool <: Real but adding Int(0) to it promotes to Int
true
zero(T)
for non-union types should be defined as the unique additive identity such that
a. zero(T)::T
, i.e. zero()
is an endomorphism, and
b. The identity zero(T) + zero(T) === zero(T)
is satisfied for all T
.
zero(Union{T,S}) === zero(T)
when zero(T) + zero(S) === zero(S)
, taking into account type promotion rules.
If no unique endomorphic zero exists, then zero(T)
should be undefined, except for zero(Bool) === false
as a special case.
This definition would preserve the endomorphic property for leaf types, like zero(Float16) === Float16(0.0)
.
This definition would preserve the result for zero(Bool) === false
, to be consistent with current arithmetic semantics of Bool
.
This definition would preserve the semantics of #28854 for zero(Missing) === missing
.
This definition would define zero(Union{Float16,Float32,Float64})
to be Float16(0.0)
(currently undefined).
This definition would define zero(Union{Float16,Int16})
to be Int16(0)
(currently undefined).
This definition would change zero(Real)
to be false
(and similarly in Case 3 above; currently Int(0)
); arguably a bug fix.
This definition would change zero(Union{Bool,Int})
to be false
(currently Int(0)
); arguably a bug fix.
The two cases that would change under the proposal above would be preserved if we defined zero(Bool) === Int(0)
, which is arguably the actual arithmetic semantics being described. Might be less of a breaking change to pick Int(0)
over false
.
Would it make sense to define a zero singleton that just adds to everything the way we want it to, or would that break type stability?
I hope this does not add too much noise to the discussion, but a similar need (of a "symbolic zero" object z
, such that a+z === a
for all a
), was expressed in https://github.com/FluxML/Zygote.jl/issues/275. I imagine this would be more useful now that in the past, as the type instability is not so bad for performance due to small union optimizations.
Would it make sense to define a zero singleton that just adds to everything the way we want it to
That's already essentially what false
is.
As it happens I was just thinking about this here: https://github.com/FluxML/Zygote.jl/issues/329#issuecomment-561296023
I think it would make sense to have an analogue to LinearAlgebra.I
, perhaps LinearAlgebra.Z
(which I guess we can't export until 2.0) which has the right properties of zero just like UniformScaling
. Call it UniformZero
or whatever.
@StefanKarpinski
That's already essentially what false is.
Maybe for Number
s, but not anything else:
julia> false + [1 0; 0 1]
ERROR: MethodError: no method matching +(::Bool, ::Array{Int64,2})
Also, as @MikeInnes has pointed out, you can't dispatch on false
which is sometimes desirable.
Another use case I ran into is defining zero(::Nothing) = nothing
- permissible if considering that Nothing
has only one element (nothing
), then one can define the trivial algebra over it. It turned out to be useful in an edge case with the method
ForwardDiff.partials(x, i...) = zero(x)
I'd rather not define any arithmetic operation on nothing
. Its main raison d'être is that it throws an error as soon as you try to do something with it, which helps catching bugs. We have missing
when you need propagation.
Would it make sense to define a zero singleton that just adds to everything the way we want it to
That's already essentially what
false
is.
So, this suggests that, when writing code that is supposed to be generic over numbers, we should always use false
instead of 0
? 🧐
Yes, that's essentially what it suggests. And true
for the generic unit value.
But doesn’t using false
to mean “the additive identity” (0
) and true
as “the additive unit value” (1
) seem like we’re overloading the meaning of what are traditionally just Boolean values?
I think it seems more sensible if you think of Bool
as equivalent to UInt1
; boolean operations are just special cases of the bitwise operations.
Even if it can be justified, it still seems strange to me to write
a += true
in order to add 1 to a
in a generic way.
I don't disagree; what would you prefer to write?
We already have a better generic 1: LinearAlgebra.I
.
julia> using LinearAlgebra
julia> x = 5f0 + im; x + I
6.0f0 + 1.0f0im
julia> julia> rand(5, 5) + I
5×5 Array{Float64,2}:
1.78862 0.772261 0.661623 0.913183 0.390289
0.366553 1.78299 0.545654 0.433197 0.36714
0.969424 0.646091 1.02947 0.68133 0.284309
0.14118 0.273732 0.127835 1.35712 0.424081
0.180714 0.120144 0.712769 0.548235 1.18592
I think moving UniformScaling
to Base
and adding an analogous UniformZero
would be a nice solution.
Note that I
imposes some restrictions on shape, e.g. UniformScaling(0) + rand(3,4)
is an error. And its type doesn't contain its value. Unlike the Zero()
singleton linked above.
Taking into account the discussion above (and trying to link it back to the original issue):
Regarding the zero singleton (for number types and ignoring matrices), my preference would be to simply use 0
and 1
for the additive identity and additive unit value, respectively.
This is currently problematic, of course, since these are Int
s, and standard promotion rules apply (hence the existence of zero(T)
, one(T)
and oneunit(T)
and the original proposal by @jiahao ).
Some proposals to address this:
Set zero() = false
and one() = true
, and use zero()
and one()
.
Create Zero()
and One()
(and maybe Oneunit()
) singletons, as proposed above.
In either case, fix the issues raised by @jiahao in the original issue.
From a user’s perspective, there isn’t much difference between these.
An additional nicety would be to have the compiler convert actual 0
s and 1
s to zero()
, one()
, or oneunit()
of it can determine that the corresponding 0
or 1
is being added or multiplied with something other than an Int
. Would this be too hard, fragile, or magical?
If you are going with singleton zero and one, I think it is worth considering supporting arbitrary monoids, as doing so does not require much more effort. I did this in InitialValues.jl and it simplified code doing reduction a lot. The idea is to define something like
abstract type Monoid <: Function end
struct Id{F}
op::F
end
struct Add <: Monoid end
const + = Add()
struct Mul <: Monoid end
const * = Mul()
(::F)(::Id{F}, x) where {F <: Monoid} = x
(::F)(x, ::Id{F}) where {F <: Monoid} = x
(::F)(x::Id{F}, ::Id{F}) where {F <: Monoid} = x
although Julia does not seem to support the code like the last three lines. But you can use macro to automate it (and defining structs like Add
).
Regarding the zero singleton (for number types and ignoring matrices), my preference would be to simply use
0
and1
for the additive identity and additive unit value, respectively.
One possibility to making code concise with those constants is to use Unicode:
const 𝟘 = Zero()
const 𝟙 = One()
@jiahao You also need to think about the case where there is no endomorphic zero, such as for Date
. Note that zero(x::Date)
is defined to be Day(0)
in https://github.com/JuliaLang/julia/pull/35554
xref: https://github.com/JuliaLang/julia/issues/18367#issuecomment-245133418 (and onward)
I think it would be very cool if we had true singletons for one and zero, and maybe for +/- infinity, too.
I recently came across a need for this with samples and weights. In some use cases, the samples are unweighted, in others not, and I want to write some generic code that can handle both scenarios. So it would be nice to have weights = fill(One(), n)
use zero memory, and for operations like all(isone, weights)
to run in zero time.
Also, since we already have singleton for things like π
and ℯ
, we could have sin(π) == 𝟙
, log(ℯ) == 𝟙
, log(𝟘) == MinusInf()
, and so on (if that would be considered not too breaking).
As I noted elsewhere (https://github.com/JuliaLang/julia/issues/36103#issuecomment-642894606) adding a Zero()
singleton type would tend to lead to type-unstable code, since the main reason to use it seems to be for accumulation where you aren't sure if the collection is empty.
I would rather see it's main use in generic code, where you'd use an array of One
in one scenario, and an array of some numeric type in another (e.g. my example with statistical weights). But which of the two it is would be known at dispatch time, and the code would be type stable.
I wouldn't use it for accumulation, etc.
Note that UnitWeights
in StatsBase is essentially the array of ones what you describe.
Yes, but constructs like UnitWeights
don't allow for single-element representation (which then later can be combined in an array again). It's actually one thing I really dislike about the StatsBase weights implementation - no way to have single elements. So I can never represent a single variate with it's weight without loosing what kind of weight it is. It also leads to behavior like broadcast(identity, uweights(10)) isa Array{Int64}
. I prefer the way Unitful does this: tagging values, not arrays explicitly (but implicitly, of course).
But I don't want to focus too much on a specific use case here - I think having zero-mem-cost arrays of ones and zero that can preserve their property under broadcasting (e.g. fill(𝟘, 10) .* rand(10) isa Array{Zero}
), possibly with special optimizations (one could consider things like fill(𝟙, 10) .* A === A
) would very useful for generic algorithms that produce optimal code.
StaticNumbers.jl provides some of this, but it's quite heavyweight (has to be) and also can't define things like sin(π) = 𝟘
without pirating Base. And zero, one and +/- infinity are just special, so I think there's a good case to be made for having singletons for them as part of Julia itself.
A nice idea for One
from @YingboMa (on slack) would be to do this:
_strides(A::DenseArray) = (One(), Base.strides(A)[2:end]...)
Perhaps combined with #30432, as a lighter-weight alternative to ArrayLayouts.jl.
As I noted elsewhere (#36103 (comment)) adding a
Zero()
singleton type would tend to lead to type-unstable code, since the main reason to use it seems to be for accumulation where you aren't sure if the collection is empty.
@stevengj Base
now has a singleton _InitialValue
which is like the singleton identity element but "untyped" (in the sense it is not aware of the reducing function op
; so, somewhat less type-safe). This generates very efficient code thanks to union-splitting and it's possible to improve this much more if you use the tail-call function-barrier pattern #34293 (which is like forced union-split). This is something you need to have anyway to implement something like collect
and map
based on mutate-or-widen strategy. So, I think singleton identity elements actually play nicely with well-written generic code in Julia.
Will _InitialValue
become part of the official API at some point, if it works out well?
Yeah, it'd be nice to turn this into an API. But, for defining foldl
etc., I think it'd be much better to provide overload API like _foldl_impl
so that the implementers do not care about the initial values in the first place. (But then if we are adding this API it'd be better to add the full transducer-compatible foldl
API...)
I have a feeling these are really separate things:
A kind of generic neutral element, for things like reductions, etc. - it's not really a zero, always, semantically
Singleton types for special values like one, zero and possibly +- infinity.
My intention was to share that there is enough usage experience for supporting the usefulness of singleton identity elements. It just happened to be reductions (although it's not a coincidence; that's where you really need to take "zeros" seriously).
For other special values like oneunit
, infinity, etc. I think we need to accumulate usage experience before considering adding it to Base
. So, somebody has to write a package and use it in the wild; e.g., make sure it works well with FillArrays.jl. (Yes, it'd be nice sin(π)
works as-is. But you can always wrap π
etc.)
Oh, sorry if my statement sounded critical - it wasn't meant to be. I absolutely do agree with you, I think having singleton identity elements would be extremely useful!
I was just trying to disentangle them from the mathematical one and zero singletons, since I had a feeling that some doubts expressed here about having Zero()
were that it wouldn't a good identity element.
So, somebody has to write a package and use it in the wild
Yes, I was actually thinking about putting together a "OnesAndZeros.jl" or "SpecialNumbers.jl", as a proof of concept. I'm not sure yet how much is possible without committing type piracy regarding to Base, though. But maybe quite a bit. :-)
Maybe it can have a commit_base_type_piracy()
function that enables some extra functionality. :-)
As for sin(π) == 𝟘
, I'm not sure how many real-world use cases there would be that would actually profit from it, performance-wise. But it just would be so elegant ...
I was just trying to disentangle them from the mathematical one and zero singletons, since I had a feeling that some doubts expressed here about having
Zero()
were that it wouldn't a good identity element.
Got it. Thanks for clarifying this.
My worry is that the same "doubt" could be brought up even for other generic special singleton uses other than the initial values of reductions. If it's in a common API, it _could_ be end up in the state variables of the loops. Of course, this may not happen so frequently because the special singletons are mostly for dispatches (IIUC). Having something like "SpecialNumbers.jl" in the ecosystem would be great for understanding this.
Ok, I'll go ahead with that, then, hopefully soon.
I hate to suggest broadening scope, but given this issue and the massive number of trig+pi (sinpi, cospi etc) functions we are starting to get, should this issue instead be taking symbolic numbers seriously and add a Symbolic
type which subtypes Number
which is similar (and possibly deprecates for 2.0) current Irrational
numbers? Then we could have make a non symbolic times a symbolic number yield a special type so sinpi(3)
be written as the imo more elegant sin(3*pi)
.
I think it makes sense to explore the design of singleton/partially-typed numbers here.
By the way, the topic in the OP is still important but unfortunately not much discussed. I think it's "orthogonal" to singleton numbers.
@jiahao How about re-posting the OP as a new issue? Something similar is required for, e.g., #31635. It'd be nice to have a dedicated place to discuss this.
By the way, the topic in the OP is still important but unfortunately not much discussed. I think it's "orthogonal" to singleton numbers.
I fully agree!
@perrutquist just pointed me to Zeros.jl so my ideas for a SpecialNumbers.jl are largely realized, already!
Zeros.jl even includes has a Zeros.@pirate Base
, akin to the commit_base_type_piracy()
I had been speculating about.
It's really cool to see that
using Zeros, BenchmarkTools
A = fill(One(), 10^6)
@benchmark all(isone, $A)
gives a mean run time of 1 ns, even though Zeros.jl doesn't include a specialized method for that. Kudos to @perrutquist and the Julia compiler team!
It would be weird if it took time to do a computation on something that takes no memory to store 😁
julia> sizeof(A)
0
It would - still, it's cool to see how Julia optimizes the for-loop in Base._all(isone, A, :)
away completely. :-)
Even count(isone, A)
runs in 1 ns - I didn't realize that Julia could turn a counting for-loop into a predictable value. Or is that LLVM itself?
It would be weird if it took time to do a computation on something that takes no memory to store
It's not quite true, tough. :-)
using Zeros, BenchmarkTools
A = fill(One(), 10^6)
@benchmark sum($A)
takes 10 μs.
But foldl(+, A, init = 0)
and mapreduce(identity, +, A, init = 0)
do run in 1 ns.
It's strange - Base.sum(A; dims = :)
takes 10 μs, but mapreduce(identity, +, A, init = 0, dims = :)
takes 1 ns. mapreduce(identity, +, $A, init = 0, dims = 1)
takes 10 μs, though, like sum
. But doesn't sum(A)
use mapreduce
with dims = :
, internally?
Most helpful comment
I'd rather not define any arithmetic operation on
nothing
. Its main raison d'être is that it throws an error as soon as you try to do something with it, which helps catching bugs. We havemissing
when you need propagation.