Julia: should isequal(0.0, -0.0) == true?

Created on 13 Sep 2016 · 41Comments · Source: JuliaLang/julia

Came up in https://github.com/JuliaLang/julia/issues/9381 but deserves its own issue: it's been proposed that positive and negative zero should hash and compare as equal, but that has an unfortunate interaction with sorting and stability.

breaking decision

Source

StefanKarpinski

Most helpful comment

My current thinking is that we should _not_ make this change. Here's a run-down:

0.0 and -0.0 give different results for some well-defined, perfectly valid operations such as 1/x and many complex operations involving branch cuts. So they're meaningfully different.
If a certain application wants to equate 0.0 and -0.0 for use as dict keys, it's easy to use x==0 ? abs(x) : x as the key. But if isequal equates the values and you want to distinguish them, that's harder to get.
If we're going to keep both == and isequal, we might as well make them consistent with IEEE equals and totalOrder, respectively. Keeping isequal just for NaNs is not a huge win. If the plan involved getting rid of isequal completely, it might be worth a bit of ugliness.

JeffBezanson on 20 Sep 2016

👍2

All 41 comments

Does it help or hinder to consider that in fact isequal might be two different ideas? Consider:

my_is_equivalent(2+2,4) is true (post evaluation)
my_is_the_same_as(2+2,4) is false (prior to evaluation)

It is perhaps unfortunate that multiplication propagates the negative for reals: 0.0 * -0.0 evaluates to -0.0; it is ok for integers.

With respect to 0.0 and -0.0 it is a similar problem to that of "is noon AM or PM?" In fact it is neither, however we often label noon pm because 11:59AM,12:00AM,12:01PM makes less sense than 12:00PM in that context.

colbec on 15 Sep 2016

I like this question.

# v0.5

nan1 = Inf/Inf;  nan1 = -0.0*Inf;  pz = 0.0; nz= -0.0;

nan1==nan2, nan1===nan2, isequal(nan1,nan2)
# (false, true, true)
pz==nz, pz===nz, isequal(pz,nz)
# (true, false, false)

# for a.b =Nan, NaN and for a,b = 0.0, -0.0
# (==)(a,b) = !(===)(a,b)  so  a == b  iff a and b differ

with (0.0 == -0.0) === true,
only NaNs have the property that equalsness is given with nonidentity
and inequalsness is given with identity

That is reason-about-able for entities that are NotNumber (excluded from numerosity).
And well avoided for entities that of numerousness or of absenct numerousness.

I favor that Julia specify isequal( 0.0, -0.0 ) is true.

JeffreySarnoff on 16 Sep 2016

@JeffreySarnoff Do you then agree that sorting of numbers doesn't distinguish between positive and negative zero any more, and that hashing returns the same value, and that dictionaries can hold only one key for zero, etc.?

eschnett on 16 Sep 2016

@eschnett no.

branch-cut sanity carries utility and pervades reliability more than does zeroing the zeros

pz = 0.0; nz = -pz;

showcompact( (atan2(nz,nz), atan2(nz, pz), atan2(pz, pz), atan2(pz, nz)) )
# ( -3.14159,-0.0, 0.0, 3.14159 )

object_id( pz ) != object_id( nz ) # now, and my forward preference
with this, object_id dicts hold distinct keys for 0.0 and -0.0

JeffreySarnoff on 16 Sep 2016

@eschnett One cannot sort handedness nor parity, one may group like handednesses and same parity.

This is not a question:

taitjai

Which color precedes?

JeffreySarnoff on 16 Sep 2016

White.

KristofferC on 16 Sep 2016

The lower White or the left White?

JeffreySarnoff on 16 Sep 2016

That's a tricky one. Will have to come back to you on that.

KristofferC on 17 Sep 2016

@JeffreySarnoff Did you answer my question?

Let me rephrase things:

What should issorted([0.0, -0.0]) return?
Should hash(0.0) == hash(-0.0)?
Can Dict dictionaries hold different values for keys 0.0 and -0.0?

eschnett on 17 Sep 2016

@eschnett alright

Ought isequal( 0.0, -0.0 ) be true? Yes.

Must object_id( 0.0 ) differ from object_id( -0.0 )? Yes.

Must some Dict dictionaries hold _different_ values for keys 0.0 and -0.0? Yes.
May other Dict dictionaries hold _the same_ values for keys 0.0 and -0.0? Yes.
Should we allow Dict dicts to hold _the same_ values for keys 0.0 and -0.0? Yes.

isequal(x,y) must imply that hash(x) == hash(y). (the docs)

Should hash(0.0) == hash(-0.0)? Yes.

JeffreySarnoff on 17 Sep 2016

Must some Dict dictionaries hold different values for keys 0.0 and -0.0? Yes.

That's fine for ObjectIdDict, but otherwise contradicts isequal(0.0,-0.0) being true. Unless we add a fourth standard equality predicate.

JeffBezanson on 17 Sep 2016

@JeffBezanson you intuited my intent. As long as ObjectIdDict is available, the _Must_ some _Dict dictionaries_ statement is satisfied, and happiness suffuses logic.

JeffreySarnoff on 17 Sep 2016

Dict is a type in Julia; if you use Dict to create a dictionary, you will not get an ObjectIdDict. Thus my expression "Dict dictionary", as I did not want to discuss ObjectIdDict.

eschnett on 18 Sep 2016

Here's a crazier but more general idea: might it be possible to rename the current Dict type

immutable HashDict{K,V,Hft,Eft} <: Associative{K,V}
    hash::Hft
    isequal::Eft
    ...  # as present
end

then define

typealias Dict{K,V} HashDict{K,V,typeof(hash),typeof(isequal)}

and thus allow users to supply their own hash and equality functions, kind of like how C++ does it?

TotalVerb on 18 Sep 2016

Yes, it's possible and works like you surmised. Here's a complete demo implementation I had toyed with a bit a couple months ago: https://github.com/JuliaLang/julia/commit/fce28b4a9e3e59dc3c9ab566b77d7bf63b6d7d91

vtjnash on 19 Sep 2016

I don't really want to do that – I think it's a case of serious over-parameterization of the type. If you want to use custom hashing for your dict, you should probably call a transformation function on your keys before inserting them; it's much simpler and easier to reason about. It also doesn't address this problem since we still need to provide an isequal function that works well.

StefanKarpinski on 19 Sep 2016

One parameter could be removed by keeping only the hash function, and computing the isequal function from it. This would also prevent using a hash function with the incorrect isequal companion function.

TotalVerb on 20 Sep 2016

Ok, but this is totally tangential since however Dict works, we still need to design the way the isequal function works in the first place. Saying "you can use whatever equality predicate you want" isn't solving anything – it's just sweeping the problem under the proverbial rug.

StefanKarpinski on 20 Sep 2016

@StefanKarpinski Point understood. The Dict issue is orthogonal to what the best behaviour for isequal should be. I lean towards making 0.0 and -0.0 equal and hash equal — this seems far less surprising mathematically. Does sorting really require isless and isequal to be consistent? I believe many other languages make do with only a single comparison predicate.

TotalVerb on 20 Sep 2016

Also, if things as different as 0 and 0.0 are going to be considered equal, then 0 and -0.0 ought to be also. Especially since -0.0 is a better "exact zero" than 0.0 is in IEEE floating point arithmetic.

julia> isequal(0, 0.0)
true

julia> isequal(0, -0.0)
false

TotalVerb on 20 Sep 2016

Especially since -0.0 is a better "exact zero" than 0.0 is in IEEE floating point arithmetic

How so?

JeffBezanson on 20 Sep 2016

-0.0 is a better neutral element for summations. We have -0.0 + 0.0 = 0.0.

Perhaps it's an exaggeration to say that -0.0 is a better exact zero, since -1.0 + 1.0 = 0.0. But -0.0 is just as zero as 0.0 is, and shouldn't be !isequal to integer zero.

TotalVerb on 20 Sep 2016

Consider that (-0.0)+(-0.0)-(-0.0)==0.0 To me this indicates that -0.0 really should equal 0.0, as otherwise addition that really should be commutative isn't. I would similarly suggest that hash(-0.0)==hash(0.0) and that dicts should treat them the same way.

oscardssmith on 20 Sep 2016

My current thinking is that we should _not_ make this change. Here's a run-down:

0.0 and -0.0 give different results for some well-defined, perfectly valid operations such as 1/x and many complex operations involving branch cuts. So they're meaningfully different.
If a certain application wants to equate 0.0 and -0.0 for use as dict keys, it's easy to use x==0 ? abs(x) : x as the key. But if isequal equates the values and you want to distinguish them, that's harder to get.
If we're going to keep both == and isequal, we might as well make them consistent with IEEE equals and totalOrder, respectively. Keeping isequal just for NaNs is not a huge win. If the plan involved getting rid of isequal completely, it might be worth a bit of ugliness.

JeffBezanson on 20 Sep 2016

👍2

this indicates that -0.0 really should equal 0.0, as otherwise addition that really should be commutative isn't

That is arguably handled by ==.

JeffBezanson on 20 Sep 2016

My point is just that given how many seeming tautologies applied to -0.0 result in 0.0 could cause some really confusing results (although I guess that mainly is because using floats as dict keys is an absolutely horrible idea)

oscardssmith on 20 Sep 2016

Nevertheless, it is still confusing that isequal(0, 0.0) but not isequal(0, -0.0). If equality must be transitive, then perhaps !isequal(0, 0.0) is the better option.

TotalVerb on 20 Sep 2016

it is still confusing that isequal(0, 0.0) but not isequal(0, -0.0)

Yes, I can kind of see that, but once you accept that isequal distinguishes negative zero it makes sense. 0 certainly isn't a "negative zero", but in fact could be with a signed-magnitude integer type, so this doesn't bother me so much.

JeffBezanson on 20 Sep 2016

but consider that isequal(-0, -0.0) is false.

oscardssmith on 20 Sep 2016

But -0 === 0; the expression used to compute a value is immaterial.

JeffBezanson on 20 Sep 2016

I like this question.

0.0 and -0.0 give different results for some well-defined, perfectly valid operations such as 1/x and many complex operations involving branch cuts. So they're meaningfully different.

Yes, and that is a [meaningful operational] difference without a [measure theoretic] distinction. That they are different hither and the same yon is meaningfuller.

   -  they differ as ObjectIDentifiable  symbols
    - additively, they are the same
    - multiplicatively, they differ
    - they are the same under linear ordering, 
           (without nonzero measure, parity  is as orientation -- intrinsically unorderable)

If a certain application wants to equate 0.0 and -0.0 for use as dict keys, it's easy to use x==0 ? abs(x) : x as the key. But if isequal equates the values and you want to distinguish them, that's harder to get.

Reading that standing on my head,: the branch cuts and the multiplictive resolution of odd signedness are handled for almost all applications that are not writing a floating point math library. The likelihood is members of the Julia community and a user of a Julia products for their own purposes consider them an incomplete sameness rather than as not completely different. Relying on the coherently consistent branch cuts that are availble with 0.0, -0.0 requires onlty that !(0.0 === -0.0), irrespective of the truth valueisequal( 0.0 , -0.0) yeilds. As one is of object non-identitty and iesequal(0.0, -0.) is is of the inststinguishability of lattice-free measures zero, they need [qua ought] not agree..

If we're going to keep both == and isequal, we might as well make them consistent with IEEE equals and totalOrder, respectively. ...

Doing so assigns truth to each of these expressions

  (0.0 == -0.0),  !( 0.0 === -0.0 ),  isequal(0.0, -0.0)

We are in agreement.

JeffreySarnoff on 25 Sep 2016

@TotalVerb: Does sorting really require isless and isequal to be consistent?

Yes, since otherwise if you change from a hash-based dictionary data structure to a tree-based one, you'll get different behavior, which is awful. Not having isless and isequal be consistent would be pretty bad and whether you have another name for it or not, it implicitly introduces yet another ordering.

I believe many other languages make do with only a single comparison predicate.

Lots of languages only have a few names for their orderings. But they usually have a lot of implicit orderings created by the rampant inconsistencies between the different orderings that they actually exhibit. There's a reason that equality is the first place to look for "wat" behavior in languages.

Given @JeffBezanson's arguments and the fact that no clever solutions to the sorting issues raised by isequal(-0.0, 0.0) have been forthcoming, I'm inclined to agree that we shouldn't do this.

StefanKarpinski on 27 Sep 2016

👍1

That argument makes sense to me.

TotalVerb on 27 Sep 2016

Actually, the sorting issue makes some sense to me as well.

import Base: unbox, slt_int, eq_float, isless, isequal


for (T,I) in [(:Float64, :Int64), 
              (:Float32, :Int32), 
              (:Float16, :Int16)]
    @eval begin
        isequal(a::$T, b::$T) = eq_float(unbox($T,a), unbox($T,b))
        function isless(a::$T, b::$T)
            ia = reinterpret($I, a)
            ib = reinterpret($I, b)
            ifelse( signbit(ia) & signbit(ib), 
                    slt_int(ib,ia), slt_int(ia,ib) )
        end
    end
end


floatvec = [-Inf,-100.0,-1.0,-0.5,-0.0,0.0,0.5,1.0,100.0,Inf];
floatvec == sort(floatvec)  # true
sort([NaN, Inf, -Inf])'     # [-Inf, Inf, NaN] (as now)

issorted([ -0.0, 0.0 ]), issorted([ 0.0, -0.0 ]) # (true, false)

-0.0 === 0.0, 0.0 === -0.0              # (false, false)

-0.0 == 0.0, 0.0 == -0.0                # (true, true)
-0.0 < 0.0,  0.0 < -0.0                 # (false, false)
-0.0 <= 0.0,  0.0 <= -0.0               # (true, true)

isequal(-0.0, 0.0), isequal(0.0, -0.0)  # (true, true)
isless(-0.0, 0.0), isless(0.0, -0.0)    # (true, false)

that's all -- thanks for the active discussion

JeffreySarnoff on 28 Sep 2016

Seems like this is resolved in favor of preserving the current behavior isequal(+0.0, -0.0) == false, in which case this issue can be closed.

stevengj on 1 Oct 2016

This seems resolved, and (for once) conveniently in a way that requires no action.

StefanKarpinski on 27 Oct 2016

👍1

ok. for my own edification: what was the drawback to the implementation I
posted responding to consistent sorting of negatives before nonnegatives?

On Thu, Oct 27, 2016 at 1:28 PM, Stefan Karpinski [email protected]
wrote:

This seems resolved, and (for once) conveniently in a way that requires no
action.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/julia/issues/18485#issuecomment-256714087,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABmqxtvyRIfXyXnkfqa_BQeWj5ljuvsQks5q4N9MgaJpZM4J8CCv
.