Julia: Should true == 1?

Created on 21 Apr 2016 · 16Comments · Source: JuliaLang/julia

@vtjnash recommended I write a bug report after encountering this behavior and looking into workarounds.

At present:

true == 1
# true

This seems fairly jarring given that the Bool type seems much more restrictive than in languages like e.g. Python where implicit promotion/deference to parent methods for comparison are typical. For example, this:

if 1
    println("Ok!")
end
# ERROR: TypeError: non-boolean (Int64) used in boolean context

Would also imply to me (not necessarily in any formal sense, though) that the prior equality test should also result in a TypeError.

In this case I want to be able to put both 1 and true in a Set of type Any:

Set([1, true, "true"])
# Set(Any["true",true])

But due to hash/equality semantics, only one will make it in. I could return an alternative implementation of Set backed by an ObjectIdDict for my use case, which is a serialization library that will convey and receive sets from other languages (whose Boolean equality/hash semantics are heterogenous), but wanted to check in to see if this is the desired behavior, or falling out of some implementation detail.

Source

benkamphaus

Most helpful comment

It comes from the fact that Bool is essentially just UInt1 – a one-bit unsigned integer. The true == 1 part doesn't really bother me. The principle is that it's ok to use a boolean where a number is expected, but it's not ok to use a non-boolean where a boolean is expected. This is often handy, e.g. when counting the number of values that are true using sum. The Set business is more concerning. We could make Bool a non-numeric type, but that would have other consequences, e.g. it would no longer be possible to have im = Complex(false,true), although we've discussed recently whether that's a good idea.

StefanKarpinski on 21 Apr 2016

👍5

All 16 comments

StefanKarpinski on 21 Apr 2016

👍5

In a set you'll also run into problems if you want to add 1 and 1.0 at the same time. You should be able to use isequal (i.e. ===) instead of == to compare values in a set.

How does Set handle NaN? Technically, NaN != NaN, and it should thus keep all of them?

eschnett on 21 Apr 2016

@eschnett Just to be clear, since the Set impl is a HashSet - backed by a Dict's keys with all values as Void, the important detail is how each value hashes:

julia> Set([NaN, NaN])
Set([NaN])

julia> hash(NaN)
0x15d7d083d04ecb90

julia> hash(1)
0x02011ce34bce797f

julia> hash(true)
0x02011ce34bce797f

The Set is the concerning part for me as well @StefanKarpinski (impact is it will require a workaround that impacts usability when edge case is encountered). Maybe I should revise issue title to reflect Set as focus? The behavior of equality came out of the discussion on irc.

benkamphaus on 21 Apr 2016

I was confused; of course, isequal(NaN, NaN) is true.

eschnett on 21 Apr 2016

note that Char has a similar issue, and also fails for transitive equality with non-integer types 32.0 != ' ' == 32

The true == 1 part doesn't really bother me. The principle is that it's ok to use a boolean where a number is expected, but it's not ok to use a non-boolean where a boolean is expected.

in comparing bool == foo, I think it's arguably expected that foo::Bool

vtjnash on 21 Apr 2016

Yeah, I think it's even more surprising with Char in the Set context:

> Set([32, ' '])
# Set(Any[' '])

Probably more plausibly encountered in a scenario like:

> x = " foo"[1]
# ' '
> Set([32, x])
# Set(Any[' '])

benkamphaus on 21 Apr 2016

😕1

Since Char isn't a subtype of Number (unlike Bool) I agree ' ' == 32 doesn't seem right.

JeffBezanson on 21 Apr 2016

👍2

The char vs number thing is definitely a bug left over from when Char was a kind of integer. I'm working on a fix.

StefanKarpinski on 21 Apr 2016

The Char vs. Int comparison thing is surprisingly annoying to fix – we assume that you can compare integers and chars all over the place. I've almost got it done, but it makes me wonder about the change.

StefanKarpinski on 22 Apr 2016

I think it's still the right way to go. Unlike e.g. for real and integer numbers, there is not commonly used abstraction that makes characters a subset of integers. ASCII and UTF-8 are common, but they specify an encoding, not an equivalence. It's easy enough to convert between characters and integers, and if a certain piece of code requires this too often, then I wonder whether there's an abstraction missing.

Most other languages -- Python, Fortran, Mathematica -- don't allow such comparisons either. C and its descendents seem to be in the exception.

eschnett on 22 Apr 2016

"there is not commonly used abstraction that makes characters a subset of integers" .... Unicode?

stevengj on 22 Apr 2016

In any case, they don't need to be isequal, which would fix the Set issue.

simonster on 22 Apr 2016

Indeed Unicode offers a standard mapping from integer to char, but comparison between these types is still quite confusing. +1 for deprecating it, and requiring people to write Char(32) == ' ' when that's really what they want.

nalimilan on 22 Apr 2016

I don't know... what languages have true character types that don't allow such comparisons? Python doesn't have a character type per se, it only has length-1 strings, and Fortran is similar if I understand it correctly. I don't know about Mathematica, but Mathematica is not particularly well-known for its strength in string processing.

In any case, I feel like discussion of Char == Integer should be in a separate issue.

stevengj on 22 Apr 2016

" it would no longer be possible to have im = Complex(false,true), although we've discussed recently whether that's a good idea."
Assuming changing that is a good idea, imo the change should still allow
[U]Int +,-,* Bool, Bool +,-,* [U]Int with true working as 1 and false working as 0
by overloading those signatures (as they are so handy, at times).