Julia: Introduce `UInt1` type to replace misuse of `Bool`

Created on 6 Sep 2016  Â·  67Comments  Â·  Source: JuliaLang/julia

This will solve a whole bunch of problems.

  • UInt1 is a lot more intuitive than Bool as a identity element for promote_type.
  • true + true === 2 won't be inconsistent with every other unsigned type; we can have UInt1(1) + UInt1(1) === UInt1(0) as expected.
  • Bool can be made non-numeric. (cc @StefanKarpinski)

I think the current system is really not ideal; it's worse than a method pun: it's punning a type for two very different things.

decision help wanted

Most helpful comment

I oppose the idea of having two different modules. Base should provide a sensible default, not excessive customizability.

All 67 comments

If we decide to commit to UInt1 as an identity element for promote_type, then a lot of advantages fall out. For instance, sum([]) and prod([]) could return a usable result.

sum([]) still can't return a usable result, because UInt1 is only an usable as an identity element for promotion of numeric types.

sum([]) should return UInt1(0), and prod([]) should return UInt1(1). Because most Julia code is type-stable, it is easy to forget that some code cannot be made type-stable.

Here's an example: map(length, []) returns Any[], because methods of length can return anything. So prod(map(length, [])) currently fails. This is not the most efficient code (mapreduce works better) but demonstrates a case where the current situation of bailing out because there's no one(Any) is just not appropriate.

Because most Julia code is type-stable, it is easy to forget that some code cannot be made type-stable.

Then in the interest of maintaining as much type stability as possible, wouldn't it make more sense to continue to error on sum([]) and prod([]) rather than returning something of a different type?

sum(::Vector{Any}) is already type-unstable, and it couldn't be any way else. The only time where it fails is when the vector is empty, which is a very annoying inconsistency.

Am I understand this right?
This would be a large change to the behaviour of addition of Bools.

I suspect making true + true == false will break a lot of code.
For example
I often calculate the accuracy of a binary classifier by:

mean(output .== ground_truth)

However this is probably bad.
It doesn't work in most languages.

I am in-favor of breaking lots of code, to get the language right, prior to hitting v1.0

@ararslan I cannot stress enough that Vector{Any} is an important case that should be dealt with properly, because this is commonly what is produced whenever code can't be made perfectly type-stable. As it stands, to find the sum of elements in a Vector{Any}, the code must be isempty(xs) ? 1 : sum(xs). This is not just a theoretical concern; it comes up in practice in extremely common areas.

@oxinabox I propose that Bool be made non-numeric, and that UInt1 be introduced to model Fâ‚‚. Note that the common notational meaning of + and * on Bool is | and & respectively; these already have symbols in Julia and so defining + and * on Bool is not necessary and can be removed.

To calculate the accuracy of a binary classifier, you can just convert the Bools to Float64 first: mean(Float64(x) for x in output .== ground_truth).

For @oxinabox's specific case there, you can pass a function/type constructor as the first argument to mean (same goes for sum and prod), so you can do mean(Float64, output .== ground_truth).

+1 for UInt1 - it sounds really useful to have Z2. What will a bit in a BitArray be, though? Will we need BoolArray?

I also kind-of like having true + true == 2 but I agree it is kind-of disgusting at the same time.

As it stands, to find the sum of elements in a Vector{Any}, the code must be isempty(xs) ? 1 : sum(xs)

I assume that is meant to be a zero, not a one. We could specialize sum(::AbstractVector{Any}) etc to do exactly this. Or even better, why not just define a method for zero(Any) and one(Any)? For instance, if zero(::Type{Any}) = 1 (or 1.0, maybe) then zero(Any)::Any so it's all very consistent.

(As an aside, in other contexts I've also been wanting a singleton One() and Zero(), which might be useful here? For a simple example. consider Float64(Zero) (meaning roughly convert(Float64, Zero())) instead of zero(Float64). We _don't_ have pi(Float64) but we do have Float64(pi) - because it makes sense that way!).

sum([]) should return UInt1(0)

@TotalVerb, the problem is that UInt1(0) is the additive identity only for numbers, whereas sum (by design) works for anything with a + operation (and ideally zero too). There is simply no satisfactory way to define sum for Any[], because there is no additive identity that works for any conceivable type defining +.

This is why zero(Any) is undefined.

because there is no additive identity that works for any conceivable type defining +.

Maybe this plays nicer with my suggestion of Zero() such that MyType(::Zero) should be defined along with +. We can have Any(::Zero) = Zero() and sum([]) = Zero() and if you try to add that result to anything else that supports + then it would all work out (we could have +(x, ::Zero) = x, etc, already defined in Base). It won't always be type stable, but it might always be "logically stable".

@andyferris, honestly, it seems like an opaque Zero() will confuse a typical user, it won't work with lots of methods (imagine a user doing sin(sum(A)) and passing A=[]), and it adds one more obscure method to the list of things a type needs to implement. Anyone who understands Zero() can surely use T[] instead of [].

How often does one take the sum of non-numbers? Nobody is actually writing sum([]). This is about handling the empty case sanely, in 90% of cases at least, which is a strict improvement over an error, which handles 0% of cases.

How often does one take the sum of non-numbers?

I would venture a guess that it's rather common amongst people coming to Julia from R. That said, I don't think sum(::Vector{Bool}), akin to Lyndon's example, is what should actually be used in Julia; that's why we have count.

I think another important question we should ask is: how often does one take the sum of Array{Any}? IMHO it might even make more sense to have _all_ reduction operations fail on Array{Any}, not just the empty case, perhaps saying something like "Cannot reduce an array of type Any. Convert the elements of the array to a common type first." I think that may be annoying in some cases, but folks will end up with type-stable code as a result.

If UInt1 becomes a thing, I think we'll have to think carefully about deprecation so as not to suddenly and mysteriously break things that relied on these kinds of operations on Bools that we didn't foresee.

@tkelman, it's not that uncommon to sum arrays of arrays of numbers, for example. And note that we throw an error for sum(Vector{Int}[]) too; it would be weird if _loosening_ the array type made it "work" (but probably not give what you want).

Giving something that is _definitely wrong_ 5% of the time is worse than giving an error 100% of the time, in my opinion.

@ararslan, Julia is a still a dynamic language. In many other contexts, we allow objects of type Any and determine method applicability at runtime. I don't see why sum of non-empty Any[...] arrays shouldn't be the same — if + works, it should work.

Anyway, this discussion is a bit derailed. Whether a UInt1 type makes sense is independent of whether zero(::Type{Any}) = UInt1(0), and the empty case of sum can be (and has been) discussed elsewhere.

See also e.g. #8092, #12461, #6994, #6069 for discussions of sum and the empty-array case.

@stevengj It is precisely because all those are closed that I'm discussing this here. Furthermore, I'm reminded of

julia> map(() -> 2)
2

which is a generalization to the empty case that has a similar objection—it is an incorrect generalization in some cases; perhaps I would have expected [2] or [2, 2, 2]. But yet in this case, we do not throw an error, because returning the simplest possible result is sane in many situations.

Also, I disagree that this discussion is irrelevant to UInt1. The existence of a zero or a one that can promote upward to any numeric type is a strong endorsement of returning those zeros or ones as a degenerate base case for certain operations. The current situation with Bool is unfortunate, and it is understandable why sum([]) should not return false, for a variety of reasons including user bewilderment.

A naïve implementation of the Flatten iterator type demonstrates why a sane default is really not such a bad thing.

immutable Flatten
    xs
end
length(itr::Flatten) = sum(length, itr.xs)  # seems reasonable enough?

Then length(Flatten([])) breaks down.

The map case is very different, because it is a question of dispatch on the argument signature, not on the values.

I disagree that it is very different. It is just as conceivable to imagine map(f, xs...) as it is to imagine sum(xs).

I'm not convinced that we need or want UInt1. If people want Z2 modular arithmetic, it seems like a modular arithmetic package would be a better way to get it.

But we need a type that can be promoted upward to any other type. Currently we're using Bool for this, but UInt1 makes more sense.

What do we need it for? There's not a lot of use cases that aren't inherently type unstable.

For instance, im = Complex(false, true), ε = DualNumber(false, true), etc.

Yeah, I'm not entirely sure those are a good idea anymore. Having an Imaginary type may be better.

I disagree. Should we have three Quaternion types, one for each axis?

Yeah, I'm not entirely sure those are a good idea anymore. Having an Imaginary type may be better.

I disagree. Should we have three Quaternion types, one for each axis?

Sorry to beat the same drum, but we only need two special types for all these cases and more:

zero = Zero()
one = One()
# (::Zero)(::Type{Float64}) = 0.0 (just saying that we can replace the function zero() entirely with call on singleton zero)
im = Complex(zero, one)
ε = DualNumber(zero, one)
q1 = Quaternion(zero, one, zero, zero)
q2 = Quaternion(zero, zero, one, zero)
q3 = Quaternion(zero, zero, zero, one)

I agree that it is potentially confusing to introduce these singletons, and it would be worryingly difficult to get it neat and tidy (e.g. an imaginary number Complex(zero, 1.23) would need two type parameters, Complex{T1,T2}), but I am suggesting these here as a useful alternative to UInt1 if you just want some number which promotes correctly to other numbers.

@andyferris That is indeed a good idea. If we have these singletons, then I wouldn't mind if UInt1 were defined in a separate package.

However, I like Complex{T} better than Complex{T,T}.

Making Complex{T1,T2} would possibly be the most breaking part. E.g. code that has Matrix{Complex{Float64}} will be a matrix of abstract types - so maybe not "breaking" but certainly slowing everything down by an order of magnitude (run-time as well as programmers having to type Complex{Float64, Float64} and Quaternion(Float64, Float64, Float64, Float64}.

But maybe more flexible?

Since these are likely constant-folded at compile-time, Complex{Union{Zero, One}} is likely just fine for performance.

having to type Complex{Float64, Float64}

Unless the language supported default arguments on type parameters, like immutable Complex{T1 <: Number, T2 <: Number = T1}.

Since these are likely constant-folded at compile-time, Complex{Union{Zero, One}} is likely just fine for performance.

Won't work for Imaginary, though.

We simply don't need an Imaginary type; Complex does just fine.

The definition

const im = Complex{Union{typeof(zero), typeof(one)}}(zero, one)

should work.

We could even reuse the functions zero and one for these new numbers.

We simply don't need an Imaginary type; Complex does just fine.

Well it does "just" fine, but Imaginary is a performance enhancement. Consider z * im with complex z. This calculation does two useless multiplications and two useless additions by 0 (or false). Have an imaginary type would effectively double the speed of this calculation.

Furthermore, having zero and one would remove the other useless multiplication by 1 (or true), making it even faster. @StefanKarpinski for these reasons I see these as complementary features/optimizations (i.e. removes the desire for Complex{T1,T2}).

We could even reuse the functions zero and one for these new numbers.

That _is_ interesting. But we have to type typeof(zero) sometimes...

If we must have this Imaginary{T} type, I would prefer it not have its own name, but be Complex{typeof(zero), T}. In any event, I disagree that im should have Imaginary type — it seems far more useful to have it be Complex.

Furthermore, having zero and one would remove the other useless multiplication by 1 (or true),

I also wonder what we can do with constant propagation in this space, e.g. if multiplying by const(1) could be made a no-op in inference or something. Not sure how to make this user-extensible, though.

If we must have this Imaginary{T} type, I would prefer it not have its own name, but be Complex{typeof(zero), T}. In any event, I disagree that im should have Imaginary type — it seems far more useful to have it be Complex.

Not that I completely disagree, but I'm _guessing_ that people would take a new type over huge breakage. But we haven't reached version 1.0 yet, so who knows?

I would also rather target having the optimizer constant-fold the multiplication by im than distinguishing its type. This seems simple enough if * is inlined and the return value of false * x is a compile-time constant. In fact, I wonder if it is not done already.

An Imaginary type is simply a major disaster for usability. Consider the vector [1+0im, 1im, -1+0im, -1im] which would suddenly have type Any. Sure, this can be fixed with 0+1im, but that is just unnecessary typing.

For what it's worth,

julia> f(x) = x * im
f (generic function with 1 method)

julia> @code_native f(0+0im)
    .text
Filename: REPL[25]
    pushq   %rbp
    movq    %rsp, %rbp
Source line: 1
    movabsq $140696697704352, %rax  # imm = 0x7FF680B00FA0
    movb    (%rax), %cl
    movb    1(%rax), %r9b
    xorl    %r8d, %r8d
    andb    $1, %cl
    movq    (%rsi), %rdx
    movq    8(%rsi), %rsi
    movq    %rsi, %rax
    cmoveq  %r8, %rax
    testb   %cl, %cl
    movq    %rdx, %rcx
    cmoveq  %r8, %rcx
    andb    $1, %r9b
    cmoveq  %r8, %rsi
    subq    %rsi, %rcx
    testb   %r9b, %r9b
    cmoveq  %r8, %rdx
    addq    %rax, %rdx
    movq    %rcx, (%rdi)
    movq    %rdx, 8(%rdi)
    movq    %rdi, %rax
    popq    %rbp
    retq
    nopw    %cs:(%rax,%rax)

My x86 assembly is rusty but I don't see a mulq there. I suspect the optimization you're looking for is already done.

As a footnote, Prof. Kahan advocates using an Imaginary type, but purely for reasons of numerical accuracy.

If it is necessary for numerical accuracy, then I could support an Imaginary type. But nevertheless, we still need a zero and one value, unless we want to define 3 types for Quaternion and 7 types for Octonian.

My x86 assembly is rusty but I don't see a mulq there. I suspect the optimization you're looking for is already done.

Wow that is really long! It just has to reorder two numbers and apply minus to one of them. I've been staring at _StaticArrays_ assembly a lot and 2-vector code is typically much, much shorter.

I think that LLVM does a lot of the optimizations for multiplying by 1, etc, not inference/codegen.

Consider the vector [1+0im, 1im, -1+0im, -1im]

Doesn't this run through promote_type and thus become Vector{Complex{Int}}? But, yes I imagine there are other just as annoying corner cases here with a very similar problem.

For reference:

julia> f(z) = Complex(-z.im, z.re)
f (generic function with 1 method)

julia> @code_native f(0+0im)
        .text
Filename: REPL[2]
        pushq   %rbp
        movq    %rsp, %rbp
Source line: 1
        xorl    %eax, %eax
        subq    8(%rdx), %rax
        movq    (%rdx), %rdx
        movq    %rax, (%rcx)
        movq    %rdx, 8(%rcx)
        movq    %rcx, %rax
        popq    %rbp
        retq
        nopl    (%rax)

Really not optimal.

This is an optimization that should be possible, so if it is not optimal, that should be fixed.

@andyferris I've figured out the problem: the compiler won't constant-fold im. This works:

julia> macro im()
           :(Complex(false, true))
       end
@im (macro with 1 method)

julia> f(x) = x * @im
f (generic function with 1 method)

julia> @code_llvm f(0+0im)

define void @julia_f_71105(%Complex.68* noalias sret, %Complex.68*) #0 {
top:
  %2 = getelementptr inbounds %Complex.68, %Complex.68* %1, i64 0, i32 1
  %3 = getelementptr inbounds %Complex.68, %Complex.68* %1, i64 0, i32 0
  %4 = load i64, i64* %2, align 8
  %5 = sub i64 0, %4
  %6 = load i64, i64* %3, align 8
  %.sroa.0.0..sroa_idx = getelementptr inbounds %Complex.68, %Complex.68* %0, i64 0, i32 0
  store i64 %5, i64* %.sroa.0.0..sroa_idx, align 8
  %.sroa.2.0..sroa_idx1 = getelementptr inbounds %Complex.68, %Complex.68* %0, i64 0, i32 1
  store i64 %6, i64* %.sroa.2.0..sroa_idx1, align 8
  ret void
}

julia> @code_native f(0+0im)
    .text
Filename: REPL[16]
    pushq   %rbp
    movq    %rsp, %rbp
Source line: 1
    xorl    %eax, %eax
    subq    8(%rsi), %rax
    movq    (%rsi), %rcx
    movq    %rax, (%rdi)
    movq    %rcx, 8(%rdi)
    movq    %rdi, %rax
    popq    %rbp
    retq
    nopl    (%rax)

Due to the problem being an inability to constant-fold im, I suspect Imaginary will do nothing to fix this performance issue.

OK, thanks @TotalVerb. If that is reliable, maybe some of the problems disappear (but not necessarily all of the things Imaginary could do).

The only thing that Imaginary can do that Complex with constant folding can't is produce a more precise result, and Prof. Kahan asserts that there are situations where that can be the case. This can justify an Imaginary type, but I don't think it is appropriate to set im = Imaginary(true) — I think those cases where numeric accuracy are important can explicitly be annotated as Imaginary(x).

Also, even with an Imaginary type, and even if we set im = Imaginary(one), we still need either one or UInt1(1).

I spend the last day writing a function in LLVM IR that operates on 2bits (packed in 64bits). This meant working with <32 x i2>. LLVM supports arbitrary bit lengths for integer and it would be nice to have support for them in Julia too. (Even of only through bitstype 2 UInt2). Maybe we could relax the restriction on the size in bitstype N UIntNthat it needs to be a multiply of 8? These types make little sense standalone, but as part of vectors they simplify working with packed data.

I am experimenting right now with relaxing the requirement on bitstype so that we might support odd types.

It's not necessary to change how bitstypes work. You can instead implement n-bit integers by storing them in m bits, where m>=n, rounded up to the next byte. I've done this in SmallInts.

My motivation there was to test algorithms that use integers while being able to test the algorithms for all possible inputs. Using integers with fewer bits makes it easier (faster) to iterate over all of them, especially if an algorithm expects multiple arguments.

(Apologies for not remembering that I already implemented a Uint1 type there.)

Notwithstanding the technical points that have been raised, I think the big conceptual issue remains: Is a Bool (or an element of BitArray for that matter) meant to represent a number or a logical value?

If a Bool is a number then its two values should be 0 and 1 (not true and false) and these should have the usual meaning and behavior of 0 and 1 in arithmetic contexts, e.g. 1+1 promotes to 2. In this interpretation the logical operations (|, &, $, ~) are foreign. On the other hand if a Bool represents a logical value, then the two values should be true and false and the logical operations are the appropriate ones to implement; in this case the arithmetic operators are puns with ambiguous interpretation.

I think it would be preferable to have numeric types and logical types treated rather distinctly, e.g. to deprecate support of arithmetic operators (+,*, etc.) and arithmetic functions (e.g. sum, sin) for logical values in favor of logical operations. In those cases where the user wants to treat an instance of one type as an instance of the other, I don't think it will be awkward to require explicit conversions. It will certainly be better for code clarity. And, as @StefanKarpinski suggested, if people really want to do algebra over GF(2), that can be achieved by importing a package that implements +,* for Bool (or UInt1 or some other type yet to be proposed.) Putting such behavior in a package rather than in Base will at least force people to think about what behavior they want, instead of silently surprising them with a default behavior they may not expect.

I agree with most of your points. But |, &, $, and ~ are supported on all integer types, so I would argue they are not foreign to UInt1.

Also, I really think GF(2) is too important to leave out of Base. It's the most natural definition of im and similar.

@TotalVerb Yes, |, &, $, and ~ are supported for integer types, but they are foreign in the sense that they are imparting an additional structure on integers, namely a base-2 representation.

I agree that GF(2) is very important and should be provided in the standard distribution. But the question is, should this be the _default_ behavior of Bools? I think both true + true == 2 and true + true == false are reasonable interprations, and I think many people will prefer one behavior will many will prefer the other.

An approach I think would be satisfactory for most people would be the following: Have the standard distribution come with two modules, with descriptive names such as BoolAsInt and BoolAsGF2, providing the two different behaviors for arithmetic operations on Bools. Neither module would be used by default, and an expression such as true + true would issue an error along with a suggestion to either convert the bool values to numbers, or to use one of the two modules to implement the desired behavior. With this approach, anybody can easily obtain either desired behavior (even different behaviors in different parts of a project) and there is no danger of unexpected behavior.

I oppose the idea of having two different modules. Base should provide a sensible default, not excessive customizability.

@TotalVerb I agree that having a sensible default would be preferable, provided it doesn't cause confusion or make it too awkward to achieve the other behavior when desired. Especially in this case, since both behaviors seem about equally reasonable and useful. As someone fairly new to Julia, I can think of only four ways to support both behaviors:

  1. Provide a definitive behavior of +, *, etc. for Bool, and provide functions with different names to implement the other behavior(s).
  2. Provide a default behavior of +, *, etc. for Bool along with a module that can be imported to override the default behavior (likely to cause all sorts of confusion)
  3. Provide two Bool-like types, on which + and * behave differently (also confusing and awkward)
  4. Provide no default behavior for Bool arithmetic, and have the user choose the desired behavior (if needed) on a per-namespace basis by using the appropriate module

Each option has its advantages and disadvantages, but 1 and 4 seem to me to be the least undesirable.

My opinion is that any time you have two sensible behaviours for a type, and are tempted to allow the user to _choose_ behaviour, then that type is a type-pun: it's serving the purpose of two types. Hence I'm more comfortable with option 3.

My opinion is that any time you have two sensible behaviours for a type, and are tempted to allow the user to choose behaviour, then that type is a type-pun: it's serving the purpose of two types

That's a fair point. So, suppose there is a boolean type for GF(2) arithmetic. Then would it be useful to have a second type for "regular" arithmetic, if it just ends up getting promoting in most operations? Or even a third type with logical (as opposed to algebraic) semantics?

What about

zero(::Type{Any}) = UInt1(0)*I
one(::Type{Any}) = UInt1(1)*I

That is return a UniformScaling, which should behave like zero/one for all types.

@dlfivefifty, UniformScaling is a multiplicative identity. This doesn't make sense if a type implements a group under + and not *.

Right, but it will support the most common case of Number and Array. Then any other group (I'm not aware of any in Base) would through an error that +(::MyGroup,::UniformScaling) is not defined.

I guess only square Matrices as rand(10,3) + 0*I errors out.... nevermind

I think it's safe to say we decided not to do this for v1.0

Was this page helpful?
0 / 5 - 0 ratings