Nim: Integer implicit conversion problems

Created on 17 Feb 2014 · 24Comments · Source: nim-lang/Nim

Maybe this should be divided in many different issues, but here a compilation of bugs with implicit conversions that should work, mostly concerning unsigned types:

import unsigned

proc uintConvTest() = 
  var b:byte = 5 # the only line here that works.
  var y:int = b + 2  # Error: type mismatch: got (Byte) but expected 'int'

  var u:uint32 = 2147483648  # Error: type mismatch: got (int64) but expected 'uint32'
  assert u > 0  # Error: type mismatch: got (int literal(0), uint32) 
  assert u == 2147483648 # Error: type mismatch: got (uint32, int64)

proc intConvTest() =
  let x = 5000'i8
  echo 5000'i8, " == ", x # 5000 == -120

intConvTest()
uintConvTest()

High Priority unsigned

Source

ReneSac

👍2

Most helpful comment

I am pretty sure I read that documentation, and because you linked to it I read it again, and I could again not find the part that said that constants are treaded like they are treaded in go. Just to give an example, go does not do the same as nim:

var u:uint32 = 2147483648  # Error: type mismatch: got (int64) but expected 'uint32'

In go this line would be:

var u uint32 = 2147483648

And here it just works without problems, I had to change it to create an error message. And you can see that the error message is anatomically different:

var u uint32 = 21474836480 // constant 21474836480 overflows uint32

The constant is not of any special type like int64 or uint64, it is just a constant of value 21474836480 and only when assigned to a variable of a certain type the compiler does the check weather the value is in the range of the underlying type.

var u uint32 = (-17 + 33) * 0.125

This is totally valid go code, because the result of the expression on the right is exactly 2, wich is unarguably in the range of an uint32.

krux02 on 9 Aug 2016

👍2

All 24 comments

Unsigned ints are also not accepted for indexing into arrays. At least types where uintT.max < int.max should be accepted.

ReneSac on 30 Apr 2014

Fixing these issues are surprisingly problematic:

Most of them can be fixed by allowing implicit conversion from uint to int, at least for most cases. I tried doing this by adding the following to unsigned.nim:

when sizeof(int) > 1:
  converter toInt*(x: uint8): int = cast[int](x)

when sizeof(int) > 2:
  converter toInt*(x: uint16): int = cast[int](x)

when sizeof(int) > 4:
  converter toInt*(x: uint32): int = cast[int](x)

converter toInt16*(x: uint8): int16 = cast[int16](x)
converter toInt32*(x: uint8): int32 = cast[int32](x)
converter toInt32*(x: uint16): int32 = cast[int32](x)
converter toInt64*(x: uint8): int64 = cast[int64](x)
converter toInt64*(x: uint16): int64 = cast[int64](x)
converter toInt64*(x: uint32): int64 = cast[int64](x)

Problem is that this caused $ to stop working for uint, because it suddenly had three different version of $ to select from (int, int64 and uint64) and it didn't know which one to select.

var y:int = b + 2  # Error: type mismatch: got (Byte) but expected 'int'

This lines could be fixed by adding implicit conversion from uint to int in the compiler.

var u:uint32 = 2147483648  # Error: type mismatch: got (int64) but expected 'uint32'

This line is tricky to fix, because the compiler represents large numbers as int64 literals. You would need to allow implicit conversion from int64 to uint32, but that's unsafe in the general case. But you need to use 'u64 for large uint64 literals now, so maybe we should just accept that you need 'u32 for large uint32 literals as well.

assert u > 0  # Error: type mismatch: got (int literal(0), uint32) 
assert u == 2147483648 # Error: type mismatch: got (uint32, int64)

These lines lines could be fixed by implementing comparisons between unsigned and signed int, which should be possible and safe.

  let x = 5000'i8
  echo 5000'i8, " == ", x # 5000 == -120

I'm not sure why you'd want this behaviour? Or is this something that has already been fixed?

skyfex on 2 Jan 2015

I looked at doing implicit conversion of uint to int in the compiler, and I got the some problem as when I added the 'converter's..

But I think there must be some special check in the compiler related to ints that helps it select the right version of $ (or similar)

skyfex on 2 Jan 2015

I'm not sure why you'd want this behaviour? Or is this something that has already been fixed?

It seems to have been fixed. It used to print the string in the comment.

I looked at doing implicit conversion of uint to int in the compiler, and I got the some problem as when I added the 'converter's..

Yeah, I hit the same problem (the $ not knowing what to do) when I tried to fix it in the compiler. But there must be a way to fix it!

ReneSac on 3 Jan 2015

I managed to fix so that implicit conversion worked correctly, and $ did not get confused. See commit in my fork of Nim.

But now the issue is that it causes the wrong version of e.g. shl to be selected. When you type foo shl 3, where foo is uint16, it casts foo to int and uses the int version of shl.. this causes existing code to break, because if you try to assign the result back to a uint16, you get compile error.

I'm not quite sure how to proceed, but I'll think about it.

skyfex on 6 Jan 2015

Does the following works?

proc `shl` *[T,U: distinct SomeInteger](x:T, y:U): T {.inline, noSideEffect.} = x shl T(y)
proc `shr` *[T,U: distinct SomeInteger](x:T, y:U): T {.inline, noSideEffect.} = x shr T(y)
proc `or` *[T,U: distinct SomeInteger](x:T, y:U): T {.inline, noSideEffect.} = x or T(y)
proc `and` *[T,U: distinct SomeInteger](x:T, y:U): T {.inline, noSideEffect.} = x and T(y)
proc `xor` *[T,U: distinct SomeInteger](x:T, y:U): T {.inline, noSideEffect.} = x xor T(y)
proc `div` *[T,U: distinct SomeInteger](x:T, y:U): T {.inline, noSideEffect.} = x div T(y)

Or still gives the same wrong conversion?

ReneSac on 7 Jan 2015

I'll check, but I think the solution must be to have some kind of priority scheme for which conversions to perform. literal -> SomeInteger should be high priority, SomeInteger->int should be mid priority (to resolve $) and uintX -> intY should be low priority.

Maybe this priority number could be encoded as a pragma to the converters, and maybe this will allow the funky int conversion rules to be expressed in the manual and system.nim in a clear way, rather than being compiler magic.

skyfex on 7 Jan 2015

Now that your last pull request about uin64 literals has been accepted, the compiler spills the error with the confused $.

On my code above, I think you should use cast[T](y) for or and xor because many of the conversions may be called invalid by the compiler

Have you talked with Araq about those priority number for converters? Seems like a sane system if it can be made to work for all combinations of types in complex expressions.

ReneSac on 13 Jan 2015

ReneSac: Yeah I messed up, because I pushed a commit with the implicit conversion to my fork of Nim, not realising that it got included with my pull request for uint64 literals.

I haven't discussed my thoughts about the the type conversion priority stuff with Araq. I've just tried to understand what's already there first, but it's a bit complicated :p

skyfex on 13 Jan 2015

A foo'u8 xor 5 problem is a issue with literal integers. This would be solved if the _literal integer_ is set by default to the smallest integer type that can hold it's value, and widened from there as needed by the rules you made.

The only place you really need an _integer literal_ to become directly a _int_ is on duck typed assignments like var x = 1, right? Or I'm forgetting something?

This already works for floating point types:

var a: float32 = 3.0  # float32
var b = 4.0  # float
var c = a * 2.0  #  float32
var d = b * c  # float

It should be possible to do the same for integers too.

The shl and shr should be fixed using [T: SomeUnsignedInt, U: SomeInteger](x:T, y:U): T (I forgot above about the distinction of signed and unsigned shifts, but it would be ok because the ones defined on system.nim are more specific). Maybe instead of SomeInteger we could use Natural, but I'm not sure of how cumbersome this might be. Negative shifts are possible but most likely not what you imagine, and probably an error. But I'm not sure if trying to catch this possible error statically isn't too annoying. And maybe there is some code depeding on the negative shifts behaviour...

The operations or xor and should ideally cast the operand with smaller width to the wider one, and not find the least common type between them, that can be wider than both if mixing signed and unsigned. This could be done with lots of overloads in the function definition, and I mean lots, but is the easiest way.

For the rest of operations, the corrected rules of widening to a common type should do the right thing. Well, you might argue that for division for example one can keep the smaller type, but I don't think it warranties creating an exception.

No need for a sophisticated conversion priority system, I think.

ReneSac on 14 Jan 2015

I tried the skyfex fork. I still had trouble with the following code snippet:

import unsigned
var x = 2.uint8
var shi = 5
x = x > 5
echo x

gives

/tmp/aporia/a5.nim(4, 6) Error: type mismatch: got (int literal(5), uint8)
but expected one of:
system.<(x: TEnum, y: TEnum): bool
system.<(x: string, y: string): bool
system.<(x: char, y: char): bool
system.<(x: set[T], y: set[T]): bool
system.<(x: bool, y: bool): bool
system.<(x: ref T, y: ref T): bool
system.<(x: ptr T, y: ptr T): bool
system.<(x: pointer, y: pointer): bool
system.<(x: int64, y: int64): bool
system.<(x: float32, y: float32): bool
system.<(x: int16, y: int16): bool
system.<(x: ordinal[T]): T
system.<(x: int, y: int): bool
system.<(x: T, y: T): bool
system.<(x: int32, y: int32): bool
system.<(x: int8, y: int8): bool
system.<(x: float, y: float): bool
unsigned.<(x: T, y: T): bool

I tried a quick and dirty correction, but got:
"Error: ambiguous call; both system.xor(x: int64, y: int64): int64 and unsigned.xor(x: int64, y: uint64): int64 match for: (int64, int)". Not unlike the $ errors that you said you corrected in your branch.

I did the fixes I proposed to the functions of the unsigned module. I'm not sure how to make a pull request to the skyfex fork, so here my version: https://gist.github.com/ReneSac/caabcdf4f6d835ece56d

I'm not sure if only those overloads are sufficient for and or xor.

ReneSac on 17 Jan 2015

Btw, branch with implicit uint->int conversion are now back up on:

https://github.com/skyfex/Nim/tree/uintconv

skyfex on 17 Jan 2015

I compiled your uintconv branch. It really runs everything from uintConvTest() save for the large literal (simply appending a 'u32' makes it work), and also correctly indexes arrays with uint8 for example. YAY!

As for the shl problem:

let foo:uint16 = 256
let bar:uint16 = foo shl 3

I tried my fixed unsigned.nim above, with the following shl definition:

proc `shl`*[T: SomeUnsignedInt, U: SomeInteger](x:T, y:U): T {.magic: "ShlI", noSideEffect.}
  ## computes the `shift left` operation of `x` and `y`.

But the compiler insisted in using the signed integer version of shl. Then I tried the precise overload:

proc `shl`*(x:uint16, y:int): uint16 {.magic: "ShlI", noSideEffect.}

And that works and fix the problem.

Now the question is why the generic/typeclass overload didn't worked as it should.

ReneSac on 31 Jan 2015

Testing I found two additional minor problems, that are not showstopers (they are already a problem currently, so not really a regression):

The following fails because the result is eagerly converted to int, instead of the smallest integer that can hold the value.

let a:uint8 = 2
let b:int8 = 3
let c:int16 = a + b
echo c

If I define b as int16 type it works.

Also:

let x:int64 = 0xFF'i8
let y:uint32 = 0xFF
let z:int64 = y and x 
echo z

Gives: Error: ambiguous call; both system.and(x: int64, y: int64): int64 and unsigned.and(x: uint64, y: int64): uint64 match for: (uint32, int64).
If I replace the third line by: let z:uint64 = y and 0xF_FFFF_FFFF it gives almost the same error (but now with int lit).

Additional more precise overloads for and fixes that, as I feared would be needed.

So, the question is: should I write those tons of overloads, or a more fundamental fix is possible, or even needed? I may also write them now, just to get all this unsigned stuff working, and later someone can make a proper fix and remove those workarounds.

ReneSac on 31 Jan 2015

Yeah, I found similar things..

The root of the problem now is that the unsigned functions are defined using a typeclass (SomeUnsignedInt). That causes conversions to uint to be downprioritized.

The thing is, that if we were to change the definitions of the signed ints to use a typeclass as well, I think we'd see similar issues as well. So this is something that Nim will have to resolve in the future.

But for the moment, I think the solution is just to define all the unsigned operations using concrete types.

skyfex on 31 Jan 2015

👍1

I added changed unsigned.nim to not use typeclasses, and pushed it to my uintconv branch.

With this I finally don't get any regressions.. ReneSac, do you want to test it a bit and see if you see anything unexpected?

skyfex on 1 Feb 2015

Ok, many problems...

Nim code now behaves differently depending if you import unsigned or not. If you don't, many unsigned operations will continue to work, but now produce signed numbers of larger width as result. I'm not sure how desirable is that... I guess this means unsigned must really be moved to system.

Here $ still get confused when trying to print a uint32. Wasn't it solved? Maybe I made some mistake in the version I got from git?

And finally, there is still the problem in bitwise operations between signed and unsigned integers that I described before. The result of a xor between two numbers of the same width, but one of then signed, is a signed number with double the width. This is undesirable. Such an operation would fail in a previous nim version.

I tried the lots of overloads method, but no go:

proc `and`*[T: SomeUnsignedInt](x, y: T): T {.magic: "BitandI", noSideEffect.}
proc `and`*(x:uint8, y: int8): uint8 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:int8, y: uint8): int8 {.magic: "BitandI", noSideEffect.}

proc `and`*(x:uint16, y: int16): uint16 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:uint16, y: int8): uint16 {.magic: "BitandI", noSideEffect.}

proc `and`*(x:int16, y: uint16): int16 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:int8, y: uint16): int16 {.magic: "BitandI", noSideEffect.}

proc `and`*(x:uint32, y: int32): uint32 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:uint32, y: int16): uint32 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:uint32, y: int8): uint32 {.magic: "BitandI", noSideEffect.}

proc `and`*(x:int32, y: uint32): int32 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:int8, y: uint32): int32 {.magic: "BitandI", noSideEffect.}
proc `and`*(x:int16, y: uint32): int32 {.magic: "BitandI", noSideEffect.}

proc `and`*(x:uint64, y: int64): uint64 {.magic: "BitandI64", noSideEffect.}
proc `and`*(x:uint64, y: int32): uint64 {.magic: "BitandI64", noSideEffect.}
proc `and`*(x:uint64, y: int16): uint64 {.magic: "BitandI64", noSideEffect.}
proc `and`*(x:uint64, y: int8): uint64 {.magic: "BitandI64", noSideEffect.}

proc `and`*(x:int64, y: uint64): int64 {.magic: "BitandI64", noSideEffect.}
proc `and`*(x:int8, y: uint64): int64 {.magic: "BitandI64", noSideEffect.}
proc `and`*(x:int16, y: uint64): int64 {.magic: "BitandI64", noSideEffect.}
proc `and`*(x:int32, y: uint64): int64 {.magic: "BitandI64", noSideEffect.}

gives

# lib/pure/net.nim(1131, 69) Error: ambiguous call; both unsigned.and(x: uint32, y: int16): uint32
# and unsigned.and(x: uint32, y: int32): uint32 match for: (uint32, int literal(255))

ReneSac on 3 Feb 2015

If you don't, many unsigned operations will continue to work, but now produce signed numbers of larger width as result.

I would argue that this is a desirable property. Many C libraries use unsigned integers. It is nice to be able to use those values and have them automatically be converted to signed numbers, even if you use them directly in an expression.

Here $ still get confused when trying to print a uint32. Wasn't it solved? Maybe I made some mistake in the version I got from git?

Hmm, I thought so. I'll take a look.

The result of a xor between two numbers of the same width, but one of then signed, is a signed number with double the width. This is undesirable. Such an operation would fail in a previous nim version.

Yeah, unfortunately there's no elegant way around that I think. It is a bit of an odd corner case though.

skyfex on 3 Feb 2015

So, do $ for uint32 fails at your side too?

ReneSac on 10 Feb 2015

There is a solution to at least some of these problems provided by the Go programming language. In go constant integers are by default computed in arbitrary precision, meaning they can never overflow or underflow. As soon as they get assigned to a variable of certain type, the compiler checks weather this type is in the range of the underlying type.

Here it is explained pretty well why they did that design decision: https://blog.golang.org/constants

krux02 on 8 Aug 2016

That's what Nim already does. http://nim-lang.org/docs/manual.html#types-pre-defined-integer-types

Maybe you should also read Nim's documentation.

Araq on 8 Aug 2016

var u:uint32 = 2147483648  # Error: type mismatch: got (int64) but expected 'uint32'

In go this line would be:

var u uint32 = 2147483648

And here it just works without problems, I had to change it to create an error message. And you can see that the error message is anatomically different:

var u uint32 = 21474836480 // constant 21474836480 overflows uint32

var u uint32 = (-17 + 33) * 0.125

This is totally valid go code, because the result of the expression on the right is exactly 2, wich is unarguably in the range of an uint32.

krux02 on 9 Aug 2016

👍2

I tried lots and lots of different things in the meantime and the current rules are as good as it can get. More complex rules also produce weird corner cases. Maybe for Nim v2 I'll switch to use C#'s integer promotion rules. I now dislike Nim's "integer literals are special" rule which is the cause of many of these problems. Go's rules are even worse IMO and just don't fit systems programming. Go uses at compile time arbitrary precision integers internally that are not used at runtime. That's a PITA when you can run most code at compiletime like Nim supports.

Araq on 29 Dec 2016

Now with more experience in the Nim programming language, I have to agree that "integer literal are special" is not that great to work with. It works well in the Go programming language, but Go has no compile time metaprogramming, there are only constants, and this implicit conversion might be ok in that context.

I am not the biggest fan of implicit type conversions, but the go constants are not only an implicit type conversion, they are also a hidden type that is only available at compile time. At the moment I think that literals should always be of a specific sized type (without suffix of type _int_). Arbitrary precision literals could be added on top of the existing literals, but then their type should also be _ArbitraryPrecisionInteger_ (or _BigInt_ or however you want to call it), and not _int_. This type should then also be available at runtime like any other type. I think with this approach the weird corner cases can go away, and things would become a bit more explicit, and less magic.

My latest thought is a special compile time implicit conversion between all number types. When the value is known at compile time, an implicit conversion could be supported from int to uint8 safely.

krux02 on 30 Dec 2016

Was this page helpful?

0 / 5 - 0 ratings