Rust: narrowing_rem

Created on 30 May 2020  路  13Comments  路  Source: rust-lang/rust

To remove some unsafe "as" casts and keep the code safe (lossless) and nice, sometimes I'd like to use a rem+cast. So what do you think about adding this to the stdlib?

trait NarrowRem<Out> {
    fn narrowing_rem(&self, den: Out) -> Out;
}

impl NarrowRem<u8> for u16 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u16::from(den)) as u8 }
}
impl NarrowRem<u8> for u32 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u32::from(den)) as u8 }
}
impl NarrowRem<u16> for u32 {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % u32::from(den)) as u16 }
}
impl NarrowRem<u8> for u64 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u64::from(den)) as u8 }
}
impl NarrowRem<u16> for u64 {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % u64::from(den)) as u16 }
}
impl NarrowRem<u32> for u64 {
    fn narrowing_rem(&self, den: u32) -> u32 { (*self % u64::from(den)) as u32 }
}
impl NarrowRem<u8> for u128 {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % u128::from(den)) as u8 }
}
impl NarrowRem<u16> for u128 {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % u128::from(den)) as u16 }
}
impl NarrowRem<u32> for u128 {
    fn narrowing_rem(&self, den: u32) -> u32 { (*self % u128::from(den)) as u32 }
}
impl NarrowRem<u64> for u128 {
    fn narrowing_rem(&self, den: u64) -> u64 { (*self % u128::from(den)) as u64 }
}

impl NarrowRem<i8> for i16 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i16::from(den)) as i8 }
}
impl NarrowRem<i8> for i32 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i32::from(den)) as i8 }
}
impl NarrowRem<i16> for i32 {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % i32::from(den)) as i16 }
}
impl NarrowRem<i8> for i64 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i64::from(den)) as i8 }
}
impl NarrowRem<i16> for i64 {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % i64::from(den)) as i16 }
}
impl NarrowRem<i32> for i64 {
    fn narrowing_rem(&self, den: i32) -> i32 { (*self % i64::from(den)) as i32 }
}
impl NarrowRem<i8> for i128 {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % i128::from(den)) as i8 }
}
impl NarrowRem<i16> for i128 {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % i128::from(den)) as i16 }
}
impl NarrowRem<i32> for i128 {
    fn narrowing_rem(&self, den: i32) -> i32 { (*self % i128::from(den)) as i32 }
}
impl NarrowRem<i64> for i128 {
    fn narrowing_rem(&self, den: i64) -> i64 { (*self % i128::from(den)) as i64 }
}

impl NarrowRem<u8> for usize {
    fn narrowing_rem(&self, den: u8) -> u8 { (*self % usize::from(den)) as u8 }
}
impl NarrowRem<u16> for usize {
    fn narrowing_rem(&self, den: u16) -> u16 { (*self % usize::from(den)) as u16 }
}

impl NarrowRem<i8> for isize {
    fn narrowing_rem(&self, den: i8) -> i8 { (*self % isize::from(den)) as i8 }
}
impl NarrowRem<i16> for isize {
    fn narrowing_rem(&self, den: i16) -> i16 { (*self % isize::from(den)) as i16 }
}

An example usage:

fn main() {
    let _x: u8 = 152_u64.narrowing_rem(51_u8);
}

D language performs this lossless cast operation automatically and transparently (while it doesn't perform lossy casts silently):

uint foo(in ulong x) { // Error: cannot implicitly convert
    return x;
}
uint bar(in ulong x) { // OK
    return x % 1000;
}
void main() {}
C-feature-request T-libs

Most helpful comment

There is nothing wrong with as in user code

It's just a matter of how you define "unsafe". If you use a narrow definition of memory safety, then you're right. But Rust designers have understood that's a insufficient definition. A cast from floating point to integral value was undefined behaviour. We have patched this problem, leaving a escape hatch (to_int_unchecked). The narrowing cast between two integral values using "as" is well defined, it's not undefined behaviour, but if you write lot of numerical code in Rust you slowly understand that current "as" casts are unsafe and they lead to problems in C code. In Rust there's try_from to avoid this risk (but there isn't a function like try_from that becomes an "as" in release mode only). So a wide coding experience in C/Rust and other languages shows you that reducing the number of naked "as" casts in your user code is a good idea to avoid bugs. That's why I have suggested this narrowing rem that's a hopefully sufficiently common case that's safe (also handled by the D language type system).

All 13 comments

Why not the following?

let _x: u8 = (152_u64 % 51_u64) as u8;

Because it contains a naked "as" in user code.

There is nothing wrong with as in user code, it's not unsafe to use. Furthermore, you probably know that the the RHS fits in the type you will be casting after, although I can see how your proposed implementation would guarantee this to hold.

There is nothing wrong with as in user code

It's just a matter of how you define "unsafe". If you use a narrow definition of memory safety, then you're right. But Rust designers have understood that's a insufficient definition. A cast from floating point to integral value was undefined behaviour. We have patched this problem, leaving a escape hatch (to_int_unchecked). The narrowing cast between two integral values using "as" is well defined, it's not undefined behaviour, but if you write lot of numerical code in Rust you slowly understand that current "as" casts are unsafe and they lead to problems in C code. In Rust there's try_from to avoid this risk (but there isn't a function like try_from that becomes an "as" in release mode only). So a wide coding experience in C/Rust and other languages shows you that reducing the number of naked "as" casts in your user code is a good idea to avoid bugs. That's why I have suggested this narrowing rem that's a hopefully sufficiently common case that's safe (also handled by the D language type system).

FWIW I agree that as should be avoided -- in Miri we are now using TryFrom everywhere which makes me feel much better as I know we don't silently truncate anything.

However @Elinvynia I am a bit surprised to see the "typesystem" label here. This is just a libs addition not affecting the type system, isn't it?

From my understanding of this request, I thought that it having to do with "type conversions" would be a good reason to include that label, what would be more appropriate here?

in Miri we are now using TryFrom everywhere which makes me feel much better as I know we don't silently truncate anything

But TryFrom in some performance-sensitive places is heavy. So in my code I use a short to!{} macro that performs the TryFrom in debug mode and uses "as" in release mode. And another little macro for FP=>Int casts, that uses "as" in debug mode and to_int_unchecked in release mode.

But TryFrom in some performance-sensitive places is heavy. So in my code I use a short to!{} macro that performs the TryFrom in debug mode and uses "as" in release mode.

Yeah I can see how that's useful. For Miri, arithmetic is not our limiting factor in terms of performance. In fact I'd prefer our additions etc. to be overflow-checked in release mode as well.

And another little macro for FP=>Int casts, that uses "as" in debug mode and to_int_unchecked in release mode.

oO okay that's UB when anything goes wrong (i.e., this is an unsafe-to-use macro), I would not accept that in my own codebases without benchmarks for each individual invocation of the macro. ;) But of course, everyone makes their own decisions for these trade-offs.

Btw, IIRC people were looking for feedback on cases where the as float-to-int casts are measurably costing performance -- that should then likely be optimized better so people do not have to resort to unsafe code. See https://github.com/rust-lang/rust/pull/71269#issuecomment-615537137.

I would not accept that in my own codebases without benchmarks for each individual invocation of the macro. ;)

Yes, I use this second fto!{} macro only in special cases, and after benchmarking.

In the meantime I've found about a dozen usage points for this narrowing_rem in my codebase.

The code could be simplified a bit using Self:

fn narrowing_rem(&self, den: u8) -> u8 { (*self % Self::from(den)) as u8 }

But now I'm trying to use a (possibly overengineered) more complex version:

#![feature(core_intrinsics, const_fn, const_panic)]

use core::intrinsics::assume;
use std::num::{NonZeroU8, NonZeroU16};


trait HasNonZero {
    type NonZero;
    fn from_nonzero(v: Self::NonZero) -> Self;
}
impl HasNonZero for u8 {
    type NonZero = NonZeroU8;
    fn from_nonzero(v: Self::NonZero) -> Self { v.get() }
}
impl HasNonZero for u16 {
    type NonZero = NonZeroU16;
    fn from_nonzero(v: Self::NonZero) -> Self { v.get() }
}
//...

//- - - - - - - - - - - - - - - - - - - -

// To be used at compile-time, it raises a panic. TODO until Option::unwrap becomes const.

const fn nonzero_u8(x: u8) -> NonZeroU8 {
    match x {
        0 => panic!("nonzero: x == 0"),
        _ => unsafe { NonZeroU8::new_unchecked(x) },
    }
}

const fn nonzero_u16(x: u16) -> NonZeroU16 {
    match x {
        0 => panic!("nonzero: x == 0"),
        _ => unsafe { NonZeroU16::new_unchecked(x) },
    }
}
//...

//- - - - - - - - - - - - - - - - - - - -

trait NarrowRem<Out> where Out: HasNonZero {
    fn narrowing_rem(&self, den: Out::NonZero) -> Out;
}
impl NarrowRem<u8> for u32 {
    #[inline(always)]
    fn narrowing_rem(&self, den: <u8 as HasNonZero>::NonZero) -> u8 {
        let divisor = Self::from(u8::from_nonzero(den));
        unsafe { assume(divisor != 0); } // TODO, will be unnecessary.
        (*self % divisor) as u8
    }
}
impl NarrowRem<u8> for usize {
    #[inline(always)]
    fn narrowing_rem(&self, den: <u8 as HasNonZero>::NonZero) -> u8 {
        let divisor = Self::from(u8::from_nonzero(den));
        unsafe { assume(divisor != 0); } // TODO, will be unnecessary.
        (*self % divisor) as u8
    }
}
//...

nonzero_uX are now useless, we can define NonZeroUX::new().unwrap() in constants.

Was this page helpful?
0 / 5 - 0 ratings