Rfcs: Pre-RFC: Unchecked arithmetic

Created on 31 Jul 2018 · 17Comments · Source: rust-lang/rfcs

On IRC, we recently discussed the speed of addition. I think Rust currently has no way to say “I know this cannot overflow, so just optimize and UB if it actually can”.

Hence the idea of adding unchecked_add, etc., that would be unsafe functions. (addition of an Unchecked<u*> type is left to later assessment)

Specific list of functions that'd be available as unchecked_* is the same set of functions that are currently available as checked_*, overflowing_* and wrapping_* forms:

unchecked_add
unchecked_sub
unchecked_mul
unchecked_div
unchecked_div_euc
unchecked_rem
unchecked_mod_euc
unchecked_neg
unchecked_shl
unchecked_shr
unchecked_pow

Is there any major roadblock/risks that you think I should be aware of before starting to write this RFC? Do you like this idea or not?

For the potential for improvement, even though it's hard to pinpoint the exact reason why, using signed arithmetic instead of unsigned arithmetic in C here made the code run 10% faster -- this is likely an equivalent gain we could have using unchecked variants instead of overflowing (or equivalent) ones. :) (not even counting debug builds, where most of the huge gain could already be had by using eg. overflowing_*, even though it wouldn't really be semantically correct)

Note: this is the same idea as https://github.com/rust-lang/rust/pull/52205 , except it'd be available to user code

A-arithmetic A-unsafe T-lang T-libs

Source

Ekleog

👍9

Most helpful comment

Not yet, what has been tried is this:

pub unsafe fn foo(a: u32, b: u32) -> u32 {
    a.checked_add(b).unwrap_or(std::mem::uninitialized())
}

(playground)

and this

pub unsafe fn foo(a: u32, b: u32) -> u32 {
    if let Some(x) = a.checked_add(b) {
        x
    } else {
        std::hint::unreachable_unchecked()
    }
}

(playground)

Both of which fail to generate add nuw in LLVM IR, and generate add instead.

I can now confirm the same result with

pub unsafe fn foo(a: u32, b: u32) -> u32 {
    let (result, overflowed) = a.overflowing_add(b);
    if overflowed {
        std::hint::unreachable_unchecked();
    } else {
        result
    }
}

(playground)

That said, the example you're giving does look like a reasonable “default cross-platform” implementation for unchecked_* (though I'd likely have used the trick with checked_* instead, given it'd also work nicely with unchecked_div being UB on division-by-0)

BTW, rustc appears to generate most-likely optimal IR for the example using checked_div and unreachable_unchecked (but not the other examples, due to the jump to panic for the overflowing_add example, and I don't really know why for the mem::uninitialized case), apart from maybe questions about idivl / divl as per the linked stackoverflow answer above, which could be investigated further during implementation (but I'd guess idivl couldn't be used here anyway due to the semantics of unsigned division)

Ekleog on 31 Jul 2018

❤2 👍2

All 17 comments

I don’t know how much context you reviewed when discussing this matter, but it would be a good thing for any RFC here to start by reviewing the text of RFC 560, PR: https://github.com/rust-lang/rfcs/pull/560 and the discussion thread linked on its PR, and perhaps the other discussions that preceded it

pnkfelix on 31 Jul 2018

Notably #146 was an important predecessor to #560 that I think tried to incorporate more allowance of undefined behavior, in a scoped manner.

pnkfelix on 31 Jul 2018

Thank you for the pointers! I must say I didn't read in detail all the discussions, but looking at everything that mentions “unsafe” (assuming this would lead to the messages relevant for the current discussion, given I can't imagine such primitives not being “unsafe” and noone mentioning it) and reading about half the other posts to try not to miss any extended discussion, I think that:

#540 looks very centered around what the default operating mode should be, and having wrapping be the default or not.
#540 also debated about returning undefined results in release (instead of the current wrapping behaviour), that were rejected (naturally, can I say a few years later, as that'd be like using mem::undefined() without unsafe)
- The ML thread discussed basically the same things as #540 (and also went to discussing “what is Rust trying to do”, and a bit of scoping the arithmetic type, and then a swift vs rust discussion, considerations on the optimizability of bounds checks… eclectic thread)
#146 was proposing undefined results in release. That's more or less what I'm putting forward now again, except now I'm proposing having it be unsafe (to match mem::uninitialized()).

Overall, I think the possibility of having unsafe helper functions to have performance-optimized has not really been discussed yet, even though surrounding discussions (and especially questions about which behaviour should be the default one) have been numerous :) Then, again, I didn't read everything, so maybe I missed something.

However, these discussions do me feel better about the unchecked_* functions being unsafe.

That said, reading these discussions also makes me wonder which should be the behaviour for preconditions not matched in unchecked_*:

Undefined result
Undefined behaviour

Undefined behaviour is much more violent, and not required for LLVM's add nuw (which only requires undefined result). However, it will likely be required for divisions (to allow reception of SIGFPE as a valid outcome of unchecked division-by-0), or weird architectures where an overflowing addition traps.

Overall, given these functions would be defined as unsafe, I think it'd be better to just have undefined behaviour straightaway, so that all possible optimizations could be done. What do you think about it?

BTW, I've edited the topmost post according to feedback received over IRC, basically removing the parts about a Unchecked<u*> type and adding in a link to a benchmark in C to reduce bikeshed-likelihood.

Ekleog on 31 Jul 2018

I think Rust currently has no way to say “I know this cannot overflow, so just optimize and UB if it actually can”.

Was

let (result, overflowed) = a.overflowing_add(b);
if overflowed { 
    unsafe { unreachable_unchecked() }
} else {
    result
}

discussed or tried?

glaebhoerl on 31 Jul 2018

Not yet, what has been tried is this:

pub unsafe fn foo(a: u32, b: u32) -> u32 {
    a.checked_add(b).unwrap_or(std::mem::uninitialized())
}

(playground)

and this

pub unsafe fn foo(a: u32, b: u32) -> u32 {
    if let Some(x) = a.checked_add(b) {
        x
    } else {
        std::hint::unreachable_unchecked()
    }
}

(playground)

Both of which fail to generate add nuw in LLVM IR, and generate add instead.

I can now confirm the same result with

pub unsafe fn foo(a: u32, b: u32) -> u32 {
    let (result, overflowed) = a.overflowing_add(b);
    if overflowed {
        std::hint::unreachable_unchecked();
    } else {
        result
    }
}

(playground)

Ekleog on 31 Jul 2018

❤2 👍2

add nuw

Why not add nuw nsw ?

AFAIK we can't currently generate the appropriate LLVM-IR in any way, so whether these could actually make a performance difference or not in some situations is basically just speculation unless one modifies rustc to try it out, and those who want to reproduce modify and recompile rustc themselves to do so.

I think we could add these to core::intrinsics as a step 0 first (this needs to be done anyways to implement the integer methods). That way we could all try this out on nightly in godbolt and see how much of an impact do these have.

Once we have more information we can reconsider adding the methods to the integers, the Unchecked wrapper (I'd call it UnsafeArithmetic<T> though), etc.

Worst case these will just hang in core::intrinsics forever, but these might still be useful while debugging code-gen / hacking on rustc / hacking on LLVM, so I'd start there.

gnzlbg on 2 Aug 2018

Indeed, it's all speculative for Rust, but in C, this StackOverflow question appears to state that integer arithmetic (ie. something similar to the unchecked arithmetic proposed here) does perform 10% faster in at least one benchmark :)

Then, I agree with you that having intrinsics first would likely be first step. Let's wait for the follow-up of https://github.com/rust-lang/rust/pull/52205 for the time being, then, and the result of the Range optimization could also answer the question of whether that's a useful optimization :) (“could”, because if it's a good optimization then it'll have answered the question, but if it optimizes nothing, unchecked multiplications / divisions, esp. on ARM targets, appear like much more optimize-able operations to me)

Ekleog on 2 Aug 2018

the result of the Range optimization could also answer the question of whether that's a useful optimization

These intrinsics not being useful for Range does not imply that they are never useful. I've commented on that PR to see if it can be re-activated. I think these intrinsics are useful for experimentation, debugging performance issues, etc. even if they never make it to a stable API.

gnzlbg on 2 Aug 2018

Why not add nuw nsw ?

@gnzlbg Because there's no obvious type for which to have the methods do that. My PR does nowrap_add::<u32> as add nuw and nowrap_add::<i32> as add nsw. Do you have a situation in which you were wanting the combination of both overflow flags?

scottmcm on 3 Aug 2018

I've written a note about this topic:
https://users.rust-lang.org/t/high-performance-operations/19739/2

leonardo-m on 22 Aug 2018

The intrinsics for unchecked_add, unchecked_sub and unchecked_mul are now on nightly.

lcnr on 5 Jun 2019

🎉3

Also for unchecked_shl and unchecked_shr.

RustyYato on 5 Jun 2019

any plans to migrate those to stable ?

eloraiby on 9 Jul 2020

What about unchecked_neg?

A1-Triard on 10 Sep 2020

@A1-Triard There's no neg instruction in llvm, so unchecked_neg(x) is just unchecked_sub(0, x).

scottmcm on 11 Sep 2020

🚀1

@A1-Triard There's no neg instruction in llvm, so unchecked_neg(x) is just unchecked_sub(0, x).

Still, I think, it is not a reason to not having unchecked_neg

A1-Triard on 19 Sep 2020

👍1

@A1-Triard It's a reason not to have it _as an intrinsic_, which are the only things that exist right now.

Whether there should be an unchecked_neg in whatever gets stabilized is a different conversation, but since there's no stabilization-track APIs for this right now, that's now what's happened.

Last I heard there was potential concern with stabilizing things here, so anyone wanting to see further movement will need to do some experimentation in nightly to prove the value of these in real code.

scottmcm on 11 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

The compiler should be warned when some misleading code is being used

3442853561 · 3Comments

Support implicit conversion of temporary T to Option<T>

mqudsi · 3Comments

Rename .remove() to .take() in most collection types

mqudsi · 3Comments

Limiting `upstream crates may add new impl of trait in future versions`

p-avital · 3Comments

Related facilities for "non-ASCII identifiers"

3442853561 · 3Comments