Rust: Tracking issue for `mem::unreachable`

Created on 9 Aug 2017  Â·  74Comments  Â·  Source: rust-lang/rust

B-unstable C-tracking-issue T-libs final-comment-period

Most helpful comment

This does not belong in mem, and it should not have the same name as the unreachable!() macro.

Why not just call it std::intrinsics::undefined_behaviour()?

All 74 comments

Bikeshed: I find it surprising that safe unreachable!() and unsafe unreachable() do very different things. Also, this seems to have nothing to do with memory, so mem is an odd location for it.

Another possible name/location for discussion: raw::assume_unreachable(). Would require changing the raw module's mandate. We could also leave it under intrinsics, but stabilize it (you can use std::intrinsics::transmute; today).

This does not belong in mem, and it should not have the same name as the unreachable!() macro.

Why not just call it std::intrinsics::undefined_behaviour()?

There's a suggestion calling it unchecked_unreachable.

I like @notriddle's suggestion. Does what it says on the tin.

std::raw::eat_my_laundry()

(Sorry, couldn't help myself)

std::raw::eat_my_laundry()

Except that, in most cases, it won't consume your clothes. Your computer probably doesn't even have the necessary hardware to do anything to your laundry. Undefined behaviour may eat your laundry, because it may do anything; the specification places no constraints on what the result of invoking this intrinsic is. A name like eat_my_laundry fails to express this. Of course, I'm probably taking the suggestion too seriously. 😆

Unlike eat_my_laundry, though, naming the function undefined_behavior is actually a suggestion; it has the advantage of being one less name that people have to remember. And it's not like invoking this intrinsic may invoke UB; invoking this intrinsic is (at least from a specification and documentation POV) exactly the same as invoking UB any other way. There's no conceptual difference, so I'm not sure why there should be a naming difference.

@notriddle Yep, I agree that undefined_behavior is the best suggestion so far. Not sure which module it should fit in though.

(Context for those who came to this community after Rust 1.0: Before 1.0 there was a disclaimer on Rust's homepage saying something like it wasn't production ready yet, and could in theory do anything, including eating your laundry. I don't remember the exact wording.)

I would say std itself. Since undefined_behavior is a basic concept, not less common than Box or other common concepts.

@earthengine Box is located at std::boxed::Box, not std::Box. std as a module contains no types or functions. (and there's no way this function will enter prelude.)

Then make it inside std::intrinsics.

Consider that another intrinsic, abort, is very similar in behaviour and would probably live in the same place as unreachable.

I propose that we simply stabilise the intrinsics module and keep the functions in there unstable, stabilising only unreachable. I cannot see a single better location for this function, unless we invent another module with identical meaning (think: builtins).

This would need some thinking on how to deal with the currently stable intrinsics in there:

  • copy;
  • copy_nonoverlapping;
  • drop_in_place;
  • transmute;
  • write_bytes;

Would std::abstract be a workable module name? Making it something like std::abstract::assume_unreachable or std::abstract::undefined_behavior. After all, the distinguishing feature of this function is that it has no definition in machine code.

(I thought about calling it abstract_machine, but that's too long.),

For the currently-stable things in intrinsics: Since they're all exported elsewhere with (nearly?) identical signatures, just deprecate them with a message suggesting the ptr or mem ones instead?

Although they are "stable" they cannot be used as such because importing
the intrinsics module trips the instability check – so a deprecation
doesn't seem necessary to me. At least some of them were made stable as an
implementation detail.

On Aug 27, 2017 1:15 AM, "scottmcm" notifications@github.com wrote:

For the currently-stable things in intrinsics: Since they're all exported
elsewhere with (nearly?) identical signatures, just deprecate them with a
message suggesting the ptr or mem ones instead?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/43751#issuecomment-325164830,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AApc0mwJl8q9HWNO03KHl1ymOItRg7FDks5scJj1gaJpZM4OxZfV
.

It doesn't actually trip the stability check, you can import them from intrinsics if you want even though they are re-exported elsewhere. So we could avoid creating another module.

Do likely/unlikely also belong in the same category, that category that doesn't have a home in a current submodule?

likely/unlikely: #26179

std::hint::{likely, unlikely, unreachable, assume, breakpoint, prefetch_read_data}? (Maybe reexport std::sync::atomic::compiler_fence here as well?)

(compiler_fence is not a "hint", it has semantic impact)

assume and unreachable are also not hints, they also have impact.

I agree. Especially since their impact for being used incorrectly is simply UB.

std::builtin or std::builtins would be a bit silly, right? We have a lot of things that are "built in". Yet it seems to be the best name I can think of.

  • std::builtin::unreachable()

Or that means likely/unlikely and unreachable/assume shouldn't be grouped into the same module after all (if you want a meaningful name for the module).

I'm actually kinda okay with unreachable and assume being in a "hints" module. They're optimizer hints. Misuse can introduce UB, but otherwise they don't affect semantics either. I could understand not wanting to mix these subtly different meanings in one module, but I don't think it's inherently misleading to call them "hints", especially considering the existing "optimizer hint" terminology.

After some discussion on IRC and re-reading the earlier discussion, I actually don't see any reason to not stabilize std::intrinsics (it's already effectively stable) and put/leave all these functions in there.

Is impossible() a more distinguishable name?

I would _really_ love a std::hints module. The only reason anyone would use std::intrinsics::undefined_behaviour (my personal favourite of the names suggested) over unreachable! is to make it more optimisable, and the suggestions like likely/unlikely and assume would also be really useful for HPC code. The difference between possibly-UB and just-a-hint is enforced by unsafe, assume and undefined_behaviour would be unsafe whereas likely and unlikely would be safe. You could also include wrappers around madvise in this module, although now we're off-topic.

I still think it would be valuable to identify exactly how this function differs from unreachable!(), and why you would ever want to use it. It'd be for power-users, sure, but still useful.

A sort of TL;DR is:

unreachable!() is just a panic!. This function actually affects the semantics of how your code is optimized.

Put another way, the macro is "if this code path gets hit, panic." This function is "I'm telling you that this never happens and so please assume it does not".

The unreachable!() macro

This code

match an_option {
  Some(x) => use(x),
  None => unreachable!(),
}

should get optimized into something like this:

if an_option.is_none() {
  panic!("unreachable");
} else {
  use(an_option.<Some>); // imaginary pseudo-syntax for accessing the Some contents of an option without matching on it
}

The intrinsics::undefined_behavior() function

This code

match an_option {
  Some(x) => use(x),
  None => unreachable!(),
}

should get optimized into something like this:

use(an_option.<Some>); // imaginary pseudo-syntax for accessing the Some contents of an option without matching on it

Notice how, unlike the unreachable macro, it doesn't bother checking if the option is None or not. It just assumes it's Some.

So, instead of:

if foo.is_none() {
    return Err("was none");
};
let foo = foo.unwrap();

(which I sometimes use to avoid an extra indent), you could do:

if foo.is_none() {
    return Err("was none");
};
let foo = if let Some(foo) = foo { foo } else { intrinsics::undefined_behavior() };

And the benefit here would presumably be higher performance (in theory) + smaller code footprint? Have any specific use-cases been proposed for this function?

And the benefit here would presumably be higher performance (in theory) + smaller code footprint?

Yup!

Have any specific use-cases been proposed for this function?

The only time I've personally used it is when writing interrupt handlers for an OS.

I firmly second the suggestion of renaming this function to impossible, if only because we know from experience that having two items with the same name but different purposes (mem::drop and Drop::drop) is a teachability nightmare.

Don't care whether this ends up in a new instrinsics module or not, though if the inline macro ever gets properly implemented and namespaced I suppose it could comfortably live in a hints module...

I'm strongly in favor of undefined_behavior.

I like undefined_behavior.

The intrinsics module now contains a lot of stuff like atomic_cxchgweak_acqrel_failrelaxed, which makes its docs hard to navigate. I think it makes sense for this module to remain more of an implementation detail than intended for public consumption. I don’t have a strong preference for another location, but std::hint sounds alright.

So in order to get things moving, let’s propose:

  • Rename this function to undefined_behavior
  • Move to a new (stable) core::hint module that is re-exported as std::hint
  • Stabilize

@rfcbot fcp merge

(Feel free to keep discussing if you feel strongly about naming.)

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams:

  • [x] @Kimundi
  • [x] @SimonSapin
  • [x] @alexcrichton
  • [ ] @aturon
  • [x] @dtolnay
  • [x] @sfackler
  • [ ] @withoutboats

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

I am on board with stabilizing and I wouldn't want to debate this too much, but just registering a preference for unreachable_unchecked rather than undefined_behavior.

If my code optimizes poorly and I am searching for a way to eliminate a codepath that I know is unreachable, I am going to search for how to identify a codepath as unreachable, not "how do I trigger undefined behavior?".

I believe the [...]_unchecked naming is already strongly associated with undefined behavior in the case that the unchecked assumption is violated.

I’m fine with unreachable_unchecked too.

I like unreachable_unchecked - it definitely seems like it reads more clearly with what it's intended to do.

Awhile back I actually implemented a macro for a project that deferenced a null pointer to get undefined behavior to ensure certain optimizations happened, and I called it unreachable_unchecked!() too.

FWIW, the code was doing deserialization, and I had a trusted input version that assumed the input was well-formed. So I used unreachable_unchecked!() as the default branch of my match expression to allow the optimizer to assume incorrect input was impossible (similar to how enum tags work really).

:bell: This is now entering its final comment period, as per the review above. :bell:

The final comment period is now complete.

Are we stabilizing this as core::hint::undefined_behavior() or core::hint::unreachable_unchecked()?

It will be core::hint::unreachable_unchecked.

It would be nice if there was a way for this to panic in debug builds but use unreachable_unchecked in release builds.

There's a crate that does that. I don't think it's a good policy in general
for --release to remove safety checks.

On Tue, Apr 17, 2018 at 11:48 PM, Who? Me?! notifications@github.com
wrote:

It would be nice if there was a way for this to panic in debug builds but
use unreachable_unchecked in release builds.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/43751#issuecomment-382248365,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAC3nykD25Psuxxt3-7240CYqZZUewy2ks5tprePgaJpZM4OxZfV
.

It would be nice if there was a way for this to panic in debug builds but use unreachable_unchecked in release builds.

This is similar to the idea of panicking in debug builds when the index passed to get_unchecked is out of bounds. The reason why that idea didn't work out is because get_unchecked is often used to legitimately access stuff outside the bounds. See https://github.com/rust-lang/rust/issues/36976.

But in case of unreachable_unchecked it'd be nice to get a panic rather than UB in debug mode. This is just for easier debugging, not for safety. There are no drawbacks to going that route, as far as I can tell.

There's a crate that does that. I don't think it's a good policy in general for --release to remove safety checks.

I think it is the opposite, right?

If one uses unreachable!(), I think that should _always_ be checked (debug or release), but if one uses unreachable_unchecked(), then we are _adding_ (not removing) a safety check. And as @stjepang mentioned, it is really more for ease of debugging than safety.

I think that something that has unchecked in the name should not be checked, even in debug mode. If "checked in debug mode only" functionality is to be added, it should be a separate API (and so not really relevant to this tracking issue).

@SimonSapin I feel like that would make me _never_ use unreachable_unchecked.

You are welcome to use or not use whatever APIs you want.

I mean, it's supposed to have a deterrent effect, as it should only be used
in really provably unreachable code, so... yeah.

On Wed, Apr 18, 2018 at 12:35 PM, Steven Fackler notifications@github.com
wrote:

You are welcome to use or not use whatever APIs you want.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/43751#issuecomment-382449652,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAC3n0dKOv8TqshD52G17jRIeP_OEzz4ks5tp2tQgaJpZM4OxZfV
.

@mark-i-m Double negation is hard and I can’t tell what your "that" refers to.

To clarify, what I mean is this: I generally prefer to leave asserts on in production, but there are a few asserts that I like to turn off in production. For those few asserts, I still want them _on_ when I'm debugging. I consider unreachable_unchecked to be like that.

While I'm debugging, if I don't want to risk UB. When I deploy, I'm confident enough that I'm willing to not have the assert. If I can't make that distinction cleanly, I'm unlike to ever use unreachable_unchecked. (I guess I could always manually do some sort of cfg, but that's ugly)...

It sounds like you want to use https://crates.io/crates/debug_unreachable, and maybe there’s a case to be made to include it in the standard library but it shouldn’t be under the name unreachable_unchecked and probably not in this issue.

… however just copying that code into libstd or libcore probably won’t work since the standard library is always compiled in release mode.

@SimonSapin That's fair. Thanks :)

I think @mark-i-m has raised a great point and we need to be more clear about what use cases we envision for unreachable_unchecked.

I feel like that would make me never use unreachable_unchecked.

I sympathize with this and I struggle to think of reasonable use cases where I would prefer the current unconditionally unchecked unreachable over something like debug_unreachable. @SimonSapin or @sfackler do you have such a use case in mind?

How would that be implemented?

Possibly the same way CStr::from_ptr is implemented -- we just reserve the right to implement it a particular way when that becomes possible.

We can do literally anything we want when undefined behavior happens so that's not a thing we need to explicitly guarantee.

45920 causes a trap to be generated from unreachable anyway, so you're already going to abort if an unreachable_unchecked is reached.

Then it's just a difference between "unconditionally unchecked is the correct behavior for a function with this name, and if we want a debug checked version later we call it something else or recommend a thirdparty crate" vs "we wish we could check in debug mode but the standard library is not set up to be able to do that -- maybe later." You and Simon seem to have been supporting the former but maybe mark-i-m and I would prefer the latter.

Is the trap as easily debuggable as a panic with stacktrace?

Abort doesn't print a stack trace but I believe you can catch it in a
debugger.

On Wed, Apr 18, 2018 at 4:14 PM, David Tolnay notifications@github.com
wrote:

Then it's just a difference between "unconditionally unchecked is the
correct behavior for a function with this name, and if we want a debug
checked version later we call it something else or recommend a thirdparty
crate" vs "we wish we could check in debug mode but the standard library is
not set up to be able to do that -- maybe later." You and Simon seem to
have been supporting the former but maybe mark-i-m and I would prefer the
latter.

Is the trap as easily debuggable as a panic with stacktrace?

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/43751#issuecomment-382514392,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAC3n_tCywRhtTIgSv7qKr2b20Z3_YIAks5tp56RgaJpZM4OxZfV
.

Note that #45920 only kicks in at codegen time, optimizations based on the UB can still cause arbitrarily bad or confusing consequences. You might not even ever hit the unreachable -- if it is conditional, for example, the branch with the "unreachable" in it might be eliminated entirely.

One reason for not panicking specifically is that it unwinds the stack, and the surrounding unsafe code might not be equipped to handle that.

What if it immediately aborted and printed a stack trace in debug mode?

Is the trap as easily debuggable as a panic with stacktrace?

It's not quite as nice. A trap will cause the program to be killed on a standard Linux setup. If you are running in GDB, then GDB will pause and you can use bt to get a stacktrace. However, if it just happens when you are running (especially if it is non-deterministic), then you will just get killed by the OS without warning.

What if it immediately aborted and printed a stack trace in debug mode?

I think I would be ok with pretty much anything, as long as the program halts and some debugging info is printed.

Well if it's defined to trap it's not undefined, is it? And those "clever
UB optimizations" shouldn't happen in debug mode.

On Wed, Apr 18, 2018 at 4:25 PM, Robin Kruppe notifications@github.com
wrote:

Note that #45920 https://github.com/rust-lang/rust/pull/45920 only
kicks in at codegen time, optimizations based on the UB can still cause
arbitrarily bad or confusing consequences. You might not even ever hit the
unreachable -- if it is conditional, for example, the branch with the
"unreachable" in it might be eliminated entirely.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/43751#issuecomment-382517536,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAC3n89B_KPoilfOw6CqLzu442ctFS97ks5tp6EjgaJpZM4OxZfV
.

I think that I was wrong, and that debug_assertions would do the right thing in libcore since it’s not a function but a macro, so (if I understand things correctly) cfg!(…) in its expansion would be evaluated in the context of the call site.

However I think that for the past three years the crate also hasn’t been doing what everyone thinks it’s doing, since it uses cfg!(ndebug) which isn’t set by Cargo anymore: https://github.com/rust-lang/cargo/pull/1444

I sympathize with this and I struggle to think of reasonable use cases where I would prefer the current unconditionally unchecked unreachable over something like debug_unreachable.

<[T]>::get_unchecked and every other *_unchecked API are unconditionally unchecked. Why do we have those?

@durka

Well if it's defined to trap it's not undefined, is it? And those "clever UB optimizations" shouldn't happen in debug mode.

It's undefined, period, codegen can just be configured is slightly mitigating the impact of any unreachables that make it to codegen. And re: "shouldn't happen in debug mode" -- pretty much all existing runtime checks can be configured independently of optimization level (-Coverflow-checks, debug_assertions) and this should too. Moreover, it appears @sfackler was talking more generally than "just" about debug mode. I just want to be clear about what #45920 does and does not achieve -- it's a mitigation, it doesn't tame any UB nor its consequences.

cfg!(…) in its expansion would be evaluated in the context of the call site.

Yes but the call site is in libcore, which has already been compiled, so it won't do what you expect. That's why there are ~ridiculous~ ingenious hacks to make overflow-checks-in-debug-mode work right.

Well if it's defined to trap it's not undefined, is it? And those "clever UB optimizations" shouldn't happen in debug mode.

Undefined doesn't necessarily have to mean unpredictable. Technically, *(int*)0 is UB, but in practice, most OSes cause it to segfault predictably. UB simply means we don't make any promises. We can still make a best effort, though. On the other hand, it also means we don't have any promises about what happens in debug mode either, just best effort. This makes it seem acceptable to panic in debug mode but not in release mode (assuming we can get it to work).

Technically, (int)0 is UB, but in practice, most OSes cause it to segfault predictably.

This is not true anymore if you factor compilers into the equation. Actual real-world programs don't segfault reliably when the programmer might think that it'd reliably execute the *(int *)0.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SharplEr picture SharplEr  Â·  3Comments

tikue picture tikue  Â·  3Comments

Robbepop picture Robbepop  Â·  3Comments

zhendongsu picture zhendongsu  Â·  3Comments

dwrensha picture dwrensha  Â·  3Comments