Rust: Functions with uninhabited return values codegen trap instead of unreachable

Created on 8 Apr 2019  路  19Comments  路  Source: rust-lang/rust

With https://github.com/rust-lang/rust/pull/59639, I would expect the following two functions to generate the same LLVM IR:

#[derive(Clone, Copy)]
pub enum EmptyEnum {}

pub fn empty(x: &EmptyEnum) -> EmptyEnum {
    *x
}

pub fn unreach(x: &EmptyEnum) -> EmptyEnum {
    unsafe { std::hint::unreachable_unchecked() }
}

However, they do not:

; playground::empty
; Function Attrs: noreturn nounwind nonlazybind uwtable
define void @_ZN10playground5empty17he3f416ff39576b93E(%EmptyEnum* noalias nocapture nonnull readonly align 1 %x) unnamed_addr #0 {
start:
  tail call void @llvm.trap()
  unreachable
}

; playground::unreach
; Function Attrs: norecurse noreturn nounwind nonlazybind readnone uwtable
define void @_ZN10playground7unreach17h416ea08edc51ffc9E(%EmptyEnum* noalias nocapture nonnull readonly align 1 %x) unnamed_addr #1 {
start:
  unreachable
}

Namely, empty triggers a well-defined trap while unreach causes UB.

That this makes a difference can be seen when adding:

pub fn test_unreach(x: bool) -> i32 {
    if x {
        42
    } else {
        unsafe { unreach(&*(8 as *const EmptyEnum)) };
        13
    }
}

pub fn test_empty(x: bool) -> i32 {
    if x {
        42
    } else {
        unsafe { empty(&*(8 as *const EmptyEnum)) };
        13
    }
}

The first becomes

; playground::test_unreach
; Function Attrs: norecurse nounwind nonlazybind readnone uwtable
define i32 @_ZN10playground12test_unreach17he4cbbc8d69597b0eE(i1 zeroext %x) unnamed_addr #2 {
start:
  ret i32 42
}

but the second becomes

; playground::test_empty
; Function Attrs: nounwind nonlazybind uwtable
define i32 @_ZN10playground10test_empty17hcb53970308f4f532E(i1 zeroext %x) unnamed_addr #3 {
start:
  br i1 %x, label %bb1, label %bb2

bb1:                                              ; preds = %start
  ret i32 42

bb2:                                              ; preds = %start
  tail call void @llvm.trap() #5
  unreachable
}

because the second branch does not actually cause UB.

Cc @eddyb @cuviper

A-codegen T-compiler

Most helpful comment

@cuviper

That's somewhat circular. It's only never executed if it's pruned as dead code

No, it is never executed period -- because only UB programs can ever get there, and we do not consider UB programs. There is no circularity.

In debug mode, there is a real path where this could be executed.

That path is just as real as the path to an if where a bool has value 2. And yet we codegen conditionals and other match assuming such paths do not exist. We do not consider UB paths.

I'd prefer it if we had an example that wasn't obviously doing something wrong.

I don't understand?

I mean that it's not supposed to be possible to create an uninhabited value, so we want to assume it never happens. But if you called test_empty(false) in debug mode, that code would be reached, so the unreachable is incorrect in practice. Same for test_unreach(false) though, so maybe this isn't a useful distinction.

This is exactly what UB is for: the compiler may assume it does not happen, and the programmer writing unsafe code must make sure that that is truly the case.

@comex

To the extent deciding how strictly to exploit UB is a tradeoff between performance and developer ergonomics, it's helpful to know that LLVM will trap rather than literally emitting no instructions and falling through into whatever chunk of code comes next, something that both GCC and Clang do by default on Linux :)

LLVM might trap. Or it might do anything else. That's what I mean by "unreliable lint". See test_unreach above for a case where it does not trap.
I'm not saying it's not useful, I am all for keeping it; I am just saying that this in no way means that unreachable is any less dangerous or more predictable in terms of what happens if such code is ever reached.

All 19 comments

Note also that we use TrapUnreachable, so unreachable will be selected to a trap instruction during codegen, if it persists until then. Given that, I don't think an explicit @llvm.trap() makes a lot of sense here.

@nikic we use trap in other places to ensure that no UB happens. The example above shows that even with TrapUnreachable, there is a huge difference in terms of how optimizations treat trap vs unreachable.

TrapUnreachable from what you describe has the effect of an unreliable lint, but cannot be relied on to actually detect cases where programs run into unreachable. Hence I do not think having or not having that setting should affect our own codegen choices at all.

We can change the call void @llvm.trap() ; unreachable to just unreachable, if we really want this to behave just like hint::unreachable_unchecked().

I'm not certain that we should though, since your empty example is purely safe code, and your uncheck requires unsafe for the hint. In other words, if we change this, you've invented a safe way to write an otherwise unsafe construct. (Although it does require unsafe to call empty with a faked &Empty).

One note about going to just unreachable -- here's your debug IR:

; issue59793::empty
; Function Attrs: noreturn nonlazybind uwtable
define void @_ZN10issue597935empty17h4ff3ba3a79e34872E(%EmptyEnum* noalias nonnull readonly align 1) unnamed_addr #1 !dbg !13 {
start:
  %x = alloca %EmptyEnum*, align 8
  store %EmptyEnum* %0, %EmptyEnum** %x, align 8
  call void @llvm.dbg.declare(metadata %EmptyEnum** %x, metadata !22, metadata !DIExpression()), !dbg !23
  call void @llvm.trap(), !dbg !24
  unreachable, !dbg !24
}

; issue59793::unreach
; Function Attrs: noreturn nonlazybind uwtable
define void @_ZN10issue597937unreach17h2cd2e01f0a9a6888E(%EmptyEnum* noalias nonnull readonly align 1) unnamed_addr #1 !dbg !25 {
start:
  %x = alloca %EmptyEnum*, align 8
  store %EmptyEnum* %0, %EmptyEnum** %x, align 8
  call void @llvm.dbg.declare(metadata %EmptyEnum** %x, metadata !26, metadata !DIExpression()), !dbg !27
; call core::hint::unreachable_unchecked
  call void @_ZN4core4hint21unreachable_unchecked17h592d1c4b48415e36E(), !dbg !28
  unreachable, !dbg !28
}

If I comment out the trap, opt -lint complains:

Unusual: unreachable immediately preceded by instruction without side effects
  unreachable, !dbg !16

This doesn't happen without debuginfo, but think of the rustc -g -O case like distros use.

&*(8 as *const EmptyEnum)

It seems to me that this is probably a good place to signal UB / unreachable, no? If we could mark such pointer reads as illegal, having manifested an uninhabited value, then I think everything that follows could be pruned as you want.

I think it would be more interesting to find bad codegen which didn't do anything dubious like this.

For the examples I tested in https://github.com/rust-lang/rust/pull/59639#issuecomment-480928320, the code which might deal with uninhabited values is truly dead, as they come from using an infallible Result<T, !>, but never creating an Err at all.

I'm not certain that we should though, since your empty example is purely safe code, and your uncheck requires unsafe for the hint. In other words, if we change this, you've invented a safe way to write an otherwise unsafe construct.

Sure, that sounds about right? Just like reference types provide a safe way to write an otherwise unsafe construct (pointer dereference), we can do similar things based on uninhabited return types.

We do agree that it would be correct to put an unreachable here, right? Whether we want that or not is a trade-off between better optimizations in safe code vs. less things going wrong when unsafe code authors break the rules.

If I comment out the trap, opt -lint complains:

That's some overeager linting right there. :/ Seems like LLVM wants an analysis that "goes backward" and erases unreachable code. I think LLVM is just being silly here though.

It seems to me that this is probably a good place to signal UB / unreachable, no? If we could mark such pointer reads as illegal, having manifested an uninhabited value, then I think everything that follows could be pruned as you want.

Well, this is where I'd advise caution for now. ;) We never do a pointer read here, we just create a reference to an uninhabited type. Whether such a reference (when never used, or only used to create a place but never to create an rvalue) is insta-UB is an open question, part of which is discussed at https://github.com/rust-lang/unsafe-code-guidelines/issues/77.

Cc @rust-lang/wg-unsafe-code-guidelines

I'm not certain that we should though, since your empty example is purely safe code, and your uncheck requires unsafe for the hint. In other words, if we change this, you've invented a safe way to write an otherwise unsafe construct.

Sure, that sounds about right? Just like reference types provide a safe way to write an otherwise unsafe construct (pointer dereference), we can do similar things based on uninhabited return types.

But unlike pointer dereference, this is an unsafe construct that's always UB.

We do agree that it would be _correct_ to put an unreachable here, right? Whether we want that or not is a trade-off between better optimizations in safe code vs. less things going wrong when unsafe code authors break the rules.

If I may continue hedging, I guess I do agree that it's _semantically_ correct. We know at a high level that these values can't exist, so it's only "reachable" by misbehaving unsafe code. So the question is whether we should do anything if this happens.

Whether such a reference (when never used, or only used to create a place but never to create an rvalue) is insta-UB is an open question

I see the point. If we were talking about something like &bool, I would not be advocating trapping invalid values every time we read it. I feel like we could make stronger assertions about uninhabited types, since we know there's never a valid allocation, never mind a valid initialization, but I haven't thought deeply about the implications.

But I'm aware we do actually create allocations to allow partial initialization, per #49298, so... :man_shrugging:

But unlike pointer dereference, this is an unsafe construct that's always UB.

This is mirrored by the fact that the safe code we are talking about is never executed. I don't see an issue.

We do codegen for safe code all the time under the assumption that all surrounding unsafe code behaves appropriately. We add noalias annotations to references, aligned annotations to pointers, range annotations to bool and enums, we generate jump tables for match only covering legal enum discriminants, and so on. This is just another instance. The fact that the set of cases where safe code can legally run through this code is empty does not make this special in any interesting way.

I guess I do agree that it's semantically correct.

"semantically" as opposed to... what?

Returning from empty requires manifesting a (fully initialized) value of uninhabited type. That is about as clearly UB as UB gets, so we can tell LLVM likewise. Of course we have to make sure this does not subvert our safety guarantees, but in this case it does not.

This is mirrored by the fact that the safe code we are talking about is never executed.

That's somewhat circular. It's only never executed if it's pruned as dead code, which we want the optimizer to do on the basis of UB. In debug mode, there is a real path where this could be executed.

I'd prefer it if we had an example that wasn't obviously doing something wrong. I'll see if I can figure out how to match any of the recent increase in compiler ud2 to this change...

I guess I do agree that it's semantically correct.

"semantically" as opposed to... what?

I mean that it's not supposed to be possible to create an uninhabited value, so we want to assume it never happens. But if you called test_empty(false) in debug mode, that code would be reached, so the unreachable is incorrect in practice. Same for test_unreach(false) though, so maybe this isn't a useful distinction.

If this were real code where we could wave it off and say that we know these are never called with false, then hopefully we would also be inlining enough to see that fact, at least in cases where the code is tight enough that branches leading to this trap are noticeable.

TrapUnreachable from what you describe has the effect of an unreliable lint, but cannot be relied on to actually detect cases where programs run into unreachable. Hence I do not think having or not having that setting should affect our own codegen choices at all.

It doesn't affect what is correct, but it affects ergonomics. To the extent deciding how strictly to exploit UB is a tradeoff between performance and developer ergonomics, it's helpful to know that LLVM will trap rather than literally emitting no instructions and falling through into whatever chunk of code comes next, something that both GCC and Clang do by default on Linux :)

@cuviper

That's somewhat circular. It's only never executed if it's pruned as dead code

No, it is never executed period -- because only UB programs can ever get there, and we do not consider UB programs. There is no circularity.

In debug mode, there is a real path where this could be executed.

That path is just as real as the path to an if where a bool has value 2. And yet we codegen conditionals and other match assuming such paths do not exist. We do not consider UB paths.

I'd prefer it if we had an example that wasn't obviously doing something wrong.

I don't understand?

I mean that it's not supposed to be possible to create an uninhabited value, so we want to assume it never happens. But if you called test_empty(false) in debug mode, that code would be reached, so the unreachable is incorrect in practice. Same for test_unreach(false) though, so maybe this isn't a useful distinction.

This is exactly what UB is for: the compiler may assume it does not happen, and the programmer writing unsafe code must make sure that that is truly the case.

@comex

To the extent deciding how strictly to exploit UB is a tradeoff between performance and developer ergonomics, it's helpful to know that LLVM will trap rather than literally emitting no instructions and falling through into whatever chunk of code comes next, something that both GCC and Clang do by default on Linux :)

LLVM might trap. Or it might do anything else. That's what I mean by "unreliable lint". See test_unreach above for a case where it does not trap.
I'm not saying it's not useful, I am all for keeping it; I am just saying that this in no way means that unreachable is any less dangerous or more predictable in terms of what happens if such code is ever reached.

I assumed https://github.com/rust-lang/rust/pull/59639 was about generic functions when codegen ends up seeing a return statement but the choice of type parameters made it have an uninhabited return type.
Otherwise there wouldn't even be a return statement.

I'll see if I can figure out how to match any of the recent increase in compiler ud2 to this change.

Might be related: the asm for the playground link in the OP reveals pointless ud2's:

playground::empty: # @playground::empty
# %bb.0:
    ud2
    ud2 // unnecessary

playground::unreach: # @playground::unreach
# %bb.0:
    ud2

@RalfJung

we do not consider UB programs [etc]

OK, fine, maybe I'm trying to be too nice to UB.

I'd prefer it if we had an example that wasn't obviously doing something wrong.

I don't understand?

Your example is directly creating UB in hopes of seeing it optimized away. That's somewhat useful as a minimal test, but it's not a realistic situation in itself. If someone actually had that code and complained about codegen, I'd say they should just prune that unreachable branch themselves.

It would be more compelling to see realistic code that _leads_ to a situation like that, probably using generic types as @eddyb suggests. Something where the code is entirely reasonable for some types, but with uninhabited types would have dead code that we want pruned. If a trap survives here, that's something to care about.

It would be more compelling to see realistic code that leads to a situation like that, probably using generic types as @eddyb suggests.

That's fair. It's just much harder to come up with that.^^

Might be related: the asm for the playground link in the OP reveals pointless ud2's:

That seems to be it. I just tried comparing compiler builds of the exact same source, only with and without the trap, and all 3 new ud2 in std are accounted for here:

@@ -82417,9 +82417,10 @@ Disassembly of section .text:

 000000000008ea80 <<libc::unix::notbsd::timezone as core::clone::Clone>::clone>:
    8ea80:  0f 0b                   ud2    
-   8ea82:  66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
-   8ea89:  00 00 00 
-   8ea8c:  0f 1f 40 00             nopl   0x0(%rax)
+   8ea82:  0f 0b                   ud2    
+   8ea84:  66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
+   8ea8b:  00 00 00 
+   8ea8e:  66 90                   xchg   %ax,%ax

 000000000008ea90 <libc::unix::notbsd::CMSG_ALIGN>:
    8ea90:  48 8d 47 07             lea    0x7(%rdi),%rax
@@ -83112,9 +83113,10 @@ Disassembly of section .text:

 000000000008f280 <<libc::unix::notbsd::linux::fpos64_t as core::clone::Clone>::clone>:
    8f280:  0f 0b                   ud2    
-   8f282:  66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
-   8f289:  00 00 00 
-   8f28c:  0f 1f 40 00             nopl   0x0(%rax)
+   8f282:  0f 0b                   ud2    
+   8f284:  66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
+   8f28b:  00 00 00 
+   8f28e:  66 90                   xchg   %ax,%ax

 000000000008f290 <<libc::unix::notbsd::linux::rlimit64 as core::clone::Clone>::clone>:
    8f290:  48 8b 07                mov    (%rdi),%rax
@@ -83517,9 +83519,10 @@ Disassembly of section .text:

 000000000008f6f0 <<libc::unix::DIR as core::clone::Clone>::clone>:
    8f6f0:  0f 0b                   ud2    
-   8f6f2:  66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
-   8f6f9:  00 00 00 
-   8f6fc:  0f 1f 40 00             nopl   0x0(%rax)
+   8f6f2:  0f 0b                   ud2    
+   8f6f4:  66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
+   8f6fb:  00 00 00 
+   8f6fe:  66 90                   xchg   %ax,%ax

 000000000008f700 <<libc::unix::rusage as core::clone::Clone>::clone>:
    8f700:  53                      push   %rbx

The other changed instructions are just different sized NOPs for function alignment.

All 3 of those types are empty enums. It's not clear to me why they should implement Clone and Copy at all, but I found this was added in libc#1217 commit f3684584c99b.

The same 3 appear in a couple other libraries, but otherwise I see no effect at all on the entire compiler, having the trap or just a bare unreachable. This could also be construed as an argument to go ahead and remove the trap, but the lint in https://github.com/rust-lang/rust/issues/59793#issuecomment-480937376 is annoying.

Source:

void Lint::visitUnreachableInst(UnreachableInst &I) {
  // This isn't undefined behavior, it's merely suspicious.
  Assert(&I == &I.getParent()->front() ||
             std::prev(I.getIterator())->mayHaveSideEffects(),
         "Unusual: unreachable immediately preceded by instruction without "
         "side effects",
         &I);
}

One lint avoidance is to br to a new basic block that's just the unreachable. That seems silly, but it's also the reason why hint::unreachable_unchecked() doesn't trigger anything.

LLVM _might_ trap. Or it might do anything else. That's what I mean by "unreliable lint". See test_unreach above for a case where it does not trap.
I'm not saying it's not useful, I am all for keeping it; I am just saying that this in no way means that unreachable is any less dangerous or more predictable in terms of what happens if such code is ever reached.

(I know. My wording could be improved, I was trying to say it would trap in cases where it would otherwise do the specific other thing I mentioned.)

This recently came up again in https://github.com/rust-lang/rust/pull/67054.

Was this page helpful?
0 / 5 - 0 ratings