Rust: Unreachable code optimization failure when matching on Rust enum

Created on 11 Oct 2020  路  7Comments  路  Source: rust-lang/rust

In this example:

#[derive(Copy, Clone, Eq, PartialEq)]
pub enum Variant {
    Zero, 
    One,
    Two,
}

#[inline]
fn unreachable() {
    println!("Impossible");
}

extern {
    fn exf1();
    fn exf2();
}

#[no_mangle]
pub static mut GLOBAL: Variant = Variant::Zero;

pub unsafe fn test() {
    let g = GLOBAL;
    if g != Variant::Zero {
        match g {
            Variant::One => exf1(),
            Variant::Two => exf2(),
            Variant::Zero => unreachable(),
        }
    }
}

the unreachable() branch is not removed. Adding any #[repr(n)] but not #[repr(C)] to Variant or making reachable branches call only one of external functions allows the optimization.

There's no such issue with a similar C example using clang so it's likely something on Rust side.

Example links:
Rust https://rust.godbolt.org/z/9oh6sP
C https://godbolt.org/z/Tnbfx6

A-codegen A-mir-opt C-enhancement I-slow T-compiler

All 7 comments

This is weird; LLVM has plenty enough information to be able to do this from what's emitted.

The IR it currently produces: https://rust.godbolt.org/z/GWacEc

define void @_ZN7example4test17h9c70bf34953e7145E() unnamed_addr #0 !dbg !6 {
start:
  %_2.i = alloca %"std::fmt::Arguments", align 8
  %0 = load i8, i8* getelementptr inbounds (<{ [1 x i8] }>, <{ [1 x i8] }>* @GLOBAL, i64 0, i32 0, i64 0), align 1, !dbg !10, !range !11
  %_10.i.i.not = icmp eq i8 %0, 0, !dbg !12
  br i1 %_10.i.i.not, label %bb8, label %bb3, !dbg !22

bb3:                                              ; preds = %start
  %_6 = zext i8 %0 to i64, !dbg !23
  switch i64 %_6, label %bb5 [
    i64 0, label %bb4
    i64 1, label %bb6
    i64 2, label %bb7
  ], !dbg !23

That clearly has sufficient information to know that bb4 is unreachable, but it's not taking advantage of it.

Curiosity: why is rustc insisting on zexting to i64 for the discriminant comparison? That's something rust is doing; it's there even in -O0 https://rust.godbolt.org/z/a1nxzx

It gets even weirder if you put Variant::Zero as the first pattern in the match, notice in particular the two consecutive je https://rust.godbolt.org/z/3K933z

This looks like the same problem as #73031. LLVM can't see through the zext in some cases.

rustc extends to i64 because the default discriminant type is isize, even though the memory representation is usually smaller.

Although this can be fixed in LLVM, it may be worthwhile to try making the discr type the same as the memory repr, i.e. do this https://github.com/rust-lang/rust/blob/8cc82ee340ed96099680ec1165cf5e192d658d0f/compiler/rustc_middle/src/ty/layout.rs#L74-L78 earlier, so this fn https://github.com/rust-lang/rust/blob/8cc82ee340ed96099680ec1165cf5e192d658d0f/compiler/rustc_middle/src/ty/mod.rs#L2290-L2292 defaults to the smallest type that will fit instead of isize. (I _think_ this would be backwards compatible, you can still as-cast to any integer type, I'm not sure if the underlying discriminant type is exposed anywhere.)

So is the fix here to codegen it using <T as DiscriminantKind>::Discriminant instead of i64?

<T as DiscriminantKind>::Discriminant is i64, which is the problem: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=3f5f5c4a34d07152653c173a7e3407e4

FYI, #74215 has some discussion about the trade-offs of changing DiscriminantKind::Discriminant to be the smallest possible integer type.

I've also found that a not insignificant amount of hashing in the compiler is being done on discriminant values, so there is a potential compile time improvement in doing this too, though I don't think that should be a primary motivation for the change.

Hmm, there's nothing that would force that type and the HIR->MIR desugaring to use the same internal construct value, though, right? Like we could change the MIR Rvalue::Discriminant to return the same width as it's actually stored, and the discriminant_value intrinsic to [sz]ext to whatever DiscriminantKind wants?

That way LLVM would just see a switch on the loaded value, not an extended one. (So this might still happen if the if g != Variant::Zero { check were replaced with mem::Discriminant one, but that would be a different problem.)

Was this page helpful?
0 / 5 - 0 ratings