Rust: unwrap branches are not optimized away.

Created on 17 Apr 2020  Â·  8Comments  Â·  Source: rust-lang/rust

When creating a structure which has a variant type, and that this type is wrapped as an Option<T>, the unwrap branch cannot be removed by LLVM, even after proving that the only path reachable is the Some(…) case and never the None case.

I tried this code:

https://github.com/mozilla-spidermonkey/jsparagus/blob/330722aaa4f7c6f39422b7daae7084ea8cbcbead/crates/parser/src/queue_stack.rs#L141-L153

Calling the previous function as:

assert!(self.top != 0);
let v = self.pop().unwrap();

This hint LLVM that the self.top == 0 branch does not need to be generated. However, the unwrap condition remains as the None value is folded within the variant and that LLVM does not know about it.

I think the Rust compiler should give range analysis information that the None value would never be part of the variant, and as such let LLVM remove the unwrapping condition from the generated code.

Meta

Tested with both rustc --version --verbose:

rustc 1.42.0 (b8cedc004 2020-03-09)
binary: rustc
commit-hash: b8cedc00407a4c56a3bda1ed605c6fc166655447
commit-date: 2020-03-09
host: x86_64-unknown-linux-gnu
release: 1.42.0
LLVM version: 9.0

and

rustc 1.44.0-nightly (7f3df5772 2020-04-16)
binary: rustc
commit-hash: 7f3df5772439eee1c512ed2eb540beef1124d236
commit-date: 2020-04-16
host: x86_64-unknown-linux-gnu
release: 1.44.0-nightly
LLVM version: 9.0
A-LLVM C-bug E-needs-mcve I-slow T-compiler

Most helpful comment

Clang gets this right with the -fstrict-enums option which attaches range metadata to the enum load from the the Vec.

e.g.

#include <stdlib.h>
enum Foo {
    A,
    B
};

struct Vec {
    size_t len;
    Foo *data;
};

static int pop(Vec &v) {
    if (v.len) {
        return v.data[v.len-1];
    }
    return (int)Foo::B + 1;
}

Foo foo(Vec &v) {
    if (v.len == 0) { exit(1); }
    int k = pop(v);
    if (k > 1) {
        exit(3);
    }
    return (Foo)k;
}

with -O2 compiles to something similar to the Rust code:

foo(Vec&):                            # @foo(Vec&)
        pushq   %rax
        movq    (%rdi), %rax
        testq   %rax, %rax
        je      .LBB0_3
        movq    8(%rdi), %rcx
        movl    -4(%rcx,%rax,4), %eax
        cmpl    $2, %eax
        jge     .LBB0_4
        popq    %rcx
        retq
.LBB0_3:
        movl    $1, %edi
        callq   exit
.LBB0_4:
        movl    $3, %edi
        callq   exit

but with -O2 -fstrict-enums compiles to the desired:

foo(Vec&):                            # @foo(Vec&)
        pushq   %rax
        movq    (%rdi), %rax
        testq   %rax, %rax
        je      .LBB0_2
        movq    8(%rdi), %rcx
        movl    -4(%rcx,%rax,4), %eax
        popq    %rcx
        retq
.LBB0_2:
        movl    $1, %edi
        callq   exit

It seems like it would be reasonable for Rust to always act like -fstrict-enums

All 8 comments

In the trivial case, https://rust.godbolt.org/z/3TaPU2, this doesn't seem to reproduce, so I'm going to go ahead and mark as E-needs-mcve. I suspect that this might be a case of insufficient inlining or something like that.

I can reproduce it with the following test case: ( https://rust.godbolt.org/z/t_9kCR )

use std::vec::Vec;

pub enum Foo {
    First(usize),
    Second(usize),
}

pub fn foo(mut v: Vec<Foo>) -> Foo {
    assert!(v.len() == 1);
    v.pop().unwrap()
}

In the generated code we see the branch for the assertion, which remove the None case from pop. However, the unwrap condition is still present, because the None case is mixed with the discriminant type of Foo, which lead the generated code to load the the discriminant, and compare it against the None value (= 2), instead of removing the branch.

        mov     r14, qword ptr [rdi]
        cmp     r14, 2
        je      .LBB6_3

I will also note that without the assertion, the case which generate the None branch is folded with the unwrap case and panic. However, the Some branch still has the extra unwrap test, despite the fact that it would never fail.

Clang gets this right with the -fstrict-enums option which attaches range metadata to the enum load from the the Vec.

e.g.

#include <stdlib.h>
enum Foo {
    A,
    B
};

struct Vec {
    size_t len;
    Foo *data;
};

static int pop(Vec &v) {
    if (v.len) {
        return v.data[v.len-1];
    }
    return (int)Foo::B + 1;
}

Foo foo(Vec &v) {
    if (v.len == 0) { exit(1); }
    int k = pop(v);
    if (k > 1) {
        exit(3);
    }
    return (Foo)k;
}

with -O2 compiles to something similar to the Rust code:

foo(Vec&):                            # @foo(Vec&)
        pushq   %rax
        movq    (%rdi), %rax
        testq   %rax, %rax
        je      .LBB0_3
        movq    8(%rdi), %rcx
        movl    -4(%rcx,%rax,4), %eax
        cmpl    $2, %eax
        jge     .LBB0_4
        popq    %rcx
        retq
.LBB0_3:
        movl    $1, %edi
        callq   exit
.LBB0_4:
        movl    $3, %edi
        callq   exit

but with -O2 -fstrict-enums compiles to the desired:

foo(Vec&):                            # @foo(Vec&)
        pushq   %rax
        movq    (%rdi), %rax
        testq   %rax, %rax
        je      .LBB0_2
        movq    8(%rdi), %rcx
        movl    -4(%rcx,%rax,4), %eax
        popq    %rcx
        retq
.LBB0_2:
        movl    $1, %edi
        callq   exit

It seems like it would be reasonable for Rust to always act like -fstrict-enums

rustc already adds range metadata to the load of the discriminant. It seems it gets lost at some point during optimizations.

Hmm so it does. It seems like LLVM is pretty fragile here.

#include <stdlib.h>
enum Foo {
    A,
    B,
    C,
};

Foo foo(Foo* data) {
    Foo k = data[1];
    if (k > Foo::C) {
        exit(3);
    }
    return k;
}

produces the following with -fstrict-enums

foo(Foo*):                            # @foo(Foo*)
        pushq   %rax
        movl    4(%rdi), %eax
        cmpl    $3, %eax
        je      .LBB0_2
        popq    %rcx
        retq
.LBB0_2:
        movl    $3, %edi
        callq   exit

Interestingly enough the je becomes jge without -fstrict-enums so LLVM has some idea what's going on.

So it turns out that https://bugs.llvm.org/show_bug.cgi?id=46279 was me making an incorrect assumption about enums in C++. I looked more closely at the rust issue and filed #73258 which probably needs to be fixed before making progress on this one.

Was this page helpful?
0 / 5 - 0 ratings