Rust: next_back for a Vec is optimized incorrectly on i686

Created on 3 Jul 2018 · 9Comments · Source: rust-lang/rust

> rustc --version -v
rustc 1.28.0-nightly (e3bf634e0 2018-06-28)
binary: rustc
commit-hash: e3bf634e060bc2f8665878288bcea02008ca346e
commit-date: 2018-06-28
host: i686-pc-windows-msvc
release: 1.28.0-nightly
LLVM version: 6.0

Optimized code of the following program panics if -C codegen-units is not 1.

struct S([u32; 2]);
fn main() {
    let x = vec![S([1, 2])];
    let mut it = x.iter();
    if let Some(_) = it.next_back() {
        assert!(it.next_back().is_none());
    }
}

A-LLVM C-bug I-unsound 💥 O-x86

Source

neivv

Most helpful comment

Incredibly work! 👏

I've managed to reduce your test case a bit further. As should be expected from the root cause you identified, changing the pointer size to 64 bit makes the issue disappear. Trying to simplify the types and GEPs further runs into the problem that other (simpler and correct) GEP optimizations fire first.

target datalayout = "p:32:32"

%S = type { [2 x i32] }

define i1 @foo([0 x %S]* %p, i32 %n) {
  %start.cast = bitcast [0 x %S]* %p to %S*
  %end = getelementptr inbounds [0 x %S], [0 x %S]* %p, i32 0, i32 %n, i32 0, i32 0
  %end.cast = bitcast i32* %end to %S*
  %last = getelementptr inbounds %S, %S* %end.cast, i32 -1
  %cmp = icmp eq %S* %last, %start.cast
  ret i1 %cmp
}

Someone with a http://bugs.llvm.org/ account should report this issue upstream (I should really request one soon, but, well, I'm not doing that now).

hanna-kruppe on 10 Jul 2018

👍4

All 9 comments

I bisected the change to be between nightlies rustc 1.24.0-nightly (0077d128d 2017-12-14) (good) and rustc 1.24.0-nightly (77efd6800 2017-12-15) (bad)

I saw two differences in LLVM IR between good and bad versions:
(Disclaimer: I'm certainly not sure if I missed something else and this is just a red herring)
https://gist.github.com/neivv/c7e536269885d3b3fac8ae1a93086920
Bad one has struct definitions as

%S = type { [0 x i32], [2 x i32], [0 x i32] }
%"alloc::vec::Vec<S>" = type { [0 x i32], { i32*, i32 }, [0 x i32], i32, [0 x i32] }

and good one has

%S = type { [0 x i8], [2 x i32], [0 x i8] }
%"alloc::vec::Vec<S>" = type { [0 x i8], { i32*, i32 }, [0 x i8], i32, [0 x i8] }

bad IR also has an additional load instruction in the assertion panic branch
https://gist.github.com/neivv/c7e536269885d3b3fac8ae1a93086920#file-bad-ll-L496

neivv on 4 Jul 2018

cc @eddyb

Mark-Simulacrum on 7 Jul 2018

So, I think this is an LLVM issue:

I tried to examine how the LLVM IR is optimized, and one transformation is quite.. suspicious
https://gist.github.com/neivv/daf93ad90c0d475c3abb119b1d01c584#file-neat-ll-L136
(The gist has main before-after the suspicious pass)

It's comparing vec's length to 0xe000_0001 - comparing to 0x1 would be correct, but somehow the three highest bits of the constant are set??
Interestingly multiplying 0xe000_0001 by sizeof(S) -- 8, is same as 0x1 * sizeof(S).

I don't know if there's anything that would allow LLVM do that transformation intentionally, but I don't really have any experience with LLVM either.

So I tried to reduce this to something that would reproduce, but modifying that IR to be something that opt accepts and running it through opt -instcombine, just does the expected transform, comparing slice length to 0x1. And since this bug doesn't occur with -C codegen-units=1, rustc --emit llvm-ir doesn't seem to produce anything that would reproduce the bug either.

neivv on 8 Jul 2018

I've not examine the IR in detail but that does sound quite suspiciously like something is deeply broken in LLVM. It might be a miscompilation in our build, or a misuse of the APIs when running passes in-process, or a wild write somewhere messing up data structures.

But since that sort of bug is very difficult to diagnose, it would be good to make sure whether we really can't reproduce it outside of rustc. For starters, you could add -print-module-pass to the LLVM arguments when running under -print-after-all. That will make it print the entire module even after function passes such as SimplifyCFG and InstCombine, and should give you something that you can feed into vanilla LLVM 6.0's opt unmodified.

hanna-kruppe on 8 Jul 2018

Okay, I've got LLVM-only reproduction =)

https://gist.github.com/neivv/ae449690d80e0aad853251caa5a3863b#file-thing-ll
Running this through opt -S -instcombine will cause the odd transformation to length == 0xe000_0001 (-536870911). At least with the opt that is part of Rust's current LLVM.

Haven't tried to reduce from there / find if anything in the input module is actually suspicious yet.

neivv on 9 Jul 2018

👍2

Godbolt's opt (trunc) reproduces the odd transformation.

hanna-kruppe on 9 Jul 2018

Figured it out.

I haven't had a chance to test this hypothesis but noticed this while looking around how InstCombine works, and I'm confident that this is it:
https://github.com/llvm-mirror/llvm/blob/ed10960040a110b16e0415cc5ec320247384f721/lib/Transforms/InstCombine/InstCombineCompares.cpp#L528
In this issue, the code is comparing roughly (S *)slice_start == (S *)slice_start + slice_len - 1.
This function is extracting the constant byte offset, -8, and dividing it by 8, sizeof(S). However, before that, compiling to a 32-bit target, the values are masked by PtrSizeMask 0xFFFFFFFF, which discards the sign bit of int64 and gives +4294967288, which divided by 8 is the same 536870911 which ends up being in the result.

Here's also the best I was able to reduce IR while still reproducing the issue.

; ModuleID = 'tmp1-317d481089b8c8fe83113de504472633.rs'
source_filename = "tmp1-317d481089b8c8fe83113de504472633.rs"
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

%S = type { [0 x i32], [2 x i32], [0 x i32] }

; Function Attrs: uwtable
define internal void @_ZN3tmp4main17hcd108164c8e55929E() unnamed_addr {
  %1 = call { [0 x %S]*, i32 } @"_ZN68_$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..deref..Deref$GT$5deref17h968a5bfabac70869E"()
  %2 = extractvalue { [0 x %S]*, i32 } %1, 0
  %3 = extractvalue { [0 x %S]*, i32 } %1, 1
  %4 = getelementptr inbounds [0 x %S], [0 x %S]* %2, i32 0, i32 0, i32 0, i32 0
  %5 = getelementptr inbounds [0 x %S], [0 x %S]* %2, i32 0, i32 %3, i32 0, i32 0
  %6 = bitcast i32* %4 to %S*
  %7 = bitcast i32* %5 to %S*
  %8 = getelementptr inbounds %S, %S* %7, i32 -1
  %9 = bitcast %S* %8 to i32*
  %10 = bitcast i32* %9 to %S*
  %11 = icmp eq %S* %10, %6
  br i1 %11, label %12, label %13

; <label>:12:                                     ; preds = %24
  ret void

; <label>:13:                                     ; preds = %12, %50
  ret void
}

declare hidden { [0 x %S]*, i32 } @"_ZN68_$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..deref..Deref$GT$5deref17h968a5bfabac70869E"() unnamed_addr

neivv on 9 Jul 2018

🎉1

Incredibly work! 👏

target datalayout = "p:32:32"

%S = type { [2 x i32] }

define i1 @foo([0 x %S]* %p, i32 %n) {
  %start.cast = bitcast [0 x %S]* %p to %S*
  %end = getelementptr inbounds [0 x %S], [0 x %S]* %p, i32 0, i32 %n, i32 0, i32 0
  %end.cast = bitcast i32* %end to %S*
  %last = getelementptr inbounds %S, %S* %end.cast, i32 -1
  %cmp = icmp eq %S* %last, %start.cast
  ret i1 %cmp
}

Someone with a http://bugs.llvm.org/ account should report this issue upstream (I should really request one soon, but, well, I'm not doing that now).

hanna-kruppe on 10 Jul 2018

👍4

Opened https://bugs.llvm.org/show_bug.cgi?id=39908