Rust: Small SIMD test fails with --release but passes without

Created on 22 Apr 2018  路  17Comments  路  Source: rust-lang/rust

Full repro here:
https://github.com/danielrh/simd_playground run with cargo test --release
failing test is here:

#![feature(stdsimd)]
mod test {
#[test]
fn baseline() {
use std::simd::*;
   let symbol = 2i16;
   let inc = 1i16;
   let data = i16x16::new(4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64);
   let one_to_16 = i16x16::new(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16);
   let increment_v = i16x16::splat(inc);
   let mask_v = unsafe {
            ::std::arch::x86_64::_mm256_cmpgt_epi16(::std::arch::x86_64::__m256i::from_bits(one_to_16),
                                                   ::std::arch::x86_64::__m256i::from_bits(i16x16::splat(i16::from(symbol))))
    };
    let output = data + (increment_v & i16x16::from_bits(mask_v));
    let mut xfinal = [0i16; 16];
    output.store_unaligned(&mut xfinal);
    assert_eq!(xfinal, [4, 8, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65]);
}
}

using:
```rustc 1.27.0-nightly (ac3c2288f 2018-04-18)
binary: rustc
commit-hash: ac3c2288f9f9d977acb46406ba60033d65165a7b
commit-date: 2018-04-18
host: x86_64-apple-darwin
release: 1.27.0-nightly
LLVM version: 6.0

on OSX 10.12.6 (16G1314)

and also on linux

rustc --verbose --version
rustc 1.27.0-nightly (ac3c2288f 2018-04-18)
binary: rustc
commit-hash: ac3c2288f9f9d977acb46406ba60033d65165a7b
commit-date: 2018-04-18
host: x86_64-unknown-linux-gnu
release: 1.27.0-nightly
LLVM version: 6.0
```

I was unable to reproduce the above problem by making a simple main with the above function, so it could be something to do with the build options.

One more note: the same exact test and code have been working for months with stdsimd 0.0.3 and 0.0.4 crate. I couldn't get that crate to build with nightly 1.27.0, so I couldn't see if it still worked.

A-LLVM A-simd C-bug T-compiler

Most helpful comment

Fails with nightly-2019-01-26 == rustc 1.33.0-nightly (bf669d1e3 2019-01-25).
Passes with nightly-2019-01-27 == rustc 1.33.0-nightly (20c2cba61 2019-01-26).

This is consistent with the fix being in the LLVM update of #57675, so yeah, I think we can close this.

All 17 comments

Are you compiling with the avx2 target feature enabled?

Great question: I can avoid the bug by specifying RUSTFLAGS="-C target-cpu=core-avx-i" cargo test --release as the build command line

But in the past, llvm has been able to polyfill the instructions down to SSE2 with stdsimd-0.0.4 and therefore I didn't have to specify any flag at all.

I've noticed that asking LLVM to polyfill the instructions from avx2 down to core-avx-i actually improves performance on most available AVX2 hardware unless your instructions are sufficiently dense. This is because the AVX2 instructions downclock the chip for some time, and so it's much better to keep the full clock speed, but then keep the AVX2 code around for newer chips like skylake.

I really loved writing AVX2 intrinsics and having the compiler match them to my desired architecture...having 4 code paths (avx2, avx and SSE4.2 and SSE2) to match my desired targets is a significant support burden, so from my perspective, the old behavior with the SSE2 polyfill was ideal.

This... really isn't what those intrinsics are for. Sometimes the path of least resistance for the compiler is to treat some intrinsics as a generic operation that can be lowered to other instruction sets as well, but that is not at all guaranteed. If you want something portable across different SIMD instruction sets, you should use the (in-development) portable SIMD types, not AVX2 intrinsics.

Good suggestion about the portable simd types: I was able to make this helper function which seems to translate into the avx2 intrinsic when needed. Would this kind of thing be worth providing for all portable SIMD types?

#[inline(always)]
fn cmp_gt_i16x16(lhs: i16x16, rhs: i16x16) -> i16x16 {
    let lz = rhs - lhs;
    let sign_bit = lz & i16x16::splat(-32768);
    sign_bit >> 15
}

I do, however, think that this should either error out in the development build, or preferably yield a compiler error (or at least SIGILL) instead of providing wrong arithmetic results in release. Also, I suspect this is a recent LLVM bug in their polyfill...and may still be worth correcting

This is an issue with opt-level 3 specifically and I believe is a bug inside of LLVM. The problem is that we're passing all arguments by reference (the SIMD arguments) and LLVM is accidentally promoting them to by-value which is known to produce bugs.

Specifically LLVM's Promote 'by reference' arguments to scalars on SCC pass is promoting pass-by-reference to pass-by-value, which is invalid in the sense of how we're expecting to use these functions.

I've opened an upstream LLVM bug at https://bugs.llvm.org/show_bug.cgi?id=37358

fn baseline() {
    let data = i16x16::new(4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64);
    let one_to_16 = i16x16::new(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16);
    let output = one_to_16.gt(i16x16::splat(2i16)).select(data + 1i16, data);
    // note: if the mask is often false for all lanes you could guard the select
    // behind an `if mask.any() { ... }`
    assert_eq!(
        output,
        i16x16::new(4, 8, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65)
    );
}
  • @alexcrichton I've pinged a couple of more people on that LLVM bug, it would be nice to get this fixed for LLVM 7. The code in the OP uses unsafe to call an AVX function, but this is safe (defined behavior) if the code path is only reached in a CPU with AVX. Therefore, this bug is turning defined behavior into undefined behavior, and all of this in stable Rust, so we should probably mark this with I-unsound.

Is there a way to tell which from the I-unsound issues affect stable Rust only? There are a lot of them, and many affect only nightly Rust, but it is hard to tell them apart.

Thanks for the extra pings @gnzlbg! Let's see how that plays out...

@gnzlbg Your playground does not work anymore :(

2 | use std::simd::i16x16;
  |          ^^^^ Could not find `simd` in `std`

@hellow554 you need to use the packed_simd crate, std::simd is (hopefully temporarily) not part of std anymore.

So it appears that this won't be fixed in LLVM any time soon, and AFAICT this is not something we can easily warn about in the Rust side of things for the time being :/

I find this bug unfortunate, as I'm trying to do safe wrappers (called fearless_simd), but this basically makes that approach unworkable. If the bug won't be fixed soon, maybe we should document the danger zone.

I read the llvm issue. It's interesting that this bug has persisted so long without getting triggered; it's evidence that the way people use C++ and Rust are quite different in spite of the similar approaches to zero-cost abstractions etc.

I'm hoping that I woke up on the right side of the bed this morning as after reading #55059 I was struck with inspiration about how we might solve this, manifested in https://github.com/rust-lang/rust/pull/55073. If others are familiar with LLVM review on that would be greatly appreciated!

I'm posting a revert for the fix in https://github.com/rust-lang/rust/pull/55281 because I don't think the fix was quite right (causing segfaults for me). LLVM, however, in the meantime should have an official fix, so this should hopefully get closed out in the near future once that lands.

Does this still reproduce with nightly? If not we can close this.

Fails with nightly-2019-01-26 == rustc 1.33.0-nightly (bf669d1e3 2019-01-25).
Passes with nightly-2019-01-27 == rustc 1.33.0-nightly (20c2cba61 2019-01-26).

This is consistent with the fix being in the LLVM update of #57675, so yeah, I think we can close this.

Is there any reason this is not yet closed?

Was this page helpful?
0 / 5 - 0 ratings