Rust: `is_x86_feature_detected!("avx512f")` fails to build on beta and nightly

Created on 6 Feb 2020  路  38Comments  路  Source: rust-lang/rust

The following program builds and runs on stable but fails to build on nightly:

fn main() {
    dbg!(is_x86_feature_detected!("avx512f"));
}

Here's the build error on rustc 1.43.0-nightly (58b834344 2020-02-05):

error[E0658]: use of unstable library feature 'stdsimd'
 --> src/main.rs:2:10
  |
2 |     dbg!(is_x86_feature_detected!("avx512f"));
  |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: for more information, see https://github.com/rust-lang/rust/issues/27731
  = help: add `#![feature(stdsimd)]` to the crate attributes to enable
  = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

But note that if I replace "avx512f" with "avx2", the error goes away.


I-nominated P-high T-lang T-libs disposition-merge finished-final-comment-period regression-from-stable-to-nightly

Most helpful comment

I think we should assume, for practical purposes, that there are no regressed crates until there's _positive_ empirical evidence for something.

To me, the phrase "We should assume there are no regressed crates [until there's empirical evidence otherwise]" describes a model that is, in my opinion, too willing to accept breakage unnecessarily, and is not in line with "if you are using the stable release of Rust, you will never dread upgrading to the next release"

(But maybe I'm just someone who does think that the "absence of evidence" maxim is worth raising from time to time.)

All 38 comments

cc @gnzlbg

Here's the output of cargo-bisect-rustc:

searched nightlies: from nightly-2020-01-02 to nightly-2020-02-10
regressed nightly: nightly-2020-02-06
searched commits: from https://github.com/rust-lang/rust/commit/c9290dceee2cb6b882b26ec6e294560e51ef0853 to https://github.com/rust-lang/rust/commit/58b834344fc7b9185e7a50db1ff24e5eb07dae5e
regressed commit: https://github.com/rust-lang/rust/commit/eda1a7adfcf6de70afa4ca0a6f709ed0e507516a

That points to https://github.com/rust-lang/rust/pull/68755, which is a large merge of stdarch. When I bisect stdarch, I find the error was introduced in https://github.com/rust-lang/stdarch/commit/cb6fa6800159e062ccb38c1179d69f19e1dde272, possibly intentionally. Is it the case that AVX-512 feature detection was never intended to be stabilized? Or maybe it got caught up in a larger group of unstable features? Prior to this regression, it did in fact work :)

A quick search through https://github.com/search?q=is_x86_feature_detected%21%28%22avx512f%22%29&type=Code turns up at least one other crate besides BLAKE3 that fails to build on nightly for this reason:

```
git clone https://github.com/PoC-Consortium/scavenger
cd scavenger
cargo check --features=simd # succeeds
cargo +nightly check --features=simd # fails

@Mark-Simulacrum, this might be a hi-pri regression. (Though of course I'm biased cause it affects my crate.) If @gnzlbg isn't available to evaluate it, could we pull in someone else?

cc @rust-lang/libs

To be honest I'm not even entirely sure where this macro is implemented...

Caused by https://github.com/rust-lang/stdarch/pull/739, which started enforcing stability on individual features.

It looks like a crater run was done in September, but the blake3 and b3sum crates were published in January. The blake3 crate relies on std to _detect_ AVX-512 support, but the actual AVX-512 implementation code is in C or assembly. Also the scavenger crate I mentioned above hides its use of AVX-512 detection behind its simd feature, so I don't think crater would've caught it.

Seems like there was no discussion done of whether this would break folks (or at least I can't find it). Clearly this is "permitted" under a strict interpretation of our stability rules, though is not documented on the macro at all; the docs list avx512f with no warning that it is unstable. That at least sounds like a bug.

I guess someone (libs team?) needs to make a call on whether this is a bug, and whether we should fix it.

Considering that this broke two crates on stable, I think we should just go ahead and stabilize the avx512 features. I don't think there is any particular reason for holding back on them.

(nominating to try to ensure T-libs resolves this in some manner before this hits stable.)

Even if we don't want to stabilize the intrinsics themselves for whatever reason, it seems fine to allow detection of avx512 support via the macro.

Could someone on T-libs fcp a stabilization for the following x86 features, which have been destabilized in nightly:

  • mmx
  • sse4a
  • avx512f
  • avx512cd
  • avx512er
  • avx512pf
  • avx512bw
  • avx512dq
  • avx512vl
  • avx512ifma
  • avx512vbmi
  • avx512vpopcntdq
  • tbm

https://doc.rust-lang.org/nightly/std/macro.is_x86_feature_detected.html#supported-arguments

Once we get the ball rolling on that, I can work on trying to fix whatever the root cause of this destabilization is.

(Or, if we don't want to stabilize all of them, and want to say "oh well breaking change", then that would be helpful too).

Also see #60109: there was an approved FCP to stabilize adx, tbm and sse4a but it got closed because the PR author was inactive.

According to https://github.com/rust-lang/stdarch/issues/310, it seems that a lot of AVX512 intrinsics are missing from std::arch. As such I don't think it really makes sense to stabilize the AVX512 feature.

I think the best approach would be to stabilize avx512 only for is_x86_feature_detected, since it was already (accidentally) stable and code is relying on it.

Any thoughts @rust-lang/libs @gnzlbg? I will write up a PR for stdarch if there are no objections.

It seems to me that stabilizing avx512 just as a parameter to is_x86_feature_detected should be fine? As in, I don't think there's any hazard of us never stabilizing it.

I'd be more skeptical of blanket stabilizing a bunch of parameters to is_x86_feature_detected though. For example, we might want to never stabilize anything to do with mmx.

it seems that a lot of AVX512 intrinsics are missing from std::arch. As such I don't think it really makes sense to stabilize the AVX512 feature.

Definitely agree here.

To be clear, stable today has the list in https://github.com/rust-lang/rust/issues/68905#issuecomment-589098209 stabilized, which includes mmx -- we can technically say "woops" I guess and revert that to being unstable, but that seems not great. (Though if there are good reasons to do so, then perhaps we should.)

There are good reasons to forbid MMX. It's the only instruction from the list above which can cause misoptimization (breaking otherwise sound code).
It's removal is discussed in https://github.com/rust-lang/stdarch/issues/823

Is it important to get fixes in before the next beta is tagged in a couple weeks?

It would be good, but I think it's not critical. We can always backport.

I've opened https://github.com/rust-lang/stdarch/pull/842 as a first stab at an upstream fix. That one only re-stabilizes the AVX-512 features (that were stable before), leaving mmx, sse4a, and tbm as they are. It sounds like we specifically don't want to include mmx, though I'm not sure what to do with the other two.

It sounds like we specifically don't want to include mmx, though I'm not sure what to do with the other two.

Upstream topic about sse4a and tbm: https://github.com/rust-lang/stdarch/issues/668

We discussed this in the language team meeting. We agreed that we should stabilize processor feature detection even if we don't have intrinsics for a given feature. So, in particular, we should re-stabilize support in is_x86_feature_detected! for any feature we know how to detect, even those we don't have stable intrinsics for. (That includes mmx, even though it doesn't sound like anyone is likely to care about it.)

@rfcbot poll lang Re-stabilize is_x86_feature_detected for any processor feature we know how to detect?

Team member @joshtriplett has asked teams: T-lang, for consensus on:

Re-stabilize is_x86_feature_detected for any processor feature we know how to detect?

  • [x] @Centril
  • [x] @cramertj
  • [x] @eddyb
  • [x] @joshtriplett
  • [ ] @nikomatsakis
  • [x] @pnkfelix
  • [x] @scottmcm
  • [ ] @withoutboats

The following workaround can be used to detect AVX512F and AVX512VL (which is what BLAKE3 uses) without depending on any unstable features:

// Detection code based on the std_detect crate
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
pub fn avx512f_avx512vl_detected() -> bool {
    use std::arch::x86_64::*;

    // Check for avx2 to ensure all needed CPUID features are available.
    if !is_x86_feature_detected!("avx2") {
        return false;
    }

    unsafe {
        // Check that the OS supports AVX512
        let xcr0 = _xgetbv(0);
        let os_avx512_support = xcr0 & 224 == 224;
        if !os_avx512_support {
            return false;
        }

        // Check that the CPU supports AVX512F and AVX512VL
        let CpuidResult { ebx, .. } = __cpuid(7);
        let avx512f = ebx & 1 << 16 != 0;
        let avx512vl = ebx & 1 << 31 != 0;
        avx512f && avx512vl
    }
}

Considering that these features were accidentally stabilized and that only 2 crates are depending on them I would propose that:

  • We do not stabilize the avx512 detection flags.
  • BLAKE3 and scavenger switch to using the above detection function.

We have no concrete evidence it's just two crates though, right?

I myself was pretty convinced by @joshtriplett's argument in the lang team meeting that there's not really any reason to not stabilize detection for anything that we can reliably detect; the intrinsics themselves are a different story.

IIRC, one key benefit of the macro is that it globally caches the results, whereas your function would need additional work to do so, I think?

We have no concrete evidence it's just two crates though, right?

(Not that this changes my view per the poll above, but I think we should not consider "absence of evidence is not evidence of absence" as a good argument. I think we should assume, for practical purposes, that there are no regressed crates until there's positive empirical evidence for something.)

I would avoid any kind of handcrafted cpudetection code scattered around.

(I have a patch pending in rav1e waiting for the avx512 cpu detection support to stabilize and probably other projects are in a similar situation)

Here's another affected crate: https://github.com/solana-labs/solana/commit/60877f9ba48c2c20cb90b522a5bb690d7989d85a

I've reconsidered my position: since cfg(target_feature = "X") is already stable for all feature flags (stable or unstable), I don't think it makes sense to block the same flag names for runtime feature detection.

I think we should assume, for practical purposes, that there are no regressed crates until there's _positive_ empirical evidence for something.

To me, the phrase "We should assume there are no regressed crates [until there's empirical evidence otherwise]" describes a model that is, in my opinion, too willing to accept breakage unnecessarily, and is not in line with "if you are using the stable release of Rust, you will never dread upgrading to the next release"

(But maybe I'm just someone who does think that the "absence of evidence" maxim is worth raising from time to time.)

triage: After spending a while explaining this bug to @spastorino , I figured I should take the plunge and mark this as P-high.

(I'm not 100% sure if it is a release blocker per se, but my interpretation of the comments above is that we would prefer to either land a fix for this before beta is cut or, failing that, prioritize a backport to the beta branch, so that the bug itself never hits stable.)

OK, let's try to make some progress on this. I'd like to propose stabilizing the following features for feature detection only:

  • mmx
  • sse4a
  • tbm
  • avx512f
  • avx512cd
  • avx512er
  • avx512pf
  • avx512bw
  • avx512dq
  • avx512vl

The SSE4a and TBM features were already scheduled to be stabilized in #60109, but the PR got abandoned. Only a subset of the AVX512 features are listed here because there is some uncertainty with the newer ones as to whether they should contain an underscore: avx512vbmi vs avx512_vbmi. Finally MMX, while not supported by LLVM, is a well known feature name.

@rfcbot fcp merge

Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:

  • [x] @Amanieu
  • [x] @Centril
  • [x] @KodrAus
  • [x] @SimonSapin
  • [x] @cramertj
  • [x] @dtolnay
  • [x] @joshtriplett
  • [x] @nikomatsakis
  • [x] @pnkfelix
  • [x] @scottmcm
  • [x] @sfackler
  • [ ] @withoutboats

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

https://github.com/rust-lang/stdarch/pull/842 is now up to speed with @Amanieu's proposal above.

:bell: This is now entering its final comment period, as per the review above. :bell:

As of the 1.42 release today, cargo +beta install b3sum is failing. This fix will need to be backported.

@rustbot assign @Amanieu

PR to update the submodule is here: #70151

It will need to be backported to beta.

Was this page helpful?
0 / 5 - 0 ratings