Rust: "discriminant_value" intrinsic returns u64, cannot fit all discriminants

Created on 28 Mar 2020  Â·  27Comments  Â·  Source: rust-lang/rust

Ever since we had enums with repr(u128)/repr(i128), the type of the discriminant_value intrinsic was no longer adequate: it returns a u64, which simply cannot fit all discriminant values.

@eddyb says the type of the intrinsic should be adjusted. This will affect both its codegen and Miri implementation.


This issue has been assigned to @lcnr via this comment.


A-intrinsics C-bug T-compiler T-lang T-libs

Most helpful comment

I wouldn't have any sympathy for people transmuting to u64.

If you want to be slick, you could make std::mem::Discriminant\

All 27 comments

cc @rust-lang/libs I wonder if we're safe to change std::mem::discriminant.
We might need a crater run as I suspect there are people who transmute the result to u64.

A crater run would be fine, but we should not block a change on people doing things like that IMO.

I wouldn't have any sympathy for people transmuting to u64.

If you want to be slick, you could make std::mem::Discriminant\

Would the following work?

Add a (potentially hidden) permanently unstable trait

trait DiscriminantKind {
    // the traits are required by `mem::Discriminant`
    type Discriminant: Clone + Copy + Debug + Eq + PartialEq + Hash + Send + Sync + Unpin;
}

which is automatically implemented for every type and change intrinsics::discriminant to return DiscriminantKind::Discriminant instead.

mem::Discriminant should now be able to use <T as DiscriminantKind>::Discriminant as its only field.

This allows for mem::Discriminant to be smaller in most cases while still being fairly simple to implement afaict.

The Discriminant type was originally made opaque due to exactly this issue IIRC (and a couple other reasons). If only there was a way to prohibit transmuting it, or mark the type Unsized...


@lcnr There’s no point in making the intrinsic hide its implementation details, as the intrinsics are inherently _unstable_.

The reason why I considered hiding it is to not pollute rustdoc by adding this (for the user irrelevant) impl to every type.

@rustbot claim

Rather than perma-unstable, I think the trait could be private to libcore. (The Discriminant struct doesn’t need a trait bound since the trait needs to be implemented for all types anyway: for example std::mem::discriminant(&true) compiles today.)

An associated type in a trait is required because that’s how the compiler supports having a field of different types in struct Discriminant<T> { … } depending on T, but these are implementation details that don’t need to be visible in any public API.

@SimonSapin While it is not required as a bound for Discriminant, it is required for the return type of intrinsics::discriminant_value which is public

intrinsics::discriminant_value which is public

We can make the intrinsic private, they're not really meant to be used directly anyway.

Couldn’t the return type of intrinsics::discriminant_value be changed to Discriminant?

This might not even require a change to the compiler side of the intrinsic, if we add #[repr(transparent)] to Discriminant

I’m not sure what’s exactly the point of changing the intrinsic. Intrinsics are inherently unstable, have changed in the past multiple times and people are expected to not rely on them being stable. It is an implementation detail.

What for are we trying to make our own lives harder by thinking about it at all?

We can make the intrinsic private, they're not really meant to be used directly anyway.

Not sure there’s any difference as people can just (unstably) extern it if they want.

Yeah, I would change the intrinsic in whatever way makes the implementation easier.

While implementing #70705 I realized that this breaks PartialEq for repr128 enums:

#![feature(core_intrinsics, repr128)]

use std::intrinsics::discriminant_value;

#[derive(PartialEq, Debug)]
#[repr(i128)]
enum Test {
    A = 0,
    B = u64::max_value() as i128 + 1, 
}

fn main() {
    println!("{}, {}", discriminant_value(&Test::A), discriminant_value(&Test::B));
    assert_eq!(Test::A, Test::B);
}

playground

memory unsafety :heart:

#![feature(repr128, arbitrary_enum_discriminant)]

#[derive(PartialEq, Debug)]
#[repr(i128)]
enum Test {
    A(Box<u64>) = 0,
    B(usize) = u64::max_value() as i128 + 1, 
}

fn main() {
    assert_eq!(Test::A(Box::new(2)), Test::B(0));
    //~^ Illegal instruction (core dumped)
}

At least #[repr(i128)] seems to still be unstable, I didn't realize that was the case.
I guess fixing this brings it closer to stabilization?

@lcnr to clarify, when you say

I realized that this breaks PartialEq for repr128 enums:

you meant that the failure to represent 128-bit discriminants breaks PartialEq, right? That is, it's not that the approach of using an associated type of a magic trait breaks PartialEq, but rather the underlying bug.

Yes, in the current state PartialEq can lead to undefined behavior as discriminant_value truncates 128 bit discriminants. #70705 correctly compares the untruncated 128 bit values and fixes this issue.

#![derive(PartialEq)]
enum Foo {
    A(u8),
    B,
}

can be thought of as

impl PartialEq for Foo {
    fn eq(&self, rhs: &Self) -> bool {
        use core::intrinsics::discriminant_value;
        if discriminant_value(self) != discriminant_value(rhs) {
            return false;
        }

        match (self, rhs) {
            (Foo::A(l), Foo::A(r)) => l == r,
            (Foo::B, Foo::B) => true,
            _ => unsafe { core::hint::unreachable_unchecked() },
        }
    }
}

The output of --pretty expanded for https://github.com/rust-lang/rust/issues/70509#issuecomment-609980977 includes:

impl ::core::cmp::PartialEq for Test {
    #[inline]
    fn eq(&self, other: &Test) -> bool {
        {
            let __self_vi =
                unsafe { ::core::intrinsics::discriminant_value(&*self) } as
                    i128;
            let __arg_1_vi =
                unsafe { ::core::intrinsics::discriminant_value(&*other) } as
                    i128;
            if true && __self_vi == __arg_1_vi {
                match (&*self, &*other) {
                    (&Test::A(ref __self_0), &Test::A(ref __arg_1_0)) =>
                    (*__self_0) == (*__arg_1_0),
                    (&Test::B(ref __self_0), &Test::B(ref __arg_1_0)) =>
                    (*__self_0) == (*__arg_1_0),
                    _ => unsafe { ::core::intrinsics::unreachable() }
                }
            } else { false }
        }
    }

My understanding is that in current Nightly, intrinsics::discriminant_value returns a u64 so u64::max_value() as i128 + 1 is incorrectly truncated to zero. Thus __self_vi == __arg_1_vi is true even though the two values have different a variant, so control flow would go to the last branch of match but with intrinsics::unreachable we get UB.

I believe https://github.com/rust-lang/rust/pull/70705 would fix this.

One note would be that, eventually, it seems like we could stabilize the DiscriminantKind trait, though I would never permit user-provided implementations. It'd be nice to work towards a "minimal set of lang items" that we can use as the basis for the rest of the stdlib.

Although I guess that's an orthogonal thing -- i.e., in user code, Discriminant<T> is the stable way to access the "discriminant for this type". And the lang item itself is specific to how our stdlib impl works but not user exposed. Ok, never mind what I just wrote, except to note that exposing associated types of "magic traits" like T::Discriminant might be a nice option in the future.

So what is needed to unblock this? This PR is blocking some Miri cleanup of read_discriminant (that would be awkward to do right now because of the bug this fixes).

@RalfJung by "this PR" you mean https://github.com/rust-lang/rust/pull/70705 ?

It seems like we're mostly good to go forward, there aren't (afaict) many public-facing implications of this beyond correctly handling larger discriminants.

Sorry, I meant to post that in the PR https://github.com/rust-lang/rust/pull/70705, yes.

Was this page helpful?
0 / 5 - 0 ratings