The current terminology is that the (optionally explicitly specified) numerical values associated with enum variants are called "discriminants", while the representation may use "tags".
By default (that is, unless e.g. #[repr(iN)] / #[repr(uN)] is specified), discriminants have the isize type and tags usually take the smallest signed/unsigned primitive integer that fits all the values.
All types (not just enums) have a "discriminant", which is always 0u8 for all non-enum types.
For enums, the discriminant is computed from the memory representation, which specifically for unoptimized tagged unions is done by casting the tag to the discriminant type.
In the general case, the original discriminant values do not need to be used in-memory.
However, some of the older code in the compiler mixes up the two terms and can cause confusion.
Ideally, for enums, we would use "discr(iminant)" for the value associated with each variant, and "tag" for its encoding in the memory representation of tagged unions.
cc @oli-obk @nikomatsakis
馃憤
I've been busy teaching myself how Rust enums are represented in memory, and I think I know enough to attempt to fix this issue. I'll still need guidance along the way.
I've been struggling to find a good definition of _tagged unions_ and how they relate to enums in Rust. From what I know so far, a tagged union is how an enum is represented in memory by Rust. Is this correct?
From what I know so far, a tagged union is how an enum is represented in memory by Rust. Is this correct?
Mostly. This is how an enum is represented that has the tagged layout. https://github.com/rust-lang/rust/blob/master/src/librustc/ty/layout.rs#L820
So Option<&Foo> is not represented that way.
@samWson Wikipedia has a page on them, although they focus on the high-level concept (like Rust's enums) whereas I'm using the term here to mean one of the possible representation choices, specifically one which involves an explicit "tag" and an "union" (in the C sense) of variant data.
See the C/C++ example under https://en.wikipedia.org/wiki/Tagged_union#1970s_&_1980s for how you might "manually" implement a tagged union in a language without the higher-level concept.
Mostly. This is how an enum is represented that has the tagged layout. https://github.com/rust-lang/rust/blob/master/src/librustc/ty/layout.rs#L820
@oli-obk Going by the original issue, is the enum quoted above an example of the code that needs to be changed to use the correct terminology?
For example, the code at the quoted link will become:
/// General-case enums: for each case there is a struct, and they all have
/// all space reserved for the tag, and their first field starts
/// at a non-0 offset, after where the tag would go.
Tagged {
tag: Scalar,
variants: Vec<LayoutDetails>,
},
Likewise, any uses of type Tagged throughout the file will be changed too.
Is this cleanup of terminology what the issue is trying to solve?
Yes, that's exactly the cleanup meant here
Great. I'll get started on the cleanup then.
Is there any code outside of src/librustc/ty/layout.rs that will need to be updated to take into account the changes needed by this issue? I haven't been able to find any references to ty::layout::Variants::Tagged outside of layout.rs myself. Likewise, I haven't found any tests that would be affected by these changes either. If there are any tests I need to be aware of please let me know.
@samWson try searching for Tagged, there should be a bunch of uses in both miri and rustc_trans.
EDIT: as for tests, there shouldn't be any changes, as long as you're just doing renamings.
Pull request is passing CI. Please review and let me know if there are more changes required. #50309
Most helpful comment
@samWson Wikipedia has a page on them, although they focus on the high-level concept (like Rust's
enums) whereas I'm using the term here to mean one of the possible representation choices, specifically one which involves an explicit "tag" and an "union" (in the C sense) of variant data.See the C/C++ example under https://en.wikipedia.org/wiki/Tagged_union#1970s_&_1980s for how you might "manually" implement a tagged union in a language without the higher-level concept.