Rust: Tracking issue for RFC 1758: Specify `repr(transparent)`

Created on 3 Jul 2017  路  55Comments  路  Source: rust-lang/rust

RFC

This is a tracking issue for the RFC "Specify repr(transparent)" (rust-lang/rfcs#1758).

Steps:

B-RFC-approved C-tracking-issue T-lang disposition-merge finished-final-comment-period

Most helpful comment

The other way (&T to &UnsafeCell) is definitely not okay, since this is a use-after-free.

Well yes, you're casting &T to &mut T in here. That's not legal if the &T didn't point into an UnsafeCell to begin with.

(It's also really dangerous to use transmute this way because the lifetime inferred for the result has nothing to do with the lifetime of the argument, but that's not the problem here. Usually transmute should be wrapped into a helper function to make sure the argument and return types are what you think they are.)

Do you have any thoughts on whether transmuting Box> to Box would be unsound?

Actually I think Box<Cell<T>> to Box<T> and vice versa are safe conversions, in the sense that the following two functions would be sound to add to libstd:

fn cell_to_box<T>(x: Box<Cell<T>>) -> Box<T> {
  unsafe { mem::transmute(x) }
}
fn box_to_cell<T>(x: Box<T>) -> Box<Cell<T>> {
  unsafe { mem::transmute(x) }
}

After all, owning a Box<T> excludes the possibility of any kind of reference pointing into the box. And the same is true for &mut Box<Cell<T>> and &mut Box<T>; and even &mut T and &mut Cell<T> (or really, wrapping any number of &mut and Box around the Cell).

EDIT: Confirmed, I proved this for Box<Cell<T>> and &mut Cell<T>.

As for Arc... I'll have to think about this more, and will answer on SO.

All 55 comments

Mentoring instructions unavailable. This is sadly not that easy, as we have too much code that still relies on assumptions that such types do not exist. I have an in-progress branch which requires implementing the necessary support for "newtype unpacking" but no time frame for completion.

Is making types like Cell repr(transparent) (which e.g. https://github.com/rust-lang/rfcs/pull/1789 relies on) part of the implementation of the RFC or would that require its own RFC?

@RalfJung For a change at that level, a PR approved by the libs team suffices.

The RFC makes contradictory statements about the interaction with repr(align(N)): The Motivation section says

Given this attribute delegates all representation concerns, no other repr attribute should be present on the type. This means the following definitions are illegal:

#[repr(transparent, align = "128")]
struct BogusAlign(f64);

#[repr(transparent, packed)]
struct BogusPacked(f64);

... while the Detailed Design section says:

This new representation cannot be used with any other representation attribute but alignment, to be able to specify a transparent wrapper with additional alignment constraints:

#[repr(transparent, align = "128")]
struct OverAligned(f64); // Behaves as a bare f64 with 128 bits alignment.

#[repr(C, transparent)]
struct BogusRepr(f64); // Nonsensical, repr cannot be C and transparent.

AFAICT the RFC initially prohibited repr(transparent, align), then in the discussion some people asked for it to be allowed, but later the rationale for that was called into question and there was no consensus or (explicit) decision. So it's hard to see which alternative was intended to be the final state by the RFC author and the lang team.

I'd disallow it since it significantly complicates the implementation IMO. Is there any motivation for allowing the combination?

Discussion is mostly https://github.com/rust-lang/rfcs/pull/1758#issuecomment-270223812 plus the immediate replies. The motivation is not immediately clear to me, just that a nested wrapper type with align(N) and no transparent is not equivalent to a single type with repr(align(N), transparent). Perhaps @briansmith or @nox could elaborate?

@nox tells me on IRC they have no use case for it. Let's amend the RFC to consistently reject repr(transparent, align). If someone does find a use case, they can deal with the implementation effort and amend the RFC again.

Sorry, it's been forever since we had that conversation. Maybe I'm overlooking something, but I think #[repr(transparent, align=16)] would be very common. For example:

#[repr(transparent, align=16)]
struct Aes256Key([u8; 32]);

#[repr(transparent, align=16)]
struct AesNonce([u8; 16]);

// Keep this in sync with the C definition in aes.h
#[repr(C)]
struct Aes256Context {
    key: Aes256Key,
    ...
    nonce: AesNonce,
    ...
}

extern aes256Encrypt(context: &mut Aes256Context, ...);

I'm not sure how one would express this otherwise.

Is there any ABI where you need #[repr(transparent)] in conjunction with an aggregate type (e.g. a non-transparent struct, or in your case, an array)?
AFAIK, none of the ABIs we support today has such a distinction, so #[repr(transparent)] wouldn't really have to do anything to be correct in those cases, but maybe I'm wrong?

EDIT: Also, are you ever passing or returning Aes256Key or AesNonce by value to/from a function? That's the only time #[repr(transparent)] actually does anything.
The calling convention part of the ABI is orthogonal to the memory layout, and the latter is identical between a struct with one field and the field itself, pretty much everywhere.

Is there any ABI where you need #[repr(transparent)] in conjunction with an aggregate type (e.g. a non-transparent struct, or in your case, an array)?

Consider:

typedef uint8_t AES256Key[32];
typedef uint8_t AES128Key[16];
typedef uint8_t AESNonce[16];

struct Aes256Context {
    alignas(16) AES256Key key;
    alignas(16) AES256Nonce nonce;
};

Notice in particular that in this C code AES128Key and AES128Nonce are both aliases for uint8_t[16] and so they can be used (unfortunately) interchangeably. In my Rust code, I want to use the newtype pattern to make AES128Key and AES128Nonce non-interchangeable types. AFAICT, this requires using #repr[transparenent)], e.g. #repr[transparenent)] struct AES128Key([u8; 16]). In this codebase it is also required (for performance and/or correctness reasons, depending on platform) that everything be 128-bit aligned because of assumptions made by the hand-coded assembly code that uses these structures.

AFAIK, none of the ABIs we support today has such a distinction, so #[repr(transparent)] wouldn't really have to do anything to be correct in those cases, but maybe I'm wrong?

EDIT: Also, are you ever passing or returning Aes256Key or AesNonce by value to/from a function? That's the only time #[repr(transparent)] actually does anything.

Is that true? Maybe I'm misunderstanding the purpose of #[repr(transparent)], but I think that #[repr(transparent)] is supposed to do more than change how things are passed by value. Consider:

struct Outer1 {
    struct Wrapper value1;
    struct Wrapper value2; 
};

struct Wrapper {
    uint8_t value[12];
};

struct Outer2 {
    uint8_t value1[12];
    uint8_t value2[12];
};

Are these structures guaranteed, by C, to have the same memory layout? I don't remember the details but I think the answer is "no". And that's why #repr(transparent) matters for more than by-value argument passing semantics; it allows us to make a Wrapper type in our Rust bindings to that is transparent w.r.t. C semantics.

The calling convention part of the ABI is orthogonal to the memory layout, and the latter is identical between a struct with one field and the field itself, pretty much everywhere.

I think a struct with one field, like Wrapper above, is allowed to have additional padding that isn't necessarily added when there is no such struct. For example, in my example, I think it's allowed for offsetof(Outer1, value2) == 16 && offsetof(Outer2, value2) == 12.

@briansmith That would imply Wrapper has non-byte alignment and we don't currently support any architecture where that is the case, are you aware of any out there?

EDIT: also, it looks like the correct translation of the C code would be #[repr(align(16))] on fields, which we could reasonably add support for.

I just found it in the C99 spec: 6.7.2.1.15 says "There may be unnamed padding at the end of a structure or union."

@briansmith That would imply Wrapper has non-byte alignment and we don't support any architecture where that is the case, are you aware of any out there?

I don't know. However, let me ask you what really matters: Are we willing to guarantee in Rust that #[repr(c)] struct Foo(Bar) has the same memory layout as Bar? AFAICT, no, we can't, because some C platform might make them different, which is why we're specing #[repr(transparent)].

That would imply Wrapper has non-byte alignment

Just to further clarify, that's not actually implied. A C compiler is allowed to pad every struct to an even word size or whatever size it wants, according to 6.7.2.1.15.

which is why we're specing #[repr(transparent)]

Not really, #[repr(transparent)] is originally solely for wrapping integer and floating-point scalars in calls without changing their ABI properties, and the only reason that isn't statically enforce{d,able} is because there's some desire to involve generics like NonZero and UnsafeCell.

A C compiler is allowed to pad

Yes, but that's impossible in both LLVM and Rust without changing the alignment of the struct, and while there is a way to do this in the "data layout" string, no LLVM-supported platform actually sets it above 1 byte for ABI alignment, only for "preferred alignment" (which can't add padding, and it's only a hint for the outermost allocation alignment, where it can increase runtime performance).

There could be a bug here that perhaps clang does the padding itself on some platform, and we're unaware of it. That would have drastic consequences on the simplicity of the implementation.

But either way, the ABI for that platform will have to have spec'd it for that to be a valid implementation choice - it's not like we interop with the C standard, but rather with common C implementations - size_t == uintptr_t is a good example of such a choice we've made.

@eddyb It's also the case that #[repr(align)] will only work for struct. In C alignas isn't restricted to struct and much code (in particular, code I interface with) uses alignas on non-struct types, in particular arrays. So, given a C struct definition that cannot be changed (practically), like this, what would be the right way to write the Rust binding that's guaranteed to work on all platforms?:

struct Foo {
    alignas(16) key[32];
    alignas(16) nonce[16];
};

@briansmith Sorry I edited https://github.com/rust-lang/rust/issues/43036#issuecomment-352282807 instead of adding another comment.

the correct translation of the C code would be #[repr(align(16))] on fields, which we could reasonably add support for.

I should mention that this has come up before, and I believe @nikomatsakis wants it as well.

the correct translation of the C code would be #[repr(align(16))] on fields,

I'm not sure I agree. There's a lot of reasons to want to have fully transparent newtypes. For example, in my case I would like to protect the individual bytes of a key or nonce from being read or written, and I don't want a Nonceor Key to be interchangeable with a general [u8; 32] or [u8; 16].

Also, when #[repr(align)] was designed alignment of fields and not just structs was of course brought up multiple times and was rejected (IIUC). So it feels like we're running in circles on that issue.

One thing I should say is that it's probably doable to increase the alignment of a newtype in the current implementation, but not lower it (i.e. #[repr(transparent, packed)] would be illegal).

However, it's not clear how that would interact with call ABIs, especially if LLVM is responsible for placing scalar arguments onto the stack, we'd have to do a bit more investigation into it.

which is why we're specing #[repr(transparent)]

Not really, #[repr(transparent)] is originally solely for wrapping integer and floating-point scalars in calls without changing their ABI properties, and the only reason that isn't statically enforce{d,able} is because there's some desire to involve generics like NonZero and UnsafeCell.

For me it has nothing to do with calling conventions; I just want to use atomics in #[repr(C)] structs. I'll never pass those structs by value; they'll be on the heap where Rust and C can mutate them concurrently.

@willmo That's again something where you don't really need to wait for #[repr(transparent)], as most ABIs (all we support) use the same memory layout for one-field structs (out of sheer simplicity).

What you can't do at all right now with a struct is pass it by value with the ABI calling convention properties of its field, when it contains one scalar (integer, floating-point, or pointer) field.

@eddyb I realize it doesn't affect the semantics in my case, but it would silence the improper_ctypes warning that I still get.

One thing I should say is that it's probably doable to increase the alignment of a newtype in the current implementation, but not lower it (i.e. #[repr(transparent, packed)] would be illegal).

I think #[repr(transparent, packed)] struct Foo(Bar); would be a no-op. In fact, I would expect #[repr(packed)] struct Foo(Bar);, with no transparent involved, to be a no-op too; is it not?

#[repr(packed)] struct F32(f32); is still an aggregate for ABI purposes, and thus is passed differently from plain f32 or repr(transparent)] struct F32(f32). That's, like, the whole motivation for repr(transparent)!

@briansmith #[repr(packed)] sets the alignment to 1 which isn't compatible with any primitive other than bytes themselves (u8 / i8) and it would require quite some LLVM gymnastics to be.

What are the remaining pieces of work or open questions that are preventing this from being stabilized?

Is there a reason this shouldn't be automatic in most cases?

It is automatically done as an optimisation, for #[repr(Rust)] types, which shouldn't be used in FFI.

@glandium #[repr(transparent)] does nothing to layout, if you look at the implementation.
It just whitelists the struct for the FFI lint but relies on the default "newtype unpacking" optimization.

So, if I wrap a #[repr(C)] type in a Cell, UnsafeCell, or a wrapper around those, they're automatically transparent?

@glandium But then you don't get a guarantee - also, if your own type isn't transparent (#[repr(C)] is specifically "anti-transparent" by default), wrapping it in anything won't make it transparent.

@rfcbot fcp merge

I propose that we stabilize #[repr(transparent)]. This narrowly scoped feature, implemented in this PR, is an often-ignored-or-forgotten feature that is necessary in order to make two different types ABI-compatible across FFI.

Prior to implementation, there was discussion of how repr(transparent) would interact with repr(align). The combination of these two features has been made into an error, so we're free to decide on a behavior here in the future should a need arise.

There has been little-to-no iteration or discussion about changes to the feature since its initial implementation.

Team member @cramertj has proposed to merge this. The next step is review by the rest of the tagged teams:

  • [x] @aturon
  • [x] @cramertj
  • [x] @eddyb
  • [x] @joshtriplett
  • [ ] @nikomatsakis
  • [x] @nrc
  • [x] @pnkfelix
  • [x] @scottmcm
  • [x] @withoutboats

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Should marking some std types as transparent be part of the stabilization?

@glandium I don't think it needs to be-- I think that std types marked as #![repr(transparent)] should be insta-stable-ly considered as ABI compatible. if there are specific types you want marked, please open a PR or an issue!

If struct A(B) is #[repr(transparent)], is it necessarily safe to transmute a &A to &B? What about smart pointers, e.g. Arc<A> -> Arc<B>?

When A is UnsafeCell<B> (which seems to be a motivating example), you would be transmuting from a type with UnsafeCell in it to a type without. I imagine the compiler might be allowed to make assumptions about the (non-)aliasing of the &B that don't hold because it's "actually" an &UnsafeCell<B>.

On the other hand, if you can't safely turn a pointer to A into a pointer to B, the use of #[repr(transparent)] for FFI is (as far as I know) limited to passing newtypes by-value. And if UnsafeCell (and Cell) isn't transparent, then it doesn't seem to be applicable to FFI, and this can't be used to implement rust-lang/rfcs#1789 either.

This came up on Stack Overflow with respect to UnsafeCell.

@trentj This RFC was written because we want to make UnsafeCell<T> be #[repr(transparent)].

From an unsafe code guidelines perspective, casting &UnsafeCell<T>& to &T should be okay if there is no &mut currently live anywhere that points into this UnsafeCell.

@RalfJung That makes sense. The other way (&T to &UnsafeCell<T>) is definitely not okay, since this is a use-after-free.

Do you have any thoughts on whether transmuting Box<UnsafeCell<T>> to Box<T> would be unsound? Or other smart pointers (Arc being relevant to the SO question where this came up)? (Again assuming there are no live &mut pointers to the inner T.)

The other way (&T to &UnsafeCell) is definitely not okay, since this is a use-after-free.

Well yes, you're casting &T to &mut T in here. That's not legal if the &T didn't point into an UnsafeCell to begin with.

(It's also really dangerous to use transmute this way because the lifetime inferred for the result has nothing to do with the lifetime of the argument, but that's not the problem here. Usually transmute should be wrapped into a helper function to make sure the argument and return types are what you think they are.)

Do you have any thoughts on whether transmuting Box> to Box would be unsound?

Actually I think Box<Cell<T>> to Box<T> and vice versa are safe conversions, in the sense that the following two functions would be sound to add to libstd:

fn cell_to_box<T>(x: Box<Cell<T>>) -> Box<T> {
  unsafe { mem::transmute(x) }
}
fn box_to_cell<T>(x: Box<T>) -> Box<Cell<T>> {
  unsafe { mem::transmute(x) }
}

After all, owning a Box<T> excludes the possibility of any kind of reference pointing into the box. And the same is true for &mut Box<Cell<T>> and &mut Box<T>; and even &mut T and &mut Cell<T> (or really, wrapping any number of &mut and Box around the Cell).

EDIT: Confirmed, I proved this for Box<Cell<T>> and &mut Cell<T>.

As for Arc... I'll have to think about this more, and will answer on SO.

If struct A(B) is #[repr(transparent)], is it necessarily safe to transmute a &A to &B?

This specific case doen鈥檛 need transmute. Modulo privacy, you can take a reference to a field: let a: &A = /* ... */; let b = &a.0;. I think it鈥檚 the other way around that鈥檚 more interesting: "adding" a transparent wrapper when you only have a reference to begin with.

From an unsafe code guidelines perspective, casting &UnsafeCell<T>& to &T should be okay if there is no &mut currently live anywhere that points into this UnsafeCell.

(I know you know this, this is just to provide an example.) Yes, in fact &*x.borrow() on a RefCell does exactly this.

The other way (&T to &UnsafeCell<T>) is definitely not okay, since this is a use-after-free.

The given example is not ok because it uses Cell, which has additional invariants and safe APIs.

With UnsafeCell, the main API to access the wrapped value is unsafe.

Either way, cells are very special and come with a set of rules on how to safely use them. The question of what casts/transmutes are made acceptable by #[repr(transparent)] is also interesting for more "boring" wrapper types like struct PixelSize(u32); and e.g. casting &[u32] to &[PixelSize].

Thanks for that answer, @RalfJung!

I got a little mixed up between Cell and UnsafeCell probably because I was thinking too much about the &mut T -> &Cell<T> conversion.

I'm still processing the implications a little. As I currently understand it, #[repr(transparent)] is intended to suggest something like reference compatibility. The only special thing about UnsafeCell in this respect is that its interface is unsafe, so you're still on the hook to guarantee its invariants should you choose to use that interface (and indeed, there's not much point in UnsafeCell if you don't).

By this argument, Cell should not be #[repr(transparent)] because transmuting a reference to it -- which is intended to work for transparent types -- would violate Cell's own invariants.

As I currently understand it, #[repr(transparent)] is intended to suggest something like reference compatibility.

I would put it differently. repr(transparent) just affects the layout part of my considerations over at SO.

The only special thing about UnsafeCell in this respect is that its interface is unsafe

No; what's special about UnsafeCell is that it interacts with shared references to permit interior mutability. That's why there are aliasing considerations.

By this argument, Cell should not be #[repr(transparent)] because transmuting a reference to it -- which is intended to work for transparent types -- would violate Cell's own invariants.

Even with repr(transparent), the transmute is (and will be) still unsafe. So making Cell or really any newtype repr(transparent) is fine and i no way grants permission for outside code to just transmute stuff around and expect that to make sense.

It's just that without repr(transparent), doing the transmute is essentially immediate UB (because the layout could differ).

repr(transparent) isn't about memory layout at all! You can ensure layout compatibility of a wrapper types with the wrapped type with repr(C), and we could easily -- and should, IMO -- specify this for repr(Rust) types as well. We might also want to state this for particular types like UnsafeCell where privacy means users can't know/rely on anything about the fields.

repr(transparent) is exclusively about argument/return value ABI compatibility when passing by-value, which you can't get any other way. This is quite explicit in both the RFC and the docs.

Ah, sorry. I was sloppy in my terminology and also forgot about the ABI part. I meant, in general, "how values of that type are represented as bytes in the machine", be it when passed around as function arguments or when stored in memory. Memory layout and ABI are part of this, but they are closely related in my mind -- and they are separate from other aspects of the semantics, like noalias.

repr(transparent) is exclusively about argument/return value ABI compatibility when passing by-value, which you can't get any other way. This is quite explicit in both the RFC and the docs.

That's how I originally understood it, but UnsafeCell isn't terribly useful by value, right? new and into_inner are both safe and part of UnsafeCell's public interface. So when would it be meaningful to pass or return UnsafeCell<T> via FFI, that just T wouldn't work?

The RFC says

This also means that UnsafeCell cannot be soundly used in place of a bare T in FFI context, which might be necessary to signal to the Rust side of things that this T value may unexpectedly be mutated.

But moving an UnsafeCell<T> invalidates any existing pointers anyway. So if it can't be placed behind a pointer of some kind, "this T value may unexpectedly be mutated" seems like a meaningless signal.

Is there some example of an UnsafeCell<T> being used (by value) in an FFI context where #[repr(transparent)] would help?

I think you're right and this is a mistake in the RFC text. The feature is still useful for other newtypes appearing by-value in FFI signatures, though. I briefly talked to @nox on IRC and they still think UnsafeCell should be transparent, but I'll leave it to them to make that case here.

There is no case to make. It's just that if you want an UnsafeCell<T> field somewhere and pass that to C, you need UnsafeCell<T> be repr(C) or repr(transparent) or the FFI lint will complain. Given repr(C) does more than repr(transparent), the latter should be used IMO.

Given repr(C) does more than repr(transparent)

Shouldn't this be "does less"?

No. Does more. A C wrapper around a float is not the same as a bare float. A transparent wrapper is the same as a bare float.

Ah, you meant as in "changes what the type does". I was thinking in terms of the guarantees provided; repr(transparent) guarantees more because you also get ABI compatibility.

:bell: This is now entering its final comment period, as per the review above. :bell:

The final comment period, with a disposition to merge, as per the review above, is now complete.

Oops, this should have been closed by https://github.com/rust-lang/rust/pull/51562.

Could I ask here for some eyes on a suggestion to add #[repr(transparent)] to Box<T> and whether there are any obvious downsides? Thanks! https://github.com/rust-lang/rust/issues/52976

Was this page helpful?
0 / 5 - 0 ratings