This is a tracking issue for the RFC "an operator to take a raw reference" (rust-lang/rfcs#2582).
Steps:
Unresolved questions:
&[mut] <place> as *[mut|const] ?T
in the surface syntax but had a method call inserted, thus manifesting a reference (with the associated guarantees). The lint as described would not fire because the reference actually gets used as such (being passed to deref
). However, what would the lint suggest to do instead? There just is no way to write this code without creating a reference.Implementation history:
I'll try to get a PR open for this soon.
I would suggest splitting the implementation work up into phases to make each part thoroughly reviewed and tested. However, think of this list as a bunch of tasks that need to be done at some point.
&raw [mut | const] $expr
in the parser and AST.unused_parens
lint works alright.safe_packed_borrows
a hard error when the feature gate is active, and make this the case as well for unsafe
equivalents....but it seems @matthewjasper already has a PR heh.
Make safe_packed_borrows a hard error when the feature gate is active, and make this the case as well for unsafe equivalents.
I'm not sure that this is a good idea. This would make it impossible to gradually migrate a codebase over to using raw references - once the feature gate is enabled, all references to packed structs must be fixed at once in order to get the crate building again.
I wonder why not &raw $expr
but &raw
const $expr
. Is there a discussion about this?
@moshg Yes, this was discussed. See here (and the following comments) for an initial discussion. And here (and the following comments) for a signoff from grammar-wg. And here for an example that &raw (expr)
can be ambiguous.
I think it might be worth amending the RFC to state why &raw const $expr
was chosen over &raw $expr
.
The RFC thread noted a couple times (including in the FCP comment) that &raw expr
without the const
conflicts with existing syntax and would break real code. https://github.com/rust-lang/rfcs/pull/2582#issuecomment-465515012 goes into some detail. My understanding is that it is still possible to migrate to &raw expr
with breaking changes in an edition, and I didn't see much explicit discussion of that, but I think everyone agrees that there's enough urgency here that we definitely prefer making &raw const expr
a thing over blocking any solution on waiting for another edition.
@mjbshaw @lxrec
Thank you for your explanations! I've understood why &raw $expr
is breaking change.
I think it might be worth amending the RFC to state why &raw const $expr was chosen over &raw $expr.
@RalfJung I'm addressing the points you've made in https://github.com/rust-lang/rfcs/pull/2582#issuecomment-539906767
promotion being confusing here seems mostly orthogonal to raw-vs-ref, doesn't it?
While it would be helpful to have at least a warning for a promotion in &mut <rvalue>
case, disabling it is a backward incompatible change to the language. In &raw mut <rvalue>
case it can be done from the beginning.
If no promotion happens with raw ptrs, what would you expect to happen instead?
I would expect &raw mut <rvalue>
to result in compilation error, as &<rvalue>
does in C.
I would expect &raw mut (1+2) to be promoted the same way &mut (1+2) is; not doing that seems even more confusing and very error-prone
I have an impression that silently allowing such promotions is more error prone, as it violates the principle of least surprise.
trait AddTwo {
fn add_two(&mut self);
}
impl AddTwo for u32 {
fn add_two(&mut self) { *self += 2; }
}
const N: u32 = 0;
// somewhere else
assert_eq!(N, 0);
// N += 2; // doesn't compile
N.add_two(); // compiles with no warnings
assert_eq!(N, 2); // fails
promotion being confusing here seems mostly orthogonal to raw-vs-ref, doesn't it?
yes, promotion has nothing to do with it as far as I can tell.
I would expect &raw mut
to result in compilation error, as & does in C.
I agree. We should error on any &raw place
where place
Deref
projections (*
) or Index
([i]
) projectionssince we still want to permit &raw *foo()
I would assume. Or more concretely &raw foo().bar
when foo()
returns a reference.
If "erroring if not" has too many negations (i confused myself in the process of writing this), here's a version of what I think we should permit (and only that):
We should permit &raw place
only if place
Deref
projections or Index
projectionsWhile it would be helpful to have at least a warning for a promotion in &mut
case, disabling it is a backward incompatible change to the language. In &raw mut case it can be done from the beginning.
Note that not promoting here means this becomes a very short-lived temporary and the code has UB (or fails to compile, when only safe references are used)! Is that really what you want? ptr as T
is not a place like ptr
is and cannot be used as such; in particular it cannot be mutated.
I have an impression that silently allowing such promotions is more error prone, as it violates the principle of least surprise.
I have a hard time imagining that silently causing UB is less of a surprise than promoting...
N.add_two(); // compiles with no warnings
You are conflating some things here; what you are seeing here are implicit deref coercions. Promotion is not involved in this example.
We should permit &raw place only if place
Oh I see, you want to cause some static errors. Yeah I can imagine that there are reasonable things we could do here.
You are conflating some things here; what you are seeing here are implicit deref coercions. Promotion is not involved in this example.
There's a bit of misunderstanding. I used a wrong term. When I said "rvalue promotion", I had "creation of a temporary memory location for an rvalue" in mind, not "promotion of a temporary to 'static
".
@red75prime oh I see. Well currently, with promotion, we do not create a temporary memory location, but just a global one. But I suppose your desired behavior is more like what @oli-obk described, where we have static compiler errors?
I am not sure to what extend we can reliably produce those though. But MIR construction should know when it builds temporaries, so couldn't it refuse to do that when &raw
is involved? Cc @matthewjasper
Is there any plan to eventually add &raw
pointer support to pattern matching? It could enable getting rid of various forms of deliberate UB in abomonation.
@HeroicKatora actually wrote an RFC for that, though it was originally meant as an alternative to &raw
: https://github.com/rust-lang/rfcs/pull/2666
I'd be fine with it being interpreted as complementary, and can rewrite portions of it as necessary :)
I can't recall if this was discussed in the RFC (and failed to find any comments on it), but can <place>
be a pointer (to allow pointer-to-field computations)? Example:
struct Struct {
field: f32,
}
let base_ptr = core::mem::MaybeUninit::<Struct>::uninit().as_ptr();
let field_ptr = &raw const base_ptr.field; // Is this allowed by this RFC?
Currently field_ptr
has to be initialized with unsafe { &(*base_ptr).field as *const _ }
. Using unsafe { &raw const(*base_ptr).field }
is a bit better, but avoiding the *base_ptr
dereference entirely would be ideal.
I assume the RFC doesn't permit this (since base_ptr.field
isn't a place), but I just wanted to confirm (to be clear: I'm not trying to advocate for this RFC to do that if it doesn't already; I would like to draft a separate RFC exploring ways to avoid the *base_ptr
dereference).
I can't recall if this was discussed in the RFC (and failed to find any comments on it), but can
be a pointer (to allow pointer-to-field computations)?
That question is ill-typed. A pointer is a kind of value, which is a class of things separate from places.
let field_ptr = &raw const base_ptr.field; // Is this allowed by this RFC?
No, because there is no deref coercion for raw ptrs. But &raw const (*base_ptr).field
is allowed, because (*base_ptr).field
is a place.
avoiding the *base_ptr dereference entirely would be ideal.
That has nothing to do with places though, that's just a deref coercion. When you write similar things with references, they work because the compiler adds the *
for you -- and not because of anything having to do with places.
since base_ptr.field isn't a place
It's not even well-typed.
I would like to draft a separate RFC exploring ways to avoid the *base_ptr dereference).
Agreed, the current syntax is bad -- but mostly because of the parenthesis and the use of a prefix operator. I think we should have posfix-deref, e.g. base_ptr.*.field
. That has been discussed before... somewhere...
That question is ill-typed. A pointer is a kind of value, which is a class of things separate from places.
You're right. Sorry for my poor choice of words.
let field_ptr = &raw const base_ptr.field; // Is this allowed by this RFC?
No
Thanks for answering my question. I'll follow up to your other points in a new thread on internals.rust-lang.org.
Small nit: a deref coercion is &T
to &U
via deref(s). Dot access has its own implicit deref.
Hey,the RFC document doesn't link to this tracking issue,it's linking to https://github.com/rust-lang/rust/issues/ .
So, wait, would let field_ptr = &raw const (*base_ptr).field;
work without unsafe with this RFC? Or would it still need to be wrapped in an unsafe {}
block due to the "dereference" in (*base_ptr)
?
given that people try to write offset calculation code using null pointers, that expression shouldn't produce a real deref, and it should then actually work in safe code. wrapping_offset is safe, and offset is only unsafe because you _could_ go out of bounds, but a field pointer will always be in-bounds so there's not that danger.
No, it requires unsafe. The place expression does not access the memory base_ptr
points at, but it is (in effect for the current implementation) an offset
operation and as such a dangling base_ptr
causes UB.
Even if we take the separate & not-even-formally-proposed step of dropping the "must point to an allocated object" requirement for field projections to retroactively make offsetof
implementations using null pointers defined, there's still the separate matter of wraparound in the address calculation (e.g. if base_ptr as usize == usize::MAX
). Such wraparound is currently UB as well (it enables some nice optimizations) and does not affect any offsetof
implementation, so it would be an even tougher sell to drop that UB.
ahhhh I agree that you're correct but I don't like it.
In terms of the reference, &raw const (*base_ptr).field
still hits the clause saying
Dereferencing (using the * operator on) a dangling or unaligned raw pointer.
This RFC does not actually change that list of UB. It just makes it so that one can avoid the clause in the reference that refers to &[mut]
types.
That said, I just realized that the reference declares &raw const (*packed_field).subfield
UB if packed_field
is an unaligned raw pointer (e.g. to a field of a packed struct). That is probably not what we want...
The reason I phrased that clause as referring to "dereferencing" as opposed to actual memory accesses is precisely @roblabla's question.
Looks like we need two clauses?
- Dereferencing (using the
*
operator on) a dangling raw pointer.- Reading from or writing to an unaligned raw pointer. (This refers only to
*ptr
reads and*ptr = val
writes; raw pointer methods have their own rules which are spelled out in their documentation.)
Should we also add raw ref
bindings in patterns? For implementing offset_of!
, https://internals.rust-lang.org/t/discussion-on-offset-of/7440/2 recommends using a struct pattern instead of a field access expression, in order to protect against accidentally accessing the field of another struct through Deref
:
let u = $crate::mem::MaybeUninit::<$Struct>::uninitialized();
let &$Struct { $field: ref f, .. } = unsafe { &*u.as_ptr() };
The current implementation does not prevent &raw const foo().as_str()
where fn foo() -> String
as far as I can tell. Repeating my post from https://github.com/rust-lang/rust/pull/64588#discussion_r357969199
The check for &raw const 2
not being permitted is done on the HIR, and already mentions the downside to doing it on the HIR:
The check could be implemented on the MIR by bailing out if any Rvalue::AddressOf
's Place
is a temporary. Although that will probably start failing &raw const *&*local
(reborrowing a local and then reading from it). I think it would be best if we somehow reuse the logic that causes
fn foo() -> String {
unimplemented!()
}
fn main() {
let x = foo().as_str();
println!("{}", x);
}
to emit
error[E0716]: temporary value dropped while borrowed
--> src/main.rs:6:13
|
6 | let x = foo().as_str();
| ^^^^^ - temporary value is freed at the end of this statement
| |
| creates a temporary which is freed while still in use
7 | println!("{}", x);
| - borrow later used here
|
= note: consider using a `let` binding to create a longer lived value
I believe we should block stabilization on resolving this in a nonfragile way.
The initial implementation PRs (#64588, #66671) for this are now merged and will be available from the next nightly.
The current implementation does not prevent
&raw const foo().as_str()
wherefn foo() -> String
as far as I can tell.
Is this fully implemented? Several PRs have been merged, but the "Implement the RFC" box is still unchecekd.
The RFC proposes a lint that is unimplemented. The raw borrow operation is usable in nightly.
For the sake of end-user documentation, I'd like a clarification on this loose note at the end of the RFC:
Lowering of casts. Currently,
mut_ref as *mut _
has a reborrow inserted, i.e., it gets lowered to&mut *mut_ref as *mut _
. It seems like a good idea to lower this to&raw mut *mut_ref
instead to avoid any effects the reborrow might have in terms of permitted aliasing. This has the side-effect of being able to entirely remove reference-to-pointer-casts from the MIR; that conversion would be done by a "raw reborrow" instead (which is consistent with the pointer-to-reference situation).
I believe this is in reference to a historical spookiness in the language that mutable_ref as *mut as *const
is actually semantically different from mutable_ref as *const
-- the former creating a pointer which is legal to cast to a *mut and write to, while the latter doesn't (roughly speaking). I was under the impression that this distinction was miserably Actually Important. Is that no longer the orthodoxy among the UCG folks?
I believe this is in reference to a historical spookiness in the language that mutable_ref as *mut as *const is actually semantically different from mutable_ref as *const
No.
This is in reference to the fact that MIR lowering of mut_ref as *mut _
actually generated &mut *mut_ref as *mut _
. Since &mut
is a very meaningful operation for Stacked Borrows, replacing that by &raw mut mut_ref
is a huge difference. See this PR; the shr_and_raw
test failed with a Stacked Borrows violation before this change.
The mutable_ref as *const _
situation is still as described in https://github.com/rust-lang/rust/issues/56604#issuecomment-477954315. As in, right now the raw pointer type used for the initial cast from a reference is important and determines the mutability of the created pointer. If you do reference as *const _
, you get a read-only pointer, even if reference: &mut _
. Casting from one raw pointer type to another and back doesn't do anything, though. Whether that is desirable or not is an open question to me, as is how to fix it if we declare it undesirable.
So I've been thinking about this. I've started to lean back towards "we should just add the &raw mut|const
operator, even though it's not ultimately the "end user" experience I think we want. More generally, I think we ought to be looking to make steady, incremental progress on unsafe code guidelines and fixing cases (like taking the address of fields in packed structs, or perhaps https://github.com/rust-lang/rust/issues/55005) that have no real "safe path" in Rust code today.
I think there's definitely room to do a "re-think" of the raw pointer types and operations in Rust, in order to make writing unsafe code in Rust more ergonomic and correct, but I suspect that we should not block progress on such work. It may be that we deprecate some of the more piecemeal solutions or older strategies in the future, but that seems ok.
@nikomatsakis That's fine for me, too. :D So what would be the next step along that line? File a stabilization PR for &raw
and FCP it? Or is it too soon for that?
I also agree &raw is a healthy improvement over the status quo that should be made available as soon as possible.
In particular I like that this tiny step includes the clunkiness of &raw const
for two reasons:
Being slightly longer means folks are still subtly encouraged to make &raw mut
and avoid the aforementioned const provenance issues. (and folks who want the variance properties of const pointers probably also want to use NonNull, so those folks are already funneled into creating mutable ptrs)
It leaves us ample room to introduce (in a later edition) a hypothetical unified raw pointer that just uses unqualified &raw
. (Not something I'm super hopeful about these days but I like that this door isn't closed)
We discussed this recently in the most recent @rust-lang/lang meeting and I believe we had some sense that while everyone would like to move forward promptly, we would overall prefer a macro like raw_ref!(path)
rather than the &raw
syntax. This macro would have to be "well known" to the compiler in some way -- I'll defer the discussion of how to implement that to a Zulip topic, since it doesn't seem that important.
We didn't discuss in the meeting whether to have raw_ref!
and raw_mut_ref!
or to have something like raw_ref!(mut path)
.
The idea here would be to make the core capability available in some reasonable form for now, with the expectation that we may explore an alternative solution in the future.
This doesn't address @Gankra's point about the "const provenance" issues, which I guess refers to this comment that @RalfJung highlighted. I'll ask questions in that issue, I suppose, but it seems like a somewhat orthogonal issue to me.
we would overall prefer a macro like raw_ref!(path) rather than the &raw syntax. This macro would have to be "well known" to the compiler in some way -- I'll defer the discussion of how to implement that to a Zulip topic, since it doesn't seem that important.
Why would it have to be well-known? It could be implemented with the unstable syntax and allow_internal_unstable
?
I think the _idea_ is that the raw_ref!
macro would be a "compiler builtin", so it doesn't expand to any real code that you could ever write, even on Nightly. It just expands directly to the MIR operation.
That would certainly be a possible implementation, but I do not see why that would be better than what I proposed.
It would prevent ossification of an "in progress" nightly syntax, particularly on crates that want to use this new syntax but also target stable through cfg flags. It also makes it easier to write custom diagnostics for incorrect cases. That being said, beyond those two points the difference is minimal.
I can see the diagnostics argument. But I think feature flags are a well-tried mechanism against "ossification" of unstable syntax.
I put up a PR adding the macros in https://github.com/rust-lang/rust/pull/72279. These are just simple wrappers around the unstable syntax; we can always change that implementation detail later if needed.
One thing we've discussed in the past but I think not documented at any length is a larger reworking of unsafe pointers which this could be collapsed into. The basic concept, called "unsafe references," was this:
We would introduce a new reference type (let's say &unsafe T
and &unsafe mut T
, syntax to be determined), which would be like raw pointers but with a more ergonomic and safe API. One particular feature would be that they would be nonnull, and nullable pointers would be represented with an Option
just like all nullable references are. This would give a modicum of additional type safety and avoid the situation we have where structs often should be using NonNull
to get the niche optimization, but don't because NonNull is unwieldy to work with.
We could also rework the casting system to make it a bit more type safe (such as distinguishing between mutability casts and referent type casts), maybe find a way to make assignment less error prone (instead of implicitly dropping the previous value which may have been invalid for drop), etc. In other words, these types would be reworked raw pointers so that they are a bit less error prone.
And these types would degrade into raw pointers for backwards compatibility of course.
Such reference types would also have a constructor which would subsume this RFC, since they would not be guaranteed to be valid or aligned.
Anyway, no one is working on actually developing this feature. But I like stabilizing the raw_ref macro in the short term to give us time to consider a more comprehensive change like the one I've described here in the long term.
I'm nominating this for @rust-lang/lang discussion -- in particular, I think I am ready to propose stabilizing the raw-ref macro, and I think the rest of us are. Maybe we should solicit a stabilization report?
We're just past the paper deadline so I have some time. ;) Let me know if/how I can help with that report.
The desire to get the feature out to the masses sense to me, but I haven鈥檛 seen it (the macro) used or experimented with anywhere out in the wild, let alone libstd, where we have just 2(?) uses of the underlying &raw
syntax and AFAICT no uses of the macro.
Intuitively I would've expected more use of this feature, at least within libcore/liballoc/libstd, even if just to give us more confidence in its implementation, but maybe I鈥檓 just overestimating the use-cases for the raw reference operator?
I assume you are referring to the uses introduced by https://github.com/rust-lang/rust/pull/73845. This BTree PR also makes crucial use of &raw
: https://github.com/rust-lang/rust/pull/73971. I have not made a thorough audit of libstd to determine where else it is needed, that is way more work than I have time for I am afraid.
Outside of libstd, memoffset
uses it, behind a nightly flag (https://github.com/Gilnaa/memoffset/pull/43), to provide -- for the first time! -- a sound offset_of!
. This in turn is used all over the place.
A crater experiment found 80 cases (around 50 of those crates.io crates) where reference are created to fields of packed structs. These all represent UB; some of those probably need &raw
to be fixed.
(Small note: There's already a sound offset_of! macro available in the ecosystem, and on Stable, but the limitation there is that you need to have an instance of the type already created to use the macro)
(Yes I should have said something like "a sound unrestricted offset_of!
" or so, sorry.)
(actually that macro can be UB on packed structs too, because that's just a "bug" that all of rust is susceptible to)
Cool stuff. In that case I've no concerns.
@RalfJung I think the report would include a short history of the feature and relevant PRs, some details about how it's in use, and links to a few tests or examples. Basically all info that can be scraped from the last few comments in this tracking issue, I imagine. =)
I'm on vacation now for a week; I can try to draft something once I am back.
@RalfJung no rush <3
I posted the stabilization report in the tracking issue for the feature I'd like to see stabilized: https://github.com/rust-lang/rust/issues/73394#issuecomment-664379508.
I don't understand, why &raw const (*(0usize as *const SomeStruct)).some_field
is not UB despite of dereferencing invalid (zero-valued) pointer.
It is UB. Why do you think it is not?
If it is UB even with &raw
, then why this RFC and &raw
construction exist? It does not have any sense.
This is described in the RFC: https://github.com/rust-lang/rfcs/pull/2582.
@RalfJung Re-reading the RFC, I do think it makes sense to switch to just plain getelementptr when raw refs are involved, and we can always switch to a new nowrap variant if LLVM adds one.
IIRC this came up as a future possibility during the RFC. The problem is that this means the semantics of *
differ depending on the syntactic context, which seems rather tricky and could be surprising to programmers -- in particular when that context is introduced implicitly by the compiler (through auto-ref).
I've read the RFC. There are many interesting in there, especially about getelementptr inbounds
.
So, even with this new feature there is no any possibility to calculate field offset, right?
So, even with this new feature there is no any possibility to calculate field offset, right?
No, this feature is enough to implement offset_of!
. In fact, https://github.com/Gilnaa/memoffset/ already does so if you set the unstable_raw
feature. (Note that this has been discussed here in this very thread before. The discussion here is not very long yet, please read it before posting. :)
No, this feature is enough to implement offset_of!.
Through MaybeUninit
? I would not call it "enough to implement".
The problem is that this means the semantics of * differ depending on the syntactic context, which seems rather tricky and could be surprising to programmers -- in particular when that context is introduced implicitly by the compiler (through auto-ref).
I鈥檓 not sure if by "this" you mean a future possibility, but I think the RFC as-is introduces exactly that problem.
Consider this code with UB:
struct A(u8, i32);
let dangling: *mut A = std::ptr::NonNull::dangling().as_ptr();
let field_ref: &i32 = unsafe { &(*dangling).1 };
The RFC seems to be written on the premise that problematic part of this code is the reference-creating operator &
, and proposes a new operator that can be used instead. However that does not match my mental model at all. My understanding of raw pointers in current stable Rust is that the dereference operator *
is the only source of unsafety / potential UB. Using it requires the raw pointer to point to valid memory. Indeed, copying a field without using the reference-creating operator causes just as much UB:
struct A(u8, i32);
let dangling: *mut A = std::ptr::NonNull::dangling().as_ptr();
let copied_field: i32 = unsafe { (*dangling).1 };
With this RFC鈥檚 addition to the language, dereferencing with *
a raw pointer to invalid memory may or may not be UB, depending on whether the expression eventually also uses &raw
. This makes the unsafety of raw pointers harder to reason about. I鈥檓 sorry to only say it this late (the RFC is already accepted), but I feel strongly that this is an important enough flaw that we should consider alternative designs that focus on avoiding the *
operator instead of avoiding the &
or &mut
operator.
If we need to do projection of a raw pointer to a struct to a raw pointer to a field of that struct, could we have an operator doing exactly that? Strawman not fully thought through:
struct A(u8, i32);
let dangling: *mut A = std::ptr::NonNull::dangling().as_ptr();
let field_ptr: *mut i32 = dangling->1;
Aside, from the RFC:
If one wants to avoid creating a reference to uninitialized data (which might or might not become part of the invariant that must be always upheld), it is also currently not possible to create a raw pointer to a field of an uninitialized struct: again,
&mut uninit.field as *mut _
would create an intermediate reference to uninitialized data.
But what is the type of uninit
in that expression? Presumably it is not SomeStruct
but something like *mut SomeStruct
from MaybeUninit::as_mut_ptr
, otherwise the example would have UB before we even get to the given code. So that example might be missing an operator for dereferencing a raw pointer: &mut (*uninit).field as *mut _
.
With this RFC鈥檚 addition to the language, dereferencing with
*
a raw pointer to invalid memory may or may not be UB, depending on whether the expression eventually also uses&raw
.
By the way, shouldn鈥檛 raw pointer field projection always be safe? &raw mut (*dangling).1
requires an unsafe
block in current Nightly.
@A1-Triard
Through MaybeUninit? I would not call it "enough to implement".
I mean, it literally is enough to implement it. We have constructive proof of this fact. I am not sure what else you are asking for. Maybe you consider the implementation too complicated, but that is a separate problem -- and a much less pressing one.
@SimonSapin
The RFC seems to be written on the premise that problematic part of this code is the reference-creating operator &, and proposes a new operator that can be used instead. However that does not match my mental model at all. My understanding of raw pointers in current stable Rust is that the dereference operator * is the only source of unsafety / potential UB.
What you say is orthogonal to the RFC. It is true that for raw pointers, *
is the only source of UB. The issue is that, without this RFC, it is sometimes impossible to even use raw pointers, so one has to use references, and for those there is much more UB than for raw pointers.
With this RFC鈥檚 addition to the language, dereferencing with * a raw pointer to invalid memory may or may not be UB, depending on whether the expression eventually also uses &raw.
That is not correct, the RFC says nothing like this. @joshtriplett proposed to make further changes that would have the effect. I am confused as to why you think the RFC does this, when it does not.
This code is UB with and without this RFC:
struct A(u8, i32);
let dangling: *mut A = std::ptr::NonNull::dangling().as_ptr();
let field_ref: &i32 = unsafe { &(*dangling).1 };
And this code (which can only be written after the RFC) is UB:
struct A(u8, i32);
let dangling: *mut A = std::ptr::NonNull::dangling().as_ptr();
let field_ref: &i32 = unsafe { &raw const (*dangling).1 };
The reason is the same in both cases -- *
was used on a dangling raw pointer. The UB in the first example has nothing to do with the &
, and thus this entire issue has nothing to do with the RFC.
By the way, shouldn鈥檛 raw pointer field projection always be safe? &raw mut (*dangling).1 requires an unsafe block in current Nightly.
Now I am really confused, because this is the exact opposite of what you just asked for! If we want to make this safe, we have to make *
special when used below &raw
. And you just said you were rather opposed to that.
Now I am really confused, because this is the exact opposite of what you just asked for! If we want to make this safe, we have to make
*
special when used below&raw
. And you just said you were rather opposed to that.
I thought this was what the RFC proposes, so it looks like I deeply misunderstand it :/
I thought this was what the RFC proposes, so it looks like I deeply misunderstand it :/
There is, as far as I can see, not a single instance of the deref operator *
in the RFC (except for "future possibilities", where they are implicit though through auto-deref). So no, this is not what the RFC proposes. The "offsetof woes" paragraph in "Future possibilities" mentions such a change could be desirable in the future, but the rest of the RFC is specifically about creating pointers, not about dereferencing them.
@RalfJung , OK, You have convinced me. I will use memoffset
crate.
The RFC seems to be written on the premise that problematic part of this code is the reference-creating operator &, and proposes a new operator that can be used instead. However that does not match my mental model at all. My understanding of raw pointers in current stable Rust is that the dereference operator * is the only source of unsafety / potential UB.
That's not correct; the use case that the RFC immediately addresses has to do with packed structs and not dereferencing raw pointers.
#[repr(packed)]
struct A(u8, i32);
let x: A = A(0, 0);
let r: &A = &x;
// IMMEDIATE UB
let ub: &i32 = &x.1;
This code contains UB, as I understand it, because it creates an unaligned reference to an i32. This despite the fact that it contains no raw pointers at all. Eventually we'd like to make this a compiler error.
The raw ref macros like you get an *mut i32
without constructing an &i32
; that avoids the UB, because you never create an unaligned reference. The problem today is that there's no reasonable to create a raw pointer without first creating a reference, which must be aligned.
Most helpful comment
One thing we've discussed in the past but I think not documented at any length is a larger reworking of unsafe pointers which this could be collapsed into. The basic concept, called "unsafe references," was this:
We would introduce a new reference type (let's say
&unsafe T
and&unsafe mut T
, syntax to be determined), which would be like raw pointers but with a more ergonomic and safe API. One particular feature would be that they would be nonnull, and nullable pointers would be represented with anOption
just like all nullable references are. This would give a modicum of additional type safety and avoid the situation we have where structs often should be usingNonNull
to get the niche optimization, but don't because NonNull is unwieldy to work with.We could also rework the casting system to make it a bit more type safe (such as distinguishing between mutability casts and referent type casts), maybe find a way to make assignment less error prone (instead of implicitly dropping the previous value which may have been invalid for drop), etc. In other words, these types would be reworked raw pointers so that they are a bit less error prone.
And these types would degrade into raw pointers for backwards compatibility of course.
Such reference types would also have a constructor which would subsume this RFC, since they would not be guaranteed to be valid or aligned.
Anyway, no one is working on actually developing this feature. But I like stabilizing the raw_ref macro in the short term to give us time to consider a more comprehensive change like the one I've described here in the long term.