Self-referential generators violate LLVM's expectations for noalias
due to the overlapping bounds of the interior references and the &mut self
argument to the Generator::resume
function.
Original issue description: async/await is possibly unsound -- I don't consider myself an authority here, so feel free to edit this top post with relevant information.
Current discussion happened in https://github.com/rust-lang/rust/pull/63209
cc @rust-lang/lang @comex @rkruppe @RalfJung
Responding to @RalfJung's post from #63209:
I think this is fundamental. Adding
derive(Debug)
makes the exact content of local variables at any yield point observable. Interesting optimizations based on mutable references being unique rely on the content mutable references point to to _not_ be observable by any other alias.
Those hypothetical optimizations could be disabled in async fns across suspend points, though, couldn't they? I can't think of any case where this would force the compiler to pessimize all code because of the mere possibility of async being involved. Therefore, the question is whether those optimizations being performed on async fns specifically is more valuable than derive(Debug)
is.
(By the way, I still want async fns to be able to derive(Clone)
, although that would obviously require pretty severe restrictions on the body of the function.)
I've not weighed in on this but I wanted to leave a few thoughts. Let me know if I'm confused about something.
First off, the conflict here is between the reference to a generator type G and some &mut
reference T contained within the generator. What we don't have is two &mut A
and &mut B
references that overlap where neither A nor B gives any indication of arising from a generator.
This seems important: it means that, in principle, we ought to be able to understand that the &mut G
reference is somehow special. It is certainly true that the self-referential nature of generators will require extending "stacked borrows" to be more flexible. But I think that generators are something we want to have, and it doesn't seem surprising that our aliasing rules may need to be extended to accommodate and understand them. (If though we have conflict between two random types &mut A
and &mut B
, that seems more worrisome, though it is likely still true that we want generators regardless.)
Also, of course, this problem isn't really specific to generators but is rather a pattern that emerges with any self-referential setup. Presently, safe Rust has no way to express self-referential structs (apart from generators), but I think it's something we would eventually like to support.
So let's posit that we find a version of stacked borrows that understands self-referential structs. Clearly, this doesn't mean we can add noalias
to &mut
references everywhere, we'll either have to extend LLVM or be selective in some other way (as e.g. @comex suggests here).
Does this all sound correct?
I think it's very important for generators to be able to print out their locals in a Debug
implementation. With all of the effort that's gone into making async functions and futures zero-cost, it would be a shame if they became intrinsically less debuggable than a hand-written state machine.
Those hypothetical optimizations could be disabled in async fns across suspend points, though, couldn't they?
Possibly. I am not sure what the memory would look like though. Specifically...
First off, the conflict here is between the reference to a generator type G and some &mut reference T contained within the generator. What we don't have is two &mut A and &mut B references that overlap where neither A nor B gives any indication of arising from a generator.
... it is not enough, in general, if only "one side" of a conflict is informed. &mut T
expects everyone else to respect its rules. If we allow &mut T
and *mut T
to conflict arbitrarily (because *mut T
"knows" to expect conflicts), we get basically no optimizations.
Also, of course, this problem isn't really specific to generators but is rather a pattern that emerges with any self-referential setup. Presently, safe Rust has no way to express self-referential structs (apart from generators), but I think it's something we would eventually like to support.
Agreed. I do not have a concrete idea for how to do that, though.
I think it's very important for generators to be able to print out their locals in a Debug implementation. With all of the effort that's gone into making async functions and futures zero-cost, it would be a shame if they became intrinsically less debuggable than a hand-written state machine.
The question is, zero-cost when compared with what? Disabling optimizations to enable debug printing does have a cost, async fn
then get optimized less than other functions.
And even if we rule out optimizations across yield points, we'd still require a non-trivial extension to Stacked Borrows to permit debug printing. That extension will likely have cost elsewhere, in terms of code getting optimized less.
In my view, the only solution one could really call zero-cost is the one that disallows debug printing.
Added back the "unsound" label; to my knowledge, the LLVM IR we are currently generating for some safe self-referential futures is UB and thus we got a soundness bug here.
@RalfJung
Added back the "unsound" label; to my knowledge, the LLVM IR we are currently generating for some safe self-referential futures is UB and thus we got a soundness bug here.
Do you mean "with -Zno-alias
"?
... it is not enough, in general, if only "one side" of a conflict is informed.
Indeed. The other thing we have going for us here, I think, is that we can be somewhat selective about what sorts of operations on possible on the generator reference. It's not exactly the same as having a &mut
reference A to a struct and a (simultaneous) &mut
reference B to its fields, in that you could use A to access the data from B directly. With a generator, you don't get to reach in and inspect and access its state arbitrarily. You just get to pass it around until you "open it", right? (At which point you aren't passing around the generator anymore.)
In other words, at the end of the day, there is a valid seeming usage pattern that we want to capture, right? In particular:
This also seems to be true for Debug
impls -- they correspond to opening the generator, inspecting its state, and closing it.
This pattern also seems analogous to a typical &mut
reborrow, except that the reborrow can be suspended and reactivated. (Looked at in this way, it's not surprising that indeed it would require an extension of the underlying model to express.)
Is this correct or is the pattern we are trying to describe more complex than that?
@RalfJung If I recall, one of the more interesting cases you how you can simultaneously have a &
reference to a ref-cell and an &mut
reference to its interior:
let p = &RefCell::new(22);
let q = p.borrow_mut();
drop(p);
drop(q);
This required some sort of special treatment that was focused on UnsafeCell
, because passing around p
didn't invalidate q
in the way that a reference to the interior would typically be invalidated.
It seems like the generator scenario we are describing here is quite similar, no?
@nikomatsakis
If I recall, one of the more interesting cases you how you can simultaneously have a & reference to a ref-cell and an &mut reference to its interior:
See https://github.com/rust-lang/rust/issues/63787.
(I'll come back to your other points when I find some more time.)
In my view, the only solution one could really call zero-cost is the one that disallows debug printing.
There are a couple of possible "solutions" in between, e.g., having all Generators implement Debug
, but having the impls printing nothing, e.g., "Generator(debug-printing is off)"
unless -C generator-debug-print
is enabled, which we could enable on debug builds by default, or similar.
Do you mean "with -Zno-alias"?
-Zmutable-noalias
, yes. It is my understanding that a soundness bug with that flag is still a soundness bug. We want to change the default of that once LLVM is fixed.
In other words, at the end of the day, there is a valid seeming usage pattern that we want to capture, right? In particular:
The pattern you are describing seems fine, yes. I think it actually already is fine with current Stacked Borrows if the "outer" thing is a raw pointer and never turned into a reference. In some cases is this currently not possible but will be when https://github.com/rust-lang/rfcs/pull/2582 lands -- except that the Pin<&mut T>
unsafe API also uses mutable references and those might pose problems.
This also seems to be true for Debug impls -- they correspond to opening the generator, inspecting its state, and closing it.
They don't, though. At least, not operationally. They use the wrong pointer to access the field pointed to by the reference -- namely, they use the "outer" pointer.
This pattern also seems analogous to a typical &mut reborrow, except that the reborrow can be suspended and reactivated. (Looked at in this way, it's not surprising that indeed it would require an extension of the underlying model to express.)
Why suspend and reactivate? When the generator is closed you said yourself that the pointer may not be used?
I don't think a suspend-and-reactivate method is compatible with noalias
or with the desired optimizations, unless suspending is an explicit action on the to-be-suspended pointer. I don't know what that action would look like.
So I guess after all I do not agree with your model and probably did not understand it. Until that last paragraph I thought we were in agreement but we actually do not seem to be talking about the same thing.
One point, which may or may not be obvious, is that printing out variables which are mutably borrowed across the suspend point is unsound, even without optimizations.
For example, if the borrowed variable is of type
struct Annoying(RefCell<Box<i32>>);
...then the function could use RefCell::get_mut
to get a mutable reference to the interior of the Box
and save it across the suspend point, while the Debug
impl for Annoying
could use RefCell::borrow_mut
to replace it with a new Box
.
Addendum on the derive(Debug)
story: when I said "we can't have it", I meant "we can't debug-print fields that are currently borrowed". All the other fields could still be printed.
We discussed this issue in the @rust-lang/lang meeting yesterday (recording etc available here, though for this particular issue you'll want to jump in to about the 50 minute mark or so) we discussed this issue. We came to a rough consensus on a solution for how to integrate the current generator design while retaining noalias on &mut
, which I will describe below. Of course, all of this is subject to some caveats in that stacked borrows and all work on aliasing models is still a work-in-progress.
In short, the idea is that if you have a &mut Generator
reference, it would be treated similarly to a *mut Generator
reference for the purposes of aliasing rules (in stacked borrows terms, SharedRw
; in LLVM terms, it would not be marked noalias). In this way, the fact that the &mut Generator
(embedded in a Pin
) co-exists with the &mut
to some of its fields doesn't cause an aliasing conflict, any more than a *mut
would.
In other words, the rules for an &mut T
reference are not the same for all T
, but depend on whether that T
is considered "self-referential" -- for now, the only case where this occurs is a generator, but we could extend this in the future. This would allow us to accommodate patterns like rental, which presently may be unsound (we did not discuss rental, owning-ref, or any other crate in any detail, to be clear). If we extended safe Rust with some other form of self-referential struct (which I personally would like to do), then it would presumably leverage the same mechanism.
This is somewhat analogous to the way that a &UnsafeCell<T>
reference can co-exist with a &mut T
reference to its interior. This also requires some special treatment in stacked borrows.
We also discussed Debug
impls. The short version is that Debug impls are not a problem as long as they respect the rules of the borrow checker. That is, if you try to dump the contents of a generator, it should not include the values of variables that were borrowed. This seems fine (and, to me, self-evident, but perhaps not to all). It would of course require some compiler smarts to implement but nothing particularly beyond the pale.
Finally, we discussed whether this "type-based rule" was more complex than a (hypothetical) &pin T
would have been. Ralf's response was basically "not really". Or, in more detail, "sort of yes, but mostly because we are trying to be more optimized". In particular, the type-based resolution permits one to have types where only part of the type is self-referential (e.g., a tuple like (Generator, u32)
). If you had a &pin (Generator, u32)
, you would apply conservative treatment to the whole structure. But if you have a type-based rule, we might be more precise. This also arisees around Cell, as I noted above, and in general we are trying to be more precise but it's a bit of a pain (and e.g. Ralf is already choosing to approximate around enums so as to keep the specification tractable).
OK, that's the best of my recollection. Feel free @RalfJung or @Centril to correct me -- and of course the recording is available if you'd like the full details.
That is, if you try to dump the contents of a generator, it should not include the values of variables that were borrowed.
Immutably borrowed is fine right?
If the variable is only immutably borrowed, I think it's fine, but this relies on there not being an "intermediate" mutable borrow from which the immutable one was reborrowed.
Immutably borrowed is fine right?
I think the TL;DR would be "only variables you could have accessed in your own code at the point where the await occurred" -- there are maybe some interesting caveats here around unsafe code and raw pointers. i.e., there might be accesses you could've done in safe code that you shouldn't do in unsafe code.
This approach sounds like it would lose a lot of aliasing information that would exist in a synchronous version of the generator, despite the "outside world" (the code with the Pin<&mut G>
) having no actual access to the state machine's internals.
Would it be possible to recover that aliasing information "inside" the generator, so e.g. the typical "local variables that I haven't created references to are not aliased" still works, despite those locals being accessed via the Pin<&mut Self>
?
One way to model this might look something like: consider the Pin<&mut Self>
not to exist during generator execution, and instead have yield
create it and resume
consume it, making it a sort of collective "reborrow" of any self-references. This reverses the borrow stack, letting generator locals work just like sync function locals.
Extending this to the rental/hand-rolled case might involve treating structs more like stack frames, giving individual fields their own lifetimes/borrow stacks (IIUC, somewhat like existing "borrow splitting"-related behavior). Reminiscent of rental and the above, you would only be able to touch the struct contents inside a closure where the reference to the struct itself is inaccessible.
This approach sounds like it would lose a lot of aliasing information that would exist in a synchronous version of the generator, despite the "outside world" (the code with the Pin<&mut G>) having no actual access to the state machine's internals.
By "synchronous version of the generator", do you mean the non-generator version ("remove async
")?
If so -- the inside of the async fn
still has all the same aliasing information as the inside of a fn
would. This works as long as the "outside" does not do any move to access the memory that the "inside" assumes it has for itself. That's exactly what Niko's approach achieves: the Generator
type would be like a marker to the compiler saying "aliasing hazard ahead, do not touch the inside of this or you are breaking other code's aliasing assumptions".
One way to model this might look something like: consider the Pin<&mut Self> not to exist during generator execution, and instead have yield create it and resume consume it, making it a sort of collective "reborrow" of any self-references.
Interesting proposal. However, I have not understood the problem statement yet.
the Generator type would be like a marker to the compiler saying "aliasing hazard ahead, do not touch the inside of this or you are breaking other code's aliasing assumptions".
Ah, I misread. (Thrown off by the comparison to UnsafeCell
?) This sounds like it would accomplish the same thing.
I was assuming this would inhibit optimizations inside the generator as it would have to go through the Pin<&mut Self>
.
I was assuming this would inhibit optimizations inside the generator as it would have to go through the Pin<&mut Self>.
Hm, good question. I don't know how the generated code looks like. But I don't think the noalias
story has much to do with this, actually.
Usually optimizations of local variables are not based on aliasing but based on their addresses not having been leaked -- so that they are private to us and nobody else can make any observation about them. Every local variable is a separate "allocated object", so even if you leak one there is nothing leaked about the others.
With generators, the runtime driving the generator does know the base address of the "stack frame", and all local variables are stored in the same allocation. In principle, code could try to detect the layout of that stack-frame (the distances between the variables) at run-time, and that would not be UB as it would be pointer arithmetic inside a single allocation.
To make this precise (and given the fact that we [plan to] run optimizations on async fn
before lowering them), we should really have a proper "abstract machine" for async fn
that is not in terms of lowering to a state machine. Not sure what that would look like though and how it would relate to the rest of the language spec.
Check-in from lang team: I'm removing the I-nominated label. It seems like we've agreed we're going forward, and we have the "general shape" of the solution we expect, though there are details to hammer out.
We should probably implement this (if we're worried about LLVM, that is) as Pin
stripping noalias
, in a similar fashion to Box
adding it (since struct Box<T>
just contains a raw pointer): https://github.com/rust-lang/rust/blob/702b45e409495a41afcccbe87a251a692b0cefab/src/librustc/ty/layout.rs#L2309-L2316
@eddyb this is actually not what @nikomatsakis and me concluded in our conversation: Crates like owning-ref are also affected and they do not even use pinning. So this is likely more like UnsafeCell
than like Box
.
For reference I ran into this in a context where I had a Pin<&mut Foo>
and wanted to have concurrent &
references into a UnsafeCell
in Foo
. But this would imply having both &mut
and a &
at the same time.
From a conceptual point I would love to have a type analogous to UnsafeCell
which "strips"/hole punches the outer &mut
borrow, which seems to be roughly what @nikomatsakis suggested.
I personally struggle to see how an AliasedCell<T>
(let's call it that for the moment being) would be able to solve that.
IIUC, the idea behind that AliasedCell<T>
would be to be able to write the following code without violating Stacked Borrows:
struct Struct {
unaliased_1: ...,
aliased_field: AliasedCell<Field>,
unaliased_2: ...,
...
}
let mut s = Struct { ... };
let at_field: *const Field = s.aliased_field.as_ptr();
stuff(&mut s);
other_stuff(&*at_field); // <- Would violate Stacked Borrows if it weren't for `AliasedCell`
Is that right?
Now, the way I see it, such a wrapper would not be _easy to implement_, that is, _basic_ special language support (like we have for UnsafeCell
) would not suffice. We are back to the problems we had with Pin
: stuff()
might overwrite the whole Struct
, and for that, it currently needs the &mut
transitivity.
In order for AliasedCell
to work in that design, all the writes to &mut
-based places would need special care, in order to transform, for instance, a mem::swap(&mut s1, &mut s2)
, into something like:
mem::swap(&mut s1, &mut s2);
// becomes, by language magic:
mem::swap(&mut s1.unaliased_1, &mut s2.unaliased_1);
Cell::swap(s1.aliased_field.as_cell_ref(), s2.aliased_field.as_cell_ref())
mem::swap(&mut s1.unaliased_2, &mut s2.unaliased_2);
// ...
which I find quite complex.
Cell::swap(Cell::from_mut(&mut s1), Cell::from_mut(&mut s2))
but then it would mean that a single AliasedCell
field in a struct could hinder the ops on the rest of the struct.Personally, I think that a &mut
-interacting wrapper ought to _wrap a pointer_, rather than the pointee. That is, something similar to Pin
. Let's call it MaybeAliased
:
MaybeAliased<&mut _>
and MaybeAliased<Box<...>>
_etc._ ought to strip the noalias
properties of their inner pointers.It is already necessary to implement such a mechanism to begin with, be it through MaybeAliased
or AliasedCell
, and I claim that it is sufficient: crates such as owning_ref
would "just" need to wrap their Box
es _& co._ with this wrapper, and Pin
could use MaybeAliased
under the hood.
stuff() might overwrite the whole Struct, and for that, it currently needs the &mut transitivity.
A UnsafeAliasedCell
would only give hints to the compiler about what it can do with it (like UnsafeCell
) but not
provide any safe abstraction around it. In the end like you need something like an atomic to safely abstract over UnsafeCell
you would need something like Pin
to safely abstract over UnsafeAliasedCell
.
I.e. the UnsafeAliasedCell
would only work as long as you guarantee to not repurposed the memory it is on, for which you normally(but not always) would use Pin
.
for instance, a mem::swap(&mut s1, &mut s2), into something like:
I don't think a UnsafeAliasedCell
should affect mem::swap
. Like mentioned in the previous paragraph I think it only should give out guarantees if it in a situations where the memory is guaranteed to not be re-purposed (e.g. no mem::swap). Everything else would make this super complicated I think.
such a wrapper would not be easy to implement,
I agree fully with it. I'm especially not sure if this might need patches to LLVM. (But I don't know enough of the implementation details to judge this, just guessing that it might maybe be a problem).
MaybeAliased<&mut _>
I don't think this would work (well) as:
This interacts badly with Pin
and Generators/async-futures. Pin<&mut dyn Future>
would now need to become Pin<MaybeAliased<&mut dyn Future>>
if the future might contain inner self-references. Even if we find a way to fix that for futures it would still be very prone to leak internal implementation details into the trait usage. For Pin
this makes sense as it constraints the usage of a instance but for a type already requiring pin self-referential ness should be a impl. detail I think.
It's very error prone/hard to use e.g. for a Pin<MaybeAliased<&mut S>>
S
can easily have multiple fields of which some are self referential but others are not and might even be excluded from the Pin
contract. So you need to have way to go from a MaybeAliased<&mut S>
to a &mut S.Field
for any field of S
which is not self referential without ever constructing a &mut S
at any point. Which as far as I know requires pointer offsets based on field offsets in structs. Which at least currently is not the most ergonomic thing to do. (Currently as far as I know this requires repr(C) structs and the usage of Layout.extend()
to re create the layout of the struct which is error prone if not combined with a proc-macro to auto-generate the Layout.extend()
usage)
Merging it with Pin
would work but not necessary be the best idea. As not all things needing Pin
are internally self-referential. Furthermore many currently safe usages of get_mut_unchecked
would become unsafe due to the creation of an &mut
(e.g. when writing a future combinator we would need to assume that the wrapped future could be self-referencial internally and as such we would no longer be able to use get_mut_unchecked
, even through it's currently valid to do so.
I hope that I overlooked something but if not I fear the only way to fix the soundness of generators without doing braking changes is to either have something like a UnsafeAliasingCell
(which only gives any guarantees under Pin
like circumstances) or a marker trait which makes generators/async-futures "be treated like" as if they are wrapped into a imaginary UnsafeAliasingCell
type.
Maybe I found another alternative, but I don't have the necessary ASM&LLVM knowledge to know if that could even work:
&/&mut S
to &/&mut S.Field
and then pass the result to something like fn forget_aliasing(&S) -> &S
(and same for &mut
etc.). Which somehow forces the compiler to forget about aliasing hints and similar. In a certain way this would be similar to the infamous C++ std::launder
. Thanks for the clarifications, @rustonaut. _W.r.t._ most of your remarks, I had indeed the idea to merge that pointer wrapper with Pin
, since it would otherwise be annoying to manage. But .get_mut_unchecked()
is a very good (counter-)point! You have convinced me the wrapper needs to be without indirection.
Most helpful comment
We discussed this issue in the @rust-lang/lang meeting yesterday (recording etc available here, though for this particular issue you'll want to jump in to about the 50 minute mark or so) we discussed this issue. We came to a rough consensus on a solution for how to integrate the current generator design while retaining noalias on
&mut
, which I will describe below. Of course, all of this is subject to some caveats in that stacked borrows and all work on aliasing models is still a work-in-progress.In short, the idea is that if you have a
&mut Generator
reference, it would be treated similarly to a*mut Generator
reference for the purposes of aliasing rules (in stacked borrows terms,SharedRw
; in LLVM terms, it would not be marked noalias). In this way, the fact that the&mut Generator
(embedded in aPin
) co-exists with the&mut
to some of its fields doesn't cause an aliasing conflict, any more than a*mut
would.In other words, the rules for an
&mut T
reference are not the same for allT
, but depend on whether thatT
is considered "self-referential" -- for now, the only case where this occurs is a generator, but we could extend this in the future. This would allow us to accommodate patterns like rental, which presently may be unsound (we did not discuss rental, owning-ref, or any other crate in any detail, to be clear). If we extended safe Rust with some other form of self-referential struct (which I personally would like to do), then it would presumably leverage the same mechanism.This is somewhat analogous to the way that a
&UnsafeCell<T>
reference can co-exist with a&mut T
reference to its interior. This also requires some special treatment in stacked borrows.We also discussed
Debug
impls. The short version is that Debug impls are not a problem as long as they respect the rules of the borrow checker. That is, if you try to dump the contents of a generator, it should not include the values of variables that were borrowed. This seems fine (and, to me, self-evident, but perhaps not to all). It would of course require some compiler smarts to implement but nothing particularly beyond the pale.Finally, we discussed whether this "type-based rule" was more complex than a (hypothetical)
&pin T
would have been. Ralf's response was basically "not really". Or, in more detail, "sort of yes, but mostly because we are trying to be more optimized". In particular, the type-based resolution permits one to have types where only part of the type is self-referential (e.g., a tuple like(Generator, u32)
). If you had a&pin (Generator, u32)
, you would apply conservative treatment to the whole structure. But if you have a type-based rule, we might be more precise. This also arisees around Cell, as I noted above, and in general we are trying to be more precise but it's a bit of a pain (and e.g. Ralf is already choosing to approximate around enums so as to keep the specification tractable).OK, that's the best of my recollection. Feel free @RalfJung or @Centril to correct me -- and of course the recording is available if you'd like the full details.