I had heard tell of Freeze
but didn't really know what it was until today. swym
, a hybrid transactional memory library, has an accidental reimplementation of Freeze
using optin_builtin_traits
. Unfortunately optin_builtin_traits
is the only feature keeping swym
on nightly.
The ticket that removed Freeze
doesn't have much of an explanation for why it was removed so I'm assuming it was a lack of motivating use cases.
swym::tcell::TCell::borrow
returns snapshots of data - shallow memcpy
s - that are guaranteed to not be torn, and be valid for the duration of the transaction. Those snapshots hold on to the lifetime of the TCell
in order to act like a true reference, without blocking updates to the TCell
from other threads. Other threads promise to not mutate the value that had its snapshot taken until the transaction has finished, but are permitted to move the value in memory.
This works great for a lot of types, but fails miserably when UnsafeCell
s are directly stored in the type.
let x = TCell::new(Mutex::new("hello there".to_owned()));
// .. inside a transaction
let shallow_copy = x.borrow(tx, Default::default())?;
// locking a shallow copy of a lock... is not really a lock at all!
*shallow_copy.lock().unwrap() = "uh oh".to_owned();
Even if Mutex
internally had a pointer to the "actual" mutex data structure, the above example would still be broken because the String
is deallocated through the shallow copy. The String
contained in the TCell
would point to freed memory.
Note that having TCell::borrow
require Sync
would still allow the above broken example to compile.
If swym::tcell::TCell::borrow
could require Freeze
then this would not be an issue as the Mutex
type is definitely not Freeze
.
pub(crate) unsafe auto trait Freeze {}
impl<T: ?Sized> !Freeze for UnsafeCell<T> {}
unsafe impl<T: ?Sized> Freeze for PhantomData<T> {}
unsafe impl<T: ?Sized> Freeze for *const T {}
unsafe impl<T: ?Sized> Freeze for *mut T {}
unsafe impl<T: ?Sized> Freeze for &T {}
unsafe impl<T: ?Sized> Freeze for &mut T {}
Shallow immutability is all that is required for TCell::borrow
to work. Sync
is only necessary to make TCell
Sync
.
TCell<String>
- should be permitted.TCell<Mutex<String>>
- should be forbidden.TCell<Box<Mutex<String>>>
- should be permitted.MyFreeze
was correct when it was written. Everytime the author of MyType
updates their dependency on other_crate
they must recheck that OtherType
still has no direct interior mutability or risk unsoundness.struct MyType { value: other_crate::OtherType }
unsafe impl MyFreeze for MyType {}
Add a T: Copy
bound on TCell::<T>::borrow
. This would definitely work but leaves a lot of types out.
Wait for OIBITs to stabilize (assuming it will be stabilized).
Have TCell
store a Box<T>
internally, and only work with heap allocated data where interior mutability is of no concern. This would be pretty effective, and if the type is small enough and Copy
, the Box
could be elided. While not as good as stabilizing Freeze
, I think this is the best alternative.
cc @rust-lang/lang and possibly @rust-lang/libs
I'm sorta interested in this direction. However, Freeze
is also a lang item used by other parts of the language so we need to tread carefully here and possibly consider exposing a version of Freeze
that isn't a lang item.
I would also be interested in seeing more use cases to justify the exposure of Freeze
.
Note that this would make it an api-breaking change to add interior mutability to a type.
Adding interior mutability into a type is already an API breaking change.
Even adding trait objects into a type is an API breaking change.
struct S {
i: usize,
//j: std::cell::Cell<usize>,
//k: Box<dyn std::error::Error>,
}
fn f(s: &S) {
let _ = std::panic::catch_unwind(|| {
let _s = s;
});
}
fn main() {}
I would also be interested in seeing more use cases to justify the exposure of Freeze.
This is essentially the same use case, but crossbeam-epoch
could add a non-heap Atomic
style type to the API. e.g. A Seq<T>
type which internally protects the contained value with a seqlock. On write, Seq
defer
s the drop of the overwritten value. On read, Seq
returns a shallow copy which borrows the lifetime of both the Guard
and the Seq
. Seq
would not have to require Copy
, but could instead only require Freeze
.
Note that this would make it an api-breaking change to add interior mutability to a type.
Fair point! This is already the case for Cell
/RefCell
which are !Sync
. Freeze
would add Mutex
/RwLock
/Atomic*
/etc as API breakers.
Adding interior mutability into a type is already an API breaking change.
@dtolnay We might want to add a PhantomUnfrozen
to give users the ability to not guarantee Freeze
then? (Inspired by PhantomPinned
)
@Centril That's equivalent to PhantomData<Cell<()>>
.
Cc https://github.com/rust-lang/rust/issues/25053 -- using Copy
as a workaround here would obviously break if UnsafeCell
were made Copy
. (Which might provide additional motivation for exposing Freeze
, in order to be able to express the intended requirement directly rather than by fortuitous coincidence.)
Add a T: Copy bound on TCell::
::borrow. This would definitely work but leaves a lot of types out.
We'd very much appreciate if you wouldn't do that. This would paint us into a corner that is impossible to get out of, where the legitimate usecases for a Copy
type with interior mutability would be forever excluded from Rust.
@glaebhoerl Ahh, I didn't realize there was a discussion around that. It's not clear to me that UnsafeCell: Copy
would cause unsafety in swym
, but it would definitely break the reference semantics of borrow's return type.
That's equivalent to PhantomData
|
For that to work, the proposed version of Freeze
should not have the PhantomData
impl core currently has. core
would certainly wanna keep their definition of Freeze
as it's used for deciding whether to place static
s in read-only memory or writable memory and possibly for other optimizations.
Note that Cell<()>
is zero-sized so you can just use it without PhantomData
around it.
To opt out of Freeze
, using Mutex<()>
/ PhantomData<Mutex<()>>
instead avoids opting out of Sync
"by accident".
I think that having an explicitely named PhantomXXX
(_e.g._, PhantomUnfrozen
, but also PhantomUnsynced
, PhantomThreadBound
), even when it is just a type
alias or a new type wrapper around PhantomData
, helps improve the readability of the code: currently code with occurrences of PhantomData
very often have a comment next to it to explain what the purpose of the PhantomData
is. See this thread regarding the usage of PhantomData for (in)variance
A separate PhantomUnfrozen
type is required. AFAIK there is no Sync
ZST with interior mutability.
PhantomData<Mutex<()>>
/PhantomData<AtomicUsize>
would have the correct opt out behavior, but requires removing the blanket PhantomData
Freeze
impl. This is a bad idea due to widespread usage of PhantomData
for owned pointers. Types like Vec<Cell<u8>>
would not be Freeze
- even though they should be.
You can make your own on stable Rust with a wrapper around UnsafeCell<()>
that you then unsafe impl Sync
on.
Note that we reserve the right to consider everything outside a literal UnsafeCell
to be frozen so this shouldn't be used for unsafe tricks, only "unpromising" Freeze
on a type.
Okay, seeing this issue has convinced me that we really need to push for either opt-in Copy, or Copy for UnsafeCell, or the CopyOwn thing. Because we cannot just wait and expect people not tor rely on people abusing the number of things Copy excludes for other purposes than declaring things to be (implicitly) byte-copyable. People already abuse it for making sure a type has a no-op Drop impl, which is its own headache that is further exacerbated by the fact that UnsafeCell can't implement Copy.
I would also be interested in seeing more use cases to justify the exposure of
Freeze
.
Mem-mapping is a use-case I have for Freeze, specifically to prevent accidentally implementing some traits on types with interior mutability. Though given how much unsafe code is involved in that use-case anyway it'd be just a "belt-and-suspenders" measure.
It's also a use-case where exposing the actual compiler Freeze
type would be annoying: in my usecase a small subset of types have both interior mutability and implement mem-mapping, with care taken to only actually mutate "dirty" values that are heap allocated. So those types would need Freeze
impl's, which is probably inappropriate if they actually affect code generation.
@petertodd
It's also a use-case where exposing the actual compiler Freeze type would be annoying: in my usecase a small subset of types have both interior mutability and implement mem-mapping, with care taken to only actually mutate "dirty" values that are heap allocated. So those types would need Freeze impl's, which is probably inappropriate if they actually affect code generation.
I don't entirely follow. Note that Freeze
only talks about the value itself, not other memory it points to. So, for example, Box<RefCell<T>>
is Freeze
.
On January 1, 2020 12:42:22 PM EST, Ralf Jung notifications@github.com wrote:
@petertodd
It's also a use-case where exposing the actual compiler Freeze type
would be annoying: in my usecase a small subset of types have both
interior mutability and implement mem-mapping, with care taken to only
actually mutate "dirty" values that are heap allocated. So those types
would need Freeze impl's, which is probably inappropriate if they
actually affect code generation.I don't entirely follow. Note that
Freeze
only talks about the value
itself, not other memory it points to. So, for example,
Box<RefCell<T>>
isFreeze
.
I'm talking about values in memmapped files.
To summarize, I have a trait, ValidateBlob, that validates that a blob of bytes are structurally valid for the type implementing it ([repr(C)], unaligned types, etc). It's designed to work with read-only mappings鹿, so interior mutability will simply segfault虏. Freeze in that context would be a useful lint. But ValidateBlob is of course unsafe, so it'd just be a lint.
1) I'm both willing to live with corruption caused by unintentional modifications to the backing file, and in this use case since the backend is meant to be append only, you can use the append-only bit to make that problem much less of an issue. It's basically a database after all. You can also play games with memmapping modes to get similar results, with different trade-offs.
2) Intentional mutation is via copy-on-write, so all intentional mutation will be in heap/stack allocated memory.
I've come to realize that there could be a need for a trait similar to Freeze
, but with different semantics. I may do an RFC out of it, but I think this is a good place to start discussing it, since it may relate to what some people think Freeze
is for.
Cell::read_with
impl<T : ?Sized> Cell<T> {
fn read_with<R, F> (self: &'_ Self, f: for<'any> fn(&'any T) -> R) -> R
FnOnce
closure by an env-less closure to dismiss the case where the closure itself captures a &'_ Self
(discussing the right bounds on the closure is not the point of this post). This seems like a desirable API, if it was possible to offer it in a sound manner.
It would allow, for instance, to read the discriminant of a Cell<Option<String>>
.
Currently, the only way to read it is to temporarily replace
with another value:
fn is_some (cell: &'_ Cell<Option<String>>) -> bool
{
let prev = cell.replace(None);
let ret = prev.is_some();
cell.set(prev);
ret
}
instead of:
fn is_some(cell: &'_ Cell<Option<String>>) -> bool
{
cell.read_with(Option::is_some)
}
Now, using exactly the suggested function API is unsound: Playground, to be run with MIRI.
Basically, here is the offending code:
struct Rec /* = */ (
pub
Rc<Cell<Option<Rec>>>,
);
fn evil ()
{
let this = Rec::new(); // Creates a cycle (thanks to shared mutability)
this.0.read_with(|inner: &Option<Rec>| {
// given the type of inner, it is illegal to mutate
// the pointee (the Option), for instance, its
// discriminant is known to be immutable.
if let &Some(ref this) = inner {
this.0.set(None);
// Woops, we mutated an immutable discriminant: UB
}
})
}
So the issue with Cell::with
is visible: by having the "inner" type have interior mutability and it recursively pointing to the "outer" reference, we get to have a shared reference &X
witness mutation of its pointee X
, which is UB.
So, one (conservative) way of avoiding such issue with compile-time checks is to prevent the first aforementioned necessary condition: let's have a way in the type level to express that a type cannot possibly have interior / shared / aliased mutability, _i.e._, that &T
asserts that the pointee T
is _transitively_ immutable.
Now, this may seem like what the Freeze
trait is for, but this is not the case: Freeze
is more of a low-level / implementation-detail-based trait: it is just telling if the _direct_ / first layer of a type T
is immutable when shared. This is indeed a sufficient condition to have a static
occupy read-only memory, or, more generally, have a value of type T
be backed up by read-only memory (_e.g._, mmap
ed memory) if we ensure only &T
accesses are possible, and T : Freeze
.
In the previous example, for instance, Option<Rec>
is Freeze
! Indeed, auto-traits lead to it being Freeze
when Rec
is, and the first / direct layer of Rec
is that of a pointer, that in and on itself has no interior mutability. It would thus be sound to have a static _: Rec
be laid out in read-only memory, _i.e._, Rec : Freeze
.
More concretely, Freeze
has the following impl
s (note the lack of : Freeze
bound on the generic <T>
ype parameter): https://github.com/rust-lang/rust/blob/766fba3fdca8fe075c370a63d2f76f8407483af5/src/libcore/marker.rs#L683-L687
That is, indirection is automatically Freeze
, thus only a direct UnsafeCell
is able to break Freeze
-ness (and ZSTs, or at least PhantomData
, are always Freeze
too, it seems. ([T; 0]
is crying in a corner btw))
Hence the need for _another_ auto trait, if we are to have Cell::with
. I will from now on call it with a long name describing its semantics: TransitivelyImmutableWhenAliased
pub
unsafe
auto trait TransitivelyImmutableWhenAliased {}
impl<T : ?Sized> !TransitivelyImmutableWhenAliased for UnsafeCell<T> {}
// Same as with `Send` and `Sync`, we conservatively drop the trait when raw pointers are involved,
// in case the pointers are type-erase
// (_e.g._, a `*const ()` representing a `*const UnsafeCell<Something>`)
impl<T : ?Sized> !TransitivelyImmutableWhenAliased for *const T {}
impl<T : ?Sized> !TransitivelyImmutableWhenAliased for *mut T {}
unsafe impl<T : ?Sized> TransitivelyImmutableWhenAliased for Box<T> where
T : TransitivelyImmutableWhenAliased,
{}
impl<T : ?Sized> Cell<T> {
fn with<R, F> (self: &'_ Self, f: for<'any> fn(&'any T) -> R) -> R
where
T : TransitivelyImmutableWhenAliased,
{A}Rc
?I don't really know what to do with {A,}Rc<impl TransitivelyImmutableWhenAliased + ?Sized>
. @RalfJung once wrote something about this topic in another comment I cannot find. It was something regarding the "public-ness" of Cell
: while it is true that RcBox
and ArcInner
have counters with aliased mutability, these counters are just a private implementation detail and have nothing to do with the type they wrap. Hence {A,}Rc<impl TransitivelyImmutableWhenAliased + ?Sized>
would be a sound type to use with Cell::with
. However, it would be a lie to say that {A,}Rc<impl TransitivelyImmutableWhenAliased + ?Sized>
is transitively immutable (when aliased), since these counters are not.
If this TransitivelyImmutableWhenAliased
trait is to be used only with Cell::read_with
, then we could find a different more apt name and have the aforementioned {A}Rc
implement the trait. But maybe there are other use-cases where true immutability is needed for soundness, and having these mutable counters would break it (we could, for instance, imagine something along the lines of someone deriving some kind of Sync
-ness from TransitivelyImmutableWhenAliased
-ness, which would obviously be unsound with Rc
Isn't Cell::with
generally unsound because it allows you to violate aliasing rules?
let this = Cell::new(1u8);
this.with(|inner: &u8| {
// This changes the value of inner despite it being behind an immutable reference.
// This is a violation of the aliasing rules.
this.set(2);
});
This is why the key rule in Cell
is that it never exposes a reference to the inner value.
You're completely right, I don't know how I can have forgotten to think about the environment captured by a closure 馃槄.
Not all hope is lost, though, because it is still possible to add trait bounds restriction on the closure itself. We can discuss this once I suggest the pre-RFC to the internals (I don't want to carry this issue too off-topic with my suggestion, I just wanted to submit the idea that there are "different kind of Freeze
-like traits", and this looked like a good place to talk about it).
I'll edit the above post and replace impl FnOnce...
by a fn..
to (temporarily) "dismiss" the issue of the env captured by the closure (but I think that similarly to the Sync
<-> Send
interaction, a marker trait lost on &impl !TransitivelyImmutableWhenAliased
and required on the with
closure would solve this).
Not all hope is lost, though, because it is still possible to add trait bounds restriction on the closure itself.
Closure captures aren't the only way the closure might get access to another reference to the Cell, it might be reachable from a global. Modifying the last counter-example (omitting the verbose initialization):
static THIS: RwLock<Cell<i32>>;
THIS.read().unwrap().with(|inner| {
THIS.read().unwrap().set(2);
})
Oh I somehow imagined that Sync
would prevent such example, my bad. Damn I felt like this idea could have worked 馃槥
This means I got no use-case for TransitivelyImmutableWhenAliased
(I should have given this more thought).
Oh I somehow imagined that Sync would prevent such example
Actually, I wasn't that far off: not only does ::std::
's RwLock
require Sync
on the wrappee in order to be Sync
itself, but, more generally (_i.e._, taking into accounts things like ::lock_api::ReentrantMutex
), the primitive Cell::read_with
could require that the Cell
definition in and on itself not be Send
(on top of it not being Sync
) so that it cannot be held in a global variable.
a !Send + !Sync
cannot be held in a global variable, not even when protected by a lock, otherwise code like the following would be legal:
lazy_static! {
static ref THIS: RwLock<Cell<
&Cell<i32> // something not Send
>> = RwLock::new(Cell::new(
Box::leak(Box::new(Cell::new(0)))
));
}
fn main ()
{
let first_ref: &Cell<i32> = THIS.read().unwrap().get();
::std::thread::spawn(|| {
let second_ref: &Cell<i32> = THIS.read().unwrap().get();
loop { second_ref.set(0); }
});
loop { first_ref.set(!0); }
}
So, even if my suggestion is no longer applicable to ::core::cell::Cell
, it would be interesting to have another definition (an external crate I guess) that would impl !Send
for that definition of "Cell
" (let's call this definition NonSendCell
).
!Send
to avoid being accessible from within a global does seem hackish, though.In that scenario, the (auto) trait TransitivelyImmutableWhenAliased
would be useful, so my previous post and suggestion remains (partially) valid.
Making it !Send
doesn't help, just replace the previous counter-example by a thread-local variable.
There is no way to avoid communication of a caller with the closure, I'm afraid. You'd need something like an effect system.
There is no way to avoid communication of a caller with the closure, I'm afraid. You'd need something like an effect system.
I guess it is now too late to have an auto-trait for global-safety that some types could opt out of, is it not?
I wouldn't even know what that trait would mean (what is the proof obligation that goes along with it?). But also at this point it's probably better to stop the digression in this thread. ;)
Just submitted an RFC.
@petertodd I am not sure I totally understand your use case, so I only mentioned it in passing.
Most helpful comment
We'd very much appreciate if you wouldn't do that. This would paint us into a corner that is impossible to get out of, where the legitimate usecases for a
Copy
type with interior mutability would be forever excluded from Rust.