Rust: Request: Make `std::marker::Freeze` pub again

Created on 10 May 2019  路  27Comments  路  Source: rust-lang/rust

I had heard tell of Freeze but didn't really know what it was until today. swym, a hybrid transactional memory library, has an accidental reimplementation of Freeze using optin_builtin_traits. Unfortunately optin_builtin_traits is the only feature keeping swym on nightly.

The ticket that removed Freeze doesn't have much of an explanation for why it was removed so I'm assuming it was a lack of motivating use cases.

Use Case

swym::tcell::TCell::borrow returns snapshots of data - shallow memcpys - that are guaranteed to not be torn, and be valid for the duration of the transaction. Those snapshots hold on to the lifetime of the TCell in order to act like a true reference, without blocking updates to the TCell from other threads. Other threads promise to not mutate the value that had its snapshot taken until the transaction has finished, but are permitted to move the value in memory.

This works great for a lot of types, but fails miserably when UnsafeCells are directly stored in the type.

let x = TCell::new(Mutex::new("hello there".to_owned()));

// ..  inside a transaction
let shallow_copy = x.borrow(tx, Default::default())?;
// locking a shallow copy of a lock... is not really a lock at all!
*shallow_copy.lock().unwrap() = "uh oh".to_owned();

Even if Mutex internally had a pointer to the "actual" mutex data structure, the above example would still be broken because the String is deallocated through the shallow copy. The String contained in the TCell would point to freed memory.

Note that having TCell::borrow require Sync would still allow the above broken example to compile.

Freeze

If swym::tcell::TCell::borrow could require Freeze then this would not be an issue as the Mutex type is definitely not Freeze.

pub(crate) unsafe auto trait Freeze {}

impl<T: ?Sized> !Freeze for UnsafeCell<T> {}
unsafe impl<T: ?Sized> Freeze for PhantomData<T> {}
unsafe impl<T: ?Sized> Freeze for *const T {}
unsafe impl<T: ?Sized> Freeze for *mut T {}
unsafe impl<T: ?Sized> Freeze for &T {}
unsafe impl<T: ?Sized> Freeze for &mut T {}

Shallow immutability is all that is required for TCell::borrow to work. Sync is only necessary to make TCell Sync.

  • TCell<String> - should be permitted.
  • TCell<Mutex<String>> - should be forbidden.
  • TCell<Box<Mutex<String>>> - should be permitted.

Alternatives

  • A manually implemented marker trait _could_ work, but is actually very dangerous in practice. In the below example, assume that the impl of MyFreeze was correct when it was written. Everytime the author of MyType updates their dependency on other_crate they must recheck that OtherType still has no direct interior mutability or risk unsoundness.
struct MyType { value: other_crate::OtherType }
unsafe impl MyFreeze for MyType {}
  • Add a T: Copy bound on TCell::<T>::borrow. This would definitely work but leaves a lot of types out.

  • Wait for OIBITs to stabilize (assuming it will be stabilized).

  • Have TCell store a Box<T> internally, and only work with heap allocated data where interior mutability is of no concern. This would be pretty effective, and if the type is small enough and Copy, the Box could be elided. While not as good as stabilizing Freeze, I think this is the best alternative.

C-feature-request T-lang needs-rfc

Most helpful comment

Add a T: Copy bound on TCell::::borrow. This would definitely work but leaves a lot of types out.

We'd very much appreciate if you wouldn't do that. This would paint us into a corner that is impossible to get out of, where the legitimate usecases for a Copy type with interior mutability would be forever excluded from Rust.

All 27 comments

cc @rust-lang/lang and possibly @rust-lang/libs

I'm sorta interested in this direction. However, Freeze is also a lang item used by other parts of the language so we need to tread carefully here and possibly consider exposing a version of Freeze that isn't a lang item.

I would also be interested in seeing more use cases to justify the exposure of Freeze.

Note that this would make it an api-breaking change to add interior mutability to a type.

Adding interior mutability into a type is already an API breaking change.

Even adding trait objects into a type is an API breaking change.

struct S {
    i: usize,
    //j: std::cell::Cell<usize>,
    //k: Box<dyn std::error::Error>,
}

fn f(s: &S) {
    let _ = std::panic::catch_unwind(|| {
        let _s = s;
    });
}

fn main() {}

I would also be interested in seeing more use cases to justify the exposure of Freeze.

This is essentially the same use case, but crossbeam-epoch could add a non-heap Atomic style type to the API. e.g. A Seq<T> type which internally protects the contained value with a seqlock. On write, Seq defers the drop of the overwritten value. On read, Seq returns a shallow copy which borrows the lifetime of both the Guard and the Seq. Seq would not have to require Copy, but could instead only require Freeze.

Note that this would make it an api-breaking change to add interior mutability to a type.

Fair point! This is already the case for Cell/RefCell which are !Sync. Freeze would add Mutex/RwLock/Atomic*/etc as API breakers.

Adding interior mutability into a type is already an API breaking change.

@dtolnay We might want to add a PhantomUnfrozen to give users the ability to not guarantee Freeze then? (Inspired by PhantomPinned)

@Centril That's equivalent to PhantomData<Cell<()>>.

Cc https://github.com/rust-lang/rust/issues/25053 -- using Copy as a workaround here would obviously break if UnsafeCell were made Copy. (Which might provide additional motivation for exposing Freeze, in order to be able to express the intended requirement directly rather than by fortuitous coincidence.)

Add a T: Copy bound on TCell::::borrow. This would definitely work but leaves a lot of types out.

We'd very much appreciate if you wouldn't do that. This would paint us into a corner that is impossible to get out of, where the legitimate usecases for a Copy type with interior mutability would be forever excluded from Rust.

@glaebhoerl Ahh, I didn't realize there was a discussion around that. It's not clear to me that UnsafeCell: Copy would cause unsafety in swym, but it would definitely break the reference semantics of borrow's return type.

That's equivalent to PhantomData

For that to work, the proposed version of Freeze should not have the PhantomData impl core currently has. core would certainly wanna keep their definition of Freeze as it's used for deciding whether to place statics in read-only memory or writable memory and possibly for other optimizations.

Note that Cell<()> is zero-sized so you can just use it without PhantomData around it.

To opt out of Freeze, using Mutex<()> / PhantomData<Mutex<()>> instead avoids opting out of Sync "by accident".


aside

I think that having an explicitely named PhantomXXX (_e.g._, PhantomUnfrozen, but also PhantomUnsynced, PhantomThreadBound), even when it is just a type alias or a new type wrapper around PhantomData, helps improve the readability of the code: currently code with occurrences of PhantomData very often have a comment next to it to explain what the purpose of the PhantomData is. See this thread regarding the usage of PhantomData for (in)variance

A separate PhantomUnfrozen type is required. AFAIK there is no Sync ZST with interior mutability.

PhantomData<Mutex<()>>/PhantomData<AtomicUsize> would have the correct opt out behavior, but requires removing the blanket PhantomData Freeze impl. This is a bad idea due to widespread usage of PhantomData for owned pointers. Types like Vec<Cell<u8>> would not be Freeze - even though they should be.

You can make your own on stable Rust with a wrapper around UnsafeCell<()> that you then unsafe impl Sync on.

Note that we reserve the right to consider everything outside a literal UnsafeCell to be frozen so this shouldn't be used for unsafe tricks, only "unpromising" Freeze on a type.

Okay, seeing this issue has convinced me that we really need to push for either opt-in Copy, or Copy for UnsafeCell, or the CopyOwn thing. Because we cannot just wait and expect people not tor rely on people abusing the number of things Copy excludes for other purposes than declaring things to be (implicitly) byte-copyable. People already abuse it for making sure a type has a no-op Drop impl, which is its own headache that is further exacerbated by the fact that UnsafeCell can't implement Copy.

I would also be interested in seeing more use cases to justify the exposure of Freeze.

Mem-mapping is a use-case I have for Freeze, specifically to prevent accidentally implementing some traits on types with interior mutability. Though given how much unsafe code is involved in that use-case anyway it'd be just a "belt-and-suspenders" measure.

It's also a use-case where exposing the actual compiler Freeze type would be annoying: in my usecase a small subset of types have both interior mutability and implement mem-mapping, with care taken to only actually mutate "dirty" values that are heap allocated. So those types would need Freeze impl's, which is probably inappropriate if they actually affect code generation.

@petertodd

It's also a use-case where exposing the actual compiler Freeze type would be annoying: in my usecase a small subset of types have both interior mutability and implement mem-mapping, with care taken to only actually mutate "dirty" values that are heap allocated. So those types would need Freeze impl's, which is probably inappropriate if they actually affect code generation.

I don't entirely follow. Note that Freeze only talks about the value itself, not other memory it points to. So, for example, Box<RefCell<T>> is Freeze.

On January 1, 2020 12:42:22 PM EST, Ralf Jung notifications@github.com wrote:

@petertodd

It's also a use-case where exposing the actual compiler Freeze type
would be annoying: in my usecase a small subset of types have both
interior mutability and implement mem-mapping, with care taken to only
actually mutate "dirty" values that are heap allocated. So those types
would need Freeze impl's, which is probably inappropriate if they
actually affect code generation.

I don't entirely follow. Note that Freeze only talks about the value
itself, not other memory it points to. So, for example,
Box<RefCell<T>> is Freeze.

I'm talking about values in memmapped files.

To summarize, I have a trait, ValidateBlob, that validates that a blob of bytes are structurally valid for the type implementing it ([repr(C)], unaligned types, etc). It's designed to work with read-only mappings鹿, so interior mutability will simply segfault虏. Freeze in that context would be a useful lint. But ValidateBlob is of course unsafe, so it'd just be a lint.

1) I'm both willing to live with corruption caused by unintentional modifications to the backing file, and in this use case since the backend is meant to be append only, you can use the append-only bit to make that problem much less of an issue. It's basically a database after all. You can also play games with memmapping modes to get similar results, with different trade-offs.

2) Intentional mutation is via copy-on-write, so all intentional mutation will be in heap/stack allocated memory.

Aside-ish

I've come to realize that there could be a need for a trait similar to Freeze, but with different semantics. I may do an RFC out of it, but I think this is a good place to start discussing it, since it may relate to what some people think Freeze is for.

Use case: Cell::read_with

impl<T : ?Sized> Cell<T> {
    fn read_with<R, F> (self: &'_ Self, f: for<'any> fn(&'any T) -> R) -> R
  • EDIT: replaced FnOnce closure by an env-less closure to dismiss the case where the closure itself captures a &'_ Self (discussing the right bounds on the closure is not the point of this post).

This seems like a desirable API, if it was possible to offer it in a sound manner.

  • It would allow, for instance, to read the discriminant of a Cell<Option<String>>.

    Currently, the only way to read it is to temporarily replace with another value:

    fn is_some (cell: &'_ Cell<Option<String>>) -> bool
    {
        let prev = cell.replace(None);
        let ret = prev.is_some();
        cell.set(prev);
        ret
    }
    

    instead of:

    fn is_some(cell: &'_ Cell<Option<String>>) -> bool
    {
        cell.read_with(Option::is_some)
    }
    

Now, using exactly the suggested function API is unsound: Playground, to be run with MIRI.

Basically, here is the offending code:

struct Rec /* = */ (
    pub
    Rc<Cell<Option<Rec>>>,
);

fn evil ()
{
    let this = Rec::new(); // Creates a cycle (thanks to shared mutability)
    this.0.read_with(|inner: &Option<Rec>| {
        // given the type of inner, it is illegal to mutate
        // the pointee (the Option), for instance, its
        // discriminant is known to be immutable.
        if let &Some(ref this) = inner {
            this.0.set(None);
            // Woops, we mutated an immutable discriminant: UB
        }
    })
}

So the issue with Cell::with is visible: by having the "inner" type have interior mutability and it recursively pointing to the "outer" reference, we get to have a shared reference &X witness mutation of its pointee X, which is UB.

So, one (conservative) way of avoiding such issue with compile-time checks is to prevent the first aforementioned necessary condition: let's have a way in the type level to express that a type cannot possibly have interior / shared / aliased mutability, _i.e._, that &T asserts that the pointee T is _transitively_ immutable.

  • Now, this may seem like what the Freeze trait is for, but this is not the case: Freeze is more of a low-level / implementation-detail-based trait: it is just telling if the _direct_ / first layer of a type T is immutable when shared. This is indeed a sufficient condition to have a static occupy read-only memory, or, more generally, have a value of type T be backed up by read-only memory (_e.g._, mmaped memory) if we ensure only &T accesses are possible, and T : Freeze.

    In the previous example, for instance, Option<Rec> is Freeze! Indeed, auto-traits lead to it being Freeze when Rec is, and the first / direct layer of Rec is that of a pointer, that in and on itself has no interior mutability. It would thus be sound to have a static _: Rec be laid out in read-only memory, _i.e._, Rec : Freeze.

    More concretely, Freeze has the following impls (note the lack of : Freeze bound on the generic <T>ype parameter): https://github.com/rust-lang/rust/blob/766fba3fdca8fe075c370a63d2f76f8407483af5/src/libcore/marker.rs#L683-L687

    That is, indirection is automatically Freeze, thus only a direct UnsafeCell is able to break Freeze-ness (and ZSTs, or at least PhantomData, are always Freeze too, it seems. ([T; 0] is crying in a corner btw))

Hence the need for _another_ auto trait, if we are to have Cell::with. I will from now on call it with a long name describing its semantics: TransitivelyImmutableWhenAliased

pub
unsafe
auto trait TransitivelyImmutableWhenAliased {}

impl<T : ?Sized> !TransitivelyImmutableWhenAliased for UnsafeCell<T> {}

// Same as with `Send` and `Sync`, we conservatively drop the trait when raw pointers are involved,
// in case the pointers are type-erase
// (_e.g._, a `*const ()` representing a `*const UnsafeCell<Something>`)
impl<T : ?Sized> !TransitivelyImmutableWhenAliased for *const T {}
impl<T : ?Sized> !TransitivelyImmutableWhenAliased for *mut T {}

unsafe impl<T : ?Sized> TransitivelyImmutableWhenAliased for Box<T> where
    T : TransitivelyImmutableWhenAliased,
{}

impl<T : ?Sized> Cell<T> {
    fn with<R, F> (self: &'_ Self, f: for<'any> fn(&'any T) -> R) -> R
    where
        T : TransitivelyImmutableWhenAliased,

What about {A}Rc ?

I don't really know what to do with {A,}Rc<impl TransitivelyImmutableWhenAliased + ?Sized>. @RalfJung once wrote something about this topic in another comment I cannot find. It was something regarding the "public-ness" of Cell: while it is true that RcBox and ArcInner have counters with aliased mutability, these counters are just a private implementation detail and have nothing to do with the type they wrap. Hence {A,}Rc<impl TransitivelyImmutableWhenAliased + ?Sized> would be a sound type to use with Cell::with. However, it would be a lie to say that {A,}Rc<impl TransitivelyImmutableWhenAliased + ?Sized> is transitively immutable (when aliased), since these counters are not.

If this TransitivelyImmutableWhenAliased trait is to be used only with Cell::read_with, then we could find a different more apt name and have the aforementioned {A}Rc implement the trait. But maybe there are other use-cases where true immutability is needed for soundness, and having these mutable counters would break it (we could, for instance, imagine something along the lines of someone deriving some kind of Sync-ness from TransitivelyImmutableWhenAliased-ness, which would obviously be unsound with Rc

Isn't Cell::with generally unsound because it allows you to violate aliasing rules?

let this = Cell::new(1u8);
this.with(|inner: &u8| {
    // This changes the value of inner despite it being behind an immutable reference.
    // This is a violation of the aliasing rules.
    this.set(2);
});

This is why the key rule in Cell is that it never exposes a reference to the inner value.

You're completely right, I don't know how I can have forgotten to think about the environment captured by a closure 馃槄.

Not all hope is lost, though, because it is still possible to add trait bounds restriction on the closure itself. We can discuss this once I suggest the pre-RFC to the internals (I don't want to carry this issue too off-topic with my suggestion, I just wanted to submit the idea that there are "different kind of Freeze-like traits", and this looked like a good place to talk about it).
I'll edit the above post and replace impl FnOnce... by a fn.. to (temporarily) "dismiss" the issue of the env captured by the closure (but I think that similarly to the Sync <-> Send interaction, a marker trait lost on &impl !TransitivelyImmutableWhenAliased and required on the with closure would solve this).

Not all hope is lost, though, because it is still possible to add trait bounds restriction on the closure itself.

Closure captures aren't the only way the closure might get access to another reference to the Cell, it might be reachable from a global. Modifying the last counter-example (omitting the verbose initialization):

static THIS: RwLock<Cell<i32>>;
THIS.read().unwrap().with(|inner| {
    THIS.read().unwrap().set(2);
})

Oh I somehow imagined that Sync would prevent such example, my bad. Damn I felt like this idea could have worked 馃槥
This means I got no use-case for TransitivelyImmutableWhenAliased (I should have given this more thought).

Oh I somehow imagined that Sync would prevent such example

Actually, I wasn't that far off: not only does ::std::'s RwLock require Sync on the wrappee in order to be Sync itself, but, more generally (_i.e._, taking into accounts things like ::lock_api::ReentrantMutex), the primitive Cell::read_with could require that the Cell definition in and on itself not be Send (on top of it not being Sync) so that it cannot be held in a global variable.

  • a !Send + !Sync cannot be held in a global variable, not even when protected by a lock, otherwise code like the following would be legal:

    lazy_static! {
        static ref THIS: RwLock<Cell<
            &Cell<i32> // something not Send
        >> = RwLock::new(Cell::new(
            Box::leak(Box::new(Cell::new(0)))
        ));
    }
    
    fn main ()
    {
        let first_ref: &Cell<i32> = THIS.read().unwrap().get();
        ::std::thread::spawn(|| {
            let second_ref: &Cell<i32> = THIS.read().unwrap().get();
            loop { second_ref.set(0); }
        });
        loop { first_ref.set(!0); }
    }
    

So, even if my suggestion is no longer applicable to ::core::cell::Cell, it would be interesting to have another definition (an external crate I guess) that would impl !Send for that definition of "Cell" (let's call this definition NonSendCell).

  • implementing !Send to avoid being accessible from within a global does seem hackish, though.

In that scenario, the (auto) trait TransitivelyImmutableWhenAliased would be useful, so my previous post and suggestion remains (partially) valid.

Making it !Send doesn't help, just replace the previous counter-example by a thread-local variable.

There is no way to avoid communication of a caller with the closure, I'm afraid. You'd need something like an effect system.

There is no way to avoid communication of a caller with the closure, I'm afraid. You'd need something like an effect system.

I guess it is now too late to have an auto-trait for global-safety that some types could opt out of, is it not?

I wouldn't even know what that trait would mean (what is the proof obligation that goes along with it?). But also at this point it's probably better to stop the digression in this thread. ;)

Just submitted an RFC.

@petertodd I am not sure I totally understand your use case, so I only mentioned it in passing.

Was this page helpful?
0 / 5 - 0 ratings