Rust: std::thread::JoinGuard (and scoped) are unsound because of reference cycles

Created on 10 Apr 2015 · 60Comments · Source: rust-lang/rust

You can use a reference cycle to leak a JoinGuard and then the scoped thread can access freed memory:

use std::thread;
use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering::SeqCst;
use std::rc::Rc;
use std::cell::RefCell;

struct Evil<'a> {
    link: RefCell<Option<Rc<Rc<Evil<'a>>>>>,
    arm: thread::JoinGuard<'a, ()>
}

// This reliably observes an immutable reference mutating.
fn bad(g: &AtomicBool, v: &u64) {
    let jg = thread::scoped(move || {
        while !g.load(SeqCst) { }; // Wait for the reference to change
        println!("{}", *v);        // Observe it
        g.store(false, SeqCst);    // Safely exit without crashing
    });
    let e = Rc::new(Evil {
        link: RefCell::new(None),
        arm: jg
    });
    *e.link.borrow_mut() = Some(Rc::new(e.clone())); // Create a cycle
}

#[inline(never)] // Prevent DSE
fn helper(v: &mut Result<u64, &'static str>) {
    let g = AtomicBool::new(false); // Used as a barrier to ensure reliable execution
    if let &mut Ok(ref v) = v { bad(&g, &v) };
    *v = Err("foo");

    g.store(true, SeqCst);
    while g.load(SeqCst) {};
}

fn main() {
    helper(&mut Ok(4));
}

P-medium

Source

arielb1

Most helpful comment

[...] the blog post about "Fearless Concurrency" in Rust mentions std::thread::scoped, which is now deprecated as a result of this issue. Is it possible for someone to update the post? It is misleading for novices (like me).

The API is coming back, so this is mostly a temporary state.

Guys, do you realise how ridiculous this looks? Let's see:

"Hi, I read about this amazing scoped threads feature in Rust, but... I can't actually find it? What's the deal?"

"Oh, that... The thing is... [wiggle squirm] We actually dropped it like half a year ago... But it's just temporary! We have ideas for a replacement! In fact, if you spend half a day wading through discussions on issues and pull requests, perhaps you will ultimately stumble upon a crate far away called Crossbeam that indeed implements a viable replacement, stuffed in along with various other experimental features... And surely something along these lines will someday somehow end up in the standard library again in some form (even though nothing actually seems to be happening regarding that right now)... So you see, it's all temporary! No need to even mention it in existing literature!"

"Riiiiiight.... [slowly backing away] You know, I think I'll actually look for some other language... But you kids keep having fun!"

antrik on 30 Oct 2015

👍3

All 60 comments

Seems pretty bad, nominating.

sfackler on 10 Apr 2015

Sigh. Good point! This seems very similar to the dropck rules, I suspect we can address it in a similar fashion to how we addressed Arena. But I have to have a bit more caffeine and try to work it through to make sure I'm correct. :)

cc @pnkfelix

UPDATE: Spelled out what I meant a bit more.

nikomatsakis on 10 Apr 2015

@sfackler

I don't think this is _so_ bad, because I can't really think of this happening _accidentally_. I think the best way to fix this would be to add a Leak OIBIT and make Rc/Arc require it. We may want to make it a by-default bound (like Sized), through, to prevent massive annotation burden.

arielb1 on 10 Apr 2015

triage: P-backcompat-libs (1.0)

I definitely think we need to address this for 1.0. I'm still not sure the best way to do it.

nikomatsakis on 10 Apr 2015

From a pragmatic perspective, I like the idea of ?Leak like ?Sized, but I'm not sure it's really justifiable since (1) it's not a language-level feature like Sized is, (2) we don't have builtin support for Leak, (3) it puts a lot of emphasis on Rc / Arc, since those are (IIRC) the only Rust constructs that can leak anything in the first place. Making Leak an OIBIT and opting out for JoinGuard seems sufficient to me; Arc already requires two trait bounds on its generic type parameter and I've never really seen anyone complain about it, so I suspect the annotation burden isn't too severe.

pythonesque on 14 Apr 2015

This doesn't seem like innate unsoundness so much as a way to turn one form of unsoundness (the ability to cause a leak) into a worse form of unsoundness (the ability to cause a crash).

Have there been any previous efforts to eliminate the uncollectability of reference cycles?

joshtriplett on 15 Apr 2015

https://github.com/python/cpython/blob/master/Modules/gcmodule.c

dbpokorny on 15 Apr 2015

😄3

@joshtriplett raw memory leaks explicitly do not qualify as unsound, or at least, "unsafe", under Rust's use of the term, as explicitly discussed here:

http://doc.rust-lang.org/1.0.0-beta/reference.html#behaviour-not-considered-unsafe

There is further detailed discussion on an internals thread here:

http://internals.rust-lang.org/t/are-memory-leaks-considered-to-violate-memory-safety/1674

I point this out merely to establish that is not a gradient; it is a firm boolean condition, marking a line beyond which one risks undefined behavior.

In any case, a garbage collector is part of our future goals for Rust. That, when used, will allow cycles using Gc<T> to be reclaimed. But it won't resolve the problem in this issue, because garbage collectors reclaim storage at unpredictable times, and thus are not suitable for RAII guards.

pnkfelix on 15 Apr 2015

@reem and @nikomatsakis proposed different solutions for solving this on IRC

@reem's solution: make Rc and Arc require 'static and specify that leaking memory for non-'static data is unsafe. (Optional: create unsafe variants for Rc and Arc that allow non-'static content) (logs)

@nikomatsakis's solution: create a new scoped() API that is tolerent to memory leaks (logs, prototype type signatures)

theemathas on 15 Apr 2015

I changed my mind about the "make Rc require 'static" solution after @nikomatsakis demonstrated compelling use cases for Rc of non-'static data and reminded me that it would break code like fn x<T>(...) -> Rc<T>.

reem on 15 Apr 2015

Note that a hypothetical Gc type would naturally require 'static, since the destructor can run at any time in the future. Rc is very similar in this sense, which is what inspired me to say we should just bound by 'static, but pragmatism might have to prevail here, since it's useful to Rc non-'static data.

reem on 15 Apr 2015

@reem (well there might be some other bound we come up with in the future that is more flexible than 'static for Gc's purposes, at least if the collector ends up being thread-local, as originally planned for @T ... something that encodes that the data does not outlive the task ...)

pnkfelix on 15 Apr 2015

Well making Rc require T: 'static would be really bad (as in, librustc would require a pretty deep refactoring).

I think we do want to introduce Leak someday, but that would be a breaking change. We may have to settle for an ST-like with_scope API.

arielb1 on 15 Apr 2015

After thinking about this for a bit, I think that dealing with Rc/'static/Leak may be a bit of a red herring. Upon investigating, we actually have a few known sources of memory leaks today in the standard library and the compiler itself, all of which can lead to the same form of memory unsafety as exposed through Rc:

Failing destructors will leak local variables (example unsafety)
Vec does not handle panicking destructors well (example unsafety). Note that this may apply to other collections as well.

Each of these can definitely be classified as a bug in its own right, but it goes to show that a targeted solution at Rc/Arc may not totally fit the bill as it's often pretty convenient to just assume it's ok that destructors do not run.

alexcrichton on 15 Apr 2015

Yeah I'm a bit confused that std::thread::JoinGuard is getting the blame here. It seems to me that the real problem is that Rc+RefCell enables writing mem::forget in safe code:

fn safe_forget<T>(data: T) {
    use std::rc::Rc;
    use std::cell::RefCell;

    struct Leak<T> {
        cycle: RefCell<Option<Rc<Rc<Leak<T>>>>>,
        data: T,
    }

    let e = Rc::new(Leak {
        cycle: RefCell::new(None),
        data: data,
    });
    *e.cycle.borrow_mut() = Some(Rc::new(e.clone())); // Create a cycle
}

fn main() {
    struct Foo<'a>(&'a mut i32);

    impl<'a> Drop for Foo<'a> {
        fn drop(&mut self) {
            *self.0 += 1;
        }
    }

    let mut data = 0;

    {
        let foo = Foo(&mut data);
        safe_forget(foo);
    }

    data += 1; // super make sure there's no outstanding borrows

    println!("{:?}", data); // prints 1, should print 2
}

Planned features like Vec::drain_range rely on the fact that the destructor of the iterator will be called before the parent Vec becomes accessible again for _memory safety_. (rough sketch: data in the middle of the Vec is moved out (becoming logically uninitialized) and elements are _only_ back-shifted in the iterator's destructor).

Gankra on 15 Apr 2015

👍1

I understand that memory leaks are not currently considered unsound. I'm suggesting the same thing @Gankro is, that we _should_ prevent writing mem::forget in safe code, using Rc or otherwise.

Would it work to somehow prevent an Rc from being put in an Rc, by requiring a particular type class for things put in Rc, and not providing that type class for Rc itself? (That would then prevent building a cyclic data structure with Rc, which seems like a feature. Such structures could still be built with arenas or similar.)

joshtriplett on 15 Apr 2015

Or alternatively, perhaps just JoinGuard needs to be prevented from being put in an Rc (directly or indirectly). Anything used for RAII needs to not be.

joshtriplett on 15 Apr 2015

@alexcrichton These are all probably resolvable with a Leak trait the compiler knows about... for example, we could forbid putting types with !Leak in locals within destructors (this would imply that we should use ?Leak like ?Sized, assuming Leak by default for type parameters on types with Drop impls, and that this was assumed by default for destructors, which I think is fine since Leak doesn't violate parametricity in ways that interfere with dropck). This would also solve the Vec case, since Vec would simply not accept Leak types (in the future, with "not" specialization, we could then provide two versions of the Vec destructor, solving the issue without reducing efficiency for other types).

Edit: Rereading the bug, this isn't really true. I think we actually can't solve this generally without some form of nothrow (which I want anyway, but clearly isn't going to come to Rust before 1.0 :( ). None of this would be an issue with abort on fail within a destructor...

@joshtriplett There is no way to do this in general without disallowing interior mutability within Rc (expanding Rust's ability to determine type safety for reference cycles is desirable in its own right, of course). The solution I'm in favor of is ensuring (via a trait Leak) that you could explicitly opt out of the ability to mem::forget a particular type, like JoinGuard (but there are many others for which this might be appropriate).

pythonesque on 15 Apr 2015

mem::forget is unsafe; using that on an RAII type is a footgun, but not one that needs fixing. However it shouldn't be possible to implement mem::forget in purely safe code.

So, rather than introducing traits to prevent putting JoinGuard in Rc, why not introduce traits to prevent putting Rc in Rc? (And since in general a type like Rc that can have reference loops would necessarily involve unsafe in the middle, it's the responsibility of that type to say so and prevent nesting such types.)

Then, even if you have interior mutability, you still couldn't create a reference loop without also using an unsafe cast.

As for the trait itself, has there been any work on negative lifetime bounds? As in, "this object has lifetime 'a, so it cannot be referenced from something that'll live longer than 'a, and thus any arena you stuff a pointer to it into cannot have an overall arena lifetime longer than 'a".

joshtriplett on 15 Apr 2015

why not introduce traits to prevent putting Rc in Rc?

Because that would make it impossible to implement many useful data structures in safe rust code. I'd imagine such a change would horribly break pretty much any program with non-trivial Rc usage, and leave them with using unsafe all over the place as the only fix that doesn't involve re-writing the whole program.

I think the proper solution is to admit that we won't be able to prevent memory leaks without massively reducing the expressiveness of safe rust, and thus make mem::forget a safe function to indicate that skipping destructor calls is always a possibility. (remember, there are currently also ways to implement mem::forget in safe rust without using Rc)

I think Rust can do without RAII guards where skipping the destructor is memory-unsafe -- closures are a perfectly good alternative, and there aren't that many types affected. (plenty of languages lacking RAII use closures for similar purposes)

dgrunwald on 15 Apr 2015

Because that would make it impossible to implement many useful data structures in safe rust code.

Not necessarily; such data structures could use arena allocation, right?

remember, there are currently also ways to implement mem::forget in safe rust without using Rc

Interesting; such as?

But in any case, adding traits that prevent putting JoinGuard in Rc will still work; it just seems unfortunate that Rust cannot actually guarantee memory reclamation even for programs written entirely using safe code.

joshtriplett on 15 Apr 2015

Discussing on IRC several of the compiler devs are of the opinion that mem::forget is unsafe largely as a warning, and not that it should be possible to trigger memory unsafety. Evidently there has never been a safe guarantee that destructors would be run when a type is believed to be dropped (e.g. its borrows expire).

Gankra on 15 Apr 2015

Not necessarily; such data structures could use arena allocation, right?

persistent binary trees need to be able to share data with previous versions, but you'd still want to free memory used by old versions that you don't need to access anymore. The easiest implementation by far is to use Rc for the references to child nodes.

Interesting; such as?

You can send values to another thread that runs an infinite loop. (OK, this is restricted to T: Send+'static, but it's still a memory leak in safe code)

dgrunwald on 15 Apr 2015

I would be interested in hearing if it's possible to implement forget for non-'static data, not using Rc, and only safe code (including safe APIs in std). All the alternatives I've heard so far either exploit a bug in panicking during destructors or require 'static - this actually makes a lot of sense conceptually, since 'static data is by definition safe to keep around for arbitrary regions of the code.

reem on 15 Apr 2015

I'm not aware of a forget implementation not using Rc that works for non-'static using the standard libraries. (but similar smart pointer types in other libraries probably will have the same problem)

Bugs regardings panics are equivalent to requiring 'static since you can't catch panics for non-'static data. (you could with the first version of thread::scoped, but that leads to unsafety in other ways)

It seems to me that the problem here involves 4 things happening at the same time:

Rc containing another Rc
Rc with interior mutability
Rc contains non-'static data
The non-'static data has a Drop impl

Without 1. or 2., you can't construct cycles (?). Without 3. or 4., leaks should be harmless (?). So there might be a possible approach here that doesn't break much existing code, if we forbid the use of Rc only when all 4 points apply.

I don't think we can add "prevent only certain kinds of memory leaks" to the 1.0 goals without delaying the release. On the other hand, releasing 1.0 with an unstable or heavily restricted Rc is certainly not an option. Once we release 1.0 with the ability to write a safe mem::forget, it'll always be possible to write a safe mem::forget. (though we still could invent !Leak types that cannot be used with mem::forget)

dgrunwald on 16 Apr 2015

There is also:

5. The Drop impl *must* be run or memory safety can be violated.

This is not true of many Drop impls. Banning 5 is basically the Leak solution.

reem on 16 Apr 2015

I have created an RFC spurred on by this topic, which proposes admitting that memory leaks are safe behavior in Rust. (consequently admitting that scoped was the source of unsafety here).

alexcrichton on 16 Apr 2015

Retriaging as P-medium. The immediate threat to 1.0 has subsided since this API is unstable.

brson on 24 Apr 2015

It's funny but I always expected this to be the case. Creating a strong ref-cycle shouldn't call the destructor, even if the cycle goes out of scope. But it should not be safe to do this.

It should not be safe to create or leak ref-cycles. It probably shouldn't even be safe to have a RefCell to an Rc.

alexchandel on 27 Apr 2015

Edit: https://github.com/rust-lang/rfcs/pull/1094 Oops, guess I was not the first one to think of this.

What are the situations where non-'static Rc is useful?

Here is some rough speculation: Say that the interface was tweaked so that to create a non-'static Rc, one must specify a "extra-strong reference". The idea is that when the extra-strong reference is dropped, the whole graph of Rcs associated with it is destroyed regardless of reference count. I hope using a sort of "strictly outlives" constraint, it can be set up so that the only reference to the RCs will be cycles at the point where extra-strong is dropped, so there are no dangling pointers. The same lifetime constraints would prevent putting the extra-strong reference in the Rc graph it owns. Thus we recover a guarantee of no leaks.

I don't claim to have found some cycle-breaking panacea--this different restricts the use of non-static RC. But are the restrictions much more over than what is already imposed by the the borrow checker on non-'static Rc? Do they interfere with the known use cases?

On second thought, this is vaguely like getting the scope proxy ideom from https://github.com/rust-lang/rfcs/pull/1084 and using it to ensure clean up of non-'static'Rc` graphs instead forked threads.

Ericson2314 on 29 Apr 2015

This was deprecated from the standard library and will soon be removed, so closing.

alexcrichton on 12 Aug 2015

I understand why thread::scoped was removed. However having a thread scoped to a context - e.g. the lifetime of a class is really useful. I run my head against launching a thread that runs a Box<SomeTrait>::run(self). Box<SomeTrait> is not Send + 'static, but Send + 'a.

thread::scoped is a very natural way of expressing this use case. What is the current recommended way of implementing this? Is it even possible without unsafe with thread::spawn expecting 'static?

SirVer on 14 Aug 2015

@SirVer Scoped threads are very useful. They will be back as soon as there is a safe way to implement (and use) their API in Rust.

I know the crate simple_parallel has implemented some similar concepts, but that crate is just an experiment(?)

bluss on 14 Aug 2015

@bluss My argument is that they are not only useful, but essential: Right now the thread API requires Send + 'static for everything and some features can just not be expressed in this. So far it was possible to opt in to thread::scoped on nightly - now, some paradigms are just not expressible in safe rust anymore.

I argue thread::scoped should not have been removed until an alternative design was in place.

For the record, this is what I have trouble with expressing without scoped: https://github.com/sirver/switchboard, especially https://github.com/SirVer/Switchboard/blob/master/src/client.rs#L240.

SirVer on 14 Aug 2015

They were removed yesterday in PR https://github.com/rust-lang/rust/pull/27684 due to being deprecated.

Rust cannot keep around features that break memory safety. It will come back when someone figures out (and bugfixes the compiler) a memory safe API for it.

bluss on 14 Aug 2015

There should be a 3rd party crate that can bridge the gap for you (with an unsafe-marked API), but I don't know what the best alternative is.

bluss on 14 Aug 2015

Users wanting this risky behavior had to explicitly enable the feature, after being confronted with a message basically saying "this might be incredibly dangerous. Read the issue". I don't see a problem with this. Lots of people were using scoped in a way that was entirely fine, and it seems like a net negative to entirely remove the capacity for users to make that call, especially given that implementing scoped using std seems a lot more annoying than implementing it from within std (maybe this has stopped being true - last time I tried, I failed miserably because I couldn't mem::transmute away all the static requirements).

bfops on 14 Aug 2015

A third party implementation of scoped should be completely possible to implement.

sfackler on 14 Aug 2015

This is as close as I've gotten. Hopefully somebody cleverer than myself can figure it out. It would definitely seem strange to not be able to use a combination of thread::spawn and mem::transmute to implement scoped.

bfops on 14 Aug 2015

I'd look at thread-scoped and simple_parallel. I haven't evaluated them myself. I guess they can only get better with use / feedback.

bluss on 14 Aug 2015

With simple_parallel, I did this - all in safe Rust. It's rather
inefficient, though... with unsafe code, one could easily make this better.
https://gist.github.com/anonymous/00778b17c01a5ee00963

RalfJung on 14 Aug 2015

Ah cool, I don't think thread-scoped existed last time I searched. At a quick glance, it looks like it adds a T + 'static requirement and a level of indirection in the spawn.

bfops on 14 Aug 2015

@RalfJung with std::iter::once and once(x).chain(once(y)), you have a two element iterator :)

bluss on 14 Aug 2015

The thread-scoped crate is an adequate replacement. It defines ::scoped as unsafe, but I guess that is a reasonable definition for now. I am really hoping the standard library comes up with a better solution allows for scoped threads in the near future.

SirVer on 17 Aug 2015

Hi all.
I don't know if I should open a separate issue about this but: the blog post about "Fearless Concurrency" in Rust mentions std::thread::scoped, which is now deprecated as a result of this issue. Is it possible for someone to update the post? It is misleading for novices (like me).

nicola-gigante on 5 Sep 2015

The API is coming back, so this is mostly a temporary state.

Manishearth on 5 Sep 2015

Interesting! Can you point me to the place (if any) where people are discussing how to safely implement it?

nicola-gigante on 5 Sep 2015

I can't find it right now, but there already were some working scoped threads designs posted on the RfC. Not sure what the timeline for bringing this back is.

Manishearth on 5 Sep 2015

https://github.com/rust-lang/rfcs/pull/1084

Manishearth on 5 Sep 2015

Crate crossbeam implements scoped threads.

bluss on 5 Sep 2015

[...] the blog post about "Fearless Concurrency" in Rust mentions std::thread::scoped, which is now deprecated as a result of this issue. Is it possible for someone to update the post? It is misleading for novices (like me).

The API is coming back, so this is mostly a temporary state.

Guys, do you realise how ridiculous this looks? Let's see:

"Hi, I read about this amazing scoped threads feature in Rust, but... I can't actually find it? What's the deal?"

"Riiiiiight.... [slowly backing away] You know, I think I'll actually look for some other language... But you kids keep having fun!"

antrik on 30 Oct 2015

👍3

You don't need crossbeam. This is an old thread, the state of scoped threads has moved past this since then.

https://crates.io/crates/scoped_threadpool gives you the functionality. That's it. It's in an external crate which works and is sound, there's no pressing need to include it in the standard library.

Manishearth on 30 Oct 2015

@antrik please try to be a bit more substantial and less sarcastic with your criticism. This kind of comment isn't particularly welcome here.

steveklabnik on 30 Oct 2015

You don't need crossbeam. [...] https://crates.io/crates/scoped_threadpool gives you the functionality.

Pools are nice; yet crossbeam::Scope seems a more straightforward replacement for the old functionality?...

Anyway, debating this actually serves to demonstrate the existing confusion and lack of communication regarding this situation; creating an unacceptable story for any newcomer. thread::scoped() is prominently featured in a major piece of Rust advocacy (the mentioned blog post), without any hint that it's gone, or where to look for replacements; nor am I aware of any other clear communication regarding this, that people looking for thread::scoped() are likely to find -- and nobody seems to be willing to even acknowledge that there is a communication problem...

If my post wasn't substantial in illustrating how this feels to people outside the bubble, I really don't know what would be.

antrik on 31 Oct 2015

Sure, that's a straightforward thing to propose. You could have just done that. Your initial comment didn't complain about the communication error, it just ranted unconstructively without anything really helpful.

@alexcrichton could you update the blog post with a note about scoped threads being moved out of the standard library to scoped_threadpool and Crossbeam?

Manishearth on 31 Oct 2015

https://github.com/rust-lang/blog.rust-lang.org/pull/74

Manishearth on 31 Oct 2015

Thanks.

I was frustrated that the last person who suggested updating the post was brushed off, so I tried to illustrate why this is serious problem... Sorry that I failed to make my motivation clear.

Note though that updating this blog post only solves part of the problem. The Why Rust? "report" for example only vaguely hints at changes; and there might be other mentions out there as well -- so it would be helpful to have some kind of "landing page" for people looking for thread::scoped(). (Ideally right in the standard library documentation where it used to live -- though I guess that might be technically impossible after it has been dropped for good?...)

antrik on 31 Oct 2015

We could add a dummy documentation page or something. Or just add a note in the thread module. Not sure what to do in this case, @steveklabnik

As for Why Rust I think the author was informed about this later, not sure if he changed anything.

Manishearth on 31 Oct 2015

I'm not aware of anyone being brushed off, but I think editing the post is a great idea. Merged.

I wouldn't be a fan of adding dummy doc pages.

Sent from my iPhone

On Oct 30, 2015, at 20:50, Manish Goregaokar [email protected] wrote:

We could add a dummy documentation page or something. Or just add a note in the thread module. Not sure what to do in this case, @steveklabnik

As for Why Rust I think the author was informed about this later, not sure if he changed anything.

―
Reply to this email directly or view it on GitHub.

steveklabnik on 31 Oct 2015

I'll update the Why Rust? report. (I used both scoped_threadpool and crossbeam::scope in my presentation at OSCON Amsterdam last week; they're great.)

(In the future, please feel free to contact me directly about these things, rather than just mention me obliquely!)