Rust: Insufficient synchronization in `Arc::get_mut`

Created on 25 Jun 2018 · 8Comments · Source: rust-lang/rust

Consider the following Rust code:

extern crate rayon;

use std::sync::{Arc, Mutex};
use std::mem;
use rayon::join;

fn main() {
    let a1 = Arc::new(Mutex::new(0));
    let mut a2 = &mut a1.clone();
    join(
        || {
            { let mut guard = a1.lock().unwrap();
              *guard += 1;
              mem::forget(guard); }
            drop(a1);
        },
        || {
            loop {
                match Arc::get_mut(&mut a2) {
                    None => (),
                    Some(m) =>
                    { *m.get_mut().unwrap() += 1;
                      return; }
                }
            }
        }
    );
}

The first thread acquires the lock, modifies the variable, and then drop its Arc without unlocking (that's the point of the mem::forget).

The second thread waits until the first thread decrements the count by dropping its Arc, and then uses get_mut to access the content without taking the lock (at that time, the mutex is still locked).

My claim is that there is a race between the two accesses of the content of the mutex. The only reason the two accesses would be in a happens-before relationship would be that Arc::drop and Arc::get_mut would establish this happens-before relationship. However, even though Arc::drop does use a release write, Arc::get_mut only uses a relaxed read of the strong counter (via is_unique).

The fix is to use an acquire read in is_unique. I do not expect significant performance penalty here, since is_unique already contains several release-acquire accesses (of the weak count).

CC @RalfJung

EDIT

As @RalfJung noted, we do not actually need leaking memory to exploit this bug (hence this is not another instance of Leakpocalypse). The following piece of code exhibit the same problem:

extern crate rayon;

use std::sync::Arc;
use rayon::join;

fn main() {
    let a1 = Arc::new(0);
    let mut a2 = a1.clone();
    join(
        || {
            let _ : u32 = *a1;
            drop(a1);
        },
        || {
            loop {
                match Arc::get_mut(&mut a2) {
                    None => {}
                    Some(m) => {
                        *m = 1u32;
                        return;
                    }
                }
            }
        }
    );
}

C-bug I-unsound 💥 T-libs

Source

jhjourdan

❤24

Most helpful comment

That bug was discovered while trying to formally prove soundness of Arc in the RustBelt project.

jhjourdan on 25 Jun 2018

❤22 👍14 🎉12

All 8 comments

Actually, the race also exists if we were not using Mutex : Arc<u32> also triggers the bug, but then this is a race between a read in the first thread and a write in the second thread.

jhjourdan on 25 Jun 2018

That bug was discovered while trying to formally prove soundness of Arc in the RustBelt project.

jhjourdan on 25 Jun 2018

❤22 👍14 🎉12

Another bug found by formal verification :)

RalfJung on 25 Jun 2018

🎉41

Actually, the race also exists if we were not using Mutex : Arc also triggers the bug, but then this is a race between a read in the first thread and a write in the second thread.

I should emphasize that this new version of the bug does not use mem::forget. So this is not another instance of the leakpocalypse.

jhjourdan on 25 Jun 2018

That bug was discovered while trying to formally prove soundness of Arc in the RustBelt project.

More detail on this: The bug was found while Hai (@hans89), Derek and me were trying to formally verify Arc.

jhjourdan on 25 Jun 2018

This may be confusing because I claimed previously that we (that was including @jhjourdan) already had verified Arc; the answer is that the previous verification was assuming sequential consistency whereas @jhjourdan et al are now doing the verification with a much weaker memory model.

RalfJung on 25 Jun 2018

🎉10 🚀2

I should emphasize that this new version of the bug does not use mem::forget. So this is not another instance of the leakpocalypse.

@jhjourdan Can you update the original issue with the new version that doesn't use mem::forget? Leakpocalypse is what my mind immediately jumped to upon seeing that example.

bstrie on 3 Jul 2018

@bstrie here's the code:

extern crate rayon;

use std::sync::Arc;
use rayon::join;

fn main() {
    let a1 = Arc::new(0);
    let mut a2 = a1.clone();
    join(
        || {
            let _ : u32 = *a1;
            drop(a1);
        },
        || {
            loop {
                match Arc::get_mut(&mut a2) {
                    None => {}
                    Some(m) => {
                        *m = 1u32;
                        return;
                    }
                }
            }
        }
    );
}

The read and write access satisfy the condition of a race according to C11: Two accesses not ordered by happens-before, at least one a write, and at least one non-atomic.

(It is debatable whether this should be a race because it seems no compiler optimization actually invalidates said code. But that's a different question.)

RalfJung on 3 Jul 2018

Was this page helpful?

0 / 5 - 0 ratings