Consider the following Rust code:
extern crate rayon;
use std::sync::{Arc, Mutex};
use std::mem;
use rayon::join;
fn main() {
let a1 = Arc::new(Mutex::new(0));
let mut a2 = &mut a1.clone();
join(
|| {
{ let mut guard = a1.lock().unwrap();
*guard += 1;
mem::forget(guard); }
drop(a1);
},
|| {
loop {
match Arc::get_mut(&mut a2) {
None => (),
Some(m) =>
{ *m.get_mut().unwrap() += 1;
return; }
}
}
}
);
}
The first thread acquires the lock, modifies the variable, and then drop its Arc without unlocking (that's the point of the mem::forget
).
The second thread waits until the first thread decrements the count by dropping its Arc, and then uses get_mut
to access the content without taking the lock (at that time, the mutex is still locked).
My claim is that there is a race between the two accesses of the content of the mutex. The only reason the two accesses would be in a happens-before relationship would be that Arc::drop
and Arc::get_mut
would establish this happens-before relationship. However, even though Arc::drop
does use a release write, Arc::get_mut
only uses a relaxed read of the strong counter (via is_unique
).
The fix is to use an acquire read in is_unique
. I do not expect significant performance penalty here, since is_unique
already contains several release-acquire accesses (of the weak count).
CC @RalfJung
EDIT
As @RalfJung noted, we do not actually need leaking memory to exploit this bug (hence this is not another instance of Leakpocalypse). The following piece of code exhibit the same problem:
extern crate rayon;
use std::sync::Arc;
use rayon::join;
fn main() {
let a1 = Arc::new(0);
let mut a2 = a1.clone();
join(
|| {
let _ : u32 = *a1;
drop(a1);
},
|| {
loop {
match Arc::get_mut(&mut a2) {
None => {}
Some(m) => {
*m = 1u32;
return;
}
}
}
}
);
}
Actually, the race also exists if we were not using Mutex
: Arc<u32>
also triggers the bug, but then this is a race between a read in the first thread and a write in the second thread.
That bug was discovered while trying to formally prove soundness of Arc
in the RustBelt project.
Another bug found by formal verification :)
Actually, the race also exists if we were not using Mutex : Arc
also triggers the bug, but then this is a race between a read in the first thread and a write in the second thread.
I should emphasize that this new version of the bug does not use mem::forget
. So this is not another instance of the leakpocalypse.
That bug was discovered while trying to formally prove soundness of Arc in the RustBelt project.
More detail on this: The bug was found while Hai (@hans89), Derek and me were trying to formally verify Arc
.
This may be confusing because I claimed previously that we (that was including @jhjourdan) already had verified Arc
; the answer is that the previous verification was assuming sequential consistency whereas @jhjourdan et al are now doing the verification with a much weaker memory model.
I should emphasize that this new version of the bug does not use mem::forget. So this is not another instance of the leakpocalypse.
@jhjourdan Can you update the original issue with the new version that doesn't use mem::forget? Leakpocalypse is what my mind immediately jumped to upon seeing that example.
@bstrie here's the code:
extern crate rayon;
use std::sync::Arc;
use rayon::join;
fn main() {
let a1 = Arc::new(0);
let mut a2 = a1.clone();
join(
|| {
let _ : u32 = *a1;
drop(a1);
},
|| {
loop {
match Arc::get_mut(&mut a2) {
None => {}
Some(m) => {
*m = 1u32;
return;
}
}
}
}
);
}
The read and write access satisfy the condition of a race according to C11: Two accesses not ordered by happens-before, at least one a write, and at least one non-atomic.
(It is debatable whether this should be a race because it seems no compiler optimization actually invalidates said code. But that's a different question.)
Most helpful comment
That bug was discovered while trying to formally prove soundness of
Arc
in the RustBelt project.