The before_exec method should be marked unsafe because after a fork from a multi-threaded process, POSIX allows only allows async-signal-safe functions to be called. Otherwise, the result is undefined behavior.
The only way to enforce this in Rust is to mark the function as unsafe.
So to be clear, this should not compile:
use std::process::Command;
use std::os::unix::process::CommandExt;
fn main() {
let status = Command::new("sh")
.arg("-c")
.arg("echo world")
.before_exec(|| { println!("hello"); Ok(()) })
.status().unwrap();
println!("status: {}", status);
}
Could you clarify the unsafety that can happen here? We chose to stabilize this function on the grounds that we couldn't come up with any memory safety reasons this would cause problems, just deadlocks and such.
It's probably best to treat anything that is undefined according to POSIX as a memory safety violation. Once POSIX says that you cannot assume anything about the process's behavior, you need to assume that memory safety is violated as well.
I'll try to give two examples why this facility is unsafe in practice. I have to argue based on the glibc implementation, but we can hopefully assume that it is a valid implementation, matching the POSIX requirements in play here.
First, assume that you have a PRNG for use for cryptography written in Rust. This PRNG uses locks or thread-local variables to protect its internal state. It does not have fork protection because there is no way to implement that in Rust. This PRNG will return the same sequence of bytes in the before_exec
callback and the parent thread, violating the expectations associated with a PRNG. Maybe this does not count as a memory safety violation, but I think it is still a problem for PRNGs.
For the second problem, we look at the implementation of recursive mutexes in libpthread
. They embed a PID/TID, a small numeric value which gets reused over time. Let's look at this sequence of events:
before_exec
callback.before_exec
callback invokes another external command with another before_exec
callback.before_exec
callback spawns a new thread. The TID of that thread happens to match the TID of the thread in step 1.The attempt in step 6 will succeed because libpthread
incorrectly assumes that the same thread tried to acquire the mutex. But the process was copied while the thread in step 1 was holding the mutex, so the data protected by it may not be consistent, and the thread in steps 5 and 6 may access intermediate results which are not supposed to be visible. This is a thread safety violation, and hence a memory safety violation.
I think the second example at least is quite compelling.
That'd do it!
This is not marked as unsafe today and to do so would be backwards incompatible; however, a quick search on GitHub didn't seem to turn up any code that uses it.
The libs team discussed this a few weeks ago at the last triage and the conclusion was that we'd prefer to see concrete evidence (e.g. a proof-of-concept) showing the memory unsafety here. At that point we'll consider remediations but for now we're fine just adding some documentation that "care should be taken"
In general, anything involving files may not behave as expected.
For example, a concrete memory safety issue would be a memory mapped allocator and assumes that the memory mapped file can't be accessed by multiple processes (e.g., by creating a file with the O_TMPFILE
on Linux or enforced through some other means (e.g., by contract)).
Other shenanigans include:
Not to denigrate theoretical concerns (which is to say, just because one hasn't yet demonstrated that this violates memory safety doesn't mean that we're going to reject this out-of-hand, and it doesn't mean that it won't eventually become a severe bug somehow), but I second Alex's desire to see a working proof-of-concept in code.
Some prior art
Feels similar to me.
I honestly don't have time to write a convincing POC but this breaks all wayland implementations (assuming memory corruption counts as a safety bug) because wayland clients and servers communicate by memory mapped files. Any attempt (accidental or otherwise) to update a wayland buffer from within a before_exec
callback might corrupt the shared buffer. Now, this can be mitigated by taking a global lock and then checking the PID before writing to memmapped buffers but that's not something one wants to do every frame.
@steveklabnik
Those aren't quite the same:
before_exec
(a reason to make an exception).Unlike those cases, this affects a fundamental rust concept: move/copy semantics. Without before_exec
(the ability to fork
without unsafe code), it's impossible to (safely) copy a value of a type that doesn't implement Copy
. With before_exec
, this guarantee is relaxed by allowing copies (in safe code) as long as those copies reside in different processes. This prevents processes from relying on any form of "one-constructor-call -> at-most-one-value" guarantee.
As before_exec
can be used to emulate fork
, these threads also apply:
@alexcrichton I like to see all soundness issues with an associated priority tag, does your comment above imply that this is considered P-low? Alternatively, if you don't consider this a soundness issue at all, would you like to remove the I-unsound label?
Ah yes the libs team decided this was basically P-medium, so I will tag as such.
Even if there were no way to trigger memory unsafety today, at any time a change to libpthreads or libc or the operating system in the future could cause memory unsafety where there was previously none. Accordingly, I don't think blocking the change on a PoC makes sense, and further I think requiring somebody to make a PoC before the change is made would be a waste of those people's time. Any time we trigger undefined behavior, memory unsafety should just be assumed.
BTW have anyone considered that on macOS if an application is using libdispatch, it must not call any libdispatch code after fork? I don't have PoC now but I think it'd not be difficult to create.
I concur with the arguments here that this is a different beast than remove_var
and set_var
. Thus far we have chosen to always interpret specified undefined behavior as memory unsafety, demonstrable or otherwise. It would be a unique exception to our policy of "all memory unsafety may be traced back to an unsafe
block" to add "(...unless using std::os::unix::process::CommandExt::before_exec
)". The longer we wait, the more painful this will be. I propose that we begin emitting a warning ASAP with a plan to mark this unsafe at a specified future date.
With some OS help one can always cause memory unsafety:
#![feature(getpid)]
fn hacky_ptr_write<T>(ptr: &T, value: u32) {
std::process::Command::new("gdb").arg("-batch").arg("-quiet")
.arg("-pid").arg(format!("{}", std::process::id())).arg("-ex")
.arg(format!("set {{unsigned long}}{:p}={}",ptr,value))
.output() .unwrap();
}
fn main() {
let q = &mut Box::new(55);
hacky_ptr_write(&q, 0);
*q = Box::new(44); // Segmentation fault
}
But this does not mean Command
should be unsafe.
@vi
The problem here is that the type-based assumption that a non-Copy type can't be copied is broken (even though the copies end up in different processes). Any code relying on this assumption is potentially incorrect and any unsafe code relying on it is potentially memory unsafe. However, given the "different process" constraint, this bug necessarily involves the operating system.
Yes, you can usually shoot yourself in the foot using the OS†. However, you generally have to explicitly request this behavior. In this case, you can take two apparently safe and independent APIs (e.g., wayland and before_exec
) and combine them to corrupt memory.
†It's actually harder than you might think as long as you aren't root. Most modern Linux distros, at least, don't allow non-root processes to debug (or access/modify the memory of) non-child processes.
I expect there to be some unsafe{}
inside that Wayland code. Can one reproduce the unsafety without using non-std unsafe{}
code?
Does the question boil down to "Should every unsafe{}
block also expect forks or should the fork
itself just be be unsafe"?
non-Copy type can't be copied
During fork()
I expect it to stay on the same virtual address, so that it is as if not copied. !Copy
from Rust perspective does not mean non-copyable from OS kernel perspective.
I expect there to be some unsafe{} inside that Wayland code.
Yes. It memory maps a shared buffer.
Can one reproduce the unsafety without using non-std unsafe{} code?
Not unless you were to move the shared mmap functionality from wayland into std. It's not really about where the code lives but what are valid assumptions and what are not (although any assumptions made by std are assumed to be valid).
Does the question boil down to "Should every unsafe{} block also expect forks or should the fork itself just be be unsafe"?
Pretty much, yes. If we do say "this API is fine", we should also provide a fork function (the same way we made mem::forget
safe).
During fork() I expect it to stay on the same virtual address, so that it is as if not copied. !Copy from Rust perspective does not mean non-copyable from OS kernel perspective.
From an outside perspective, the value is copied and:
From an outside perspective, the value is copied and:
Will be dropped twice.
Will share any os-level objects (e.g. file descriptors).
Just what is typically needed. Consider a C program:
#include <stdlib.h>
#include <unistd.h>
int main() {
int fd = open("myfile.txt", O_RDONLY);
void* mem = malloc(300);
if (fork()) {
close(fd); // first "drop" of fd
free(mem); // first "drop" of mem
} else {
close(fd); // second "drop" of fd
free(mem); // second "drop" of mem
}
}
There are two free
s, but it's not that "double free" error. Likewise file descriptors should be closed twice after a fork.
Unsafety of fork
may be like unsafety of FromRawFd
- not directly messing with memory, just confusing other parts of system with surprises, not causing unsafety from within Rust, but causing operating system to behave unsafely for Rust program. And it's workaroundable by std::fs::File::open("/proc/self/fd/0")
or something like that.
Relationship between processes after fork()
is IPC. Shared memory buffers is also a sort of IPC. Triggering program crash by using IPC is unsafety of the program...
What exactly happens in Wayland if evil peer (or even evil Wayland server) deliberately corrupts our buffer? Shound't Rust program not trust memory-mapped buffers that are writable from outside anyway? I expect storing pointers inside a memory-mapped file is unsafe
anyways. And if no pointers getting stored there then it would be wrong, but safe behaviour.
Not unless you were to move the shared mmap functionality from wayland into std.
What function? map
? But it's unsafe anyways. Also it gives you [u8]
, and those bytes can jump around unpredictably because of it's IPC and somebody may be interfering. Storing safety-important things in that [u8]
is your decision and that's where the wrong assumption ("when I get mapping, it should be safe to work with") kicks in.
There's also this comment from the implementation of Command
:
// Currently we try hard to ensure that the call to `.exec()` doesn't
// actually allocate any memory. While many platforms try to ensure that
// memory allocation works after a fork in a multithreaded process, it's
// been observed to be buggy and somewhat unreliable, so we do our best to
// just not do it at all!
If this is true it seems pretty clear that this function should be considered unsafe. If it's not true then there's a whole lot of unnecessary and complex unsafe code in the standard library.
Here is an example for an API that successfully maintains an invariant (on an OS pipe), but that gets broken by before_exec
:
// cargo-deps: os_pipe
extern crate os_pipe;
mod use_once_pipe {
//! A rather strange module that creates a pipe and makes sure that
//! we only ever send a single byte on that pipe.
use std::io::{Read, Write};
use os_pipe::{pipe, PipeReader, PipeWriter};
pub struct Reader(PipeReader);
impl Reader {
/// Will block until `write` gets called!
pub fn read(mut self) -> Option<u8> {
let mut res = vec![];
self.0.read_to_end(&mut res).unwrap();
assert!(res.len() <= 1, "Our contract got violated! Read {:?}.", res);
res.get(0).map(|b| *b)
}
}
pub struct Writer(PipeWriter);
impl Writer {
pub fn write(mut self, byte: u8) {
self.0.write(&[byte]).unwrap();
}
}
pub fn create() -> (Reader, Writer) {
let (reader, writer) = pipe().unwrap();
(Reader(reader), Writer(writer))
}
}
use std::process::Command;
use std::os::unix::process::CommandExt;
use std::sync::{Mutex, Arc};
fn main() {
// Small demonstration of use_once_pipe.
{
let (r, w) = use_once_pipe::create();
w.write(42);
//w.write(23); // not possible because `write` consumes `self`
assert_eq!(r.read(), Some(42));
}
// Now lets break it.
{
let (r, w) = use_once_pipe::create();
let w = Arc::new(Mutex::new(Some(w)));
let w2 = Arc::clone(&w);
Command::new("true")
.before_exec(move || Ok(w2.lock().unwrap().take().unwrap().write(42)))
.status().unwrap();
w.lock().unwrap().take().unwrap().write(23);
r.read();
}
}
When I run this, I get
thread 'main' panicked at 'Our contract got violated! Read [42, 23].', before_exec.rs:17:13
note: Run with `RUST_BACKTRACE=1` for a backtrace.
@alexcrichton If I replace that assert!(res.len() <= 1)
by something that causes UB, you have an example of before_exec
introducing memory unsafety.
@RalfJung that seems akin to saying File::open
is unsafe because you can File::open("/proc/self/mem")
and arbitrarily read/write memory. I don't think I'd personally agree that example shows any memory unsafety.
@alexcrichton I'm afraid I disagree. Strongly. I think these cases are miles apart. I am also somewhat disappointed that we are now playing these definitional games where I come up with counterexamples and you just define them away. :/
The reasons I see a huge difference between /proc/self/mem
and this case are as follows:
/proc/self/mem
is a somewhat special platform-specific API. It works on Linux only. Admittedly other platforms likely have something similar, but this is still is strong contrast to my example which relies entirely on standard POSIX interfaces./proc/self/mem
would essentially entail making File::open
unsafe, which is a very commonly used API and the vast majority of ways to use it are perfectly safe. That is hardly realistic. (If there was a good way to detect if the file we open is such a strange all-of-memory file, and we could make File::open
reject opening such files, I'd propose that immediately.) In contrast, defending against this bug would just mean making before_exec
unsafe, which is perfectly possible I would argue (and which nobody has argued against).Most importantly though, I think we should consider the trade-off here. We can make either before_exec
safe, and declare everything unsafe that would be broken by it, or vice versa. There is no "not taking a decision", if we change nothing we decide for the former. (After all, this is something we can realistically fix, unlike /proc/self/mem
.) So on the one hand, we have a single rather obscure Unix-specific method doing something that is generally considered very dangerous when programming against POSIX (namely executing arbitrary code between fork
and exec
), and on the other hand, we have every single abstraction relying on having unique control over a file descriptor. Whether this be some network socket that wants to make sure a certain signal is only sent to another host once, a serialization/deserialization library used across a pipe because there might or might not be a process boundary while also maintaining uniqueness of some type instances, pretty much anything that uses Rust's ownership to enforce a discipline on external data could be broken by this.
I usually go around telling people that what makes Rust great is not just that it tracks ownership of memory for you, but that you can use the ownership facilities to track ownership of other, user-defined resources that Rust has no idea about, providing safe interfaces where that would not be possible with a less expressive type system and avoiding correctness errors that would otherwise be hard to detect statically. One of my favorite examples is a software I once started to write (unfortunately never got finished) that interacts with some external sensors, actuators and LEDs and where I used ownership to make sure that every LED has exactly one party in charge, so that we do not get strange flickering because two threads decide they will use the same LED to signal their status. Isn't it great how many places there are where ownership helps us provide better guarantees, or any guarantees at all?
It seems like in the future, every time I say that I will have to add "oh and you better make sure that your program does not call before_exec
, because that can be used to break all of these guarantees". before_exec
is essentially "fork
in safe code", and fork
duplicates the entire process state, putting it at inherent conflict with any reasoning about uniqueness. Making before_exec
safe sends the message that yes, you can use Rust to track ownership within your process, but if you actually want to do something with the outside world, you better do not rely on it. Of course this won't stop anyone, people will still write these abstractions because nobody will think of before_exec
, and then their code will have one entirely unnecessary subtle soundness hole. That is so sad :(
I am adding the lang team, because this concerns fundamental limits of safe abstractions and unsafe code.
FWIW, I find @RalfJung's arguments quite persuasive.
One of my favorite examples is a software I once started to write (unfortunately never got finished) that interacts with some external sensors, actuators and LEDs and where I used ownership to make sure that every LED has exactly one party in charge, so that we do not get strange flickering because two threads decide they will use the same LED to signal their status.
My understanding is that @japaric actually implemented this in his RTFM project. Also I had the same idea before knowing about it. I find breaking this idea just to make some obscure function safe strange.
@RalfJung you're talking specifically about file descriptors and inheritance, though, which is a system-level concern that we really can't provide many guarantees about.
Let's say that we did want to provide a guarantee that you can control file descriptors and guarantee uniqe access to file descriptors. That has consequences like:
unsafe
". This can have a huge impact we don't at all have a handle on as there are likely a good deal of abstractions in the wild that rely on system behavior one way or another, and would likely need close analysis if we want to say "all Rust programs need to provide a new guarantee by default".I also think it's disingenous to paint this as a picture of "oh the bad person is trying to destroy Rust's guarantees by forcing this one obscure thing to be safe". The actual situation is that we have a function added to the standard library long ago as safe and has been stable for quite some time. We had many debates about whether to make it unsafe
during stabilization, and concluded we could find no memory unsafety aspect and hence left it safe. We're not trying to act maliciously and break the community, we're doing the best we can (as everyone else is) and want Rust to be just as safe as everyone else.
I also think it's disingenous to paint this as a picture of "oh the bad person is trying to destroy Rust's guarantees by forcing this one obscure thing to be safe".
I overreacted. I am sorry.
we have a function added to the standard library long ago as safe and has been stable for quite some time. We had many debates about whether to make it unsafe during stabilization, and concluded we could find no memory unsafety aspect and hence left it safe
I was not aware of discussion happening before stabilization, it did not come up in this thread so far (or I missed it). Is it possible to read this somewhere?
there are likely a good deal of abstractions in the wild that rely on system behavior one way or another, and would likely need close analysis if we want to say "all Rust programs need to provide a new guarantee by default".
This goes both ways though, there are likely also a good deal of abstractions that rely on Rust's ownership and non-duplication guarantees across process boundaries if and that would need close analysis if we want to say "Rust's guarantees entirely end at the process boundary". That does seem like a surprising limitation.
You're talking specifically about file descriptors and inheritance, though, which is a system-level concern that we really can't provide many guarantees about.
I realized file descriptors are a red herring. The entire process state gets duplicated, affecting all external communication. Here is another example, this time no file descriptors get duplicated (just as a sketch this time):
Let's say we have a type Led
that represents a single LED:
struct Led(usize); // the usize is some GPIO pin or I虏C address or so
impl Led {
fn on(&mut self) {
// send GPIO/I虏C messages to turn on LED
}
fn off(&mut self) {
// send GPIO/I虏C messages to turn off LED
}
}
The purpose is to make sure that no two parties think they both exclusively control the LED, and confuse the user. (This might be an actuator controlling a servo if you want something where the failure mode is physical destruction. But that's a bit too dramatic even for my taste. ;)
But now I can again do:
fn foo(led: Led) {
let m1 = Arc::new(Mutex::new(Some(led)));
let m2 = Arc::clone(&w);
Command::new("true")
.before_exec(move || Ok(m2.lock().unwrap().take().unwrap().on()))
.status().unwrap();
w.lock().unwrap().take().unwrap().off();
}
The on
and the off
race here, subverting the guarantee that Led
otherwise manages to uphold. No file descriptors get duplicated.
That's why I said this is really about resources external to the process: If we allow safely duplicating the entire process state, we cannot use ownership in Rust to control access to resources that live outside the current process.
And it doesn't have to be embedded either. Another situation I can imagine is a network library using some non-Copy
type to make sure some message (a reply or so) is only sent once. With a safe fork
, I can send the reply twice. Or a crypto library that has some token to let you use something only once (a pre-generated nonce for some operation, or so) could now have that token be used twice. Basically, the moment the thing to protect against is any form of communication with the outside world, a safe fork
breaks any safe API you might otherwise come up with in Rust.
Every platform other than linux is now "memory unsafe" by default because of non-atomic CLOEXEC. We do our best to atomically create file descriptors and set CLOEXEC, but that isn't always possible so handles to file descriptors can leak into child processes. Once they're leaked you no longer have a static guarantee that only your process has access to it.
I had no idea non-Linux unixes do not have an atomic CLOEXEC
. That is sad :(
However, since creating a File
from a file descriptor is unsafe
, the new process cannot actually do anything with those extra file descriptors that it got leaked, right? Also, CLOEXEC
changes behavior on exec
, not fork
. So I do not see how it breaks my pipe example.
Would it be possible to acquire a lock across open a file and setting CLOEXEC
, and take the same lock across a fork
? Then we could be sure that in the forked-off process, all file descriptors (well, at least the one we can have any say about) have CLOEXEC
set. I do not know if such a lock would be prohibitive in terms of performance, though.
To me at least @RalfJung has made compelling arguments for why before_exec
should be unsafe
so I think we should at the very least try a crater run with before_exec
as unsafe fn
and see what the extent of the breakage is and then depending on the outcome change it outright or use a compatibility-warning period to ease crates into this.
What safety warnings should it carry then? Shall nix crate's fork
also be unsafe then?
fork
should definitely be consistent with before_exec
.
I can prepare a PR that marks this unsafe, so that we can crater it to get some more data.
@Centril Is this something that should be nominated for lang and libs team discussion? There'd be an FCP anyway if we decide to go with it.
Actually we're not even getting the compiler built... jobserver fails:
error[E0133]: call to unsafe function is unsafe and requires unsafe function or block
--> /home/r/.cargo/registry/src/github.com-1ecc6299db9ec823/jobserver-0.1.11/src/lib.rs:566:13
|
566 | / cmd.before_exec(move || {
567 | | set_cloexec(read, false)?;
568 | | set_cloexec(write, false)?;
569 | | Ok(())
570 | | });
| |______________^ call to unsafe function
|
= note: consult the function's documentation for information on how to avoid undefined behavior
@RalfJung It would be nicer to have some data re. breakage to go on before discussing actions to take it in-meeting.
Seems like I have to submit a PR to jobserver then to add a currently-unnecessary unsafe block so that we can proceed? Or is there a better way?
Nominating for discussion on next T-lang meeting which is on the 29th.
If @RalfJung and @alexcrichton can participate that'd be good.
We can discuss this in a broader context, but in no way can we switch before_exec
to being an unsafe
function. The only way forward for this, if decided to implement it, will be ot deprecate the existing method and figure out a new name.
I haven't had time to respond to @RalfJung's previous comment, nor has practically anyone else from the libs team weighed in here.
I'm not personally thrilled about debating the minutae here myself, and on skimming this thread awhile back it's already explained that posix basically defines arbitrary code after fork
as UB. Regardless of anything else that's likely enough motivation to start deprecation/renaming.
@alexcrichton
If we don't make before_exec
into an unsafe fn
then the soundness hole remains in the type system -- deprecation is not enough imo; I think it's entirely within our rights per RFC 1122 to make this change especially given that the impact seems to be quite nano-sized in terms of root regressions in a way that does not affect their public APIs (I'd like to run crater to be sure, but the sourcegraph query seems promising). It's even more nano-sized then I initially thought; the only crate that sourcegraph found a regression in was jobserver because the mention in duct
was in a comment.
Is fork(2)
supposed to be usable in single-threaded with-std Rust programs by itself, not as a part of fork-exec combo?
Maybe there should be two forks: the one is safe, but checks that we are currently single-threaded, the other is unsafe and just forks?
@Centril no, we are not within our rights with that RFC to break the entire ecosystem in one go. We allow breaking changes that have minimal impact, which I know pretty certain this does not have minimal impact.
Here's a small list of crates we're breaking with this sort of change that I've found locally
nix
- every historical version is now declared unsafenix
jobserver
alacritty
All of those are relatively high profile candidates, meaning this is not a trivial "just flag it as unsafe and go ahead".
Crater can provide better visibility here, and using that does not require merging changes to jobserver
to figure it out.
using that does not require merging changes to jobserver to figure it out.
How can I get a crater run if rustc doesn't build because it depends on jobserver
?
We chose to stabilize this function on the grounds that we couldn't come up with any memory safety reasons this would cause problems, just deadlocks and such.
I think this approach should be re-evaluated going forwards. Whilst "being able to directly cause memory unsafety" should of course imply that a function should be unsafe, I don't think the inability to do that is reason that a function should necessarily be stabilised as a safe function.
Firstly, there are features which are independently safe, but cause unsafety when used together. With the above approach, this will end up being decided just by whichever feature gets there first, whereas it would be better to make such choices explicitly, ie. we want X to be safe at the cost of Y being unsafe.
Secondly, it may be hard (as is the case here) to properly evaluate whether something is indeed safe or not, and stabilising it as an unsafe function would be more conservative. Obviously we want to avoid doing this too, but in certain cases where there is reasonable doubt (there was a lot of evidence at the time that there might be memory unsafety here), and particularly on platform-specific and infrequently used functionality, either delaying stabilisation or stabilising it as unsafe would be preferable.
@alexcrichton
@Centril no, we are not within our rights with that RFC to break the entire ecosystem in one go. We allow breaking changes that have minimal impact, which I know pretty certain this does not have minimal impact.
I think we should measure the impact on the ecosystem, but given the results from sourcegraph it strikes me as hyperbole to state that this would "break __the entire ecosystem__ in one go" (emphasis mine).
Twice I have read RFC 1122 now and I cannot find any paragraph nor sentence which states that breakage due to soundness fixes has to be minimal. The RFC mainly talks about mitigation strategies depending on impact, severity, exploitability, etc. If you think I'm wrong about this please show me why.
A migration strategy consisting of changing key crates before-hand plus a compatibility warning, over say 2 release cycles, that tells users to wrap .before_exec(..)
in unsafe { ... }
is both simple and a mechanism where affected users have a clear recourse.
Here's a small list of crates we're breaking with this sort of change that I've found locally
nix
- every historical version is now declared unsafe
Well, historical versions yes, we can at most go and introduce new patch versions 0.8.2
, 0.9.1
, 0.10.1
, and 0.11.1
, but we cannot do anything more than that and particularly we cannot deal with crates that have pinned versions of nix
other than with warnings. Furthermore, I checked nix
upstream and did not find any mentions of before_exec
in the code at all.
jobserver
As I noted before this has one case of before_exec
.
alacritty
This has 2 mentions of before_exec
, https://github.com/jwilm/alacritty/search?q=before_exec&unscoped_q=before_exec. The crate has zero reverse dependencies on crates.io tho so the fix is fairly local.
All of those are relatively high profile candidates, meaning this is not a trivial "just flag it as unsafe and go ahead".
OK; so we take a more gradual approach with a lint / warning stating that it should be wrapped in unsafe { .. }
and in 2-3 months we change it to unsafe fn
. I don't have a problem with waiting a bit.
@RalfJung you can use [patch]
to depend on a git repository for a try build.
@Centril I'm not really interested in hyper-analyzing all the use cases here. Crater is required for a change like this anyway, so I'm not going to deal with speculating what crater says until it's run.
What would cause a breakage in Nix? I can't find any uses of before_exec
in any version Nix going back to 0.1.2. As an aside, Nix _did_ have a very similar problem in its tests, caused not by using before_exec
but by calling fork
directly and running code before exec
. As the OP notes, we had to fix it by only using async-signal-safe functions in those situations.
Furthermore, I checked nix upstream and did not find any mentions of before_exec in the code at all.
nix
crate just exposes fork
(the source of before_exec's problems) itself as a safe function. Documentation does not mention any memory crash-worthy issues:
Safety
In a multithreaded program, only async-signal-safe functions like pause and _exit may be called by the child (the parent isn't restricted). Note that memory allocation may not be async-signal-safe and thus must be prevented.
Those functions are only a small subset of your operating system's API, so special care must be taken to only invoke code you can control and audit.
I'm not sure how "async-signal-safe" and Rust's "safe" are connected.
Whilst "being able to directly cause memory unsafety" should of course imply that a function should be unsafe, I don't think the inability to do that is reason that a function should necessarily be stabilised as a safe function.
As far as I remember, there is some notion against marking functions as unsafe just because of it feels unsafe or because of it is bad for security or to deter users from using it. Each unsafe function's documentation should ideally contain the proof of unsafety and concrete instructions of how to use it without causing UB.
Examples of such "unsafe-esque" safe functions are mem::forget
and TlsConnectionBuilder::danger_accept_invalid_certs
.
As far as I remember, there is some notion against marking functions unsafe "just because of it feel unsafe" or because of it is bad for security or to deter users from using it. Each unsafe function's documentation should ideally contain the proof of unsafety and concrete instructions of how to use it without causing UB.
This is the wrong way round: functions which call unsafe functions but are themselves safe should have a "proof" (not necessarily formal, but at least some reasoning) of why they are safe. Not just "we couldn't find any unsafety".
@vi
nix
crate just exposesfork
(the source of before_exec's problems) itself as a safe function. Documentation does not mention any memory crash-worthy issues:
Making before_exec
an unsafe fn
doesn't directly impact nix
as a matter of semver due to fork
being safe in the crate; it does however indicate that nix
might be doing something fishy. @RalfJung knows more about this subject than me so I'll leave that to him.
There is definitely a policy against marking functions that are known to sound but otherwise dangerous as unsafe
. It's also true that functions exposed as safe should be proven (or, in the absence of a formalism for all of Rust in which such a proof can be carried out, convincingly argued) to be sound. Neither quite matches what @Diggsey is talking about: no proof that it's sound and no non-UB footguns motivating unsafe
. I don't think there is an existing policy about that.
you can use [patch] to depend on a git repository for a try build.
I've never used [patch]
before, will see what I can do.
there is some notion against marking functions as unsafe just because of it feels unsafe or because of it is bad for security or to deter users from using it.
True. However:
there are features which are independently safe, but cause unsafety when used together
Namely, my pipe example (helpfully folded away by GitHub) is safe in a universe without a safe fork
, and fork
is safe in a universe that does not allow my pipe example. Making fork
(and hence before_exec
) unsafe
would not just be a "lint" because it is "bad for security", it would actually come with a proper proof obligation for the caller: You must not use this to "use up" or even touch the same exclusive resource in both parent and child process. (Basically, shared references are okay, mutable references are not. My pipe example acquires a mutable reference from a shared reference through Mutex::lock
, that is the unsound part.)
I do not think there can be a general policy about what to do in such cases, we must decide on a case-by-case basis.
Making before_exec an unsafe fn doesn't directly impact nix as a matter of semver due to fork being safe in the crate; it does however indicate that nix might be doing something fishy.
If we decide before_exec
to be unsafe, that can only go together with deciding that we make some amount of guarantees for using ownership to track out-of-process resources -- promising that things like my pipe example are sound. Hence the justification for fork
to be safe (which contains an unsafe
block, so it must manually justify its safety) would no longer hold.
So, yes, this would immediately imply that a safe fork
is a soundness hole in a library.
@RalfJung
If we decide
before_exec
to be unsafe, that can only go together with deciding that we make some amount of guarantees for using ownership to track out-of-process resources -- promising that things like my pipe example are sound. Hence the justification forfork
to be safe (which contains anunsafe
block, so it must manually justify its safety) would no longer hold.So, yes, this would immediately imply that a safe
fork
is a soundness hole in a library.
Oh sure, the crate would have to change eventually; but it does not imply that the code would break immediately in the sense that cargo check
no longer works, which is what I was trying to communicate (poorly!). :)
pipe example is safe in a universe without a safe
fork
The example looks like it trusts sockets to return some sane data. I'm not sure if OS sockets and other objects (apart from memory manager) are in scope of Rust's safety rules.
from_raw_fd
being unsafe may set precedent that file descriptors are in scope of safety rules although. There are workarounds (like /proc/self/fd/...
or /proc/self/mem
), so maybe there is some concensus that tricky file paths are not in scope of safety rules.
Another tricky point is shared memory. Is all shared memory inherently unpredictable (i.e. can't store any references there or rely on any invariants about content of shared memory) or safe facades are OK and tool that break those safe facades are unsafe?
Would hypothetical new OS function that adjusts monotonic clock, allows it to jump back in time (which is currently documended not to happen) safe or unsafe according to Rust?
There should be some explicit policy of which operation system features are in scope of safety rules and which are not. For each feature in scope, there should be description of what invariants shall unsafe-using code preserve (e.g. "don't close or duplicate file descriptors owned by Rust code").
So crater found 32 root regressions. Looks like if we decide we want to take a stanza for tracking ownership of external resources, we'd have to go the gradual-deprecation route.
Thanks for gathering the data @RalfJung! This is a nominated issue for the next T-libs meeting where we can try to discuss and discover a new name here. Do others have suggestions for what to call this method alternatively? Some ideas I might have are:
before_exec2
pre_exec
after_fork
Nominating for discussion on next T-lang meeting which is on the 29th.
If @RalfJung and @alexcrichton can participate that'd be good.
Ralf couldn't participate on the 29th.
My preference is to deprecate the method temporarily with:
#[deprecated =
"This method will become unsafe because $reason.\
Check whether what you are doing is safe and then wrap the call in `unsafe { ... }`."]
And then we can let say 3-6 months pass while we file fixing PRs; after that we make before_exec
into unsafe fn
.
I don't mind having an extra method; but I would like before_exec
to eventually become unsafe fn
.
As for naming, all of the mentioned candidates seem fine; another potential name could be before_exec_unchecked
.
This was discussed briefly with the libs team during today's triage, and there was no objection to moving forward with a deprecation. The leading contender for a name was before_exec2
which clearly indicates it's "basically the same" as the prior method modulo a few small details (in this case the safety).
Moving this issue forward largely just needs someone to champion the PR and implementation to push it through!
Not my call to make but I really do not like the idea of just incrementing the count on methods. This made a lot of code in Python just very ugly to begin with.
If you want to find some better names here are similar APIs in other languages: function with the same behavior in python is called prexec_fn
for subprocesses, in boost it's called on_exec_setup
. In glib the closest equivalent is GSpawnChildSetupFunc
(setup_func
).
If editions could be used to solve this in the future, that would be a huge benefit. (I realise this is difficult given that libstd has to only be compiled once, but perhaps there is a way to attach some metadata to the function signature so that the compiler can do the mapping).
eg. an attribute like:
#[rename("before_exec", edition=2018)]
fn before_exec2(...)
@Diggsey ostensibly you could do that with "edition visibility", e.g. pub(>= 2018)
and pub(<2018)
as an unstable mechanism. I'd like to use some mechanism like that to say e.g. pub(< 2021) fn uninitialized<T>() -> T
.
The leading contender for a name was before_exec2 which clearly indicates it's "basically the same" as the prior method modulo a few small details (in this case the safety).
This makes sense for someone coming from today's Rust, but in a future where before_exec
is not used at all any more, many people will be confused by what the 2
means. Basically, before_exec2
means we will always have to explain the name as "successor of a now-deprecated function".
We can have "soft unsafe" operations which only produce a (deny by default) lint that states that one should be wrapping it in unsafe
, but also mention that the only reason it's a lint is because of backwards compatibility.
Implementing such a scheme in the unsafe checker is quite simple. I am volunteering to write a PR if such a change is desired (or if we just want to see how such a change would look).
"soft unsafe"
Like what happens when taking a reference of a field of a repr(packed) struct?
I think there's definitely enough valid pushback to not call it before_exec2
! I like preexec
or maybe on_exec_setup
myself perhaps? (I don't feel too strongly one way or another).
@oli-obk I think I'd personally prefer to see a different function, because there's also the matter of creating a function pointer to this method which today is safe and creates a safe function pointer, but aftewards would need to ideally be safe and create an unsafe function pointer but would in practice have to unsafely create a safe function pointer
The language team agreed that before_exec
should be unsafe, and leaves the details of a transition plan to the libs team.
Speaking only for myself: I like pre_exec
.
I do feel like there is a meta question lurking here of unsafe composition and how we should manage it. For now, it seems good to err on the side of caution where possible, though, and avoid thorny questions -- but -- as has been amptly demonstrated here -- it's sort of hard to tell what the limits ought to be on what unsafe code can and cannot do when it comes to e.g. external resources. I wonder if at some point we're going to want to try and allow unsafely implemented libraries to declare more precisely the kinds of things they rely on other libraries not to do.
Here is a simple example showing memory unsafety in the presence of fork
:
#![feature(thread_id_value)]
use std::sync::atomic::{AtomicU64, AtomicPtr};
use std::sync::atomic::Ordering::SeqCst;
use std::{thread, ptr};
pub fn f() {
static I: AtomicU64 = AtomicU64::new(0);
static D: AtomicPtr<i32> = AtomicPtr::new(ptr::null_mut());
let data = Box::leak(Box::new(0));
let tid = thread::current().id().as_u64().get();
match I.load(SeqCst) {
0 => {
I.store(tid, SeqCst);
/*
* Assumption for safety: No call to `f` can have the same `tid` while we're
* in the critical section that spans this comment.
*/
D.store(data, SeqCst);
},
n if n == tid => {
/* D has been set to a valid pointer because of the assumption above */
let _v = unsafe { *D.load(SeqCst) };
},
_ => { },
}
}
Most helpful comment
@alexcrichton I'm afraid I disagree. Strongly. I think these cases are miles apart. I am also somewhat disappointed that we are now playing these definitional games where I come up with counterexamples and you just define them away. :/
The reasons I see a huge difference between
/proc/self/mem
and this case are as follows:/proc/self/mem
is a somewhat special platform-specific API. It works on Linux only. Admittedly other platforms likely have something similar, but this is still is strong contrast to my example which relies entirely on standard POSIX interfaces./proc/self/mem
would essentially entail makingFile::open
unsafe, which is a very commonly used API and the vast majority of ways to use it are perfectly safe. That is hardly realistic. (If there was a good way to detect if the file we open is such a strange all-of-memory file, and we could makeFile::open
reject opening such files, I'd propose that immediately.) In contrast, defending against this bug would just mean makingbefore_exec
unsafe, which is perfectly possible I would argue (and which nobody has argued against).Most importantly though, I think we should consider the trade-off here. We can make either
before_exec
safe, and declare everything unsafe that would be broken by it, or vice versa. There is no "not taking a decision", if we change nothing we decide for the former. (After all, this is something we can realistically fix, unlike/proc/self/mem
.) So on the one hand, we have a single rather obscure Unix-specific method doing something that is generally considered very dangerous when programming against POSIX (namely executing arbitrary code betweenfork
andexec
), and on the other hand, we have every single abstraction relying on having unique control over a file descriptor. Whether this be some network socket that wants to make sure a certain signal is only sent to another host once, a serialization/deserialization library used across a pipe because there might or might not be a process boundary while also maintaining uniqueness of some type instances, pretty much anything that uses Rust's ownership to enforce a discipline on external data could be broken by this.I usually go around telling people that what makes Rust great is not just that it tracks ownership of memory for you, but that you can use the ownership facilities to track ownership of other, user-defined resources that Rust has no idea about, providing safe interfaces where that would not be possible with a less expressive type system and avoiding correctness errors that would otherwise be hard to detect statically. One of my favorite examples is a software I once started to write (unfortunately never got finished) that interacts with some external sensors, actuators and LEDs and where I used ownership to make sure that every LED has exactly one party in charge, so that we do not get strange flickering because two threads decide they will use the same LED to signal their status. Isn't it great how many places there are where ownership helps us provide better guarantees, or any guarantees at all?
It seems like in the future, every time I say that I will have to add "oh and you better make sure that your program does not call
before_exec
, because that can be used to break all of these guarantees".before_exec
is essentially "fork
in safe code", andfork
duplicates the entire process state, putting it at inherent conflict with any reasoning about uniqueness. Makingbefore_exec
safe sends the message that yes, you can use Rust to track ownership within your process, but if you actually want to do something with the outside world, you better do not rely on it. Of course this won't stop anyone, people will still write these abstractions because nobody will think ofbefore_exec
, and then their code will have one entirely unnecessary subtle soundness hole. That is so sad :(