Rust: Tracking Issue for "unsafe blocks in unsafe fn" (RFC #2585)

Created on 29 Apr 2020 · 24Comments · Source: rust-lang/rust

This is a tracking issue for the RFC "unsafe blocks in unsafe fn" (rust-lang/rfcs#2585).
The feature gate for the issue is #![feature(unsafe_block_in_unsafe_fn)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

[x] Introduce an opt-in lint that, when enabled, causes unsafe blocks in unsafe functions to be considered required, and warns if they are absent when an unsafe operation is performed.
[ ] Include a suggestion with the lint that can insert required unsafe blocks. This could be as simple as adding a block across the entire function, though more granular insertion is probably better.
[ ] Adjust documentation to describe the (somewhat unusual) effect of the lint, and to describe the possibility that the lint will be enabled default (see instructions on rustc-dev-guide)
[ ] Stabilization PR (see instructions on rustc-dev-guide)
[ ] Write a blog post describing the change, perhaps?

Unresolved Questions

What is the timeline for adding the lint, and cranking up its default level?
Should the default level depend on the edition?
Should we ever make this deny-by-default or even a hard error, in a future edition?
Should we require cargo fix to be able to do something about this warning before making it even warn-by-default? And how precise should it be?

Implementation history

None so far

B-RFC-approved C-tracking-issue F-unsafe-block-in-unsafe-fn T-lang

Source

nikomatsakis

🚀12 ❤2 👍2

Most helpful comment

even if users are required to put unsafe operations in unsafe blocks, nothing actually makes them justify those unsafe operations, or do anything but wrap entire function bodies in unsafe blocks and call it a day (as I often do during prototyping).

That's no worse than the status quo where all unsafe functions have an implicit unsafe block around their entire body. To me, the biggest advantage of requiring an explicit unsafe block is that it is much easier to make the unsafe scope smaller (just move lines out of the block) compared to creating a second "safe" inner function (which is far from an obvious and requires lots of boilerplate). The problem of annotating each unsafe block seems to be a different problem to me and completely unrelated to this change.

By marking those functions unsafe as I construct them, the code remains in an unsafe scope right now. But this change would require me to set up new unsafe scopes as I make these sorts of edits, greatly increasing the ceremony.

So your problem is that you need to add an extra unsafe block around function bodies when prototyping? Maybe an option to only warn instead of error on missing unsafe blocks would be good solution for this? By enabling this flag during prototyping, you could add the required unsafe blocks later when you decided how to structure your code. This option could be limited to cargo check so that an actual compilation with this flag wouldn't be possible.

Also, IDE support for adding required unsafe blocks automatically might help with this too. When prototyping and refactoring, I often need to change the mutability of variables, which I found quite annoying without IDE support. With rust-analyzer however, I can now simply apply a suggestion, which solves the problem for me completely.

phil-opp on 5 Jun 2020

👍12

All 24 comments

cc @RalfJung

nikomatsakis on 29 Apr 2020

A relevant note towards the implementation:

Clippy has a helper function for testing whether a lint is enabled.

nikomatsakis on 30 Apr 2020

I'll try to work on the implementation!

LeSeulArtichaut on 3 May 2020

👍5

If I understand https://boats.gitlab.io/blog/post/ringbahn/ correctly, that API would also benefit greatly from not treating the body of an unsafe fn as unsafe {}: there is a trait with an unsafe fn, and

It’s important to understand what an unsafe methods means: unsafe methods are unsafe to call - meaning that callers must guarantee additional invariants when calling them. However, they are safe to implement (as long as you don’t do anything unsafe inside of them). So when a user implements the Event trait, they are given additional guarantees about how this method will be called that normally they could not assume.

@withoutboats did I understand this correctly?

RalfJung on 27 May 2020

👍2

@RalfJung I don't think its a particularly new situation. Basically, the guarantee that you are getting from it being unsafe are the guarantees you need to be able to call these unsafe methods in the SubmissionQueueEvent that is passed as an argument.

In other words, an Event implementation is going to use probably 1 unsafe operation (calling one of these methods), which it can prove is correct based on the invariant that trait requires the caller to maintain. This isn't really different from any other unsafe code.

What I'm trying to say with the blog post is that the Event trait hasn't introduced a new invariant for you to maintain, its introduced a new invariant that I maintain so its easier for you to call that unsafe method you need to call to get any work done.

This is just how unsafe trait methods work in general, so its nothing new. But I agree that unsafe trait methods (where the invariant contract is specified separately from the implementer) are a compelling example of where unsafe blocks in unsafe fns seem beneficial.

withoutboats on 27 May 2020

What I'm trying to say with the blog post is that the Event trait hasn't introduced a new invariant for you to maintain, its introduced a new invariant that I maintain so its easier for you to call that unsafe method you need to call to get any work done.

Is there a realistic safe implementation of that method that ignores the extra invariant provided by the caller, or is the expectation that most/all implementations will have to do something unsafe justified by that invariant?

RalfJung on 27 May 2020

Yea I believe some kinds of events should be safe to prepare because they only send Copy data into the kernel and so they don't need to worry about any kind of memory safety invariant.

withoutboats on 28 May 2020

Step 1 can now be checked since #71862 landed.

LeSeulArtichaut on 30 May 2020

❤7

I want to give an experience report. I've been writing a lot of unsafe code recently with ringbahn, and I've been thinking about how this change would have impacted my experience and the correctness of my code. Unfortunately, my thoughts are not optimistic.

My experience makes me think a big downside of this change has been overlooked, and that the upside is not so great. The goals of this change I think would be better served by a separate linting infrastructure.

First, I think this change would dramatically impact user flow when writing new and unfinished unsafe code. During the prototype, experiment, build-out phase of creating new code, I am often unsure where exactly the function boundaries should go, what code will be reusable, what code should be abstracted, what code needs to be inlined, etc. As a result, I am frequently moving code in and out of functions. By marking those functions unsafe as I construct them, the code remains in an unsafe scope right now. But this change would require me to set up new unsafe scopes as I make these sorts of edits, greatly increasing the ceremony. Similar to the arguments I've made in favor of Ok-wrapping, we should minimize the disruption to the fluidity of creating new functions during this phase of editing, and I see this as a major downside.

Second, so far two memory safety bugs have been discovered in ringbahn, and neither of them would have been prevented by this change. In both cases, invariants were not being upheld by safe code (in both cases, interestingly, because of implicit destructors). The bugs this change is designed to catch seem easy to identify and unlikely to occur: accidentally calling an unsafe function that you didn't mean to call. I am doubtful that this would meaningfully improve the quality of unsafe code or the experience of writing it.

The utility of this change seems to be in the later phase of editing, once the code has settled down and is no longer seeing large refactors. At that point, it is ideal for each unsafe operation to be annotated with a comment justifying why the invariants necessary are being upheld. However, this change to the language is a poor match for actually achieving that goal: even if users are required to put unsafe operations in unsafe blocks, nothing actually makes them justify those unsafe operations, or do anything but wrap entire function bodies in unsafe blocks and call it a day (as I often do during prototyping).

For those situations, what would be more appropriate in my opinion is a linter that requires every statement containing an unsafe operation to be annotated with a comment justifying its correctness. This would automatically enforce what is already becoming a standard style for finished unsafe code. It would not require this change at all to complete.

withoutboats on 5 Jun 2020

👎3

even if users are required to put unsafe operations in unsafe blocks, nothing actually makes them justify those unsafe operations, or do anything but wrap entire function bodies in unsafe blocks and call it a day (as I often do during prototyping).

By marking those functions unsafe as I construct them, the code remains in an unsafe scope right now. But this change would require me to set up new unsafe scopes as I make these sorts of edits, greatly increasing the ceremony.

phil-opp on 5 Jun 2020

👍12

During the prototype, experiment, build-out phase of creating new code, I am often unsure where exactly the function boundaries should go, what code will be reusable, what code should be abstracted, what code needs to be inlined, etc. As a result, I am frequently moving code in and out of functions. By marking those functions unsafe as I construct them, the code remains in an unsafe scope right now. But this change would require me to set up new unsafe scopes as I make these sorts of edits, greatly increasing the ceremony. Similar to the arguments I've made in favor of Ok-wrapping, we should minimize the disruption to the fluidity of creating new functions during this phase of editing, and I see this as a major downside.

I am fully buying your argument with Ok-wrapping, but am unconvinced of it in the context of unsafe blocks. I think that is because with Ok-wrapping, even having that added implicitly for the final product seems like a great state of affairs to me, whereas for unsafe it feels like a prototyping-only kludge.

So maybe a better solution to your problem, instead of keeping unsafe fn internally unsafe (with not just the practical but also the conceptual issues that brings) would be the ability to have unsafe blocks outside functions, so you could wrap an entire module in unsafe and just not care about safety for the time being? That seems to better reflect your prototyping intent.

Another alternative would be a more convenient way to re-gain the ability to mark the entire body of a function as unsafe. For sample code, I personally use

fn main() { unsafe {
} }

but rustfmt will turn this into a double-indent. We could adjust rustfmt, or we could adopt @Centril's proposal of defining functions with =, which would permit

fn main() = unsafe {
}

which seems like minimal effort even during prototyping, while at the same time not mixing up the two different roles of "unsafe" that unsafe fn currently fails to distinguish.

RalfJung on 5 Jun 2020

❤5 👍2

Some interesting ideas here definitely. The idea of allowing unsafe scopes outside of individual function bodies for prototyping especially appeals to me (I'd really like it as some kind of #![] pragma, but we can get to that). I agree that there may be a high point on the curve where this change plus some other change can avoid the drawback I've described.

However, I note that neither of you responded directly to my second claim that this is not particularly useful for catching bugs. Both posts contain sort of oblique responses, which I want to look at in particular:

To me, the biggest advantage of requiring an explicit unsafe block is that it is much easier to make the unsafe scope smaller (just move lines out of the block) compared to creating a second "safe" inner function (which is far from an obvious and requires lots of boilerplate).

Why do you want an interior safe scope? If its to avoid accidentally performing an unsafe op, as I understand it, linters like the one I've described would easily suffice. To reiterate my previous claim: as we all know, it is not only the unsafe ops that can cause memory safety inside a module with unsafe code; singling out these ops as the unique points at which extra care is needed gives a false sense of security in my mind.

(with not just the practical but also the conceptual issues that brings)

I get the sense from talking to you that this conceptual issue is a really big motivator for you. I agree that it is unclear the way that unsafe annotates two different sides of the safety contract. However, I am not as convinced that this change will make a big impact on ellucidating the distinction, because we are still using the unsafe keyword in both positions. I think the use of one keyword for both roles (which I think we aren't going to change) is the real stopper here, not the fact that in some cases the keyword plays multiple roles.

withoutboats on 5 Jun 2020

❤2

Why do you want an interior safe scope? If its to avoid accidentally performing an unsafe op, as I understand it, linters like the one I've described would easily suffice.

To me the only difference between safe and unsafe functions is that the latter have additional memory safety related requirements for callers. I see no reason to treat the bodies of safe and unsafe functions differently, so it makes sense to me to use the same mechanism for marking unsafe operations.

Of course we could also create a lint that requires an annotation for each unsafe operation, but this should apply to safe and unsafe functions in the same way. If we created such a lint and at some point decided that this lint is enough to mark unsafe operations, I would see no reason why unsafe blocks shouldn't be made optional in bodies of safe functions too.

To reiterate my previous claim: as we all know, it is not only the unsafe ops that can cause memory safety inside a module with unsafe code; singling out these ops as the unique points at which extra care is needed gives a false sense of security in my mind.

Again, there is no difference between safe and unsafe function bodies to me. Yes, usage of unsafe can pollute a whole module, but it is still useful to mark the places where the actual memory safety is caused. Otherwise, we could also make unsafe blocks in safe functions optional as soon as there is some unsafe used somewhere in the same module, which does not sound like a good idea to me.

I note that neither of you responded directly to my second claim that this is not particularly useful for catching bugs.

It's as useful as unsafe blocks in safe functions. Yes, the actual bugs might be outside unsafe blocks, but unsafe blocks are still useful because they mark the places where additional invariants are required. So even if the bug is somewhere else, you have a place where you can start your debugging, e.g. by adding additional assertions or debugger breakpoints.

phil-opp on 5 Jun 2020

👍8

@withoutboats

To reiterate my previous claim: as we all know, it is not only the unsafe ops that can cause memory safety inside a module with unsafe code; singling out these ops as the unique points at which extra care is needed gives a false sense of security in my mind.

That's true - however, the unsafe function being called may have additional safety conditions that are unrelated to the rest of the function. When reviewing code, seeing something like:

unsafe fn foo() {
    unsafe {
        first_fn();
        second_fn();
    }
    third_fn();
}

immediately tells me that third_fn cannot cause undefined behavior by itself - it can only do so as a result of interaction with other unsafe code (first_fn(), second_fn(), or something else).

However, if I instead see something like this:

unsafe fn foo() {
    first_fn();
    second_fn();
    third_fn();
}

It's no longer obvious that third_fn is not unsafe - I now need to check the function definition to see that that it is actually safe (and therefore imposes no additional preconditions).

what would be more appropriate in my opinion is a linter that requires every statement containing an unsafe operation to be annotated with a comment justifying its correctness.

Would this be an if-and-only-if lint (e.g. it would error if a safe statement has an "unsafe justification" comment)? If not, I think these types of comments could become stale as code is refactored over time. With unsafe blocks in unsafe functions, moving safe statements outside of unsafe blocks will make it visually obvious when comments become out-of-date. Of course, it's still a good idea to have a comment of some kind, but any kind of justification will be in terms of interactions with unsafe code in the same module.

Aaron1011 on 5 Jun 2020

👍8

Is this in nightly yet? It definitely recognizes the unsafe_block_in_unsafe_fn feature, but it still treats unsafe functions as if the entire inside is an unsafe block and even warns that unsafe blocks are unnecessary.

AaronKutch on 5 Jun 2020

@AaronKutch we have testcases that say otherwise, so if you have a concrete example, please report it as a new bug (not in this thread, but feel free to Cc me).

RalfJung on 5 Jun 2020

@AaronKutch As specified in the RFC, you need to override the lint level for the unsafe_op_in_unsafe_fn. Having it allowed just falls back to the previous behavior.

LeSeulArtichaut on 5 Jun 2020

👍2

Thank you, I have to add both

#![feature(unsafe_block_in_unsafe_fn )]
#![deny(unsafe_op_in_unsafe_fn)]

AaronKutch on 5 Jun 2020

👍2

However, I note that neither of you responded directly to my second claim that this is not particularly useful for catching bugs.

Fair. I am not sure how to respond to that -- I don't have any data, just a gut feeling and a number of examples where the unsafe scope is IMO too big. You presented an experience report saying that keeping the unsafe scope small doesn't help, or at least doesn't help you. Neither is statistically significant, but you have much more experience writing real unsafe code than I do, so there's not much I can say to counter this point.

Maybe it is a matter of coding style? Even if module-level unsafe blocks were a thing, I wouldn't use them. I keep my unsafe blocks as small as possible and try to be extremely aware of what exactly are the unsafe operations in my code. Of course that doesn't mean I can ignore the safe code, but it highlights the places where additional checks (on top of "maintain module invariants") are required.

Why do you want an interior safe scope? If its to avoid accidentally performing an unsafe op, as I understand it, linters like the one I've described would easily suffice. To reiterate my previous claim: as we all know, it is not only the unsafe ops that can cause memory safety inside a module with unsafe code; singling out these ops as the unique points at which extra care is needed gives a false sense of security in my mind.

The unsafe ops are still special as they are a place where externally defined invariants must be upheld. In contrast, the rest of the code only needs to maintain my own module-local invariants. So when I audit the code, inside an unsafe scope, I need to check more things than outside.

Sure, linters can help, but then the workflow for performing unsafe ops become strangely decoherent between normal fn and unsafe fn -- there is IMO no good reason for them to follow different rules, and yet they do.

I also proposed some ideas once to be able to mark struct fields as unsafe, which would make mutating them an unsafe operation. Then unsafe tracking would help even with maintaining module invariants. (This could also affect auto-trait derivation, but that is yet another discussion.) That is an orthogonal feature but could be part of a "vision" for making unsafe more helpful. Or maybe it's not actually helpful, I am not sure.

I get the sense from talking to you that this conceptual issue is a really big motivator for you.

That is certainly true. I consider this an inconsistency/incoherence in the unsafety system of Rust that I'd love to help fix.

I agree that it is unclear the way that unsafe annotates two different sides of the safety contract. However, I am not as convinced that this change will make a big impact on ellucidating the distinction, because we are still using the unsafe keyword in both positions. I think the use of one keyword for both roles (which I think we aren't going to change) is the real stopper here, not the fact that in some cases the keyword plays multiple roles.

I feel like it still helps, because it means every unsafe is one or the other, whereas currently some unsafe actually play both roles.

RalfJung on 8 Jun 2020

👍8

I think it boils down very much to your style of coding. I have found myself on the opposite side of the spectrum compared to @withoutboats a few times recently. When prototyping, I have actively avoided making functions that are unsafe to call unsafe fns. Because it's like turning off all safety checks and stops the compiler from helping me. When I prototype I want to properly see where I do unsafe stuff so I can reason about the invariants I have to uphold. So in order to have the compiler tell me when I do something unsafe I must avoid unsafe fns, and then I try to remember to slap that unsafe on the function when prototyping and //SAFETY: documentation is done. And that feels risky. Both because I can forget to add the unsafe when I'm done, and after I have added it it becomes harder to recognize new unsafe operations in that function and to properly reason about and document them.

I love what this RFC is doing.

faern on 9 Jun 2020

👍5

I also proposed some ideas once to be able to mark struct fields as unsafe, which would make mutating them an unsafe operation.

Side note, but @RalfJung I am still in favor of this change and would like to see it go through.

nikomatsakis on 9 Jun 2020

It seems like a new RFC would need to be made to elaborate on the motivation for this change at the moment, but I think the use case presented by ringbahn::Event is particularly important for developers writing safe code.

While there's little reason to write an unsafe function with a safe body, the implementor of an unsafe trait method doesn't control the signature. If they want to keep to the safe set of Rust (forbid(unsafe_code)), they could write a perfectly safe implementation, maybe with runtime checks. But with the implicit unsafe block introduced, it will still need to be manually verified.

There are a few nightly traits that use the same pattern, and it's helpful for 'optimisations' in things like bounds checks. I think forcing developers into unsafe Rust is a real shame when the language is otherwise so flexible (or "empowering" by the tagline)

Plecra on 20 Jul 2020

While there's little reason to write an unsafe function with a safe body

Interesting.. This does not match my experience very well — even within my unsafe fns, _most_ of the code is safe, except for _some_ parts that are unsafe.

jonhoo on 20 Jul 2020

👍3 😄1

Agreed! I meant to refer to unsafe fns with no unsafe code in their bodies (and I strongly support proposals like unsafe fields for cases like Vec::set_len).

Since I was struggling to explain my reasoning, maybe this example could elaborate on it. With tools like cargo-geiger popping up, I think it's quite important that Rust considers this code "safe":

#![forbid(unsafe_code)]

trait Collection<E> {
    /// # Safety
    /// `index` must be a valid index into this collection
    unsafe fn get_unchecked(&self, index: usize) -> E;
}
struct Bucket<T> {
    value: T,
}
impl<T: Copy> Collection<T> for Bucket<T> {
    unsafe fn get_unchecked(&self, _: usize) -> T {
        self.value
    }
}

Afaik, Rust doesn't provide any way to verify that this kind of code is safe atm, due to the implicit block this issue is addressing. This strongly discourages trait authors from including unsafe guarantees in their method calls, since users will be more hesitant to implement these traits

Plecra on 20 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings