Rust: [tracking issue] `union` field access inside `const fn`

Created on 29 Jun 2018  路  51Comments  路  Source: rust-lang/rust

union field accesses allow arbitrary transmutations inside const fn and are thus marked as unstable for now to allow us more time to think about the semantics of this without blocking the stabilization of const fn on it.

This is behind the const_fn_union feature gate.

Blocked on https://github.com/rust-lang/const-eval/issues/14.

A-const-eval A-const-fn C-tracking-issue T-lang

Most helpful comment

TLDR: the examples shown have unspecified behaviour depending on compiler versions and optimization levels, which is the exact same behaviour that the already known set of runtime UB exhibits at compile-time.

It's not about causing UB at runtime due to UB within a constant evaluation (we prevent that via static checks on constants). Let's just consider the regular UB things that the rustonomicon defines. If you have a function like

const fn foo() -> u32 {
    unsafe {
        let x: usize = mem::uninitialized();
        [5, 6, 7][x]
    }
}

and you evaluate it at compile-time, there's no clear behaviour that can come of this. Either the function

  • returns 5, 6 or 7
  • returns a random value
  • errors during evaluation because the length check failed
  • errors during evaluation because the length check succeeded, but then an OOB value was read
  • errors during evaluation because undefined bytes were compared with something else (current behaviour)

The function will do the same thing no matter how many times and in what circumstances you call it during CTFE. That is, for the same compiler with the same compiler flags. Any change in optimizations or compiler versions can and will change what the function returns at CTFE. Calling it at runtime is UB anyway, so we can ignore that, that has been covered thoroughly by lots of articles.

Now, let's consider the additional UB proposed in this issue:

const fn foo() -> u32 {
    let x = 42;
    let y = &x as *const i32 as usize;
    50000 / y
}

The above is considered UB in CTFE (as suggested in this issue), because you are inspecting a pointer beyond your ability to dereference or offset it. So, what can happen here? The function can either

  • return a random value
  • error during evaluation because the division triggered the division by zero panic
  • error during evaluation because the div-by-zero check wasn't triggered, causing a division by zero, which is UB even at runtime
  • error during the division because the value of y isn't known (current behaviour)
  • error during the conversion to usize or at least at the assignment of y

The argument was that we can make a guaranteed detection of the UB, so it's not a problem. But that's the thing about UB: once bring optimizations into the mix, even deterministic UB detection will become nondeterministic depending on whatever the optimizations are up to. In the above example the optimizer detecting UB can resonably replace the entire function by return 0xDEAFBEES, which can still sting because they see and smell you. Jokes aside, you now have a function which behaves eratically depending on compiler version and optimization levels. Sounds a lot like UB, doesn't it?

All 51 comments

Now that we don't permit any new functions from getting promoted, I guess we can lift this restriction? In acutal const contexts we check the final value. Although it does mean that users can transmute arbitrary input to safe references and thus dereference them, so via projection we should allow dereferencing raw pointers at the same time.

cc @RalfJung @Centril opinions on the above comment?

There's still the pesky run-time issue in that we get the ptr-to-int operation if we stabilize this now. So we get unconst operations without having settled whether it is UB for const fn to act non-deterministically at run-time (and I am sure folks will abuse this if we don't settle it). That is, there's still the "education" aspect that is lacking.

However, we might reasonably use this only in the standard library for now proviso that we can at least agree on the determinism angle in lib{core,alloc,std} and review carefully (which didn't work super well re. promotability before, but at least now we know better...).

if we stabilize this now

we will get this problem whenever we stabilize. If we want to prevent it we'd need to introduce "blessed transmutes", but that just moves the complexity elsewhere and lock us into layouts.

We could set it up in a manner where the feature is still unstable but doesn't block stabilizing const fns that use it, but we explicitly opted not to do so. We could hijack #[allow_internal_unstable(const_fn_union)] and make it work for const fns, too.

we will get this problem whenever we stabilize.

Granted, but timing is key here imo -- we'll need to work towards having the "education" available.

We could set it up in a manner where the feature is still unstable but doesn't block stabilizing const fns that use it, but we explicitly opted not to do so.

Yep, and it has served us well but perhaps the pragmatic choice is to make an exception now.

We could hijack #[allow_internal_unstable(const_fn_union)] and make it work for const fns, too.

It'll need to be followed by a comment justifying each use but otherwise it seems reasonable.

ptr-to-int is not itself non-deterministic though. Allocation and pointer comparison to my knowledge are the only non-deterministic operations (and for ptr comparison I have not been able to observe this non-determinism in a single execution).

However, ptr-to-int makes the allocator non-determinism observable, maybe that's what you mean? I suppose you are worried about functions like (maybe using hacks to avoid unstable features):

const fn mk_int() -> usize {
  let v = 42;
  &v as *const _ as usize
}

I think we cannot reasonably make such functions UB -- undefined behavior should be decidable dynamically (e.g. by an interpreter like Miri), but "this function is deterministic" is not decidable. So this can be a safety invariant at best. Good enough for unsafe code to rely on it when interfacting with safe code, but not good enough for the compiler to exploit it.

However, ptr-to-int makes the allocator non-determinism observable, maybe that's what you mean?

Yep.

(maybe using hacks to avoid unstable features):

We intentionally tried to prevent that; would be very unfortunate to find out we failed.

I think we cannot reasonably make such functions UB -- undefined behavior should be decidable dynamically (e.g. by an interpreter like Miri), but "this function is deterministic" is not decidable. So this can be a safety invariant at best. Good enough for unsafe code to rely on it when interfacting with safe code, but not good enough for the compiler to exploit it.

Good points; although it would be unfortunate not to take advantage of CSE for calls to const fn. But at least the type system should be able to assume deterministic execution of const fns. In either case, if it is a safety condition, it will still need good educational material.

Good points; although it would be unfortunate not to take advantage of CSE for calls to const fn

It would be unfortunate indeed, but non-decidable UB would also be unfortunate.

But at least the type system should be able to assume deterministic execution of const fns.

What do you mean by this?

What do you mean by this?

For example:

const fn foo(x: usize) -> usize { /* stuff */ }

fn bar() {
    let n = /* stuff */;
    let x: [u8; dyn foo(n)] = /* stuff */;
    let y: [u8; dyn foo(n)] = x; // OK.
}

For example:

Ah. Well for executing const fn during CTFE I think we can guarantee that they behave deterministically. This does not even require any static checks. The Miri engine just doesn't have a source of non-determinism. (The AllocId of a Pointer is non-deterministic, but it is not observable by the program.)

I think, unsafe code should remain unsafe even in const contexts, things we should explicitly prohibit is 1) dereference of any ref. or ptr. 2)returning pointers to values created inside const fn. 3) usage of non const environmental variables. Thus, casting references to pointers and pointers to integers and back should be allowed. Ideally, miri should track pointers even if they are casted to numbers.

UPD sorry for off-topic.

@tema3210 I'm not sure what you mean. This issue is about unions and no one is trying to make using them safe.

Since this issue is holding back a few other issues, what needs to be done to fix it?

Quoting from a Zulip discussion:

We are talking about unsafe code, so we should compare unsafe const-code with unsafe runtime-code. Unsafe runtime-code has certain rules you have to follow, but it is up to you to follow them -- that's the point of unsafe, there is no guaranteed safety. Violating those rules is commonly called "Undefined Behavior". Also see here.

With unsafe const-code, we have the same problem -- but the rules are different! So some code could be perfectly fine unsafe runtime-code but still be wrong unsafe const-code. And the compiler cannot tell you, because it's unsafe.

IOW, the biggest concern here is how tot each people to use unsafe code in const fn in a way that does not break CTFE. The issue is that "just" following the normal rules for unsafe is not enough.

There's some more background on this at https://github.com/rust-lang/const-eval/blob/master/const_safety.md.

Slightly stupid idea: lint about all unconst operations (with the lint specifically explaining each case individually) and require users to write #[allow(unconst_ops)] on functions or items performing such actions.

This may lead to ppl just adding #![allow(unconst_ops)] to the entire crate instead, which is uncool, and I don't think hijacking the lint system to make this one lint behave differently is a good idea.

It could also be an unconditional warning that can be silenced by using #[unconst] on such a function, const item, static item or expression.

Seems strange to me to have a completely different system for unsafe and unconst, when they are conceptually basically the same thing (just for two different languages: full Rust vs const Rust).

Yea, I just noticed we could go even more fine-grained:

#[unconst(raw_ptr_deref)].

But this made me realize that any unsafe operation can likely end up doing unconst things in some cases... so we're back to the root problem that we'd just end up marking everything, so there's no useful distinction between unconst and unsafe.

there's no useful distinction between unconst and unsafe.

I have no idea how you arrive at that conclusion. The properties enforced by the two are fundamentally different -- soundness for full Rust vs soundness for const Rust.

I came to this conclusion because while &x as *const i32 as usize is unconst, it's not unsafe, but transmute::<_, usize>(&x) is unsafe and unconst. I believe that almost all unsafe operations are also unconst, because you can always craft a situation that mirrors an unconst operation.

I understand that by itself &x as *const i32 as usize is not unconst, the fact that loads of uses of that usize are unconst is the problem. So this is also a static vs dynamic checks situation. We already have all the dynamic unconst checks we need, as an unconst op is just an operation that CTFE does not support.

Maybe it's enough to just explain CTFE errors because of unconst behaviour in detail? The only "problem" is really that users of an unconst fn get the error instead of the author of the function, or am I completely overlooking a class of unconst problems here?

I believe that almost all unsafe operations are also unconst, because you can always craft a situation that mirrors an unconst operation.

That's possible. But I don't think it means we should ditch unconst.

What would be really nice is if we could arrange things such that "being const-sound" implies "being runtime-sound". (We already know the other direction does not hold.) That would basically make the requirements of unconst a strict superset of the requirements of unsafe. In essence, we'd never use unsafe in a const fn, we'd use unconst instead, which plays the same role but comes with slightly stricter rules. This avoids having to mix both keywords in the same context. At the same time, it still adds a speed bump to const fn authors that enter these areas, as they'll have to learn about writing unconst instead of unsafe and figure out why, and then hopefully they'll learn about the const-soundness requirements.

If we use unconst as a keyword inside const fn, we must lint on unsafe, telling the user to use unconst. We can't ban unsafe as you can already call unsafe const fns on stable, you just can't do any unconst actions within unsafe blocks or unsafe const fns.

Some concerns that come to mind:

  • It feels a little odd to have unsafe const fn and unconst blocks. I definitely don't want unconst const fn.
  • Just slapping const on a function won't work anymore if it has unsafe blocks. This is a good thing as you stated, but it's also an oddness.
  • Users can already use the word unconst, so we'd need an edition just for that keyword. What do we do with the editions before that?
  • The gains are minimal for such a major change. Ppl will just learn to use unconst instead of unsafe without thinking more about it.

Possible alternatives:

  • unsafe<const> { foo } blocks. Though this is getting into effect territory I think?
  • Don't do unconst, just reuse unsafe. I realize ppl will still screw this up, but I don't know whether unconst will help there. unsafe is being overused, too. Not sure if the nudging you're suggesting will help beyond annoying ppl and making them replace all unsafe inside const fn with unconst. Some mitigation strategies that are not exclusive:

Also does unconst work in all the contexts that unsafe works? For example, would any of these work?

  • unconst trait Foo
  • unconst const fn foo()
  • fn foo() { unconst { ... } }

I think this will be more confusing to people than just saying unsafe has different rules in const contexts and perhapd linting likely mistakes or having clippy run miri or something.

Why does this work fine on stable, but this doesn't? Is there something about constant functions that prevents Scalar::Ptr/Scaler::Bits from working across call boundaries?

The reason it's not allowed in const fn is discussed starting at https://github.com/rust-lang/rust/issues/51909#issuecomment-523209113

TLDR: some functions that are sound at runtime will fail CTFE at compile-time or hypothetically even have different results (we don't know of any cases of the latter though).

@oli-obk those are all great arguments. Maybe having a good section in the nomicon is indeed the best we can do.

And we need such a section anyway, so it's probably a reasonable next step. Unfortunately it'll be a while until I will be able to spend significant time on this, so I hope someone else can get this started. :)

The reason it's not allowed in const fn is discussed starting at #51909 (comment)

TLDR: some functions that are sound at runtime will fail CTFE at compile-time or hypothetically even have different results (we don't know of any cases of the latter though).

I'm still not quite sure what you mean. Scalar::Ptr/Scalar::Bits should prevent non deterministic behavior, as long as the list of operations that can be done on a Scalar::Ptr without erroring is very short. And if a const function runs differently at runtime, it's because it doesn't cause a miri failure. If a program compiles successfully, its const functions should behave the same at runtime for all inputs that the function has been called with at compile time.

If a program compiles successfully, its const functions should behave the same at runtime for all inputs that the function has been called with at compile time.

That's the root problem. We can either do what you describe, and ban all bad operations at evaluation time, or we write down the rules that ppl need to adhere to in order to be able to do those operations safely. If we just ban them, that means we won't ever reasonably support &[] == &[42] (slice comparison), because that internally compares the addresses of the slices as an early bail out in order to not have to compare the contents: https://github.com/rust-lang/rust/blob/219380d840877de97b0e17674be7212ebc929d51/src/libcore/slice/mod.rs#L5959-L5967

Yes, we could create more intrinsics for just such operations, but that would just push the work to a different place and not allow users to write the same kind of optimizations.

Thus we would like to allow you to do pointer comparisons, but that operation will be unsafe within const contexts, because you can do such a comparison wrongly. The following example is unsound, because you have a safe function that may give you different results for runtime and compile time if you invoke the function as foo(&42, &42)

const fn foo(a: &i32, b: &i32) -> bool { unsafe { a.as_ptr() == b.as_ptr() } }

Ok, that makes sense. So for pointer comparison purposes. But it seems like the function foo() itself is deterministic. It's simply the case that the function behaves differently if the arguments passed to it happen to be equal. While foo(&42, &42) may or may not have a == b depending on context, I don't think the compiler guarantees one or the other even at runtime. If necessary though, &42 == &42 could be made true for both contexts (as it is currently true at runtime).

Edit: on second thought, yeah, &42 == &42 would have to be made true for compile time as well.

However, I'm still not sure why this would prevent compile time unions from working.

It seems like the only operations that would have to be bad would be those that modify a Scalar::Ptr and those that allow a Scalar::Ptr and a Scalar::Bits to be compared.

I wrote up some details in https://github.com/rust-lang/nomicon/pull/221

While foo(&42, &42) may or may not have a == b depending on context, I don't think the compiler guarantees one or the other even at runtime. If necessary though, &42 == &42 could be made true for both contexts (as it is currently true at runtime).

Yea, I am miscommunicating this point frequently. Let me try again.

The issue isn't whether runtime chooses a random result here, it's that runtime and compile-time disagree. So let's simplify our example. Assume the following function exists.

const fn foo() -> bool { /* something */ }

Code calling this function can reasonably assume that calling it at compile-time and at runtime will yield the same result, because const fn has no way to access global statics or anything, so calling it twice, can't yield different results.

Now if we had a

const fn am_evaluated_at_compile_time() -> bool { /* magic */ }

you could easily break this assumption. Now imagine foo() being used to safeguard some unsafe code. Basically

unsafe {
    [if foo() { ptr::null::<i32>() } else { &42 }, if foo() { &42 } else { ptr::null() }][foo() as usize]
}

Then depending on what parts get done at compile time and at runtime, things fall apart. I realize this is a very artificial example and there are definitely better ones, but I gotta run :see_no_evil:. This kind of disagreement of runtime and compile time is a big worry, especially since right now there's no way for you to violate it, so we'd like to keep it without a good reason not to have it.

Ok. How do unions act differently in compile time and run time?

They don't? Not sure where you got the idea.

I linked to a discussion about this above. I think the main concern is that "well-behaved" const code simply has different rules than to be well-behaved "normal" code. Also see this document.

Accordingly, I am somewhat confused by @oli-obk's focus on differences between CTFE and runtime. Our CTFE checks should already make sure that does not happen. (Smells like an XY problem to me -- some of the reasons const has different rules is that during CTFE we cannot do certain operations, and the reason for that has to do with not knowing the actual base address of an allocation. But that's already two layers down the rabbit hole from the key point, which is that the rules are different.)

Our CTFE checks should already make sure that does not happen.

They do right now, but if we ever want to allow [T]::eq, we'll end up having something in CTFE that will yield different results than runtime. The important thing is that this is not observable from safe code. Unsafe code may observe this internally for optimizations.

And yes, the root problem for this is that we don't know the base address of an allocation, so for anything CTFE related, the base addresses must be irrelevant. An operation that would observe the base addresses is const-unsafe because you can't let safe code ever observe that base address. "Observing a base address" can e.g. be the fact that we'd return false for &42 as *const u8 == 1234500 as *const i32 even if that theoretically may be true during runtime.

They do right now, but if we ever want to allow [T]::eq, we'll end up having something in CTFE that will yield different results than runtime.

I don't think so. Slice equality itself does not actually differ. The only issue is the pointer equality test fast-path, and I've seen a PR that proposed removing it (it seems to be a perf win in some cases and a perf loss in others). If needed we could have a perma-unstable hack to skip the ptr check in CTFE.

Also, I see no connection of this problem to transmutes/unions. The problem you mention is (I think) related to "how can we implement ptr equality binops in CTFE without breaking anything"; that question is fully orthogonal to "how do we teach people to not misuse transmutes in CTFE". Transmutes cause problems even with our current non-implementation of ptr binops, and ptr binops cause problems even without transmute.

The kind of misuse I have in mind is something like

const fn foo(x: usize, y: &i32) -> usize {
  let y: usize = unsafe { transmute(y) };
  x * y
}

This is a sound function as far as runtime Rust is concerned, but it is const-unsound ("unconst") because it can cause a const-failure with const-safe inputs.

Ok. Let's stop focussing on hypothetical differences and just address the part where a function is unsound if it causes an error at compile time while being sound at runtime. I'll adjust the text accordingly

a function is unsound if it causes an error at compile time while being sound at runtime.

Yeah, basically. Of course compile-time functions also can make stronger assumptions about their inputs (e.g., usize are "proper integers" and not pointer values). So it's not actually about comparing behavior of a function with the same inputs between CTFE and RT. It's about having both of these properties:

  • for all RT-safe inputs, the function behaves in an RT-safe way and returns a RT-safe output (that's normal runtime soundness)
  • for all CTFE-safe inputs, the function behaves in a CTFE-safe way (does not cause "unsupported" or "UB" errors) and returns a CTFE-safe output.

We have dataflow analysis already, why not to use it to permit what is already known to be sound? For example we can check usage of alocations to forbid pointer to numbers casts which change the output. So we actually want much better check of function purity. I don't think we want to forbid use of entire language primitive because it could cause UB.

Also we may want to make the ops const someday, but i'm not sure which ones.

why not to use it to permit what is already known to be sound?

That's the whole point! We want to permit things that are sound, but there's no way to make it safe in general, just like we can't make transmute safe in general. This entire discussion is about allow things, but trusting the developer to know the const rules. You can still transmute a reference to a usize and back, and that would be sound I think. That doesn't mean transmuting a reference to a usize is sound in general.

All we're trying to say is that the rules are stricter (or there are more rules) at compile time than at rumtime, and devs need to be aware of that.

@tema3210 this entire discussion is about unsafe code, which the compiler cannot check for correctness. If it could, it would be safe code. ;) So no static analysis is going to help.

We already have a mechanism for preventing conversions from pointers and references to regular numbers, shown here and here. My question is why that only works when not in a function.

The difference is that const fn can also be executed at runtime. So we need to address what the differences between invoking a const fn at runtime and at compile time are. Ideally there's no difference, but in practice that doesn't work. Thus, we want to define that a const fn must behave the same at runtime and at compile time. This means that if you invoke foo() at runtime and it succeeds, invoking foo() at compile time must succeed, too, and must even return the same value (if foo is a safe const fn).

A constant item is always interpreted at compile-time, so we do not have this duality. Only its final value is used at runtime, and we have checks ensuring that said final value is const sound.

@powerboat9 those are easy to check as it's a single constant that we can just evaluate. But to do the same with a function, we'd have to run it with all possible inputs, which clearly is not possible.

I don't really see how unions would behave differently during runtime and during compile time. If a union throws a miri evaluation error, then the code never compiles and thus the function can never be called at runtime.

And if a function could error at compile time but doesn't, then it will be impossible to observe running at compile time and no difference between runtime and compile time behavior can be observed.

TLDR: the examples shown have unspecified behaviour depending on compiler versions and optimization levels, which is the exact same behaviour that the already known set of runtime UB exhibits at compile-time.

It's not about causing UB at runtime due to UB within a constant evaluation (we prevent that via static checks on constants). Let's just consider the regular UB things that the rustonomicon defines. If you have a function like

const fn foo() -> u32 {
    unsafe {
        let x: usize = mem::uninitialized();
        [5, 6, 7][x]
    }
}

and you evaluate it at compile-time, there's no clear behaviour that can come of this. Either the function

  • returns 5, 6 or 7
  • returns a random value
  • errors during evaluation because the length check failed
  • errors during evaluation because the length check succeeded, but then an OOB value was read
  • errors during evaluation because undefined bytes were compared with something else (current behaviour)

The function will do the same thing no matter how many times and in what circumstances you call it during CTFE. That is, for the same compiler with the same compiler flags. Any change in optimizations or compiler versions can and will change what the function returns at CTFE. Calling it at runtime is UB anyway, so we can ignore that, that has been covered thoroughly by lots of articles.

Now, let's consider the additional UB proposed in this issue:

const fn foo() -> u32 {
    let x = 42;
    let y = &x as *const i32 as usize;
    50000 / y
}

The above is considered UB in CTFE (as suggested in this issue), because you are inspecting a pointer beyond your ability to dereference or offset it. So, what can happen here? The function can either

  • return a random value
  • error during evaluation because the division triggered the division by zero panic
  • error during evaluation because the div-by-zero check wasn't triggered, causing a division by zero, which is UB even at runtime
  • error during the division because the value of y isn't known (current behaviour)
  • error during the conversion to usize or at least at the assignment of y

The argument was that we can make a guaranteed detection of the UB, so it's not a problem. But that's the thing about UB: once bring optimizations into the mix, even deterministic UB detection will become nondeterministic depending on whatever the optimizations are up to. In the above example the optimizer detecting UB can resonably replace the entire function by return 0xDEAFBEES, which can still sting because they see and smell you. Jokes aside, you now have a function which behaves eratically depending on compiler version and optimization levels. Sounds a lot like UB, doesn't it?

you now have a function which behaves eratically depending on compiler version and optimization levels. Sounds a lot like UB, doesn't it?

Somewhat. It is much more constrained in that the effects of UB are "returns a weird result from this one const-eval query", and there is no chance of it deleting your files on its own, but that's about it.

return a random value

One possibility here is also "returns an uninitialized value", which is different from picking any fixed value non-deterministically.

Citing what I said on Zulip:

We could make CTFE misbehavior "unspecified behavior". that means we should list what the possible behaviors are, which I guess is "const-eval will return some arbitrary thing and leave arbitrary stuff in the memory that gets interned". but the compiler won't crash or eat your kittens.

@oli-obk basically extended on what that would mean, I think.
This closes one possible option in the Rust CTFE design space: JIT'ing MIR and running it unsandboxed/unchecked during compilation. I don't think we want to do that, but it feels worth pointing out. If we say behavior is "unspecified", we have to say what the possible behaviors are that implementations can pick from, and they have to stay in those bounds. This means e.g. mandatory bounds checks during CTFE evaluation, even when evaluating unsafe code.

Perhaps have Scalar::Uninit for currently uninitialized values?

@powerboat9 I don't know what your comment refers to and what you are suggesting where such a Scalar::Uninit would be used. Please elaborate (ideally with code examples where you think these values would occur).

We have ScalarMaybeUninit to represent potentially uninitialized values. But anyway I don't think these implementation details have anything to do with the discussion around unions. The question is not how to implement union accesses -- they have been implemented years ago -- the question is how to best expose them on stable without causing issues.

Was this page helpful?
0 / 5 - 0 ratings