rust 🚀 - Creating a reference with `std::mem::zeroed` generates a SIGILL

As soon as you instantiate a type (including references) with a stale value it will immediately cause UB. This is not really actionable (mem::uninitialized itself is not very sound, there are proposals to use unions instead).

ishitatsuyuki on 31 Jul 2018

Yes, this code is UB - references must be non-null.

Mark-Simulacrum on 31 Jul 2018

I can see how things like mem::uninitialized and mem::zeroed should generally be discouraged in application code, but I find it surprising that a language marketed toward kernel writers is making it increasingly difficult to operate directly on memory. This caused a widely used library of mine to stop working, and I'm having trouble thinking of a fix that doesn't require a major rewrite that breaks backwards compatibility. Isn't the whole point of unsafe to isolate undefined behavior when it can't be reasonably avoided?

ghost on 4 Aug 2018

You can use *const T or *mut T instead to operate on memory directly; if you've left a value zeroed or unintialized it wouldn't be safe to permit access to it: the raw pointer types do that, whereas the reference 'type' (&) does not.

If you want to, you can also wrap them in a struct to provide safe access, perhaps panicking if the value hasn't yet been initialized.

If you have suggestions as to how to improve ergonomics or clarity here (docs or through language changes) we'd be happy to hear them.

Mark-Simulacrum on 4 Aug 2018

What about when working with C function pointers? I expected these to behave more like raw pointers, but it's causing this same SIGILL. Example:

let x: unsafe extern "C" fn () = mem::zeroed();

ghost on 4 Aug 2018

You should be using Option<...> around the function pointer, and initializing it to None (equivalent to mem::zeroed, I believe, but safer).

The nomicon notes:

The most common type that takes advantage of the nullable pointer optimization is Option<T>, where None corresponds to null. So Option<extern "C" fn(c_int) -> c_int> is a correct way to represent a nullable function pointer using the C ABI (corresponding to the C type int (*)(int)).

Mark-Simulacrum on 4 Aug 2018

At the time that I wrote the x11-dl crate, it didn't seem reasonable to expect users to check if a function pointer is None every time they use one when the crate ensures that they're all non-null (or Some(_)) before making them available. I was using mem::zeroed and a loop to initialize very large structs full of function pointers as an optimization that improved build times, execution times, and code size, and prevented a stack overflow from occurring on some systems. It would be reasonable if this SIGILL had existed from the start, but it's a breaking change that's very difficult to work around.

ghost on 4 Aug 2018

One extra thing which I'd like to emphasize here is that this is an incredibly silly footgun when dealing with generic code. Yes, accessing something that is mem::zeroed or mem::uninitialized should definitely trigger undefined behavior (I don't think anyone is disputing that) and should not be done, however merely creating such a value without doing anything with it should not give the compiler a blanket permission to maliciously turn my program into a single ud2 instruction.

In fact, if the official stance is that initializing T& using mem::uninitialized() is undefined behavior (Based on what @ishitatsuyuki said it might...?) then this (written in safe Rust) program should trigger undefined behavior:

use std::collections::HashMap;

fn main() {
    let mut map: HashMap< (), &u32 > = HashMap::new();
    map.insert( (), &123 );
    for (_, v) in map.drain() {
        println!( "{}", v );
    }
}

Here's how the Drain iterator is defined:

impl<'a, K, V> Iterator for Drain<'a, K, V> {
    // ...
    #[inline]
    fn next(&mut self) -> Option<(SafeHash, K, V)> {
        self.iter.next().map(|raw| {
            unsafe {
                self.table.as_mut().size -= 1;
                let (k, v) = ptr::read(raw.pair());
                (SafeHash { hash: ptr::replace(&mut *raw.hash(), EMPTY_BUCKET) }, k, v)
            }
        })
    }
    // ...
}

And here's ptr::read:

#[inline]
#[stable(feature = "rust1", since = "1.0.0")]
pub unsafe fn read<T>(src: *const T) -> T {
    let mut tmp: T = mem::uninitialized();
    copy_nonoverlapping(src, &mut tmp, 1);
    tmp
}

So what's effectively doing is this:

let (k, v): ((), &'static u32) = mem::uninitialized();

In general my assumptions were always:

As long as the value (any value) is never accessed having it initialized using mem::uninitialized is safe.
The mem::zeroed is exactly the same as mem::uninitialized, except the underlying memory is zeroed.

The docs for mem::uninitialized currently do sound like it's fine to use mem::uninitialized as long as the value isn't accessed in any way.

The docs for mem::zeroed are more spartan; they do however suggest a similarity to mem::uninitialized. If there are cases where mem::uninitialized doesn't trigger UB and mem::zeroed does then I think that definitely should be documented!

koute on 4 Aug 2018

👍2

I'll reopen this and cc people who are working on uninitalized/zeroed.

cc @RalfJung @rust-lang/wg-unsafe-code-guidelines

Mark-Simulacrum on 4 Aug 2018

uninitialized is very different from zeroed in this context. A null reference is explicitly a value that doesn't exist, and creating one is undefined behavior.

sfackler on 4 Aug 2018

@sfackler Uninitialized memory is also assumed to be a value that doesn't exist, and both are equally undefined behavior. If anything, zeroed memory is more likely to trigger a SIGSEGV or equivalent on most platforms, which is far better than with uninitialized memory, where this is only the best case scenario.

More importantly, as @koute mentioned, this zeroed memory is not at any point being accessed in these use cases. It's being overwritten before any use, which should pretty much eliminate any undefined behavior. Regardless, there would be no point in providing mem::uninitialized or mem::zeroed or unsafe at all if such code is forbidden.

Why not just rewrite these functions to

unsafe fn uninitialized<T> () -> T {
    panic!("this used to be allowed but isn't anymore because ???")
}

unsafe fn zeroed<T> () -> T {
    panic!("this used to be allowed but isn't anymore because ???")
}

ghost on 5 Aug 2018

Uninitialized memory is also assumed to be a value that doesn't exist, and both are equally undefined behavior.

No, that is not correct. Uninitialized memory exists all over the place! The memory returned by malloc is uninitialized. Uninitialized memory has an undefined value.

Regardless, there would be no point in providing mem::uninitialized or mem::zeroed or unsafe at all if such code is forbidden.

All such code is not forbidden.

sfackler on 5 Aug 2018

Uninitialized memory is also assumed to be a value that doesn't exist, and both are equally undefined behavior.

No, that is not correct. Uninitialized memory exists all over the place! The memory returned by malloc is uninitialized. Uninitialized memory has an undefined value.

But mem::uninitialized tricks the compiler into believing that an uninitialized value is initialized, which is very much undefined behavior, and is the entire point of having the function in the first place.

ghost on 5 Aug 2018

But mem::uninitialized tricks the compiler into believing that an uninitialized value is initialized

There is no tricking going on. The compiler is well aware that the value is uninitialized.

sfackler on 5 Aug 2018

@sfackler, @Daggerbot - while the argument is sound, at the moment we have a significant amount of projects for which nightly is unusable - not mentioning that there might be pressure to push this change to stable.

Would it be reasonable to revert the change that caused this behaviour and then continue the discussion until a proper solution is found?

norru on 5 Aug 2018

There is no tricking going on. The compiler is well aware that the value is uninitialized.

If this were the case, then the following code would fail to build:

fn do_a_thing (_: &u8) {}

fn main () {
    unsafe {
        let x: u8 = std::mem::uninitialized();
        do_a_thing(&x);
    }
}

But it doesn't, because the compiler thinks x is initialized.
As a counter example:

fn do_a_thing (_: &u8) {}

fn main () {
    let x: u8;
    do_a_thing(&x);
}

This does not build because of an error stating use of possibly uninitialized 'x'.

ghost on 5 Aug 2018

The documentation for mem::uninitialized has some extensive discussion about what you can and can not do with a value returned by it.

The explicit purpose of mem::uninitialized is to bypass that check. It is an unsafe function, and the entire purpose of unsafe code is to handle things that the compiler cannot. An unsafe block says "trust me, I know what I'm doing". If you don't actually know what you're doing, then that's on you.

sfackler on 5 Aug 2018

All you need to do to make this code valid is to use mem::uninitialized rather than mem::zeroed:
https://github.com/erlepereira/x11-rs/blob/49aef6df158c9ca38a8450670984f1971686d4ef/x11-dl/src/link.rs#L54

sfackler on 5 Aug 2018

Did you not read the above comments? The value is being replaced with a valid value before it's ever accessed. Roughly equivalent example.

let mut x: u8 = mem::uninitialized();
x = 0; // not uninitialized anymore, and therefore not really undefined behavior
do_a_thing(&x);

Edit: The docs say that mem::zeroed is equivalent to mem::uninitialized followed by changing all bytes to zero. If this is not what it does, then the docs are unclear as well as the whole purpose of the function.

ghost on 5 Aug 2018

That example is totally fine and well defined. This, on the other hand, is still UB:

let mut x: &'static u8 = mem::zeroed();
x = &0;
do_a_thing(&x);

mem::zeroed is not the same thing as mem::uninitialized.

sfackler on 5 Aug 2018

let mut x: &'static u8 = mem::zeroed();
x = &0;
do_a_thing(&x);

What implementation detail makes uninitialized any more acceptable than zeroed in this example? Are references anything more than just safe pointers?

ghost on 5 Aug 2018

@kevinmehall pointed out an explanation:

If LLVM is told a type is not going to be zero, and code explicitly writes a zero to it, it can assume that code is unreachable, and compile it into an abort to reduce code size.

So it looks like mem::zeroed() is not at all equivalent to mem::uninitialized() followed by memset(), as the docs suggest. My reasoning for using zeroed was that if there was a logic error in my code, I would rather read from address 0, which raises SIGSEGV on most (not all, but most) systems, which is better than an undefined location, where there could (at worst) be malicious third party code.

Edit: It sounds like this optimization may have been triggered by LLVM if rustc recently switched to a later version. Either the docs should be updated to indicate what mem::zeroed really does, or the function should be changed to behave as the docs indicate.

ghost on 5 Aug 2018

References are defined to always point to a valid location, but in particular that location is never null. let mut x: &'static u8 = mem::zeroed(); explicitly creates a value that the compiler assumes cannot exist. Similarly, tenum Void {} let x: Void = mem::uninitialized(); is bad - it's creating an instance of a type that has no valid instances.

@rkruppe mentioned on the lang channel of the rust Discord that I may be wrong about uninitialized being valid in this case - can an optimizer select a value of 0 for an undefined value that's tagged as nonzero?

sfackler on 5 Aug 2018

So it looks like mem::zeroed() is not at all equivalent to mem::uninitialized() followed by memset(), as the docs suggest.

Why can't it be equivalent to that? memset is understood by LLVM IIRC and it would realize that the value has been zeroed.

sfackler on 5 Aug 2018

In most cases, it would appear on the surface that it is equivalent, but in some cases (such as in x11-dl), it gets optimized down to an instruction that raises SIGILL, which should be mentioned in the docs. This may just be an implementation detail in LLVM that was never intended in rustc, but it was the cause of some breaking changes. If I had used mem::uninitialized instead and subsequently zeroed the value manually, the signal would not have been raised.

ghost on 5 Aug 2018

If I had used mem::uninitialized instead and subsequently zeroed the value manually, the signal would not have been raised.

Why would it not have been raised? We instruct LLVM to generate an abort inside of unreachable code, and code that invokes UB (like zeroing a non-zeroable value) is assumed to be unreachable.

sfackler on 5 Aug 2018

It was because of the implied memset that LLVM thought the code was unreachable. If I were to zero or otherwise change the value in a way that LLVM doesn't recognize as a simple "zero this value," and therefore not trigger the optimization, then the SIGILL wouldn't occur.

As you said above, mem::uninitialized was the correct choice in this case. I erroneously used mem::zeroed as a precaution in case I forgot to initialize any of the struct fields, but no value was intended to be zero by the time it reached the library user.

One of these functions is only required because constructing these enormous structs normally (e.g. MyStruct { field_a: 1, field_b: 2, ... }) results in slow compile times (if it succeeds in compiling at all, which it didn't ealier today), slow execution times, and huge generated code. So I had to use either uninitialized or zeroed and initialize each field in a loop instead.

Before Rust I primarily used C where it's common to memset structs with lots of fields to zero before setting fields individually because accidentally dereferencing a null pointer usually fails more gracefully than an uninitialized pointer. Because this is possible (and often recommended) in C, it seems like it should be permitted in unsafe Rust as well unless the documentation for the functions in question states otherwise.

ghost on 5 Aug 2018

It was because of the implied memset that LLVM thought the code was unreachable. If I were to zero or otherwise change the value in a way that LLVM doesn't recognize as a simple "zero this value," and therefore not trigger the optimization, then the SIGILL wouldn't occur.

This is not a guarantee that you can rely on in any way. The Rust UB rules prohibit creating an invalid value, of any type, at any time. Finding a way around this (for instance, by creating a zero [u8; N] and then casting it to the desired type) is still UB. The fact that rustc or LLVM does not detect it some of the time does not make it any more legal.

The reason for this is because of things like this:

let o = Some(unsafe { std::mem::zeroed::<&i32>() });
if o.is_none() {
    unreachable!("wtf!?")
}

In order for the compiler to be allowed to do the null-pointer optimization that results in taking the branch, it needs to be illegal to create the value in this way.

I do agree that the std::mem::zeroed documentation is bad here: it implies that such values can be created as long as they are never dropped, and this is false. Calling std::mem::zeroed<&T> is undefined behaviour. It might be nice if we linted or errored on this, but ultimately the onus is on the unsafe code author to use the tools correctly.

alercah on 5 Aug 2018

So it looks like mem::zeroed() is not at all equivalent to mem::uninitialized() followed by memset(), as the docs suggest. My reasoning for using zeroed was that if there was a logic error in my code, I would rather read from address 0, which raises SIGSEGV on most (not all, but most) systems, which is better than an undefined location, where there could (at worst) be malicious third party code.

Reading from address 0 is undefined behavior and anything can happen. It might segfault, or LLVM might notice that this is happening, deduce the code is unreachable and do whatever. LLVM does not guarantee that reading from or writing to 0 triggers a segfault.

I am afraid that mem::zeroed is almost as bad a footgun as mem::uninitialized. I think both will be taken care of by https://github.com/rust-lang/rfcs/pull/1892, for which FCP has been proposed forever ago -- so hopefully I can push it to resolve the remaining questions fairly soon. We also should decide explicitly that we really have some invariants that must be uphold always, even when data is not "touched", and what at least some of these invariants are. I have some plans there as well but I am not sure when I will get around to write that RFC (or if an RFC if even the right format). Then we can start to put that information into more places, with the goal that every unsafe code author must be aware of this.

But I do not think there is a bug here. I cannot think of a fix short of not doing enum layout optimizations, which certainly is not an option.

RalfJung on 5 Aug 2018

A fix at the compiler level doesn't seem necessary, but a change to the documentation for mem::zeroed does.

This has the same effect as allocating space with mem::uninitialized and then zeroing it out. It is useful for FFI sometimes, but should generally be avoided.

There is no guarantee that an all-zero byte-pattern represents a valid value of some type T. If T has a destructor and the value is destroyed (due to a panic or the end of a scope) before being initialized, then the destructor will run on zeroed data, likely leading to undefined behavior.

See also the documentation for mem::uninitialized, which has many of the same caveats.

From this text, it's clear that the function will return a value that may not be valid, but it's not at all clear that the function call will crash your program even if the value is never used. In situations where mem::uninitialized works fine, one would expect mem::zeroed to work fine as well, especially to people coming from languages like C or C++ where the compiler doesn't randomly insert crashes when you call memset.

ghost on 5 Aug 2018

a change to the documentation for mem::zeroed does.

Agreed. The documentation should be changed to "this is deprecated." ;)

In situations where mem::uninitialized works fine, one would expect mem::zeroed to work fine as well,

Just to be clear, that is the case. It will never make your program less defined, or more broken, to literally replace mem::uninitialized by mem::zeroerd.

However, it is the nature of UB that just because your program has UB with mem::uninitialized, does not mean it will crash.

languages like C or C++ where the compiler doesn't randomly insert crashes when you call memset.

Oh, it totally does. Or, worse, it optimizes away things you thought it would be compiling, because it deduced that your program has UB.

RalfJung on 5 Aug 2018

👍3 😄1

It's perfectly reasonable to not rely on undefined behavior, but what alternative does Rust provide when making very large structs? When constructing it normally (Xlib { ... }), the compiler ran for several (5-10) minutes before crashing due to running out of memory. We switched to mem::uninitialized which prevented the crash, but I'm getting the impression that the next major release of this library should just do things the C way because I know that malloc() and free() will actually work. Or would this be considered undefined behavior for some reason as well?

let ptr = libc::malloc(mem::size_of::<Xlib>());
if ptr.is_null() { return Fail; }
// do some stuff with ptr
libc::free(ptr);

ghost on 6 Aug 2018

@Daggerbot I am not sure what exactly your use-case is. But essentially, you are fine as long as you keep the uninitialized stuff behind a raw pointer or inside a union. If the data you are putting into the array is Copy, you could take the MaybeUninit union from https://github.com/rust-lang/rfcs/pull/1892 and implement it in stable today.

If there is non-Copy data involved, I am afraid there is no good way on stable today. :/ However, you are probably fine doing Box::into_raw(Box::new(mem::uninitialized())), and then working with that raw pointer and only ever turning it into a reference (and, eventually, a Box for deallocation) when initialization is done.

When constructing it normally (Xlib { ... }), the compiler ran for several (5-10) minutes before crashing due to running out of memory.

Ouch, how big is that struct?!?? When did you try this the last time? The [value; N] case for big N has been optimized a lot, I hear.

RalfJung on 6 Aug 2018

I don't think that's fully correct, with the rules as written. I do not think you can ever safely deal with uninitialized memory for any type that has illegal representations, because as soon as you create an lvalue to write to, you create an invalid value. Even if you never read from it, only write to it, it's still illegal.

And since Rust gives you no real guarantees about layout, you can't safely do something funky, like load the offsets of each field and write to them individually in raw memory before finally casting to the desired type.

alercah on 6 Aug 2018

I don't think that's fully correct

What exactly are you referring to by "that"?

The intention is explicitly that there are "unsafe places" (avoiding the overloaded and confusing term lvalue), created from raw pointers, that do not require full validity unless the value is actually read from the place. Also see my example at https://github.com/rust-lang/rfcs/pull/1892#issuecomment-410756170.

RalfJung on 6 Aug 2018

Ah, I hadn't realized that was explicitly intended to be allowed. I guess we'll have to tighten the language up in the unsafe WG.

alercah on 6 Aug 2018

I am not sure what exactly your use-case is. But essentially, you are fine as long as you keep the uninitialized stuff behind a raw pointer or inside a union. If the data you are putting into the array is Copy, you could take the MaybeUninit union from rust-lang/rfcs#1892 and implement it in stable today.

In a major update, I'll suggest that we malloc() the memory manually instead of returning a huge struct on the stack. I'm not sure what guarantees there are about stack size anyway, and having a multi KiB return value seems hacky.

Ouch, how big is that struct?!?? When did you try this the last time? The [value; N] case for big N has been optimized a lot, I hear.

I thought I remembered one having over 1000 members, but checking again the biggest one is just under 800. All function pointers, and not in nice arrays unfortunately. They're structs exposing a C API that may or may not be available at runtime. It would be nice to put them in static memory instead, but libX11 crashes if it and GLX are not cleaned up in the correct order.

IIRC, the reason I didn't do something like this in the first place is that some of the C functions are variadic, and there isn't a way to relay their params in Rust or even in C (no vprintf-like variants), so exposing the function pointers directly looked like the only way those functions would be available at all.

ghost on 7 Aug 2018

@RalfJung I just realized: won't requiring full validity prevent some forms of layout optimization, though? Or are we committing to the idea that structs are always going to have disjoint fields (likewise, arrays always going to have disjoint elements), and therefore writing to a struct field or array element is guaranteed to be safe even if the rest of the struct isn't valid?

alercah on 8 Aug 2018

(and I guess, corollary: should such writes be required to go through a union or raw pointer, or are they permitted through an &mut as well?)

alercah on 8 Aug 2018

@alercah I'm not sure what you mean, but per se, source language level UB doesn't ever prevent optimizations, it just means the compiler can't apply the source language rules to its IR after it has done the optimization. For layout optimizations specifically, that just means the source program isn't allowed to write e.g. 2 into Option<bool>, but compiler-generated code can do that as long as it tracks that an Option<bool> may indeed be represented as the byte 2.

Or are we committing to the idea that structs are always going to have disjoint fields (likewise, arrays always going to have disjoint elements)

We don't have any way around that since you can take pointers to fields/elements and forget that they are placed inside some other type. For example (bool, bool) always has to be represented as two bytes, each of which holds an individual bool, since you can create &tup.0: &bool and &tup.1: &bool and pass them to code that doesn't know anything about (bool, bool).

hanna-kruppe on 8 Aug 2018

Upon thinking about it some more, I am not actually sure exactly what you're proposing in conjunction with your recent internals post, so I'm going to follow up on IRC.

alercah on 8 Aug 2018

@alercah Which layout optimizations are you thinking of that would not work any more?

RalfJung on 8 Aug 2018

I'm not sure that I was thinking about any that currently exist in Rust, which is why I was asking. :)

alercah on 8 Aug 2018

Well I cannot think of any.^^

RalfJung on 8 Aug 2018

Related to the discussion above is my blog post on safety and validity invariants in Rust.

RalfJung on 17 Sep 2018

I think we can close this now. The issue has been reopened based on a comment by @koute, which complained that

ptr::read uses mem::uninitialized: that has been fixed.
The docs for mem::uninitialized and mem::zeroed do not bring enough attention to the UB potential here. That has been fixed as well.

The issue also diverged a bit into whether "having" but "not using" a NULL reference should be UB or not. That's just what the semantics are and have been though, and we are about to finally stabilize MaybeUninit to account for that.

If someone wants to propose changing the semantics, IMO that warrants a new issue or rather an RFC -- and one that also precisely defines what a "use" is and how we make sure that we can still emit range attributes for LLVM wherever possible (amongst other optimizations).

RalfJung on 26 Jun 2019

The docs still don't mention that mem::zeroed can cause crashes in some cases where mem::uninitialized does not. It is reasonable to assume that invalid values should not be operated on, but mem::zeroed has been shown to simply raise SIGILL instead of assigning a value, even if it is replaced with a valid value before use, which makes this statement false:

This has the same effect as allocating space with mem::uninitialized and then zeroing it out.

Unless this has changed since I last had that issue.

ghost on 26 Jun 2019

that mem::zeroed can cause crashes in some cases where mem::uninitialized does not.

This is not accurate. mem::zeroed is always at least as correct as mem::uninitialized. It might be that both cases are UB (such as when using them to create a reference) and that the UB manifests in different ways in current versions of LLVM, but that is a detail of the optimization pipeline and can change any time.

mem::zeroed::<&T>() and mem::uninitialized::<&T>() are the same: both cause immediate UB. Both can cause crashes. Or do whatever else.

We do not intend to guarantee that any UB leads to a crash, so if this is the only remaining issue, that's "not a bug".

which makes this statement false:

That statement is correct and has been correct for every version of Rust since 1.0.
Both "allocating space with mem::uninitialized and then zeroing it out" and "mem::zeroed" cause UB when used for a reference, so the effect is the same.

Two programs that both have UB might behave differently when compiled, but again that's pretty much impossible to prevent, and it certainly is "not a bug".

The docs still don't mention

Did you look at the nightly version of the docs? It says explicitly

There is no guarantee that an all-zero byte-pattern represents a valid value of some type T. For example, the all-zero byte-pattern is not a valid value for reference types (&T and &mut T). Using zeroed on such types causes immediate undefined behavior because the Rust compiler assumes that there always is a valid value in a variable it considers initialized.

And then it even gives a concrete example:

use std::mem;

let _x: &i32 = unsafe { mem::zeroed() }; // Undefined behavior!

RalfJung on 26 Jun 2019

It is reasonable to assume that invalid values should not be operated on

The problem with this is how to define "operated on". The only consistent definition that we know so far includes "copy as part of a typed assignment" in "operated on" -- so the following program "operates on" a NULL reference, by storing it in x:

let x: &i32 = unsafe { mem::zeroed() };

Why did we include that? Well, for example we must include passing a value as argument to a function in "operated on", because we tell LLVM that a reference passed as an argument is not NULL:

fn f<T>(x: &T) -> &T { x } // LLVM gets to assume that this is not NULL.
let x: &i32 = f(unsafe { mem::zeroed() }); // So this must be UB.

It seems rather strange to include "pass as argument to a function" in "operated on", but not also include "use in an assignment". So we just include them all, that's much easier to define, easier to remember, and easier to make use of in the compiler.
Instead of having to remember which things count as an "operation" and which don't, we just say "any time a value gets used in a type way, it must be valid for that type". The assignment in let x = ... is a typed copy, putting the result of ... into x, so it falls into the scope of this definition.

RalfJung on 26 Jun 2019

Looks like it was reappeared again after https://github.com/rust-lang/rust/pull/62150 in https://github.com/erlepereira/x11-rs/issues/99

o01eg on 17 Jul 2019

@o01eg do you know where in your code the bad reference is being created?

I can only repeat that mem::zeroed on a type including a reference is immediate UB, and this is stated explicitly by the docs. Having a SIGILL there is a feature that alerts you of the UB instead of silently exploiting it. You cannot rely on always getting the SIGILL though.

RalfJung on 17 Jul 2019

FWIW, I'd be fine with closing this bug. If someone else on the lang team concurs, let's start an rfcbot close?

joshtriplett on 17 Jul 2019

I haven't looked deeply into it yet, but now the same thing seems to be happening when calling mem::uninitialized, which had previously fixed things.

ghost on 18 Jul 2019

mem::uninitialized codegen changed, so it is possible that the behavior of a program that has UB changed.

You should really not ever call uninitialized or zeroed on a type that does or might contain a reference as a field. That has never been supported and at best "happened to work".

RalfJung on 18 Jul 2019

It can't be just references. The types causing the crash in x11-dl have no references, just pointers to C functions and a *mut c_void. However, one of the members has a destructor, which is the only reason I can think that might cause this. Using mem::zeroed in the first place, and later mem::uninitialized, was a workaround for another bug that seems to occur when large objects are allocated on the stack. Unfortunately, if the official response to breaking changes is "lol not a bug", then there doesn't appear to be any possible workaround (at least not that we've come up with) in the Rust language other than redesigning the library, a major version bump, and forcing everybody who uses x11-dl to update to the new version.

ghost on 18 Jul 2019

just pointers to C functions

Were those function pointers wrapped in Option? If not, then the value zero is a trap representation for them.

retep998 on 18 Jul 2019

With uninitialized, it won't matter if Option is used. Function
pointers are similar to reference types and to the best of my knowledge
subject to the same requirements, in particular that they must be aligned.
I don't think it's valid to create an uninitialized one---you'd need to
store them as raw pointers (*const () on platforms that support it as
there are apparently no raw function pointers?) to avoid these issues.

The compiler changing how it compiles code with UB isn't considered a
breaking change: there are no guarantees whatsoever about how code with UB
will behave. It's possibly a mistake that it's poorly documented that
misaligned function pointers are UB (even the current version of the UCG
doesn't seem to address it), but that seems to me a docs bug, not a
compiler bug, especially given that it is documented that they cannot be
null.

On Thu., Jul. 18, 2019, 13:42 Peter Atashian, notifications@github.com
wrote:

just pointers to C functions

Were those function pointers wrapped in Option? If not, then the value
zero is a trap representation for them.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/52898?email_source=notifications&email_token=AE7AOVLBMNWNS4UFYYRA5DDQACTPVA5CNFSM4FM7AGW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2JHZNY#issuecomment-512916663,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE7AOVOC2QOBLEOOEX5AEPTQACTPVANCNFSM4FM7AGWQ
.

alercah on 18 Jul 2019

Actually, I just thought of something else: even if misaligned function
pointers aren't well documented as UB, replacing zeroed with
uninitialized is never going to be correct---the compiler can happily
assume that the value returned from uninitialized is going to be
all-zeroes, which is enough to trigger UB on its own.

On Thu., Jul. 18, 2019, 14:27 Alexis Hunt, alercah@gmail.com wrote:

With uninitialized, it won't matter if Option is used. Function
pointers are similar to reference types and to the best of my knowledge
subject to the same requirements, in particular that they must be aligned.
I don't think it's valid to create an uninitialized one---you'd need to
store them as raw pointers (*const () on platforms that support it as
there are apparently no raw function pointers?) to avoid these issues.

The compiler changing how it compiles code with UB isn't considered a
breaking change: there are no guarantees whatsoever about how code with UB
will behave. It's possibly a mistake that it's poorly documented that
misaligned function pointers are UB (even the current version of the UCG
doesn't seem to address it), but that seems to me a docs bug, not a
compiler bug, especially given that it is documented that they cannot be
null.

On Thu., Jul. 18, 2019, 13:42 Peter Atashian, notifications@github.com
wrote:

just pointers to C functions

Were those function pointers wrapped in Option? If not, then the value
zero is a trap representation for them.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/52898?email_source=notifications&email_token=AE7AOVLBMNWNS4UFYYRA5DDQACTPVA5CNFSM4FM7AGW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2JHZNY#issuecomment-512916663,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE7AOVOC2QOBLEOOEX5AEPTQACTPVANCNFSM4FM7AGWQ
.

alercah on 18 Jul 2019

While I agree that mem::zeroed and mem::uninitialized should be avoided, this is a breaking change. Code that used to work fine doesn't work anymore. Code that others are relying on. That's a breaking change. It seems odd that these functions exist in the standard library at all since they're now basically useless. There's no reason they have to crash a program, and they didn't before, but for some reason, somebody decided they now do.

If I had written this library today, I would have known to look for a better solution, but unfortunately I wrote it ~5 years ago or so just after Rust 1.0 was released. What can possibly be done from my end? Yank the library and tell the users they'll have to write a new one? Redesign and rewrite it and force everyone to rewrite their usage anyway? What will happen next time there's an unnecessary change that breaks people's code other than wish I had used a stable language, rather than some experimental new language that gets falsely touted as being reliable?

ghost on 19 Jul 2019

You were doing something you weren't allowed to do. Previously, the compiler didn't notice, but now it does. You're only getting caught now, but your code was incorrect all the way back to Rust 1.0.

That the bug was only encountered now is unfortunate, but compiler changes aren't the only thing that could break this code. It might break if you run it on a different platform. It might break if you compile with different flags, even in an older version of Rust. It might break if you use the code in just the wrong way that nobody has yet, again even in an older version of Rust. The compiler changed in some way which broke the specific examples complained of, but there is not and never was a guarantee that your code was operating correctly. Indeed, it may well only have appeared to be operating correctly but actually had more subtle bugs. It may have been corrupting memory in a way which you didn't notice. It may have led the compiler to, in some cases, optimize away a null pointer check that ought to have been there, causing subtle misbehaviour. Even on versions of the compiler where this "worked", there is no guarantee that it was in fact doing what you thought it was.

You say you wish you had used a "stable language". Would you consider C to be stable? You'd run into the same issues there; nothing to do with language stability. Chris Lattner (who wrote LLVM) wrote a series of blog posts about UB back in 2011 (I linked the second post, because it's the most relevant here). I don't know of any C or C++ compiler that guarantees that code performing undefined behaviour will not change how it behaves: the major ones (clang, GCC, MSVC) certainly do exactly the opposite (except where they have extensions defining behaviour in some cases where the standards would call it UB).

What should you do? You should stop shipping a library that is insecure and buggy. I don't know anything about your code so I can't tell you how to accomplish that. Ideally you can change the types. If they're in the public interface, and you can make appropriate changes without changing your public interface, go do that. Failing all else, make dummy functions that panic when called and initialize to point to them---presumably you don't care about what the actual values were, so using trapping dummies is fine. And you'll get an extra benefit of improved security in case something does try to call these values: trapping rather than setting the instruction pointer to an effectively random address.

Edit: and if for some reason that doesn't work, do the only thing you can do: make a breaking change and tell all your users that they have a security bug and need to adjust. It's the right thing to do---so much so that security bugs are an accepted reason to break backwards compatibility by many major libraries and compilers.

alercah on 19 Jul 2019

👍3

x11-dl is just a language binding for a C library that exposes the C functions as a struct of function pointers. These structs have a private lib field, which is just a *mut c_void with a destructor, and public function pointers. Because Rust (at least last time I really used it) can't box values without first passing them through the stack, there were stack overflows at runtime on some systems when loading the libraries, not to mention insane compile times (around 20 min on my old laptop). So as a workaround, the struct was put down uninitialized (previously zeroed), the lib field was carefully set, and the function pointers were initialized in a loop.

Would you consider C to be stable? You'd run into the same issues there; nothing to do with language stability.

Nothing that this library does would be an issue in C or any other systems language that I'm aware of. In fact, I've worked with code many times before in C and C++ that does almost exactly the same thing without issue (like SDL does after it decides whether to load the X11 or Wayland libraries). If Rust is intended to interface with C code, it's doesn't seem right that it's even more unpredictable than C when doing so. Unsafe code should mean "you have the power to shoot yourself in the foot," not "we'll go ahead and shoot your foot for you." All the library is trying to do is initialize a big struct, and the compiler keeps coming up with new ways to make this impossible every 6 months.

ghost on 19 Jul 2019

Because Rust (at least last time I really used it) can't box values without first passing them through the stack, there were stack overflows at runtime on some systems when loading the libraries, not to mention insane compile times (around 20 min on my old laptop). So as a workaround, the struct was put down uninitialized (previously zeroed), the lib field was carefully set, and the function pointers were initialized in a loop.

Ahh. I can see how you arrived at this. There's currently no good language facility for doing this, other than on a field-by-field basis, especially for function pointers. But there are other workarounds which don't rely on UB. For instance, you could add another struct where each field is just a *const (), and initialize the entire thing, and then take a pointer to it and cast it to a pointer to the struct that has the initialized pointer types. Both structs would need to be #[repr(C)], but then this would be guaranteed to work on any platform that guarantees function-object pointer conversions (which include POSIX). And it wouldn't rely on UB, so it would be guaranteed to work without breaking.

Nothing that this library does would be an issue in C or any other systems language that I'm aware of.

Unsafe code should mean "you have the power to shoot yourself in the foot," not "we'll go ahead and shoot your foot for you."

I think you're conflating two things here, namely "what is UB" and "what are the consequences of UB". You're complaining that the compiler has gone and shot your foot, but that's not true. You already shot yourself in the foot. It just took a long time for you to feel the pain. The equivalent situation in C to the one at hand, as far as breaking changes goes, is not having a pointer to function with an uninitialized value. C permits that; Rust does not. An equivalent in C might be signed integer overflow: it's not obviously UB, it often happens to work, and most of the bugs that arise are due to the compiler optimizing around it rather than an explicit trap or obvious misbehaviour. Additionally, it's defined behaviour in Rust (although it depends on debug/release mode), so a Rust program can use signed overflow predictably and reliably, while a C program that has it is inherently buggy, regardless of whether it happens to work sometimes.

alercah on 19 Jul 2019

👍2

@Daggerbot

Unfortunately, if the official response to breaking changes is "lol not a bug", then there doesn't appear to be any possible workaround

What will happen next time there's an unnecessary change that breaks people's code

You are making several unfair and unfounded assumptions here. First of all you seem to think this change was unnecessary. I am not sure why you think that; the change that likely triggered this is https://github.com/rust-lang/rust/pull/62150 in which we were able to remove a very crazy intrinsic and define it in plain Rust instead. That's a great improvement in terms of maintainability of the compiler and also the language spec (once such a thing exists). Second you seem to think that we knew the change in question would have any kind of fall-out. Again that's not correct, we had do idea. This PR does not change what any of the involved functions actually does: mem::uninitialized still leaves memory uninitialized, and mem::zeroed still returns zeroed memory. However, it leads to slightly different paths being taken in the optimizer, so if code has UB, its behavior can change. This is unavoidable as a part of compiler development.

And finally, it is also not correct that we don't care about when our changes do break libraries, even if those libraries have undefined behavior. But if programmers response to undefined behavior is "lol not a bug", there is no basis for a constructive discussion.

UB is a contract between the programmer and the compiler (and also, somehow, a contract between the programmer and the compiler developers), and if the programmer does not fulfill their side of the contract, the other party will not fulfill their side of the contract either.

So, with that out of the way, maybe we can have a constructive discussion?

What happens here is the equivalent of a program doing an out-of-bounds memory access. With one compiler version that might "work" and just return some garbage value, with the next compiler version it might lead to a SEGFAULT because now the access points to an unmapped page. It is basically impossible for a compiler to guarantee that incorrect programs (i.e., programs having UB) keep "working". Any time anything is changed in the optimizer or the implementation of some method used by unsafe code, that can lead to unsafe code with UB "breaking" (I put this in quotes because the code was, in some sense, already broken -- it had UB). I invite you to look at https://github.com/rust-lang/rust/pull/62150 and tell me what's "wrong" about it and how we could even have foreseen that this would have any kind of fall-out.

Note that this kind of thing happens in C all the time. Some program has UB, a compiler update breaks it. In a language where programmers can cause UB, it is unfortunately practically impossible to guarantee any stability for programs with UB. We would have to basically stop developing the language as any seemingly harmless change can have pretty much arbitrary effects.

What can possibly be done from my end?

We could try to work together to find out if and where your code has undefined behavior. If it has no undefined behavior, there is definitely a compiler bug here. If it has undefined behavior, we can try to find out if that can be avoided. If it cannot, that's in some sense a "language bug" -- there probably should be a way to do what you want to do without causing UB, but current Rust just does not support that. Then we can work on trying to fix that language bug.

But given that you seem to be willing to use mem::zeroed, I think UB can be avoided. We just have to find where in your library the UB is happening.

With uninitialized, it won't matter if Option is used

That's not entirely correct; Option<fn()> is still way better than fn. It is still UB in principle but much less likely to cause miscompilations in practice.

More importantly though, mem::zeroed is correct for Option<fn()>. So @Daggerbot if you have function pointers and can live with mem::zeroed, wrapping them in an Option should work for you. Could you point to where, in your code, the structs you want to zero-initialize are defined?

RalfJung on 19 Jul 2019

❤3 👍1

I invite you to look at #62150 and tell me what's "wrong" about it and how we could even have foreseen that this would have any kind of fall-out.

I'm going to take you up on this invitation and say that doing uninit().assume_init() is on a rather different level than just leaving a value uninitialized. It's a thin line to walk, but from the perspective of LLVM initializing something to undef and then overwriting with something valid will (with some caveats) be usually well-defined. On the other hand, taking an undef value and immediately asserting that it has certain properties (as assume_init() will likely do through the use of return attributes if nothing else) will immediately be optimized to unreachable.

It's still UB on the Rust language level in either case, but it seems like this change goes against how these intrinsics have been used in practice for a long time and it might make sense to delay this kind of change until they have been deprecated for a while.

nikic on 19 Jul 2019

doing uninit().assume_init() is on a rather different level than just leaving a value uninitialized. It's a thin line to walk, but from the perspective of LLVM initializing something to undef and then overwriting with something valid will (with some caveats) be usually well-defined. On the other hand, taking an undef value and immediately asserting that it has certain properties (as assume_init() will likely do through the use of return attributes if nothing else) will immediately be optimized to unreachable.

assume_init does not assert initializedness towards in LLVM in any way that mem::uninitialized would not already do (namely by having T as return type). So I do not understand in which sense you think this makes a difference for LLVM. The old mem::uninitialized literally took an undef and immediately returned it at some type T, so if that's enough for LLVM to insert a trap, why would anything get worse now?

Are you saying it would help if we, e.g. changed MaybeUninit::into_inner (which is used to implement assume_init) to do a ptr::read instead of accessing the field directly?

RalfJung on 19 Jul 2019

Seems like we are having something of a parallel discussion at https://github.com/amethyst/amethyst/issues/1808#issuecomment-513158717.

RalfJung on 20 Jul 2019

I am going to close this:

SIGILL on zeroed references is not a bug, it is UB and deliberately so.
The UB in the library that triggered this discussion has been fixed.
The rustc PR that causes the SIGILL to be emitted for the old version of the library has been reverted for now (but we plan to revert the revert; there should have been enough time to transition by now).
We now have a lint that statically detects some cases of this UB, and we will soon get a dynamic check as well, which should help developers find issues such as zeroed references.

RalfJung on 29 Feb 2020

Rust: Creating a reference with `std::mem::zeroed` generates a SIGILL

Most helpful comment

All 67 comments

Related issues