Rust: Show a constant's virtual memory on validation errors

Created on 14 Aug 2018  ยท  11Comments  ยท  Source: rust-lang/rust

The following code worked up until nightly-2018-07-30-x86_64-unknown-linux-gnu:

pub union Transmute<T:Copy, U:Copy> {                                                                          
    pub from:T,
    pub to:U,                                                                                              
}                       

#[derive(Clone,Copy)]
struct Foo(i64);                                                                                           

static FOO:Foo = unsafe { Transmute { from:0 }.to };                                                                                                                                                                  

fn main() {
}
error[E0080]: this static likely exhibits undefined behavior
  --> src/main.rs:9:1                                                                                      
   |                                                                                                      
 9 | static FOO:Foo = unsafe { Transmute { from:0 }.to };                                               
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ type validation failed: encountered undefined bytes at .0                                                                                                
   |                                                                                                  
     = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rust compiler repository if you believe it should not be considered undefined behavior                                                                                                 
error: aborting due to previous error
A-const-eval A-diagnostics C-enhancement E-medium E-mentor P-low T-compiler

Most helpful comment

00 00 00 00 __ __ __ __
            ^^ undefined byte

I like that one, it's very clear on a glance

All 11 comments

cc @oli-obk

This seems legit! Your 0 there is not typed as 0i64, but instead it infers by default to i32, which means half of your Foo would be undefined.

Ah.. that makes sense. Making the from type explicit does indeed work.

Using the default i32 value does work for non-static types, which is different behavior, but I suppose that's not a real problem since it's unsafe and UB anyway.

Reopening to track the improvement of the diagnostic to show the bit pattern and which bytes were undefined.

The work-around where an untagged union is used to crate uninitialized memory in a const fn also seems to be affected by this issue.

union Foo<T> {
    t: T,
    u: (),
}
unsafe const fn uninitialized<T>() -> T {
    Foo { u: () }.t
}

This code does no longer compile since the t part of the union is intentionally uninitialized.
However according to the RFC below, uninitialized memeory handling should in future be done with a new MaybeUninit type.

see also discussion in https://github.com/rust-lang/rfcs/issues/411 and https://github.com/rust-lang/rust/pull/50150 and https://github.com/rust-lang/rfcs/pull/1892

Yes, you should be returning a Foo<T> and have the user initialize the t field whenever they are ready instead of producing an invalid T.

Additionally, your code is not affected. Only if you use the function to produce a const or static that contains this value.

@oli-obk I would like to take this up.

So... Before we start out on this, we should bikeshed what the output should look like.

cc @rust-lang/wg-diagnostics I'm wondering if it would make sense to render the memory of a constant similar to how hex editors render memory. Then we could dump the rendered version in a virtual file that diagnostics can get spans into. For the given example above (where only half of a u64 is initialzed) I'm thinking that the virtual file could look like

00 00 00 00 โ“โ“ โ“โ“ โ“โ“ โ“โ“

and thus a diagnostic pointing into it would look like

note: this `i64` does not fully consist of defined bytes
  --> FOO.0:1:12
00 00 00 00 โ“โ“ โ“โ“ โ“โ“ โ“โ“
            ^^ undefined byte

Then we could dump the rendered version in a virtual file that diagnostics can get spans into.

Hopefully we can bypass needing an actual Span in order to report a good-looking error.

Also keep in mind IDEs can't really show your "virtual hex dump" files, as we don't have a way to export them from the compiler (other than saving them in a temp dir), so it'd be nicer to make the long-form "rendered" message itself contain the hex dump.

Minor bikeshed: โ“ looks too much like a 0 or a 8 for me.
Also, it's wider than a regular monospace character, so ^ wouldn't work.


Alternatives (press to expand)

00 00 00 00 UU UU UU UU
            ^^ undefined byte
00 00 00 00 uu uu uu uu
            ^^ undefined byte



md5-646a3dfbe3a60ad0a305fb6356b82f1d



00 00 00 00 XX XX XX XX
            ^^ undefined byte



md5-646a3dfbe3a60ad0a305fb6356b82f1d



00 00 00 00 xx xx xx xx
            ^^ undefined byte



md5-646a3dfbe3a60ad0a305fb6356b82f1d



00 00 00 00 ?? ?? ?? ??
            ^^ undefined byte



md5-646a3dfbe3a60ad0a305fb6356b82f1d



00 00 00 00 ## ## ## ##
            ^^ undefined byte



md5-646a3dfbe3a60ad0a305fb6356b82f1d



00 00 00 00 __ __ __ __
            ^^ undefined byte

00 00 00 00 __ __ __ __
            ^^ undefined byte

I like that one, it's very clear on a glance

Underscores are also the format I used for undefined bytes in the original Miri report and slide deck, so +1 from me. :)

Was this page helpful?
0 / 5 - 0 ratings