Rust: How to safely initialize an array of Vec with MaybeUninit doc example

Created on 24 Sep 2018 · 18Comments · Source: rust-lang/rust

In this page:
https://doc.rust-lang.org/nightly/std/mem/union.MaybeUninit.html

I suggest to add an equivalent example of "how to safely initialize an array of Vecs" using MaybeUninit instead of uninitialized, if possible:

https://doc.rust-lang.org/nightly/std/mem/fn.uninitialized.html

P-medium T-doc T-libs

Source

leonardo-m

👍2

Most helpful comment

I suggest to add an equivalent example of "how to safely initialize an array of Vecs"

The docs have been expanded to contain such an example:

use std::mem::{self, MaybeUninit};
use std::ptr;

let data = {
    // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
    // safe because the type we are claiming to have initialized here is a
    // bunch of `MaybeUninit`s, which do not require initialization.
    let mut data: [MaybeUninit<Vec<u32>>; 1000] = unsafe {
        MaybeUninit::uninit().assume_init()
    };

    // Dropping a `MaybeUninit` does nothing, so if there is a panic during this loop,
    // we have a memory leak, but there is no memory safety issue.
    for elem in &mut data[..] {
        unsafe { ptr::write(elem.as_mut_ptr(), vec![42]); }
    }

    // Everything is initialized. Transmute the array to the
    // initialized type.
    unsafe { mem::transmute::<_, [Vec<u32>; 1000]>(data) }
};

assert_eq!(&data[0], &[42]);

RalfJung on 26 Jun 2019

👍5 ❤2

All 18 comments

What would the example look like? Something like this?

let arr: [MaybeUninit<Vec<u8>>; 5] = [MaybeUninit::uninitialized(); 5];
for v in &mut arr {
    v.set(vec![]);
}
let arr: [Vec<u8>; 5] = mem::transmute(arr);

cc @RalfJung

cramertj on 25 Sep 2018

As arr array length is better to use 50 instead of 5, because Default:default goes up to about 30.

leonardo-m on 25 Sep 2018

One gotcha is that the example doesn't even work as-is because MaybeUninit isn't Copy:

https://play.rust-lang.org/?gist=1d2d5fa2a49bbbbceb8a3b113587f87a&version=nightly&mode=debug&edition=2015

If MaybeUninit is intended to be compatible with arrays then maybe that was overlooked?

FenrirWolf on 27 Sep 2018

In https://github.com/rust-lang/rust/pull/53508/commits/7c37c6d33ef890f73c16323fe87cdbe403ba2ac8#diff-322c42fd5e8ec8e8f8093261d59a5782R87, @eddyb and @RalfJung mention that MaybeUninit “distributes over products” such as arrays, so you could instead do something like:

let mut arr: MaybeUninit<[Vec<u8>; 5]> = MaybeUninit::uninitialized();
for i in 0..5 {
    ptr::write(&mut arr.get_mut()[i], vec![]);
}
let arr: [Vec<u8>; 5] = arr.into_inner();

...if it is safe to get a pointer or reference to the uninitialized inner value or its fields/elements, as long as this pointer isn't used to read from uninitialized portions of the value. It would only be safe to read from an element after it is initialized, and only safe to read from the whole (e.g. using into_inner) when all the elements are initialized.

Note that the expression arr.get_mut() above creates an &mut T that points to an uninitialized T, so this can only work if we decide that such references are “unsafe but valid” (i.e., not instant UB). Also it may be a bit risky passing this to the safe Index::index function, so a more conservative version might use only raw pointers:

    (arr.as_mut_ptr() as *mut Vec<u8>).offset(i).write(vec![]);

mbrubeck on 27 Sep 2018

👍1

So would this be a valid example of how to initialize an array of Vecs then?

#![feature(maybe_uninit)]

use std::mem::{self, MaybeUninit};

fn main() {
    let mut arr: [MaybeUninit<Vec<u8>>; 50] = {
        // The compiler complains about `Copy` bounds if we try to directly
        // create a `[MaybeUninit<Vec<u8>>; 50]`, so we use the fact that
        // MaybeUninit<[T; N]> == [MaybeUninit<T>; N]  to work around that
        let arr: MaybeUninit<[Vec<u8>; 50]> = MaybeUninit::uninitialized();
        unsafe { mem::transmute(arr) }
    };

    // Initialize the elements one at a time. The `Iterator` trait isn't
    // implemented for arrays over size 32 (integer generics when!?),
    // so let's use good ol' indexing instead.
    for idx in 0..arr.len() {
        arr[idx].set(vec![]);
    }

    // Now that everything is initialized (and they'd better be fully
    // initialized or things WILL blow up here), we can transmute to
    // a fully valid array of Vecs
    let arr: [Vec<u8>; 50] = unsafe { mem::transmute(arr) };

    // `Debug` is also unimplemented for arrays of this size, so let's
    // index-loop again to show our completed work!
    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

https://play.rust-lang.org/?gist=95140594f74290a294a7cea06d37c597&version=nightly&mode=release&edition=2015

FenrirWolf on 27 Sep 2018

Is it worth adding to the Rust std lib a helper macro like this, to simplify the creation of arrays of noncopy values?

https://crates.io/crates/array-init

leonardo-m on 27 Sep 2018

So would this be a valid example of how to initialize an array of Vecs then?

My code from above was meant to be an entire solution, without any transmuting necessary. Here's a complete example program:

use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr = MaybeUninit::uninitialized();
        for i in 0..N {
            (arr.as_mut_ptr() as *mut Vec<u8>).add(i).write(vec![]);
        }
        arr.into_inner()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

mbrubeck on 28 Sep 2018

Ah nice, that's way more concise.

FenrirWolf on 28 Sep 2018

@mbrubeck One extremely subtle point about your code is the Nice to have no transmute, though as *mut Vec<u8> is pretty much the same thing. But yeah, I agree that is correct. Might be nicer to do the arr.as_mut_ptr() as *mut Vec<u8> only once? Also I think arr should have a type ascribed, inference is not entirely trivial here.

use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr: MaybeUninit<[Vec<u8>; N]> = MaybeUninit::uninitialized();
        let arr_ptr = arr.as_mut_ptr() as *mut Vec<u8>; // pointer to 1st element
        for i in 0..N {
            arr_ptr.add(i).write(vec![]);
        }
        arr.into_inner()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

RalfJung on 30 Sep 2018

@mbrubeck One extremely subtle point about your code is the Nice to have no transmute, though as *mut Vec<u8> is pretty much the same thing. But yeah, I agree that is correct. Might be nicer to do the arr.as_mut_ptr() as *mut Vec<u8> only once? Also I think arr should have a type ascribed, inference is not entirely trivial here.
use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr: MaybeUninit<[Vec<u8>; N]> = MaybeUninit::uninitialized();
        let arr_ptr = arr.as_mut_ptr() as *mut Vec<u8>; // pointer to 1st element
        for i in 0..N {
            arr_ptr.add(i).write(vec![]);
        }
        arr.into_inner()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}
I think the ascribed type is also very helpful!

When I try to run this on the Playground I get an error however:

error[E0599]: no method named `into_inner` found for type `std::mem::MaybeUninit<[std::vec::Vec<u8>; 5]>` in the current scope

When I try to run this on my own machine with nightly-1.35.0 I get the same error. Am I missing something?

Ultimately I think we should somehow create a good safe abstraction for this. I passed this to the security workgroup to see if there is something we can come up with.

DevQps on 30 Mar 2019

👍1

The API has changed, the following should work now:

#![feature(maybe_uninit)]
use std::mem::MaybeUninit;

fn main() {
    const N: usize = 5;

    let arr: [Vec<u8>; N] = unsafe {
        let mut arr: MaybeUninit<[Vec<u8>; N]> = MaybeUninit::uninit();
        let arr_ptr = arr.as_mut_ptr() as *mut Vec<u8>; // pointer to 1st element
        for i in 0..N {
            arr_ptr.add(i).write(vec![]);
        }
        arr.assume_init()
    };

    for idx in 0..arr.len() {
        println!("Index {}: {:?}", idx, arr[idx]);
    }
}

But that's basically a hack. Better interaction of MaybeUninit with Box/arrays/slices is definitely on the roadmap, but let's first land the basics.

RalfJung on 30 Mar 2019

Isn't it simpler and safer to turn &mut MaybeUninit<[Vec<u8>; N]> into &mut [MaybeUninit<Vec<u8>>; N] and iterate that normally, instead of iterating with raw pointers?

eddyb on 30 Mar 2019

I am not saying the above is the API we should use.^^

Probably it makes most sense to start with [MaybeUninit<Vec<u8>>; N] created by uninitialized_array![Vec<u8>; N]. But then we need a good way to go from that to [Vec<u8>; N]. We cannot write that as a generic function yet, unfortunately (nor your array pointer conversion function, FWIW).

RalfJung on 30 Mar 2019

So, here's a trick...
&mut MaybeUninit<[T; N]> should coerce to &mut MaybeUninit<[T]> (do we have to solve union / custom DST unresolved questions for this?)

Then you don't need const generics to turn it into &mut [MaybeUninit<T>].

Btw, we should just make [MaybeUninit::uninit(); N] and even [vec![]; N] work in the NLL borrowck, we were waiting for Rust 2018 to ship, but it should be easy now.

We can even bypass const fn promotability for vec![] by special-casing it to use a perma-unstable Vec::EMPTY associated const! (but Vec::new() working would be nice)

cc @oli-obk (all we have to do is promote Rvalue::Repeat's element, where possible, and move the Copy check from typeck to old & new borrowck)

There is an issue open for this, I think, but I can't find it easily right now.

eddyb on 30 Mar 2019

👍4

I suggest to add an equivalent example of "how to safely initialize an array of Vecs"

The docs have been expanded to contain such an example:

use std::mem::{self, MaybeUninit};
use std::ptr;

let data = {
    // Create an uninitialized array of `MaybeUninit`. The `assume_init` is
    // safe because the type we are claiming to have initialized here is a
    // bunch of `MaybeUninit`s, which do not require initialization.
    let mut data: [MaybeUninit<Vec<u32>>; 1000] = unsafe {
        MaybeUninit::uninit().assume_init()
    };

    // Dropping a `MaybeUninit` does nothing, so if there is a panic during this loop,
    // we have a memory leak, but there is no memory safety issue.
    for elem in &mut data[..] {
        unsafe { ptr::write(elem.as_mut_ptr(), vec![42]); }
    }

    // Everything is initialized. Transmute the array to the
    // initialized type.
    unsafe { mem::transmute::<_, [Vec<u32>; 1000]>(data) }
};

assert_eq!(&data[0], &[42]);

RalfJung on 26 Jun 2019

👍5 ❤2

@RalfJung This pattern is a way to avoid unnecessary initialization of the array. And it is crystal clear :)

In a context where unnecessary initialization is not wanted, it is likely that unnecessary moves are also unwanted. Is it somehow guaranteed by the compiler that returning unsafe { mem::transmute::(...) } from the block in this case will not copy all bytes to another location owned by the new variable data? If not, how to ensure that results will not be moved around?

Should this be more explicit in this example of the doc, or is it the wrong place to bring up a move-elision-related topic?

iago-lito on 8 Apr 2020

AFAIK Rust has no guaranteed move elision, and I am afraid I don't know how well the optimizer can elide such moves in practice. This is below the layers of Rust that I am familiar with.

RalfJung on 8 Apr 2020

👍1

@iago-lito I think the feature you are asking for was also requested at https://github.com/rust-lang/rust/issues/61011.

RalfJung on 10 Apr 2020

❤1 👍1

Was this page helpful?

0 / 5 - 0 ratings