This is a tracking issue for the unstable placement_new_protocol
feature in the standard library, and placement_in_syntax
/box_syntax
in the compiler.
(@pnkfelix adds below:)
Things to decide / finalize before stabilization:
in PLACE { BLOCK }
vs PLACE <- EXPR
. (See https://github.com/rust-lang/rfcs/pull/1228 )&mut self
vs self
for the Placer::make_place
(https://github.com/rust-lang/rfcs/issues/1286).box EXPR
part of this? (currently the desugaring doesn't work due to type inference issues).Place
for InPlace
and BoxPlace
, or just have the InPlace
trait independently from any BoxPlace
.At least this needs to be decided before this can become stable.
I've adapted this issue from being explicitly focused on just the library aspects of placement new to encompassing both that and the compiler impl (since they're fairly intertwined). I've also been thinking about this a little recently.
General links:
in X { Y }
to X <- Y
We currently have Place
,Placer
& InPlace
and BoxPlace
& Boxed
. The Place
trait is trying to abstract out a commonality between BoxPlace
and InPlace
: the pointer
method. The lang team discussed this and decided that this may be unnecessary abstraction, since its not obvious how important it is to share these details at the trait level (NB. one can still share the code itself if a type impls both BoxPlace
and InPlace
and since one can, say, call one pointer(&mut self) -> *mut T
method from the other).
Placer
blanket implThe Placer
:(InPlace
+Place
) relationship is very similar to the IntoIterator
:Iterator
one: The former is designed to convert into the later so that one can use in
(respectively for
& iterator adaptors/consumers) with things that aren't directly InPlace
s (e.g. in vec { elem }
/vec <- elem
) or Iterator
(e.g. for _ in &[1, 2, 3]
). IntoIterator
has a blanket impl
impl<I> IntoIterator for I where I: Iterator {
type Item = I::Item;
type IntoIter = I;
fn into_iter(self) -> I { self }
}
which means that all Iterator
s can be transparently used in places that expect IntoIterator
, without the creator of the Iterator
having to remember or write anything more.
It'd be interesting if a similar thing could happen with Placer
and InPlace
, i.e. have:
impl<Data, P> Placer<Data> for P where P: InPlace<Data> {
type Place = P;
fn make_place(self) -> P { self }
}
However this hits a coherence error, https://github.com/rust-lang/rust/issues/28881.
(Combined with merging Place
and InPlace
, this would mean creating many Placer
s would only require implementing a single trait, InPlace
, rather than the 3 it does today.)
Handling fallible placement allocations is a little idiosyncratic, but seems possible, by handling failure in in X { Y }
(aka X <- Y
) when creating X
itself, not by having the whole expression return a Result
to indicate problems.
I find this a little unintuitive for both implementers and users. For implementers, one is basically forced to do the allocation immediately when creating the Placer
, and then pass through the pointer (i.e. Placer::make_place
and Place::pointer
just return an already-created value), making the layers of traits seems a bit strange. For users, it feels more natural to have X <- Y
return Result<_, _>
, but only when X
is something that can fail, which possibly requires HKTs to encode (and may not be worth it).
The broad thoughts was that this wasn't _that_ strange, and that it won't come up that often (i.e. heavily biased toward OS/embedded development), but that it's very nice that there is some possibility to write this.
This is a version of Box
that supports placement in
and allows allocation failures to be recovered from (people may be most interested in the main
/func
example at the start, and/or how the procotol is implemented after that; the details of MyBox
itself at the end aren't so important):
#![feature(placement_in_syntax, placement_new_protocol)]
fn main() {
let x: Result<_, _> = MyBox::place().map(|p| in p { 1 }); // ....map(|p| p <- 1)
// with `try { ... ? ... }` this could also be:
// let x = try { in MyBox::place()? { 1 } }; // ... try { MyBoxPlace()? <- 1 };
println!("{}", *x.unwrap());
println!("{}", *func(2).unwrap());
}
fn func(val: i32) -> Result<MyBox<i32>, BadAlloc> {
let mut x = in try!(MyBox::place()) { val }; // ... try!(MyBox::place()) <- val
*x += 10;
Ok(x)
}
// implementation of the `in` procotol:
pub struct MyBoxPlace<T> {
ptr: *mut T
}
#[derive(Debug)]
pub struct BadAlloc;
impl<T> MyBox<T> {
pub fn place() -> Result<MyBoxPlace<T>, BadAlloc> {
let p = unsafe {malloc(mem::size_of::<T>())};
if p.is_null() {
Err(BadAlloc)
} else {
Ok(MyBoxPlace { ptr: p as *mut T })
}
}
}
impl<T> ops::Placer<T> for MyBoxPlace<T> {
type Place = Self;
fn make_place(self) -> Self { self }
}
impl<T> ops::Place<T> for MyBoxPlace<T> {
fn pointer(&mut self) -> *mut T { self.ptr }
}
impl<T> ops::InPlace<T> for MyBoxPlace<T> {
type Owner = MyBox<T>;
unsafe fn finalize(self) -> MyBox<T> {
let p = self.ptr as *const T;
mem::forget(self);
MyBox { ptr: p }
}
}
impl<T> Drop for MyBoxPlace<T> {
fn drop(&mut self) {
unsafe {
free(self.ptr as *mut u8);
}
}
}
// implementation of the pointer itself
use std::{mem, ops, ptr};
extern {
fn malloc(x: usize) -> *mut u8;
fn free(p: *mut u8);
}
/// Custom `Box`
pub struct MyBox<T> {
ptr: *const T
}
// make `MyBox` behave like a pointer
impl<T> ops::Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &T {
unsafe {&*self.ptr}
}
}
impl<T> ops::DerefMut for MyBox<T> {
fn deref_mut(&mut self) -> &mut T {
unsafe {&mut *(self.ptr as *mut T)}
}
}
// etc.
impl<T> Drop for MyBox<T> {
fn drop(&mut self) {
unsafe {
drop(ptr::read(self.ptr));
free(self.ptr as *mut u8);
}
}
}
How would this handle this use case? This is trying to create a way to store any instance of a trait T
in a way that they can be copied around, without knowing anything about the concrete type except that it implements the T
trait and that the concrete type is small enough to fit in the buffer.
const MAX_SIZE: usize = 128;
struct StaticBox<T> {
buffer: [u8; MAX_SIZE]
}
impl StaticBox<T> {
fn as_ref(&self) -> &T {
// We know an instance of T has been allocated in self.buffer, but
// how do we get it out?
unimplemented!();
}
}
trait T {
fn new() -> T;
fn do_something(&self);
}
struct A { a: u8 }
impl T for A {
fn new() { A { a: 1 } }
fn do_something() { unimplemented!() }
}
struct B { b: [u64; 8] }
impl T for B {
fn new() { B { b: [0; 8] } }
fn do_something() { unimplemented!() }
}
fn create_and_return_a_T() -> StaticBox<T> {
let x: StaticBox<T> = unimplemented!(); // initialize an instance of `A` inside |x|.
x
}
fn main() {
let a = create_and_return_a_T();
let b: &T = a.as_ref();
}
@briansmith
I imagine you would want an array of u64
- for alignment - instead of an array of u8
. Anyway, I don't think there is a safe way of doing this (do we have a placer impl for raw pointers?)
FYI draft of FAQ here https://internals.rust-lang.org/t/placemet-nwbi-faq-new-box-in-left-arrow/2789
Q(9). Which stdlib datatypes currently support placement-in?
A. None, currently. :smile:
We are still finalizing the protocol API and have not added Placer support to any of the standard library types.
I still think it's important to have placement insertion for basic collections (Vec
and HashMap
) implemented, tested and benchmarked (!) before finalizing the protocol. Fewer chances to do something wrong this way.
In particular we need to make sure that performance doesn't regress on small types (~pointer sized) compared to non-placement insertion (by marking the new code for exceptional case as cold or something like that?).
@petrochenkov I don't think I have ever suggested stabilizing the protocol before it has been implemented for all the collection types that we can think of. :) (Feel free to point out where I may have misled...)
@petrochenkov but I can see how the answer written there can be misinterpreted.
I'll try to change that specific text; for the most part, I hope that discussion about the FAQ itself can be restricted to that internals thread.
(I started writing the following as a comment on the FAQ thread on internals, before I saw it was partially addressed here, but I'll post it in full anyway...)
So why does the placer protocol need two types/traits? As I understand it, a method like "emplace_back" would normally return basically a wrapper object with a reference to the container in question; Rust would then call make_place()
, whose implementation would actually reserve space in the container, returning a Place
which could then be finalize()
d (or else dropped normally if the expression on the right of the <-
panicked). But why not cut out the middle operation and have emplace_back()
itself do the allocation and return a Place
, which <-
would accept on the left instead of a Placer
?
One drawback would be that global allocators would have to look like heap() <- foo
rather than HEAP <- foo
. But the former arguably looks better anyway due to not being in all caps, and more importantly, this removes an important asymmetry when it comes to fallible allocators:
Fallible allocators (i.e. allocators that can fail without panicking) cannot implement Placer
the expected way, where the make_place
implementation is what does the actual grunt work of allocation, because of course there is no way to tell the compiler to stop before writing data in. This could be worked around in the Placer
protocol itself, e.g. by having make_place
return a Result<Self::Place, Self::Place::Owner>
(where if it returned Err(owner)
, the <-
expression would evaluate to owner
without evaluating the right hand side), but that would be both complicated and really weird, since the syntax would have no indication that the right hand side could sometimes just not be evaluated.
In lieu of that, such allocators would have to allocate before returning a Placer
, so the Placer
would just be a wrapper around a Place
. e.g. fn fallible_alloc<T>() -> Result<Placer<T>, ()>
, and then user code would typically look like try!(fallible_alloc()) <- expr
, which is not bad at all and makes the control flow divergence more explicit.
Which is fine, but means that the ability to use constants as Placers
which, as far as I can tell, is the only advantage of having a separate Placer
trait, does not work for fallible allocators, creating the aforementioned asymmetry. Since it's not a very big advantage and often results in unnecessary wrapper object juggling anyway, it seems to me more sensible to get rid of it.
...this is the point where I stopped writing, and now I see that the question of allowing Placers to fail has been touched on above, and one additional advantage of having a separate Placer is mentioned - that you can have short forms like vec <- elem
rather than needing an explicit emplace method (however named). This is something, but I'm not convinced it's worth it personally, so I maintain the conclusion of the last paragraph.
(Sidenote - even if Rust's standard library doesn't care to deal with allocation failure, when it comes to the language itself, Rust's use of explicit option types rather than null pointers should make safety in the presence of allocation failure considerably _easier_ than in C. Just saying.)
@huonw, are the following traits not sufficient to reduce trait proliferation?
trait Placer<Data: ?Sized> {
type Place: Place<Data>;
fn make_place(&mut self) -> Self::Place;
}
unsafe trait Place<Data: ?Sized> {
type Owner;
fn pointer(&mut self) -> *mut Data;
unsafe fn finalize(self) -> Self::Owner;
}
trait Boxer<Data: ?Sized>: Sized {
type Place: Place<Data, Owner=Self>;
fn make_place() -> Self::Place;
}
impl<T> Boxer<T> for Box<T> { /* ... */ }
impl<T> Place<T> for IntermediateBox<T> { /* ... */ }
Wanted for WebRender.
Should the stabilization of box_syntax
or the design of placement new affect box_patterns
(https://github.com/rust-lang/rust/issues/29641)? Wondering if that gate should be tracked in this issue as well.
I'm a little unclear of the expected semantics for expressions for placement and when you can guarantee not touching the stack. The closest I've found to an explanation among the assorted RFCs/discussion is the draft in https://github.com/rust-lang/rfcs/pull/470, section 2 which says "Evaluate the <value-expr>
and write the result directly into the backing storage." and the FAQs similarly say "evaluates VALUE into the previously allocated memory (that is, do not put it onto a temporary stack slot)".
The above seems a bit vague and I'm having difficulty predicting what's going to work without blowing up the stack. I've put together a quiz of 4 simple examples, and you have to try and guess which will work in debug and which in release (ideally they'd behave identically I think?). I'm on Linux, rustc 1.17.0-nightly (be760566c 2017-02-28)
, ulimit stack limit 8MB):
(edit: most of these now work as of 2018-01-16, but you can make trivial tweaks to make them overflow the stack again)
// printlns are to prevent optimisation
#![feature(placement_in_syntax)]
#![feature(collection_placement)]
use std::collections::LinkedList;
fn main() {
// [T; 10*1024*1024]
let mut ll = LinkedList::new(); // EXAMPLE 1: T = u8
ll.back_place() <- [0u8; 10*1024*1024];
println!("{}", ll.front().unwrap()[0]+1);
let mut ll = LinkedList::new(); // EXAMPLE 2: T = usize
ll.back_place() <- [0usize; 10*1024*1024];
println!("{}", ll.front().unwrap()[0]+1);
// [[[[[T; 10]; 32]; 32]; 32]; 32] // 10*32^4 == 10*1024*1024
let mut ll = LinkedList::new(); // EXAMPLE 3: T = u8
ll.back_place() <- [[[[[0u8; 10]; 32]; 32]; 32]; 32];
println!("{}", ll.front().unwrap()[0][0][0][0][0]+1);
let mut ll = LinkedList::new(); // EXAMPLE 4: T = usize
ll.back_place() <- [[[[[0usize; 10]; 32]; 32]; 32]; 32];
println!("{}", ll.front().unwrap()[0][0][0][0][0]+1);
}
None of them work on debug mode, only 2 (Click to see answers
[usize; 10*1024*1024]
) works in release mode (yes, this contrasts to 1 ([u8; 10*1024*1024]
) which is a smaller type yet fails).
I'd also like to understand more about how placement-in behaves with more complex value expressions, but given I can't even predict the behaviour given the most trivial constructor for a type, that may be looking ahead a little.
Is there something I'm missing/doing wrong? Or perhaps this is a current known limitation of the (desugaring) implementation that will be fixed? If so, is there a todo I've missed (I can't see a relevant checkbox at the top of this issue)?
(edit: removed {}
to address comment below, makes no difference)
Putting the expression within a block ({}) moves its result, so rust has to
have where to move from and that ends up being the stack.
On Mar 2, 2017 3:58 AM, "aidanhs" notifications@github.com wrote:
I'm a little unclear of the expected semantics for expressions for
placement and when you can guarantee not touching the stack. The closest
I've found to an explanation among the assorted RFCs/discussion is the
draft in rust-lang/rfcs#470 https://github.com/rust-lang/rfcs/pull/470, section
2
https://github.com/pnkfelix/rfcs/blob/fsk-placement-box-rfc/text/0000-placement-box.md#section-2-semantics
which says "Evaluate theand write the result directly into
the backing storage." and the FAQs
https://internals.rust-lang.org/t/placement-nwbi-faq-new-box-in-left-arrow/2789
similarly say "evaluates VALUE into the previously allocated memory (that
is, do not put it onto a temporary stack slot)".The above seems a bit vague and I'm having difficulty predicting what's
going to work without blowing up the stack. I've put together a quiz of 4
simple examples, and you have to try and guess which will work in debug and
which in release (ideally they'd behave identically
https://github.com/rust-lang/rfcs/pull/1228#issuecomment-145356539 I
think?). I'm on Linux, rustc 1.17.0-nightly (be760566c 2017-02-28),
ulimit stack limit 8MB):// printlns are to prevent optimisation
![feature(placement_in_syntax)]
![feature(collection_placement)]
use std::collections::LinkedList;
fn main() {
// [T; 1010241024]
let mut ll = LinkedList::new(); // 1: T = u8
ll.back_place() <- { [0u8; 1010241024] };
println!("{}", ll.front().unwrap()[0]+1);
let mut ll = LinkedList::new(); // 2: T = usize
ll.back_place() <- { [0usize; 1010241024] };
println!("{}", ll.front().unwrap()[0]+1);// [[[[[T; 10]; 32]; 32]; 32]; 32] // 10*32^4 == 10*1024*1024 let mut ll = LinkedList::new(); // 3: T = u8 ll.back_place() <- { [[[[[0u8; 10]; 32]; 32]; 32]; 32] }; println!("{}", ll.front().unwrap()[0][0][0][0][0]+1); let mut ll = LinkedList::new(); // 4: T = usize ll.back_place() <- { [[[[[0usize; 10]; 32]; 32]; 32]; 32] }; println!("{}", ll.front().unwrap()[0][0][0][0][0]+1);
}
Click to see answers
None of them work on debug mode, only 2 ([usize; 1010241024]) works in
release mode (yes, this contrasts to 1 ([u8; 1010241024]) which is a
smaller type yet fails).
I'd also like to understand more about how placement-in behaves with more
complex value expressions, but given I can't even predict the behaviour
given the most trivial constructor for a type, that may be looking ahead a
little.Is there something I'm missing/doing wrong? Or perhaps this is a current
known limitation of the (desugaring) implementation that will be fixed? If
so, is there a todo I've missed (I can't see a relevant checkbox at the top
of this issue)?—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/27779#issuecomment-283532401,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AApc0onxXJfZQCinrEKxYNxRrnwj5Ihdks5rhiJQgaJpZM4Fqnws
.
Did you try it? Removing the {}
seems to make no difference whatsoever.
Ah, I guess this explains why it didn't work above (though I'm not sure how it was ever going to work in debug mode if it relies on LLVM optimizing?):
regarding return-value-optimization: the intention is that the desugaring described in the RFC hopefully presents the code in a manner where LLVM can optimize it accordingly. If we observe that LLVM fails to optimize the code as desired, then we can instead switch from a macro-based implementation to more integrated support within the whole compiler pipeline; but that will hopefully be unnecessary.
My inability to use the placement new feature aside, I've been working on something that would benefit greatly from placement, but the current design seems subideal. Consider serde deserialization with a large T
that no part of should touch the stack:
linkedlist.back_place() <- bincode::deserialize_from(&mut reader, SizeLimit::Infinite);
Even if RVO was guaranteed for this function, serde recurses to gather the fields so you'd get temporaries on the stack anyway (unless the RVO depth was guaranteed to be effectively infinite!).
When thinking about this I stumbled across this comment which suggests that a lack of familiarity with C++ placement (I have no familiarity) could inhibit understanding of the Rust design. I went and did some reading, but found the C++ implementation to map poorly onto typical Rust code.
Specifically, the approximate Rust equivalent to a C++ constructor has a &mut self
receiver and you can delete+new fields in-place at your leisure. But Rust construction generally happens via return value rather than mutation, so this placement-new design creates a place (like C++), gets the pointer (like C++)...but then has trouble passing the pointer through for in-place initialisation of each field, so defers to compulsory RVO and says "the result ends up here". Wait, what happens to the fields that get constructed into temporaries as part of creating the result!? They too can be big!
I think there's an opportunity here for Rust placement-new to be faster than C++, simply because C++ has to run the default constructor for object members before entering the constructor body (IIUC), but right now I'm looking at adapting serde to take *mut T
and use an offsetof
macro, i.e. unsafely emulating uninit/out pointers. If I imagine a world where uninit/out pointers are available, I think this RFC would look very different which gives me pause.
Thoughts/corrections gratefully received, particularly if they explain how one could guarantee in-place deserialize under the proposed model.
A fairly extensive number of data structures now implement placement-in protocol (tracking here: https://github.com/rust-lang/rust/issues/30172)
I'm concerned that the following use of the placer traits lands too close to arbitrary memory writes in safe code (playground link); in particular that the Placer trait returns a raw pointer and the desugaring trusts that pointer.
I'm sorry if this has come up before, I can't find it exactly, at least not with the full context.
(This comment was first posted in another issue, but this is the right place I think.)
IIRC raw pointer was chosen specifically because data behind the pointer
could be uninitialised. If we had #[repr(transparent)] there's an
alternative option we could probably use: &mut MaybeUninit
Do you have alternative ideas? Maybe more unsafety annotations?
On Sep 27, 2017 1:37 AM, "bluss" notifications@github.com wrote:
I'm concerned that the following use of the placer traits lands too close
to arbitrary memory writes in safe code (playground link)
https://play.rust-lang.org/?gist=fb67c9138789da84d47db92258788920&version=nightly;
in particular that the Placer trait returns a raw pointer and the
desugaring trusts that pointer.I'm sorry if this has come up before, I can't find it exactly, at least
not with the full context.(This comment was first posted in another issue, but this is the right
place I think.)—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/27779#issuecomment-332356191,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AApc0pQX6NNf0da-K_oNw8CEkLyvcbXUks5smXyXgaJpZM4Fqnws
.
Would this perhaps be useful here?
```rust
pub struct MutOnlyRef<'a, T: 'a> { data: &'a mut T }
impl<'a, T> MutOnlyRef<'a, T> {
pub fn new(data: &'a mut T) -> Self { Self { data } }
pub fn set(self, src: T) -> &'a mut T {
unsafe { ptr::write(self.data, src); }
self.data
}
}
Although the Place<Data>
trait allows Data: ?Sized
, the current design of the trait prohibits actual use with DSTs. The problem is the pointer
method:
pub trait Place<Data>
where
Data: ?Sized,
{
fn pointer(&mut self) -> *mut Data;
}
If Data
is a DST, there is no way to produce this *mut Data
without the metadata from the pointer being copied from. This is because raw fat pointers must always have valid metadata. Additionally, the metadata is needed to ensure the returned pointer has the right size and alignment.
I'm concerned that the following use of the placer traits lands too close to arbitrary memory writes in safe code (playground link); in particular that the Placer trait returns a raw pointer and the desugaring trusts that pointer.
@bluss, I would have thought we'd make Place
an unsafe
trait so implementors know that other code is depending on them giving a pointer to valid (albeit possibly uninitialised) memory and it is their responsibility to uphold that constraint. This seems to be an unspoken assumption with the placement-in protocol.
@Michael-F-Bryan I agree, that should be the solution if no other is suggested. It has been explicitly decided _against_ with a good goal (Make placement new easy to use widely and implement in safe code by composition) but since that fails to meet the requirement of "safe Rust is memory safe", we must go back to that if no other solution is found.
If Data is a DST, there is no way to produce this *mut Data without the metadata from the pointer being copied from. This is because raw fat pointers must always have valid metadata. Additionally, the metadata is needed to ensure the returned pointer has the right size and alignment.
A potential fix would be to add size
and align
parameters to make_place
, and change the return type of Place::pointer
to *mut ()
. The compiler can then create a fat pointer using this as the data pointer.
PR to change the Placer trait: #47299
I think it's worth starting a discussion about the future of this feature. This tracking issue has been open for about a year and a half and I'm not sure if there's a route to stabilisation when linkedlist.back_place() <- [0u8; 10*1024*1024]
works in debug mode but linkedlist.back_place() <- [[0u8; 10*1024*1024]]
does not.
Clearly everyone watching this issue would like a 'working' placement new but maybe we need a more concrete set of guarantees it's meant to fulfill/use cases to provide for before we can evaluate whether it's succeeded and is ready for stabilisation. For example, one criteria I would propose is that it is possible to do fully in-place deserialisation without needing to manufacturing pointers to struct fields from offsets.
The limitations of PLACE <- EXPR
as I see them right now: a fair amount of work has gone into getting the construction of places correct but not so much has gone into making it ergonomic for EXPR
to make use of the allocated space (I've previously commented that the RFC is a bit vague about this). One interim solution is just stabilising the placer stuff and requiring users to use pointers to actually do anything in place until there's an ergonomic solution, but pointers are horrible to work with.
I personally wouldn't object to taking this back to RFC to figure out how it will work for (say) serde and come up with a new solution. What does everyone else think about the state of placement today?
I'd like to know how many people subscribed to this issue only want to solve the "this code causes a spurious stack overflow in debug mode because it tries to call Box::new
with a giant array on the stack" problem. Is it possible to stabilise the minimum necessary to get let x = box 5;
working and heal this papercut, and let the people who are interested in more advanced use-cases discuss the rest? Or am I being naive due to lack of understanding?
@Ketsuban unless we have a complete solution, we end up drawing the line somewhere arbitrary. We can make box 0u8
work, but what about box [0u8; 10]
, box [[0u8; 10]]
, box SomeStruct { field: [0u8; 10] }
, box SomeStruct { field: makefield() }
, box SomeStruct::new()
- which one should we stop at?
That said, if lots of people really do just want to box a huge array I wouldn't discard the idea of a partial stabilisation out of hand (personally I just turn vecs into boxed slices).
FWIW I grepped crates.io for crates actually using this feature to see how many people would be broken if the feature was removed from nightly and went back to RFC:
Click to see results
# At least one version of these crates enabled the feature (and possibly used it)
allocators
core_collections
hashmap_core
jenga
light_arena
mbox
peg
printpdf
rayon-hash
scoped_allocator
seckey
skew-heap
wbuf
# Just a fork of libcore
avr-libcore
# Look like syntax related crates
clippy_linst
cpp_syn
easy-plugin
futures_await_syn
rfmt
rsyntax
rustfmt
syn
syntex_syntax
unrest_tmp_syn
I'm :-1: to stabilizing Placer
and BoxPlace
as-is since the current interface cannot support DST at all, as mentioned above in https://github.com/rust-lang/rust/issues/27779#issuecomment-354464938. The only way to get a DST via box x
currently is via unsizing or converting from other types like Vec
. This closes the potential extension of allowing box [x; dyn n]
.
@kennytm box [x; dyn n]
definitely looks doable, but I'm curious if you think the same is true for trait objects. For example, how would the following work?
let rc = Rc::new(5) as Rc<Clone>;
let bx: Box<Clone> = box (*rc).clone();
Here, BoxPlace::make_place()
needs to know the size and alignment of the Clone
trait object, which it can get from the vtable of the trait object returned from the call to clone
, but clone
can't return until it has an IntermediateBox
to write to, so there's a chicken-or-egg problem.
In this case, the vtable is the same as the Rc
's vtable, so what you would do is create a BoxPlace
using that vtable, and then use it as the return location for clone. I've had some ideas on how to make this more general, but nothing has stuck yet.
@mikeyhew I'm just confused. At least at present, the trait std::clone::Clone
cannot be made into an object. You didn't propose to change "object safety" rules, did you? Maybe another example can explain your arguments better.
@F001 you're right, and that's sort of what I'm getting at... I'm hoping we can solve the problem of returning DSTs from functions, and thus make Clone
object-safe.
One idea is to allow functions to return DSTs, as long as the return value is immediately "placed" somewhere – i.e. in a box
or <-
expression, like the example above with Clone
.
The thing is, for the box (*rc).clone()
expression to work, the executing program has to do things in the following order:
(*rc).clone()
<Rc as Boxed>::Place
for the return value, using its now-known size and alignment(*rc).clone()
in that placeOne way of doing this is to have Clone::clone
allocate the return value onto the stack, and then, before popping the stack frame for Clone::clone
, allocate the <Rc as Boxed>::Place
, and copy the return value into it1.
It would be nice, though, if we could avoid the extra copy and stack allocation by allocating the <Rc as Boxed>::Place
instead of doing the stack allocation. This would take some serious compiler magic though, as how would the implementation of Clone::clone
be able to support arbitrary placement types (e.g. Box
, Rc
, SomeDSTArenaType
)?
memmove
d from the top of the callee's stack frame to just above the top of the caller's stack frame, overwriting the callee's stack frame. Example syntax: let c: Clone = alloca (*rc).clone();
Another thing that we should really get to work is creating unsized structs with placement new:
struct Foo {
x: i32,
slice: [u8],
}
fn new(x: i32, slice: Box<u8>) -> Box<Foo> {
box Foo { x, *slice }
}
Which is probably a more sought after feature, and definitely simpler to solve, than DST returning functions.
I have created a PR to delete all the placement unstable features at https://github.com/rust-lang/rust/pull/48333, effectively taking placement back to pre-rfc - please contribute there if you have thoughts on the next steps for placement.
FYI: I moved to FCP @aidanhs's proposal to remove this feature.
@nikomatsakis any summary on what was wrong with original idea?
@Kixunil please see https://github.com/rust-lang/rust/pull/48333 -- and also note that one of the conditions for removal is writing up the problems and other issues. =)
A curious case has popped up where code like this causes a segfault:
#![feature(box_syntax)]
use std::mem;
enum Void {}
struct RcBox<T> {
_a: usize,
_b: T,
}
pub unsafe fn bar() {
mem::forget(box RcBox {
_a: 1,
_b: mem::uninitialized::<Void>(),
});
}
but avoiding the use of box
and instead using Box::new
fixes the issue. I'm not sure if this code is even supposed to work, though, and it may mean that whatever "fix" is in place for struct literals just hasn't made its way to the box
keyword yet.
(Note that I believe the plan is to remove placement new, whenever @aidanhs gets a chance to rebase https://github.com/rust-lang/rust/pull/48333)
Though that's a good one to remember for the future. Huh.
I suppose that the call to uninitialized
is basically considered UB.
Would it be possible to just avoid all the issues brought up with placement syntax and provide a ptr::write
like intrinsic that allowed directly writing a static Struct to a pointer as a stop-gap solution? As is, Rust does not really support large structs, because they're always allocated on the stack before writing, resulting in stack overflows. This would fill the basic need, while leaving all the other questions until later.
Placement new is imminently about to be/has been removed as an unstable feature and the RFCs unaccepted. The approved/merged PR is at #48333 and the tracking issues were at #22181 and #27779 (this issue). Note that this does not affect box syntax - there is a new tracking issue for that at #49733.
Find the internals thread where you can discuss this more at https://internals.rust-lang.org/t/removal-of-all-unstable-placement-features/7223. Please add any thoughts there. This is the summary comment.
As described in rust-lang/rfcs#470 (referred to by the accepted rust-lang/rfcs#809), the implementation of placement new should
Add user-defined placement
in
expression (more succinctly, "anin
expression"), an operator analogous to "placement new" in C++. This provides a way for a user to specify (1.) how the backing storage for some datum should be allocated, (2.) that the allocation should be ordered before the evaluation of the datum, and (3.) that the datum should preferably be stored directly into the backing storage (rather than allocating temporary storage on the stack and then copying the datum from the stack into the backing storage).
The summarised goals (from the same RFC text) are to be able to:
Now consider the description of C++ behaviour in https://isocpp.org/wiki/faq/dtors#placement-new and note that during construction, the this
pointer will point to the allocated location, so that all fields are assigned directly to the allocated location. It follows that we must provide similar guarantees to achive goal 3 (be competitive with C++), so the "preferably" in the implementation description is not strong enough - it is actually necessary.
Unfortunately, it is easy to show that rust does not construct objects directly into the allocation in debug mode. This is an artificially simple case that uses a struct literal rather than the very common Rust pattern of 'return value construction' (most new
functions).
It appears that the current implementation cannot competitive with C++ placement as-is. A new RFC might either propose different guarantees, or describe how the implementation should work given the very different method of construction in Rust (compared to C++). Straw man: "A call to a fn() -> T
can be satisfied by a fn(&uninit T)
function of the same name (allowing you to assign fields directly in the function body via the uninit reference)".
As described by the C++ goals for placement (mentioned above), placement is typically used because you need to have explicit control over the location an object is put at. We saw above that Rust fails in very simple cases, but even if it didn't there is a more general issue - there is no feedback to the user whether placement is actually working. For example, there is no way for a user to tell that linkedlist.back_place() <- [0u8; 10*1024*1024]
is placed but linkedlist.back_place() <- [[0u8; 10*1024*1024]]
is not.
Effectively, placement as implemented today is a 'slightly-better-effort to place values than normal assignment'. For an API that aims to offer additional control, this unpredictability is a significant problem. A new RFC might provide either provide clear guidance and documentation on what placement is guaranteed, or require that compilation will fail if a requested placement cannot succeed. Straw man 1: "Placement only works for arrays of bytes. Function calls (e.g. serde or anything with fallible creation) and DSTs will not work". Straw man 2: "If a same-name fn(&uninit T)
does not exist for the fn() -> T
call being placed, compilation will fail".
There are a number of specific unresolved questions around the RFC(s), but there has been effectively no design work for about 2 years. These include (some already covered above):
Placer::make_place
- [2]More speculative unresolved questions include:
[0] https://github.com/rust-lang/rfcs/pull/470
[1] https://github.com/rust-lang/rfcs/pull/809#issuecomment-73910414
[2] https://github.com/rust-lang/rfcs/issues/1286
[3] https://github.com/rust-lang/rfcs/issues/1315
[4] https://github.com/rust-lang/rust/issues/27779#issuecomment-146711893
[5] https://github.com/rust-lang/rust/issues/27779#issuecomment-285562402
[6] https://github.com/rust-lang/rust/issues/27779#issuecomment-354464938
[7] https://github.com/rust-lang/rfcs/pull/1228#issuecomment-190825370
[irlo1] https://internals.rust-lang.org/t/placement-nwbi-faq-new-box-in-left-arrow/2789
[irlo2] https://internals.rust-lang.org/t/placement-nwbi-faq-new-box-in-left-arrow/2789/19
[irlo3] https://internals.rust-lang.org/t/lang-team-minutes-feature-status-report-placement-in-and-box/4646
I've opted to list these rather than going into detail, as they're generally covered comprehensively by the corresponding links. A future RFC might examine these points to identify areas to explictly address, including (in no particular order):
@scottjmaddox I responded to you on the internals thread - https://internals.rust-lang.org/t/removal-of-all-unstable-placement-features/7223/2.
Closing since the unaccepting PR has been merged.
@aidanhs this is the tracking RFC for box_syntax
as well, according to the Unstable Book.
@abonander good point. I've created a new tracking issue for just box syntax, updated my summary comment above and made a post on the thread in the internals forum.
@aidanhs You'll need to update the tracking issue number in the Rust source as well.
@kennytm I've filed PR #51066 for this
@aidanhs thank you for the summary! I've finally understood it well.
Most helpful comment
Placement new is imminently about to be/has been removed as an unstable feature and the RFCs unaccepted. The approved/merged PR is at #48333 and the tracking issues were at #22181 and #27779 (this issue). Note that this does not affect box syntax - there is a new tracking issue for that at #49733.
Find the internals thread where you can discuss this more at https://internals.rust-lang.org/t/removal-of-all-unstable-placement-features/7223. Please add any thoughts there. This is the summary comment.
So why remove placement?
The implementation does not fulfil the design goals
As described in rust-lang/rfcs#470 (referred to by the accepted rust-lang/rfcs#809), the implementation of placement new should
The summarised goals (from the same RFC text) are to be able to:
Now consider the description of C++ behaviour in https://isocpp.org/wiki/faq/dtors#placement-new and note that during construction, the
this
pointer will point to the allocated location, so that all fields are assigned directly to the allocated location. It follows that we must provide similar guarantees to achive goal 3 (be competitive with C++), so the "preferably" in the implementation description is not strong enough - it is actually necessary.Unfortunately, it is easy to show that rust does not construct objects directly into the allocation in debug mode. This is an artificially simple case that uses a struct literal rather than the very common Rust pattern of 'return value construction' (most
new
functions).It appears that the current implementation cannot competitive with C++ placement as-is. A new RFC might either propose different guarantees, or describe how the implementation should work given the very different method of construction in Rust (compared to C++). Straw man: "A call to a
fn() -> T
can be satisfied by afn(&uninit T)
function of the same name (allowing you to assign fields directly in the function body via the uninit reference)".The functionality of placement is unpredictable
As described by the C++ goals for placement (mentioned above), placement is typically used because you need to have explicit control over the location an object is put at. We saw above that Rust fails in very simple cases, but even if it didn't there is a more general issue - there is no feedback to the user whether placement is actually working. For example, there is no way for a user to tell that
linkedlist.back_place() <- [0u8; 10*1024*1024]
is placed butlinkedlist.back_place() <- [[0u8; 10*1024*1024]]
is not.Effectively, placement as implemented today is a 'slightly-better-effort to place values than normal assignment'. For an API that aims to offer additional control, this unpredictability is a significant problem. A new RFC might provide either provide clear guidance and documentation on what placement is guaranteed, or require that compilation will fail if a requested placement cannot succeed. Straw man 1: "Placement only works for arrays of bytes. Function calls (e.g. serde or anything with fallible creation) and DSTs will not work". Straw man 2: "If a same-name
fn(&uninit T)
does not exist for thefn() -> T
call being placed, compilation will fail".Specific unresolved questions
There are a number of specific unresolved questions around the RFC(s), but there has been effectively no design work for about 2 years. These include (some already covered above):
Placer::make_place
- [2]More speculative unresolved questions include:
[0] https://github.com/rust-lang/rfcs/pull/470
[1] https://github.com/rust-lang/rfcs/pull/809#issuecomment-73910414
[2] https://github.com/rust-lang/rfcs/issues/1286
[3] https://github.com/rust-lang/rfcs/issues/1315
[4] https://github.com/rust-lang/rust/issues/27779#issuecomment-146711893
[5] https://github.com/rust-lang/rust/issues/27779#issuecomment-285562402
[6] https://github.com/rust-lang/rust/issues/27779#issuecomment-354464938
[7] https://github.com/rust-lang/rfcs/pull/1228#issuecomment-190825370
[irlo1] https://internals.rust-lang.org/t/placement-nwbi-faq-new-box-in-left-arrow/2789
[irlo2] https://internals.rust-lang.org/t/placement-nwbi-faq-new-box-in-left-arrow/2789/19
[irlo3] https://internals.rust-lang.org/t/lang-team-minutes-feature-status-report-placement-in-and-box/4646
I've opted to list these rather than going into detail, as they're generally covered comprehensively by the corresponding links. A future RFC might examine these points to identify areas to explictly address, including (in no particular order):