Changing this to a tracking issue for the feature:
Weak::new()) and T: ?Sized? These seem to be incompatible.Hello
The Arc has the into_raw and from_raw methods. I think it would be technically possible to have the same on Weak. Obviously, as Weak is non-owning, it would be up to the caller to make sure the pointer is not dangling when used.
Would adding these (and maybe as_raw too ‒ one can get a reference out of Arc, but not from Weak) require an RFC, because the handling of these methods might be even a bit more delicate than the ones on Arc, or is this considered small enough for just a pull request & later stabilization?
I've written the arc-swap crate that allows to keep an Arc around but make it point to some other object atomically. It uses the mentioned methods. It would make sense to have weak version of this atomic storage too.
Additionally, the as_raw would make it possible to compare if eg an Arc and Weak point to the same object (which would also come handy in some alternative of the Cache that doesn't hold the object alive needlessly).
Would adding these (and maybe
as_rawtoo ‒ one can get a reference out ofArc, but not fromWeak) require an RFC, because the handling of these methods might be even a bit more delicate than the ones onArc, or is this considered small enough for just a pull request & later stabilization?
A PR is sufficient; the motivation can be justified in there.
@rust-lang/libs is there any opposition to stabilizing at least .as_raw()? It's hard to imagine wanting that function looking any different than it does today, and it's pretty useful (I don't know another way to deduplicate weak pointers to the same object).
When you bring it up, what needs to be done for stabilizing it all? The API follows the one on Arc, so I don't think any of this could look differently.
There's the problem with unsized Ts ‒ currently, they are simply not supported (because of the interaction with dangling & null pointers). But relaxing the restriction later would be backwards compatible change, so I don't think a decision what exactly to do about that has to block stabilization.
Relaxing the restriction of T to ?Sized I don't think is compatible with the current safety requirements. On top of that, I think that as is currently written, it's self-conflicting.
[...]
It takes ownership of one weak count. In case a
nullis passed, a danglingWeakis returned.Safety
The pointer must represent one valid weak count. In other words, it must point to
Twhich is or _was_ managed by an (A)Rcand the weak count of that (A)Rcmust not have reached 0. It is allowed for the strong count to be 0.[...]
The safety requirement requires that the pointer point to a valid ArcInner/RcBox (side note: probably should make those naming consistent). But then it also says that it accepts null? The reason for this is that as_raw/into_raw return null when the strong count is 0 when the weak is fake (self.inner() returns None).
But on top of that, this is wrong (I just realized writing this up)! Because from_raw(into_raw(_)) on a dangling Weak turns it into the singleton dangling Weak, the weak count of the original will never be decreased! The correct behavior is for Weak::into_raw to return the dangling pointer such that Weak::from_raw can recover the original pointer to allocation and decrease the weak count. This then is fully compatible with T: ?Sized as it's doing the exact same as the non-weak version.
As for as_raw... it's "fine" for it to return null as a helpful hint (as dereferencing null is a faster error), but I think it's more important for as_raw(_); forget(_) to be equivalent to into_raw(_).
It's been a while since I wrote it, so I went back to have a look. I think it is slightly different than how you describe it. If I read through it correctly, there are three kinds of weak pointer values:
strong_count() > 0.strong_count() = 0. They've run the destructor on the T, but the RcBox is still there.Weak::new(). They don't have any RcBox allocated for them and their pointer is set to a sentinel value (usize::MAX casted to pointer).The into_raw()/from_raw() return null() only for the last case. There's no weak count to speak of in that case, so nothing is forgotten to be decreased.
The code needs to special-case the sentinel value somehow anyway, it can't just add the offset to it, because it would overflow and that is AFAIK forbidden thing to do with pointers. It could also return the sentinel value (usize::MAX), but null seems to be as good sentinel value as that other one, with the difference users recognize it better. It also has proper alignment for the type, which other code might rely on (by putting flags into the low bits). The downside is that pointers to unsized types don't have null() and if I remember correctly, this was the reason why they are not currently supported.
I can try changing the sentinel value and see if I can add support for unsized types instead.
As for the documentation, if I just remove all notions of null and keep it to the fact the pointer must come from previous Weak and not specify what value is returned in case it is from Weak::new(), would it be better?
Ah, I admit, I read the code wrong.
Ok, I agree returning null for "fake" weak pointers is probably the correct sentinel. (The use of usize::MAX as a sentinel for the inner pointer only works because the RcBox/ArcInner has align(usize) > 1, but the T doesn't have that guarantee, so usize::MAX is a _theoretically_ valid pointer. I think a T pointer within an RcBox/ArcInner is guaranteed to have at least align(inner) because of the alignment of the container, but it'd be better to not rely on that.)
But the nonexistence of ptr::null for ?Sized types isn't really an issue. The reason it doesn't exist is the inability for it to conjure pointer metadata out of thin air. We have that!
I don't know the exact way std would accomplish this, but "all" that needs to be done is to pull the metadata off of the inner pointer, and put it onto a null pointer. (And then do the same thing in from_raw.)
For docs, I'd just specify that the pointer given to Weak::from_raw must have come from Weak::as_raw or Weak::into_raw.
I've tried updating the docs, I hope it's for the better.
But the nonexistence of ptr::null for ?Sized types isn't really an issue. The reason it doesn't exist is the inability for it to conjure pointer metadata out of thin air. We have that!
I don't know the exact way std would accomplish this, but "all" that needs to be done is to pull the metadata off of the inner pointer, and put it onto a null pointer. (And then do the same thing in from_raw.)
It's not the problem with null pointers per se, because Weak::new exists for Sized types only too. It's simply illegal to put null pointer into into_raw for unsized type for that reason. However the sized implementation needs to check for null equality. It would be fine if this turned out to be false every time for unsized types, but as the null() doesn't even exist, it doesn't compile :-(. I think specialization would help here but AFAIK it's not ready to be used yet. Or is it?
But, after clarifying the documentation, is there anything that would prevent adding that specialization/relaxing the bounds in the future in backwards compatible manner?
Hello? I know this is probably not a high priority feature, but it also seems somewhat small and I'd really like to push it over the finish line somehow. Is this waiting on something from me? I believe the raised concerns about docs were hopefully addressed.
@rfcbot fcp merge
Team member @Amanieu has proposed to merge this. The next step is review by the rest of the tagged team members:
Concerns:
as_raw should be as_ptrWeak are represented using nullOnce a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
I would prefer if we made no guarantee as to what pointer value is generated for a dangling Weak. This could allow for a slightly more efficient implementation since we could just blindly apply the offset of usize::MAX to convert to/from a raw pointer.
@rfcbot concern dangling Weak are represented using null
The convention for returning raw pointers to a value is as_ptr. There is no precedent for as_raw in the standard library.
@rfcbot concern as_raw should be as_ptr
@Amanieu Consider not only that other types use as_ptr, but that Arc/Rc already have both into_raw and from_raw. I don't think any type actually has both the "as_ptr" and "into_raw" operations yet:
as_ptr: slice, str, Cell, RefCell, CStr, MaybeUninit, NonNull, Vecinto_ptr: (none)from_ptr: CStras_raw: (unstable) rc::Weak, sync::Weakinto_raw: Box, CString, Rc, Arc, (unstable) rc::Weak, sync::Weakfrom_raw: ExitStatus, ExitStatusExt, Box, CString, Rc, Arc, Waker, (unstable) rc::Weak, sync::WeakI think sticking to _raw methods for the Rc family is more consistent than using as_ptr here. Consider also: does from_raw(as_ptr()) look like it should be allowed? from_raw(as_raw()) is clearly using the same "raw representation".
There's also the fact that I would think "as_ptr" would be &*rc, i.e. deref but to a pointer, not a ref. as_raw is more subtle because it keeps mutable provenance over the location, and as such can be used for from_raw, unlike a deref. (The standard library got this wrong previously!)
Some small-ish concerns:
Arc/Rc should have as_raw/as_ptr to mirror the API surface of Weak. Note that &*arc is _not enough_ for as_raw, see the link above where the standard library got this wrong.
@rfcbot concern api should be mirrored on Arc/Rc as well
I want to make sure that these functions on Weak are forwards-compatible with supporting T: ?Sized in the future. When the requirements on pointer metadata for invalid (non-derefable) pointers are clear, the standard library should be able to manufacture some arbitrary pointer metadata for the dangling Weak case (at least for some subset of T: ?Sized).
@rfcbot concern forwards-compatible to T: ?Sized
I don't know if the rfcbot commands will work for me (they probably won't), but might as well try.
Consider also: does from_raw(as_ptr()) look like it should be allowed? from_raw(as_raw()) is clearly using the same "raw representation".
No it doesn't look like it should be allowed, and that's a good thing! You really should be using into_raw instead of as_ptr for this.
In fact I would be in favor of removing as_ptr + forget as a valid way of constructing a pointer that can be used with from_raw, specifically because of the provenance issues you mentioned earlier. The only valid pointer that can be passed to from_raw is one created by into_raw. I don't think this will affect any practical use cases but do let me know if I'm wrong.
I want to make sure that these functions on Weak are forwards-compatible with supporting T: ?Sized in the future.
It is relatively easy to support T: ?Sized if we drop the guarantee that dangling Weaks return a null pointer. We can just let the pointer dangle in that case.
I would prefer if we made no guarantee as to what pointer value is generated for a dangling Weak
I would prefer to be able to create dangling pointer from null (or, as dangling weak pointers exist, it's reasonable to assume that one will be created if I feed it null). But considering any pointer returned from weak may already be dangling because the count had dropped to 0, I think the other guarantee didn't really exist. I'll go over the docs and see if there are any mentions of guarantees.
Do you think I should also change the implementation, so nobody relies on it by accident?
I don't think this will affect any practical use cases but do let me know if I'm wrong.
I'll have to review my code in arc-swap. I suspect I might actually be using it, but I'm not 100% sure. There are some cases around lock-free algorithms when I need the pointer but I don't know yet if storing it into AtomicPtr will work out or not. So I store the as_ptr and then do forget or into_raw when it's confirmed. I probably could do into_raw first and then, if it fails, do from_raw again to recover.
Do you think I should also change the implementation, so nobody relies on it by accident?
Yes, I think that's a good idea. Note that you'll need to use wrapping_offset instead of offset to support the dangling pointers.
@Amanieu
In fact I would be in favor of removing as_ptr + forget as a valid way of constructing a pointer that can be used with from_raw, specifically because of the provenance issues you mentioned earlier. The only valid pointer that can be passed to from_raw is one created by into_raw. I don't think this will affect any practical use cases but do let me know if I'm wrong.
I use Arc::as_raw (currently spelled Arc​::​into_raw​(​unsafe​ { ptr​::​read​(this) })) rather pervasively to implement ArcBorrow<'_, _>, which is effectively "&T but can be upgraded to Arc<T>." This is "+0" passing for Arc, as opposed to double-indirect passing (&Arc<T>) or "+1" passing (Arc<T>).
I'm _not_ doing as_raw + forget, I'm doing as_raw + clone_raw.
If it can't be used for from_raw, it definitely should be as_ptr instead of as_raw. But I see real use cases (well, mostly just +0 passing) for as_raw, but not much use for as_ptr beyond implementing ptr_eq. (What can you do with a Weak::as_ptr? Not deref it, not without knowing you have a strong ref around somewhere, and you can't check that from the ptr if you can't turn it back into a Weak.)
Fair enough, let's leave the semantics of as_raw as they are right now and just rename it to as_ptr for consistency with the rest of libstd. Since this is clearly something that would be useful for Rc/Arc it would make sense to add a method there as well.
Yes, I think that's a good idea. Note that you'll need to use wrapping_offset instead of offset to support the dangling pointers.
While implementing it, I've noticed one thing. The current implementation will return a pointer with invalid alignment. Is that Ok?
Since this is clearly something that would be useful for Rc/Arc it would make sense to add a method there as well.
Here, as part of this feature?
I think it's fine to return an unaligned pointer (you're not allowed to dereference it anyways). We should document it however.
Here, as part of this feature?
Sure, might as well.
May I ask what happens now? I believe I've addressed the concerns and it was merged, but they are still listed here. How are they removed?
@rfcbot resolve dangling Weak are represented using null
@rfcbot resolve as_raw should be as_ptr
:bell: This is now entering its final comment period, as per the review above. :bell:
Hmm, I don't know about that as_ptr. Forcing the user into using into_ptr is a great lint that would've prevented some bug's I experienced in C++ code had C++ have it. @CAD97 use case is legit, though. Does that one use case justify "removing a lint"?
@Kixunil Can you explain what you mean? It was renamed to as_ptr to hint that it should not be interchangeable with the into_raw. The documentation no longer suggests it should be possible to use as_ptr to re-create it with from_raw and then forget it. So I think your concern was already resolved?
The final comment period, with a disposition to merge, as per the review above, is now complete.
As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.
The RFC will be merged soon.
@vorner I mean not having it at all. :) into_raw is much more useful in my experience. If not for @CAD97 use case, I would believe as_ptr to be completely useless.
There are other use cases. Like, hashing it, equivalence checking, etc.
@vorner aren't those cases better served by using a reference and converting it? Seems more obvious what it does.
What reference do you have in mind? &Weak<T> is not useful for these cases (where you want to check if eg. an atomic pointer still points to the same thing as the Weak you have cached or if you want to recreate a new one). You don't get &T out of a Weak and have to upgrade it to the Arc. But that is both more expensive and might return None ‒ but you might want to know that the Weak you have is still the one you should have even if it is no longer valid. The as_ptr on Arc is there mostly for symmetry.
But I don't really understand your issue. What kind of misuse are you afraid of? as_ptr doesn't seem to allow more foot guns than into_raw ‒ you still get a raw pointer out of that and can do whatever you like with it. You also need to use unsafe to do anything dangerous with it, which should be a lint enough to stop people from doing stupid things by accident.
I assume that when the FPC is over, I can enjoy submitting a stabilization PR.
You don't get
&Tout of aWeak
Ah, yeah, I missed that. I had strong ones in mind mostly. Might be a good reason to have ptr_eq and hash_ptr methods instead, but pointer is definitely more flexible. But the docs should probably explain the pointer is useless for anything else than comparison.
I was mostly wondering about as_ptr being confused with into_ptr in case of strong versions. A sad thing about these kinds of bugs is when combined with multithread environment they can be really hard to debug.
(Personal experience: code sending a pointer from one thread to another waiting on select over a Unix pipe, the sender forgot to ref the pointer. Took years to find the cause of mysterious crashes.)
Might be a good reason to have ptr_eq and hash_ptr methods instead
That wouldn't allow customizing how you hash or against what you compare the pointer (I'd say ptr_eq should take another Weak as a parameter and in case of storing the pointer out-of-bound, you have *const T, maybe).
But the docs should probably explain the pointer is useless for anything else than comparison.
But that is not true. The pointer is fully usable as long as you make sure by some other means that the lifetime of the object is still valid, same as with any other pointer.
Yeah, it's possible to do it by other means, I just can't imagine a real-life situation in which one would have a Weak and know that it's valid without also having Rc/Arc and using a reference to those instead.
That all being said, it's not a huge deal, I guess. Mainly wanted this to be considered.
Most helpful comment
When you bring it up, what needs to be done for stabilizing it all? The API follows the one on
Arc, so I don't think any of this could look differently.There's the problem with unsized
Ts ‒ currently, they are simply not supported (because of the interaction with dangling & null pointers). But relaxing the restriction later would be backwards compatible change, so I don't think a decision what exactly to do about that has to block stabilization.