This tracks the stabilization of BinaryHeap::into_iter_sorted
(binary_heap_into_iter_sorted
) and BinaryHeap::drain_sorted
(binary_heap_drain_sorted
) implemented in https://github.com/rust-lang/rust/pull/65091.
EDIT much later:
into_sorted_vec
) need to be addressed before a future attempt to stabilize.Original feature request:
Currently, both BinaryHeap::into_iter
and BinaryHeap::drain
yield the heap elements in an arbitrary order. This seems counter-intuitive given the nature of a heap. Sadly, we can't simply change their behavior behind the scenes, as we would have to remove the implementation of DoubleEndedIterator
. Should we perhaps add into_ordered_iter
and drain_ordered
? The implementation is pretty straightforward so I'd be happy to submit a PR.
Isn't into_sorted_vec().into_iter()
sufficient?
@clarfon No, because getting the largest k
items from that is O(n lg n + k)
, whereas with into_ordered_iter()
it would be O(k lg n)
.
@clarfon In addition to what @scottmcm said, into_sorted_vec().into_iter()
takes up space proportional to the number of items in the heap because you have to create the Vec
first. That means that you've roughly doubled the amount of memory you need, which could be a concern either when you are on more limited systems, or when you're dealing with very large amounts of data.
@ckaran into_sorted_vec
actually doesn't take up any space at all; it reuses the same buffer.
@clarfon Sorry, I was being unclear. I was thinking that you'd want to continue using the heap afterwards. IIRC, into_sorted_vec()
will consume the collection (which, you're right, is done in place), which means that you need to do a clone on the heap first, which is what I was really commenting on. Sorry about being unclear.
Hi,
I've implemented both API in the local rust tree.
I see some problems in the API and I'm proposing the better version at the bottom of this comment.
Any ideas?
.into_ordered_iter(self)
/// Returns an iterator which retrieves elements in heap order.
/// This method consumes the original heap.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(binary_heap_into_ordered_iter)]
/// use std::collections::BinaryHeap;
/// let heap = BinaryHeap::from(vec![1, 2, 3, 4, 5]);
///
/// assert_eq!(heap.into_ordered_iter().take(2).collect::<Vec<_>>(), vec![5, 4]);
/// ```
#[unstable(feature = "binary_heap_into_ordered_iter", issue = "59278")]
pub fn into_ordered_iter(self) -> IntoOrderedIter<T> {
IntoOrderedIter {
inner: self,
}
}
.drain_ordered(&mut self)
/// Returns an iterator which retrieves elements in heap order.
/// The retrieved elements will be removed from the original heap.
///
/// Note: Unlike other `.drain()` methods, this method removes elements *lazily*.
/// Only the retrieved elements will be removed.
///
/// # Examples
///
/// Basic usage:
///
/// ```
/// #![feature(binary_heap_drain_ordered)]
/// use std::collections::BinaryHeap;
///
/// let mut heap = BinaryHeap::from(vec![1, 2, 3, 4, 5]);
/// assert_eq!(heap.len(), 5);
///
/// let removed = heap.drain_ordered().take(2).collect::<Vec<_>>();
/// assert_eq!(removed, vec![5, 4]);
/// assert_eq!(heap.len(), 3);
///
/// let removed = heap.drain_ordered().take(3).collect::<Vec<_>>();
/// assert_eq!(removed, vec![3, 2, 1]);
/// assert_eq!(heap.len(), 0);
/// ```
#[inline]
#[unstable(feature = "binary_heap_drain_ordered", issue = "59278")]
pub fn drain_ordered(&mut self) -> DrainOrdered<'_, T> {
DrainOrdered {
inner: self,
}
}
BinaryHeap
doc example could be updated after this feature would be stabilized..drain()
inconsistency: The semantics of .drain_ordered()
is lazy. Other all .drain() method immediately removes elements..into_ordered_iter(&mut self)
..into_ordered_iter(self)
and .drain_ordered(&mut self)
. I picked more useful one and changed the name.~.drain_ordered()
method.~-> ~Solves P2 and P3. Partially solves P1 (discoverability).~
Both owning and borrowing versions are useful. I can't return a version that borrows from a local variable, but I can return the version that moves it.
That said, I agree that drop(foo.drain_ordered());
should result in an empty foo
. But that can be accomplished by just clearing in the iterator's drop (assuming it's safe so doesn't need to amplify).
(I'm not worried about discoverability here; people have presumably been taught about the difference in things like Vec
, which also has .into_iter()
and .drain(..)
.)
Thanks @scottmcm for the comment. I updated my comment accordingly.
@jonhoo, are you still working on PR for this?
@sekineh nope, not currently working on a PR
I've updated the issue to reflect that this now tracks the features implemented in https://github.com/rust-lang/rust/pull/65091.
@KodrAus (not sure if you're the right person to ping for this) This should probably get C-tracking
.
I kind of feel like in-order iteration should be the default for a heap. I was surprised to learn that the IntoIterator
impl yields items in arbitrary order.
Also the destructor of DrainSorted
will currently leak items if one of their destructors panics.
Most helpful comment
@clarfon No, because getting the largest
k
items from that isO(n lg n + k)
, whereas withinto_ordered_iter()
it would beO(k lg n)
.