Rust: Add Min-Max Heap to `std::collections`

Created on 2 Sep 2020  路  7Comments  路  Source: rust-lang/rust

I just read a blog post about the Min-Max heap being faster than the Binary Heap on modern processors, while being strictly more powerful since you can pop from both ends: https://probablydance.com/2020/08/31/on-modern-hardware-the-min-max-heap-beats-a-binary-heap/

Given these benefits, and since BinaryHeap is included in std::collections, I wonder if it would make sense to include a MinMaxHeap as well?

A-collections C-feature-request T-libs

Most helpful comment

I wonder if it's possible to do better than min-max heap by forgoing min-max property? Ie, an array can pack three interleaving tetrary max heaps.This give you the same parallel comparisons + four consequtive chilren in a row property.

EDIT: very quick&dirty benchmark did not disprove hypothesis that just making the heap four way makes it faster: https://github.com/matklad/quadheap/blob/91eb33870fc49280dda1e9be1fa904fc93d7364c/src/quad_heap.rs

EDIT EDIT: this is stupid and overly complicated. We can pack just one tetray heap into the array (i -> 4i + 1, 4i + 2, 4i + 3, 4i + 4) and that should be the best impl.

All 7 comments

This can be written as an external crate, has that been done already?

Also, if this really ends up being a more powerful and faster BinaryHeap, but with the same requirements on the user, then BinaryHeap can just be changed to a Min-Max heap internally. No new API needed.

This can be written as an external crate, has that been done already?

https://crates.io/crates/min-max-heap seems to be an existing implementation

Also, if this really ends up being a more powerful and faster BinaryHeap, but with the same requirements on the user, then BinaryHeap can just be changed to a Min-Max heap internally. No new API needed.

Hmm, can it? I thought that Vec::from(BinaryHeap::from(vec)) was defined to return the items in binary heap order, on which someone could potentially be depending. As you've said before, though, it sure would be nice to have .into_iter() be both sorted and double-ended for heaps...

I wonder if it's possible to do better than min-max heap by forgoing min-max property? Ie, an array can pack three interleaving tetrary max heaps.This give you the same parallel comparisons + four consequtive chilren in a row property.

EDIT: very quick&dirty benchmark did not disprove hypothesis that just making the heap four way makes it faster: https://github.com/matklad/quadheap/blob/91eb33870fc49280dda1e9be1fa904fc93d7364c/src/quad_heap.rs

EDIT EDIT: this is stupid and overly complicated. We can pack just one tetray heap into the array (i -> 4i + 1, 4i + 2, 4i + 3, 4i + 4) and that should be the best impl.

Hmm, can it? I thought that Vec::from(BinaryHeap::from(vec)) was defined to return the items in binary heap order, on which someone could potentially be depending.

The current comment for BinaryHeap::into_vec says that it "returns the underlying vector in arbitrary order", while the impl<T> From<BinaryHeap<T>> for Vec<T> just states "Performs the conversion" so I guess it shouldn't be considered a breaking change,

@matklad if you want to benchmark that you should probably remove those bound checks.

We should also consider removing branches from sift_down_range and sift_down_to_bottom. In particular the hole.get(child) <= hole.get(right) condition can be treated arithmetically and the right < end condition can be removed if the length is odd.
This applies to both the current implementation and @matklad's quadheap.

After the conversation about trying to stabilize BinaryHeap::into_iter_sorted ended up making me unhappy with the state of things there (https://github.com/rust-lang/rust/pull/76234#issuecomment-708505210), I'm personally more interested that I was previously in seeing this happen.

Having such a heap would address a bunch of things:

  • https://github.com/rust-lang/rust/issues/58174 talks about how the current one being a max-heap isn't inherently obvious, and while BinaryHeap<Reverse<T>> works great, there's something really nice about just having .pop_min() and .pop_max() for clarity.
  • A new heap could remove the iterate-in-unspecified-order problems of the existing one, and the min-max nature would mean that there'd be the natural .into_iter() for ascending, .into_iter().rev() for descending. (There'd still be Vec::from(...) or maybe some sort of .buffer_unordered() -> &[T] if people want to do something that doesn't care about the ordering.)

In a way it'd be like the VecDeque-vs-Vec pair.

Was this page helpful?
0 / 5 - 0 ratings