Currently PriorityQueue defaults to a max-heap and if you want to use it as a min-heap you need to overload Ord and change the comparison.
But you should be able to use the default ordering and simply specify that you want a min-heap, so we can use it for types like (uint, &str) for example.
The simple idea would be to simply make a new_min_heap and a min_heap_with_capacity constructors.
Thoughts?
I would tend to favor composability over duplication such as this, so something along the lines of a "invert ord" wrapper may work well.
I agree, but how would we realize it?
Using a newtype with derived traits?
#[deriving(InvertedOrd)]
struct Key<T>(T);
let mut pq = PriorityQueue::new();
pq.push((Key(2), 10u));
Or have a generic struct which implements the Ord trait?
struct InvertOrd<T>(T);
// Not sure how this would work?
impl<T> Ord for InvertOrd<T> { ... }
let mut pq = PriorityQueue::new();
pq.push((InvertOrd(2i), 10u));
@treeman here's a rough impl for the struct: http://is.gd/UnblpD
Unfortunately, this means the user has to actively wrap/unwrap the value, and be exposed to the internal comparison details. Wouldn't something like this make more sense?
struct PriorityQueue <T> {
comparator: fn(&T, &T) -> Ordering
//other stuff
}
impl <T> PriorityQueue<T> {
fn with_comparator (comparator: fn(&T, &T) -> Ordering) -> PriorityQueue<T> {
PriorityQueue{ comparator: comparator, /* other stuff */}
}
}
fn natural <F:Ord> (a: &F, b: &F) -> Ordering { a.cmp(b) }
fn rev <F:Ord> (a: &F, b: &F) -> Ordering {
match a.cmp(b) {
Equal => Equal,
Less => Greater,
Greater => Less,
}
}
impl <T: Ord> PriorityQueue<T> {
fn natural_ordering () -> PriorityQueue<T> {
PriorityQueue::with_comparator(natural)
}
fn rev_ordering () -> PriorityQueue<T> {
PriorityQueue::with_comparator(rev)
}
fn new() -> PriorityQueue<T> {
PriorityQueue::natural_ordering()
}
}
Then a priority queue can be instantiated with whatever comparison logic is desirable, without the need to wrap values.
It would be really slow if it used a function pointer.
You're probably right. I'm not familiar with how Rust can optimize/inline some of these things. Would a closure or proc be better? Is there any way to convince Rust "hey, create a concrete instance of this type, but with this one method's behavior changed", that isn't slow? It just seems to me like ordering the elements should be the responsibility of the structure, not the data itself. I guess we could template the priority queue on a comparator object? Would that be fast enough?
Adding a comparator type parameter would work. It would need to use the closure traits.
@thestinger I hacked together a very quick proof-of-concept, and threw it and std at my priorityqueue bench suite:
test compheap::bench::fill_and_drain_ord_big ... bench: 51640108 ns/iter (+/- 3253218)
test compheap::bench::fill_and_drain_ord_small ... bench: 266878 ns/iter (+/- 31967)
test compheap::bench::fill_and_drain_unord_big ... bench: 35833288 ns/iter (+/- 1339425)
test compheap::bench::fill_and_drain_unord_small ... bench: 213275 ns/iter (+/- 40535)
test compheap::bench::fill_and_pop_ord_big ... bench: 22460836 ns/iter (+/- 2083325)
test compheap::bench::fill_and_pop_ord_small ... bench: 114854 ns/iter (+/- 19998)
test compheap::bench::fill_and_pop_unord_big ... bench: 6600282 ns/iter (+/- 315886)
test compheap::bench::fill_and_pop_unord_small ... bench: 64512 ns/iter (+/- 6894)
test compheap::bench::from_iter_ord_big ... bench: 22290821 ns/iter (+/- 1092106)
test compheap::bench::from_iter_ord_small ... bench: 113892 ns/iter (+/- 10048)
test compheap::bench::from_iter_unord_big ... bench: 6617579 ns/iter (+/- 547505)
test compheap::bench::from_iter_unord_small ... bench: 62940 ns/iter (+/- 8329)
test compheap::bench::mixed_access_ord_big ... bench: 21986290 ns/iter (+/- 799167)
test compheap::bench::mixed_access_ord_small ... bench: 104077 ns/iter (+/- 7628)
test compheap::bench::mixed_access_unord_big ... bench: 18078055 ns/iter (+/- 1066857)
test compheap::bench::mixed_access_unord_small ... bench: 94110 ns/iter (+/- 8743)
test stdheap::bench::fill_and_drain_ord_big ... bench: 47650548 ns/iter (+/- 1200098)
test stdheap::bench::fill_and_drain_ord_small ... bench: 249045 ns/iter (+/- 15685)
test stdheap::bench::fill_and_drain_unord_big ... bench: 33808985 ns/iter (+/- 3187699)
test stdheap::bench::fill_and_drain_unord_small ... bench: 196874 ns/iter (+/- 17739)
test stdheap::bench::fill_and_pop_ord_big ... bench: 20377909 ns/iter (+/- 580098)
test stdheap::bench::fill_and_pop_ord_small ... bench: 110093 ns/iter (+/- 21624)
test stdheap::bench::fill_and_pop_unord_big ... bench: 6178293 ns/iter (+/- 332516)
test stdheap::bench::fill_and_pop_unord_small ... bench: 60956 ns/iter (+/- 4306)
test stdheap::bench::from_iter_ord_big ... bench: 20635709 ns/iter (+/- 1683518)
test stdheap::bench::from_iter_ord_small ... bench: 110586 ns/iter (+/- 27977)
test stdheap::bench::from_iter_unord_big ... bench: 6438810 ns/iter (+/- 1299145)
test stdheap::bench::from_iter_unord_small ... bench: 61043 ns/iter (+/- 7613)
test stdheap::bench::mixed_access_ord_big ... bench: 20329609 ns/iter (+/- 2862116)
test stdheap::bench::mixed_access_ord_small ... bench: 97260 ns/iter (+/- 4522)
test stdheap::bench::mixed_access_unord_big ... bench: 16724153 ns/iter (+/- 739112)
test stdheap::bench::mixed_access_unord_small ... bench: 87819 ns/iter (+/- 7974)
(Note Ord and Unord are actually reversed since I wrote this for minheaps)
There's clearly a performance regression, on the order of ~8%, is that an acceptable cost for the flexibility gained?
Edit: this is using their natural orderings, in case it wasn't clear.
It would be more than 8% after fixing the existing high performance cost from making the code safe during unwinding. The cost of indirect calls is also lower in a micro-benchmark than the real world where the entire program doesn't fit in the cache and there are a lot of branches. It's unclear which optimization level you're using and what you're using the measure the performance anyway.
My benchmarks are here. Tested them with uints. They basically just build an iterator of uints of size 100 or 10000 (anything more just takes _forever_ to bench) with different ordering properties. The individual bench names are fairly self explanatory.
I used the default optimization provided by cargo (and presumably rustc). Is there any access pattern and/or compiler flags you'd like to see? Or do I even need to justify using comparators to gain flexibility?
Edit: Oh I just noticed your note about closure traits. My proof-of-concept just made actual (empty) structs implementing a trait with a cmp(&self, &T, &T) method. Would closure traits provide more efficient options here?
Rust doesn't optimize by default, without passing -O the numbers aren't meaningful.
Would closure traits provide more efficient options here?
It would allowing passing a function/closure without defining a type every time.
A stored function pointer will have high overhead and a good implementation with a type parameter would have zero overhead.
Oh wow, I didn't realize how _massive_ the performance gap was from default and -O:
test compheap::bench::fill_and_drain_ord_big ... bench: 1513563 ns/iter (+/- 223270)
test compheap::bench::fill_and_drain_ord_small ... bench: 9195 ns/iter (+/- 2527)
test compheap::bench::fill_and_drain_unord_big ... bench: 1891539 ns/iter (+/- 136903)
test compheap::bench::fill_and_drain_unord_small ... bench: 8976 ns/iter (+/- 3260)
test compheap::bench::fill_and_pop_ord_big ... bench: 459496 ns/iter (+/- 25370)
test compheap::bench::fill_and_pop_ord_small ... bench: 6011 ns/iter (+/- 656)
test compheap::bench::fill_and_pop_unord_big ... bench: 373232 ns/iter (+/- 42998)
test compheap::bench::fill_and_pop_unord_small ... bench: 6111 ns/iter (+/- 703)
test compheap::bench::from_iter_ord_big ... bench: 459903 ns/iter (+/- 61074)
test compheap::bench::from_iter_ord_small ... bench: 6004 ns/iter (+/- 366)
test compheap::bench::from_iter_unord_big ... bench: 372277 ns/iter (+/- 29313)
test compheap::bench::from_iter_unord_small ... bench: 5873 ns/iter (+/- 1416)
test compheap::bench::mixed_access_ord_big ... bench: 673766 ns/iter (+/- 215907)
test compheap::bench::mixed_access_ord_small ... bench: 3790 ns/iter (+/- 808)
test compheap::bench::mixed_access_unord_big ... bench: 739408 ns/iter (+/- 113694)
test compheap::bench::mixed_access_unord_small ... bench: 4281 ns/iter (+/- 1127)
test stdheap::bench::fill_and_drain_ord_big ... bench: 1402292 ns/iter (+/- 117071)
test stdheap::bench::fill_and_drain_ord_small ... bench: 9844 ns/iter (+/- 2735)
test stdheap::bench::fill_and_drain_unord_big ... bench: 1851510 ns/iter (+/- 224613)
test stdheap::bench::fill_and_drain_unord_small ... bench: 8839 ns/iter (+/- 1712)
test stdheap::bench::fill_and_pop_ord_big ... bench: 461084 ns/iter (+/- 26943)
test stdheap::bench::fill_and_pop_ord_small ... bench: 6025 ns/iter (+/- 878)
test stdheap::bench::fill_and_pop_unord_big ... bench: 377311 ns/iter (+/- 45955)
test stdheap::bench::fill_and_pop_unord_small ... bench: 6176 ns/iter (+/- 987)
test stdheap::bench::from_iter_ord_big ... bench: 462564 ns/iter (+/- 51519)
test stdheap::bench::from_iter_ord_small ... bench: 5851 ns/iter (+/- 853)
test stdheap::bench::from_iter_unord_big ... bench: 370572 ns/iter (+/- 53719)
test stdheap::bench::from_iter_unord_small ... bench: 6070 ns/iter (+/- 1134)
test stdheap::bench::mixed_access_ord_big ... bench: 655789 ns/iter (+/- 98459)
test stdheap::bench::mixed_access_ord_small ... bench: 3678 ns/iter (+/- 403)
test stdheap::bench::mixed_access_unord_big ... bench: 724466 ns/iter (+/- 79852)
test stdheap::bench::mixed_access_unord_small ... bench: 4221 ns/iter (+/- 593)
Percentage-wise, perf difference is about the same, if not smaller. I'll start making a less hacked-together impl that uses closure traits. I'm guessing one source of overhead might be checking cmp(a, b) == Greater vs just *a > *b.
Sorry, which traits exactly do you consider the "closure" ones?
Hmm, how do want to combine the notion of specifying comparators with all the other different constructors? Should they all require a comparator to be passed to them? Or should we provide _with_comparator variants, and have the normal ones requires C: Comparator<T> + Default?
Nice job @Gankro. The interface looks about the same as I initially had imagined.
Personally I would like with_* variants. There is something related in HashMap which has:
fn new()
fn with_capacity()
fn with_hasher()
fn with_capacity_and_hasher()
so I imagine something similar would work here?
@treeman @thestinger I've got a rough design of PriorityQueue with Comparators. I'd like to discuss design before I work on tests and cleaning up docs. Should I just make a gist or something? Or submit a WIP pull request?
I think a WIP pull request would be fine.
Pull request up at #16047
@Gankro what's the state of this these days?
collect-rs has an experimental design for Comparators. We're baking it out-of-tree for now. Should be back-compat to add later with default generics.
Interested in this as well.
There is now a revord crate, which can be used like (I believe):
extern crate revord;
let mut pq = BinaryHeap::new();
pq.push((RevOrd(1), "foo"));
pq.push((RevOrd(0), "bar"));
pq.push((RevOrd(2), "baz"));
assert_eq!(pq.pop(), (RevOrd(0), "bar"));
@steveklabnik Considering https://github.com/rust-lang/rust/pull/40720, maybe this can be closed?
We discussed this during triage today and while we'd like to see this feature it'd likely be done with comparators nowadays, which would require an RFC, so we decided to close this.
Apologies for poking an old issue, but I just ran into a situation where a min-heap would be useful. I'm currently working around it by wrapping all the elements I put on the heap in std::cmp::Reverse, but the (wrap->push), (pop->unwrap) operations feel sort of clumsy. Additionally, if I didn't have to wrap, I could construct the heap with from(vec: Vec<T>), where I'm doing from_iter with a map right now.
@alexcrichton's last message mentioned doing this with comparators, which I assume meant redoing the BinaryHeap to use comparators for item comparison, and allowing the user to specify the comparator. I'm not sure where this is though, as I believe comparators are still in a separate crate? (https://github.com/contain-rs/compare)
Is there anything I could do to help work on this?
I have a binary-heap-plus crate that is backward compatible with std鈥檚 BinaryHeap.
You might find it useful:
https://crates.io/crates/binary-heap-plus
Is there some reason that we can't just use std::cmp::Reverse? Perhaps we can extend it to implement Deref for the interior value.
For min heap only support, Reverse is OK.
But there are use cases where you need to compare by arbitrary order. Of course C++ has this.
Sure, I mean, it's easy enough to implement an arbitrary Ord and friends. Reverse just gives a convenient shortcut.
Implementing Ord and friends includes a logic duplication (PartialOrd etc) and not simple enough, in my opinion.
why was this closed ? was this implemented ?