Heap, a.k.a. priority_queue, is a very common data structure in most languages.
|LANG|LIB|
|---|---|
|C++|std::priority_queue, Boost.Heap|
|Python|Lib heapq.py|
|Rust|binary_heap|
|Go|/src/container/heap|
Dijkstra's algorithm for finding the shortest path in weighted graph
Previous work has been done and documented. I'm not going to copy them here because the webpage has a better format.
Here are some difference:
record heap{
//...
record comparator=defaultComparator
proc init(type eltType, record comparator = defaultComparator, param parSafe = false) {}
// Returns an unordered array of elements
proc const toArray(): [] eltType {}
// Returns an ordered array of elements by poping. The heap will be empty after invoking
proc consume(): [] eltType {}
// Iterating the heap
iter these(): eltType {}
// Let pushHeap be overloads of push, not separated procedures.
// Question: Does the `range` overload of `push` make sense to you?
//...
}
// Rename makeHeap to createHeap
pushHeap, popHeap et al: Can't these be just overloads of push and pop?create prefix. So makeHeap should be createHeap instead. range overloads for adding stuff to the heap is interesting. I am not sure if that makes great sense to have it in the interface, though. It feels weird to me, but can't really explain why. (So I am not against it strongly)theseList methods should be added to heap as well. Mostly for the sake of completeness, consistency and ability to do generic programming using different data structures. Mainly count and remove jumped off as "would be interesting to have in the heap" kind of methods.pushHeap, popHeap et al: Can't these be just overloads of push and pop?
@e-kayrakli Yes. I think it should be.
range overloads for adding stuff to the heap is interesting. I am not sure if that makes great sense to have it in the interface, though. It feels weird to me, but can't really explain why. (So I am not against it strongly)
I'm not sure, either. I just thought it would be useful as a shortcut for looping integers and pushing them, which is commonly seen.
I think we need these
This is kind of tricky. Heap as it is can only offer users unordered iteration. It doesn't quite make sense for a user to iterate all the elements in random order. The use case would be very limited.
If a user does want to get all the elements out of the heap, I think in this case toArray would be very useful. I will add that. Also, I don't see heap in other languages has such way to iterate.
As a more general feedback, I am also wondering whether some of the List methods should be added to heap as well. Mostly for the sake of completeness, consistency and ability to do generic programming using different data structures. Mainly count and remove jumped off as "would be interesting to have in the heap" kind of methods.
Yeah, I agree with you. As I mentioned above, toArray would be great to have. However, remove and count are not possible for a heap without great time complexity.
I'm not sure, either. I just thought it would be useful as a shortcut for looping integers and pushing them, which is commonly seen.
I agree that that can be useful. But it may be one of the particular questions that you want to ask when asking for a review. For particular questions, you can always edit your original post above and list some questions that you want your API reviewers to answer.
This is kind of tricky. Heap as it is can only offer users unordered iteration. It doesn't quite make sense for a user to iterate all the elements in random order. The use case would be very limited.
If a user does want to get all the elements out of the heap, I think in this case toArray would be very useful. I will add that. Also, I don't see heap in other languages has such way to iterate.
OK you're right. I was thinking it'd be cool to have a helper that would just iterate the heap in order, but that would mean popping elements out of it. There can still be a helper that does that, though right? heap.consume() or something maybe. I am not thinking very strongly about it, but just an idea.
heap.consume()
It makes sense to have toArray returning unordered elements and consume returning ordered elements. I will update the issue.
I had a similar hesitance as @e-kayrakli to the push variation that took a range. I'd be inclined to not add this until someone requests it, or perhaps to have an overload that took a general loop expression (_iteratorRecord) which would support passing in a loop over a range or something else (but even for that, I might wait until someone requested it).
In terms of these-style iterators, does it matter if the yielded elements are unordered? (where I assume you mean unordered by value?) For example, iterating over a hash table or map also returns unordered keys/values in many cases, yet is still useful (for example, if wanting to compute a histogram of the values). Does toArray() necessarily create a duplicate copy of the heap data? (which could be expensive if the heap were very large).
pushHeap, popHeap et al: Can't these be just overloads of push and pop?
@e-kayrakli It just comes to me, what happens if users do want to push a list into the heap, not the elements of the list?
In terms of these-style iterators, does it matter if the yielded elements are unordered?
@bradcray Not really matters. It's fine as long as the users are aware of that.
Does toArray() necessarily create a duplicate copy of the heap data?
I think so. So we can have both.
It just comes to me, what happens if users do want to push a list into the heap, not the elements of the list?
I would imagine we could distinguish between these cases by looking at the heap type. I.e., if a t is pushed into a heap(t), it's a single element. If a list(t) or "array of t" is pushed into heap(t) then it's multiple elements. So if we're pushing in list(t) and the heap is heap(t), we push multiple elements; if it's heap(list(t)) we push it in as one.
Exactly what Brad said. To give an example, the best way to do that is just adding some overloads with specific types and maybe one that is fully generic (or takes an _iteratorRecord):
proc push(x: eltType) // push a single element
proc push(x: list(eltType)) // push a list
proc push(x: [] eltType) // push an array
proc push(x) // try to iterate x and push all elements one by one?
Arguably, unless we do something specific for list and [] overloads, we can omit them and let the generic one handle it.
Also see proc ref extend versions in List implementation.
I know other languages call a priority queue a heap, but do we have to continue this vague naming and instead call it something like priorityQueue or something like this.
I'm considering deleting the init= from List/Array since there is no way to pass a comparator instance to init=. Also I don't think it's reasonable to implicitly convert a List/Array into a Heap. Users should explicitly call createHeap to do that.
Most helpful comment
Exactly what Brad said. To give an example, the best way to do that is just adding some overloads with specific types and maybe one that is fully generic (or takes an
_iteratorRecord):Arguably, unless we do something specific for
listand[]overloads, we can omit them and let the generic one handle it.Also see
proc ref extendversions inListimplementation.