Rfcs: Idea: Looking up holes in BTreeMaps

Created on 26 Feb 2017  路  1Comment  路  Source: rust-lang/rfcs

Binary trees are useful for more than just key and range lookups. They are also commonly used to do interval lookups ("is x within a set of non-overlapping intervals?"), or neighborhood lookups ("is a value near x present?"). But this requires the ability to look up values that are "near" some key x, not just the value for key x. Currently, there is no way with a BTreeMap to query for the elements on either side of such a "hole". If you're willing to pay some additional complexity and do two lookups, you can use range:

fn neighbors_of<K, V>(map: &BTreeMap<K, V>, key: &K) -> (Option<(&K, &V)>, Option<(&K, &V)>) {
    let before = map.range(Bound::Unbounded, Bound::Included(x));
    let after = map.range(Bound::Included(x), Bound::Unbounded);
    (before.next_back(), after.next())
}

Arguably, the function above should actually return the full Range for both before and after, but then the caller would need to remember to use next_back() on before. But none of this feels particularly ergonomic.

Instead, I think BTreeMap should provide an implementation of neighbors_of that is analogous to the definition above. By being implemented internally, it could be more efficient than the example above, including not having to construct an iterator. This would enable the use-cases describe above without the indirection through two range queries.

A further enhancement could be a BTreeMap::search(), which returns an unbounded DoubleEndedIterator starting at the largest element <= x (which you can't do with range() today). But without a concrete use-case for this, that should probably be considered a secondary concern?

T-libs

Most helpful comment

neighbors_of only returning single values instead of iterators might be a bad idea. Each invocation would be O(log n). So if someone tried to walk the tree that way for m steps they would end up with O(m log n) runtime. search returning two iterators, one for each semi-range, on the other hand would only take O(m + log n) steps and exploit the btree's cache-friendly nature. I think a better name would be split_at returning two subviews of the map

>All comments

neighbors_of only returning single values instead of iterators might be a bad idea. Each invocation would be O(log n). So if someone tried to walk the tree that way for m steps they would end up with O(m log n) runtime. search returning two iterators, one for each semi-range, on the other hand would only take O(m + log n) steps and exploit the btree's cache-friendly nature. I think a better name would be split_at returning two subviews of the map

Was this page helpful?
0 / 5 - 0 ratings

Related issues

torkleyy picture torkleyy  路  3Comments

mqudsi picture mqudsi  路  3Comments

steveklabnik picture steveklabnik  路  4Comments

3442853561 picture 3442853561  路  3Comments

p-avital picture p-avital  路  3Comments