Rust: Tracking issue for sorted collection ranges

Created on 13 Aug 2015  Â·  92Comments  Â·  Source: rust-lang/rust

This covers btree_range and collections_bound. Both try to address the combinatorics over inclusive/exclusive/none bounds on ordered queries. I am significantly unsatisfied with the current solution, which is btree.range(Bound(&K), Bound(&K)). This pushes all the combinatorics into a nasty calling convention. Instead, I believe we would be better served by a build pattern:

for x in btree.range().ge(&min).lt(&max) { .. }

This allows you to avoid importing and dispatching on enum. It also has the advantage of making simpler queries simpler. Compare:

// today (assuming you've totally imported the bound variants)
for x in btree.range(Unbounded, Inclusive(&max)) {

// tomorrow?
for x in btree.range().le(&max) { .. }

This also could potentially address https://github.com/rust-lang/rfcs/pull/1195

let pred = btree.range().le(&max).get();

And can be extended to support drain or remove similarly;

for x in btree.range_mut().gt(&min).drain() { .. }
ler pred = btree.range_mut().le(&max).remove();

Which if desirable can be sugarred to direct methods on btree in the future.

B-unstable E-help-wanted T-libs final-comment-period

Most helpful comment

I need this in Servo. Please stop bikeshedding and stabilize this.

All 92 comments

CC @bluss @apasel422

Will this still allow for efficient implementations of things like just the
predecessor (without iteration)?

On Wednesday, August 12, 2015, Alexis Beingessner [email protected]
wrote:

CC @bluss https://github.com/bluss @apasel422
https://github.com/apasel422

—
Reply to this email directly or view it on GitHub
https://github.com/rust-lang/rust/issues/27787#issuecomment-130498741.

@apasel422 Yes the builder is completely lazy -- get would just extract the one set bound and do lt/le/gt/whatever as appropriate.

I should note the builder API is _considerably_ more hairy internally. Basically requires converting all the static types into dynamic types and then dispatching to the existing impls (for IntoIter) or those proposed in rust-lang/rfcs#1195

Shoutouts to @zentner-kyle for proposing this API. Big improvement.

In practice, with the parent pointer design I anticipate the guts of BTreeMap having the query API just do raw pointers and be an internal impl detail: get_lt(&BTreeMap, &Q) -> Handle<*mut Node>

A range knows how to compute handles from bounds by dispatching to the query API, and then get or into_iter are just munging up handles into a (K, V) pair or Iter(Handle, Handle).

@apasel422 does this sound good to you?

CC @cgaebel @glaebhoerl @jooert

I have made an RFC to this affect: https://github.com/rust-lang/rfcs/pull/1254

Removing the B-unstable tag as I believe these haven't been implemented.

@alexcrichton range and range_mut have been around since 1.0, I think. But they won't be stabilized as-is.

Oh oops there are indeed some unstable APIs tagged with this.

What's the state of this? I'm using btree ranges in production code and would be happy to help as much as I can to stabilize this feature.

I also would like to know the status as I'm building a program which uses ranges and if there is an idiomatic rust solution I'd love to use that over my own implementation.

My understanding is that we're waiting on a new RFC in lieu of rust-lang/rfcs#1254.

How can we contribute to the discussion or at least keep track of the remaining issues?

@naktinis A pre-RFC or general discussion thread on https://internals.rust-lang.org/ would probably work.

cc https://github.com/rust-lang/rust/issues/33197, a case where the APIs currently segfault apparently

It seems to me that the extension Bound provides over Range is very small. I wonder why .range() doesn't take Range instead.

@ticki Range is [start, end), but Bound provides inclusive/exclusive/unbounded on both ends.

It would make sense to make it generic over a trait implemented for both Range and (Bound, Bound)

One important usecase not covered by the Range and Bound proposals is retrieving floor and ceiling entries and iterators. Java's Navigable[Map|Set] provides this, and C++ can do it through lower_bound plus an iterator decrement.

I would suggest using a Cursor-based API for this, like I did in intrusive-collections. In particular the upper_bound and lower_bound methods on RBTree return a cursor that points to the first element satisfying the given bound (which can be Bound::Included or Bound::Excluded). Of course this is in addition to providing the range method which works just like the one in BTreeMap and returns a normal iterator.

While this remains unfixed there's effectively no stable sorted-map in std::collections :-(.

Here's another use-case not handled by range() as it stands: let the key type be (A,B) ordered lexicographically for two ordered types A and B. Unless B has min and max elements (which not all types do) we cannot express bounds for a range which matches (a, _anything_) i.e. a specific value from A.

If instead range bounds were expressed as 'monotone predicates' you could easily handle cases like the one above (which I mention because it's a real requirement for something I want to do).

Alternatively, if you could bisect using a monotone predicate, something like first_true(&self, p: P) -> Option<&K> that would give you all you'd need to find concrete range bounds.

Unless B has min and max elements (which not all types do) we cannot express bounds for a range which matches (a, anything) i.e. a specific value from A.

Actually, I think this is kinda-supported by the current signature, though it's a bit hard to use.

The trick is that the upper and lower bounds are not required to be of type K. They can be independent types Min and Max with the constraint that the key type K can be "borrowed" into the Min/Max type, and the ordering over Min/Max is used. I think you can use (abuse?) this to make Min/Max extend K with artificial values representing any boundary points you want. In your example, Min/Max could be a tuple (A, B') where B' is an enum extending B with artificial min and max values. Then you can implement Borrow on K to inject K into Min/Max.

This would be a bit clearer if BTreeMap was more like the bst crate and was parameterized with an explicit Compare object whose operators don't require the LHS and RHS to have the same type.

@rocallahan threw down the gauntlet on his blog. Maybe we can move this forward.

🔔 This issue is now entering a cycle-long final comment period 🔔

After much deliberation the libs team has come to a conclusion about what to do with these methods. They have gone through much iteration but there are key drawbacks with the current few proposals we have:

  • The methods today, which take two enums, are wildly unergonomic.
  • The methods proposed in https://github.com/rust-lang/rfcs/pull/1254 work as a builder style, but one day we would also like language syntax to express all kinds of ranges which would obsolete these methods.
  • Waiting for language syntax for all kinds of ranges is likely to take quite awhile.

With all that in mind, we've come up with the following steps for stabilizing these APIs:

  • First, the RangeArgument trait will change to return a Bound from each of its methods. This will allow it to express all kinds of ranges that we'd like to specify.
  • Second, the current BTreeMap::range methods will be changed to take T: RangeArgument, similar to Vec::drain today.
  • Next, we'll implement RangeArgument for (Bound<T>, Bound<T>) so all possible bounds are expressible today, even without language syntax just yet.
  • After a cycle, stabilize the BTreeMap::range methods
  • At a future point, also stabilize the RangeArgument trait.

The idea here is that we can roughly have our cake and eat it too. We get nice and ergonomic access to the range methods which are expressive enough for all kinds of queries on the B-Tree types. We also get the ability to express _all_ kinds of ranges at the same time not only for B-Tree collections but also all other collections which take T: RangeArgument. Finally, if and when we do add language syntax for all kinds of ranges, we simply need to implement the RangeArgument trait for the new structs and you'll be able to immediately start using the new syntax for all the queries.

Our current plan is to implement these changes to RangeArgument very soon this cycle with an eye towards stabilizing BTreeMap::range this cycle as well. Note that RangeArgument will likely remain unstable for another cycle or two.

Now with all that out of the way, the libs team is very curious to hear others' thoughts!

I don't find the current syntax "wildly unergonomic", personally, but your plan sounds good.

I have no objection to anything proposed by alexcrichton, but I do still think it would be useful to be able to construct a range from a pair of monotone predicates.

While rocallahan's suggestion works for the case I brought up, it is extra work for the user to provide an extension of the key type, and I'm not sure whether it can be done in complete generality. (Maybe the extension type could add in the monotone predicates, and you could probably provide an incomplete implementation of Ord knowing that it would never be used to compare two predicates.)

The implementation of range does not directly use the endpoints in the bounds: it uses predicates like |x: &K| x > k. So I think that making range (or a similarly-named sibling) work for predicates should make the implementation slightly simpler while increasing the set of use-cases that are handled.

@ogoodman I assume you mean a signature like the following:

impl<K: Ord, V> BTreeMap<K, V> {
    pub fn range<From, To>(&self, from: From, to: To) -> Range<K, V> where
        From: FnMut(&K) -> bool,
        To: FnMut(&K) -> bool;
}

with usage like

let map: BTreeMap<String, i32> = ...

for (k, v) in map.range(|k| k >= "bar", |k| k < "foo") {
    ...
}

Is that right?

@rocallahan hm true, I may have exaggerated a bit :)

@ogoodman do you have an example in mind where some form of range isn't suitable? That is, today's instance of these methods should be equally as expressive as what we're thinking of implementing, but it sounds like you want functionality beyond what's implemented even today?

I commonly use sorted maps to store intervals, but it requires having a floor operation (find greatest key less than or equal to some value). Perhaps the Bound enum could be extended to have Floor and Ceil variants. A sketch of how it would work:

data = { 1, 3, 5, 7, 9 }

data.range(Floor(4), Ceil(6)) => { 3, 5, 7 }
data.range(Ceil(2), Floor(6)) => { 3, 5 } // a Ceil(n) lower bound is equivalent to Included(n) lower bound
                                          // a Floor(n) upper bound is equivalent to Included(n) upper bound

and to be clear about what I mean about storing intervals, consider a data set with the following intervals:

[0, 10)
[20, 25)
[100, 200)

and I frequently want to ask questions like get me the interval (if it exists) which contains the value 22, or give me all intervals which intersect with the interval [22, 125):

data = btree_map!((0, 10), (20, 25), (100, 200));

let contains_22 = data.range(Floor(22), Included(22)).next();
let intersection = data.range(Floor(22), Floor(125));

@danburkert I don't understand, can't you already do this with the existing bounds?

data = btree_map!((0, 10), (20, 25), (100, 200));

let contains_22 = data.range(Included(22), Included(22)).next();
let intersection = data.range(Included(22), Excluded(125));

EDIT: Nevermind, I understand what you're getting at. However what you want in this case is an interval tree, which is a different data structure.

An interval tree is used for overlapping intervals, and is consequently much more complex. What I'm asking for is common, and supported by the Java NavigableMap API and the C++ map API.

@danburkert In that case what you're probably after is a Cursor interface to a BTreeMap. Have a look at my RBTree type in intrusive-collections. The lower_bound and upper_bound methods return a Cursor objects which points to a single element in the tree. You can then move the cursor to the next/previous element and look at them.

There are also _mut versions of these methods which return a CursorMut that allows removing/replacing/inserting elements at the cursor position.

@Amanieu do you have any reasons _not_ to have Floor and Ceil bound variants? I agree that cursors would be also solve my specific usecase, but as far as I can tell they aren't being considered for the standard library.

@danburkert It's just that I don't really understand what you mean by those bounds. How exactly do they differ from Included & Excluded. Also they don't really fit in with the other enum variants. I should probably write an RFC for cursors at some point...

I think my interval example is muddying the waters. The definition of floor and ceiling bounds are:

Floor(n) - a bound inclusive of the greatest element less than or equal to n
Ceil(n) - a bound inclusive of the smallest element greater than or equal to n

There is no way to simulate this with just Inclusive or Exclusive unless the resulting Range can be expanded (similar to a cursor or C++ iterator).

@ogoodman:

(Maybe the extension type could add in the monotone predicates, and you could probably provide an incomplete implementation of Ord knowing that it would never be used to compare two predicates.)

Yes, I'm pretty sure that works and could be packaged into something quite convenient to use (outside std).

@danburkert:
I'm using this interface for intervals myself. I find the greatest element less than or equal to n using range(Unbounded, Included(n)).next_back(), and the smallest element greater than or equal to n using range(Included(n), Unbounded).next().

@apasel422 Yes, that looks about right. I'm still learning rust so I didn't feel in a position to propose a signature.

@alexcrichton My first message here gave an example which is difficult with the current range bounds, namely: given a key type of (A,B) lexicographically ordered, come up with a range which matches a fixed a and any b in B. If B has minimum and maximum elements you can use inclusive range bounds (a, b_min) to (a, b_max) but not all types have min and max elements, e.g. String has no max, and arbitrary precision numbers have neither.

Recently, I've been trying to use this API and ran across a limitation (probably the same as @ogoodman's) in the interaction between range, Borrow and Ord. Basically, given some set BTreeSet<(A, B, C)> it would be nice to be able to (generically) query for all (a, b, *). This isn't currently possible without knowing some C::MIN, C::MAX. So, I'd like to propose an alternative solution: Instead of taking a range that has two bounds, take some opaque type that implements some OrdQuery (excuse the name) trait:

/// Applied some sorted sequence of values of type `T`, `OrdQuery::cmp` should
///  yield zero or more `Ordering::Less`, then zero or more `Ordering::Equal`, then
/// zero or more `Ordering::Greater`.
trait OrdQuery<T> where T: Ord {
    fn cmp(&self, &T) -> Ordering;
}

This means we can support queries like BTreeSet::<(A, B, C)>::query((a, b)) (note, I'm using query instead of range because it better describes the operation) in addition to things like my_set.query(0..100). Users could even define this trait on custom types to allow for custom lookups:

#[derive(Ord, PartialOrd, Eq, PartialEq, Debug)]
struct Person {
    last_name: String.
    first_name: String,
    age: u32,
}

struct ByFamilyName<'a>(&'a str)
struct ByName<'a>(&'a str,&'a str);

trait OrdQuery<Person> for ByFamilyName {
    fn cmp(&self, record: &Person) -> Ordering {
        self.0.cmp(&record.last_name)
    }
}
trait OrdQuery<Person> for ByName {
    fn cmp(&self, record: &Person) -> Ordering {
        (self.1, self.0).cmp(&(&record.last_name, &record.first_name))
    }
}

//...
map.query(ByName("Steven", "Allen"));
map.query(ByFamilyName("Allen"));

Something like:

use std::cmp::Ordering;
use std::ops::{Range, RangeFull};

trait OrderedRange<T: ?Sized> where T: Ord {
    fn range_cmp(&self, other: &T) -> Ordering;
}

impl<'a, T: 'a, A> OrderedRange<A> for &'a T
    where T: OrderedRange<A>, A: Ord
{
    fn range_cmp(&self, other: &A) -> Ordering {
        (&**self).range_cmp(&other)
    }
}

// Small tuple prefixes. Would use a macro to do more...
// Also implement for (&A, ...) so we can form tuples without cloning/copying.

impl<A, B> OrderedRange<(A, B)> for (A,)
    where A: Ord, B: Ord
{
    fn range_cmp(&self, other: &(A, B)) -> Ordering {
        self.0.cmp(&other.0)
    }
}

impl<'a, A, B> OrderedRange<(A, B)> for (&'a A,)
    where A: Ord, B: Ord
{
    fn range_cmp(&self, other: &(A, B)) -> Ordering {
        self.0.cmp(&other.0)
    }
}

impl<A, B, C> OrderedRange<(A, B, C)> for (A,)
    where A: Ord, B: Ord, C: Ord
{
    fn range_cmp(&self, other: &(A, B, C)) -> Ordering {
        self.0.cmp(&other.0)
    }
}

impl<'a, A, B, C> OrderedRange<(A, B, C)> for (&'a A,)
    where A: Ord, B: Ord, C: Ord
{
    fn range_cmp(&self, other: &(A, B, C)) -> Ordering {
        self.0.cmp(&other.0)
    }
}

impl<A, B, C> OrderedRange<(A, B, C)> for (A, B)
    where A: Ord, B: Ord, C: Ord
{
    fn range_cmp(&self, other: &(A, B, C)) -> Ordering {
        (&self.0, &self.1).cmp(&(&other.0, &other.1))
    }
}

impl<'a, A, B, C> OrderedRange<(A, B, C)> for (&'a A, &'a B)
    where A: Ord, B: Ord, C: Ord
{
    fn range_cmp(&self, other: &(A, B, C)) -> Ordering {
        (self.0, self.1).cmp(&(&other.0, &other.1))
    }
}

// Cover the ranges...

impl<T> OrderedRange<T> for Range<T>
    where T: Ord
{
    fn range_cmp(&self, other: &T) -> Ordering {
        if *other <= self.start {
            Ordering::Greater
        } else if *other > self.end {
            Ordering::Less
        } else {
            Ordering::Equal
        }
    }
}

impl<'a, T: 'a> OrderedRange<T> for Range<&'a T>
    where T: Ord
{
    fn range_cmp(&self, other: &T) -> Ordering {
        if *other <= *self.start {
            Ordering::Greater
        } else if *other > *self.end {
            Ordering::Less
        } else {
            Ordering::Equal
        }
    }
}

// Would like to use `Borrow` here but can't (not even specialization helps)...
impl<'a> OrderedRange<String> for Range<&'a str> {
    fn range_cmp(&self, other: &String) -> Ordering {
        if &**other <= self.start {
            Ordering::Greater
        } else if &**other > self.end {
            Ordering::Less
        } else {
            Ordering::Equal
        }
    }
}

impl<T> OrderedRange<T> for RangeFull where T: Ord {
    fn range_cmp(&self, _: &T) -> Ordering {
        Ordering::Equal
    }
}

// And for the other ranges...

// Now, for BTreeSet:

impl<T> BTreeSet<T> where T: Ord {
    fn query<Q: OrderedRange<T>>(&self, range: Q) -> btree_set::Range {
        /* ... */
    }
}

@Stebalien why is https://github.com/rust-lang/rust/issues/27787#issuecomment-231277344 not working for you?

@rocallahan the Borrow trait explicitly forbids this. That is, a.borrow().cmp(b.borrow) == a.cmp(b) for all a and b (the same is true for Hash and Eq). Also, Borrow can only return a reference to values so you can't just "make" things. You _can_ play some games with transmute and newtypes but this is very limited and not recommended.

OK, thanks for pointing that out.

Pulling https://crates.io/crates/compare into the standard library would be another way to fix this.

@rocallahan I don't really see how that helps.

bst::TreeMap is parameterized by a type C which is a subtype of Compare<K, K> _and_ Compare<Min,K>/Compare<Max,K> for the range bounds, and it definitely allows you to use any types Min/Max as long as your C parameter supports them.

So BTreeMap could use the same approach ... if Compare is pulled into the standard library.

I see. However, that's not nearly as flexible.

  1. It doesn't allow one to write custom queries over external maps/sets (the query logic is built-in to the map/set, not the query).
  2. The solution I posted above is significantly more ergonomic: set.query(1..100), set.query(("LastName",)), etc...

Here's a version that attempts to abstract over Borrow (unfortunately, it's a bit messy...). Usage examples at the bottom.

Unfortunately we were unable to get around to implementing these APIs this cycle, so we're going to punt this to the next.

I'm highly against specialized queries as recommended by @Stebalien for range queries. The range API should exclusively depend on Ord for selecting from collections. Queries are fine, but there's an algorithmic cost, since you are adding an implicit filter/sort step to the process, and that should be made explicit.

All that I expect from range is that it can walk the tree from a starting point to an ending point in the same order the tree is constructed in, which is the simplest form of this operation, and requires no knowledge of the structure of the data other than that it implements Ord. Allowing other orderings requires re-sorting the map, an expensive operation.

I'm not saying it's not a good idea to allow other queries through other means, because it will be needed in some cases, but let's keep the default simple, especially since this will simplify implementation, signature, and speed.

I also think it's a little unnecessary to have inclusive/exclusive bounds, since operators like .. already set a precedent for it. Just let BTreeMap<T: Ord>::range(T,T) -> Range<T> suffice.

Actually, following both the .. and ... convention would be a decent way to avoid having to wrap an additional type. Just have a range method and a range_inclusive method, similar to here: Range Expressions.

@OtterCode It's actually a bit more complicated since both the start and end can be either inclusive or exclusive, whereas .. and ... always make the start inclusive.

@Amanieu But why? Inclusiveness in the dot ranges is also arbitrary, but necessary to simplify the interface. We can make the same sorts of decisions here. It's trivial to drop the first element of a range, a single skip is all that's required. It's also not difficult to construct most reasonable objects with a defined Ord to be one less or one more than any useful category.

We have to sacrifice some use cases to prevent the interface from creeping out of control. This should already be implemented and stable. I believe that one of the reasons it isn't, is because the definition keeps expanding, and the goal posts keep moving. Using the same definitions as .. and ... also has the massive benefit of symmetry, keeping the basics of the language consistent, which is a huge relief on the cognitive load.

Admittedly, this would make some operations harder than they need to be, but it would leave us with a great minimal default.

@OtterCode

Queries are fine, but there's an algorithmic cost, since you are adding an implicit filter/sort step to the process, and that should be made explicit.

I'm not sure what you're trying to get at here. These collections are already sorted and range queries don't let you filter anything. The algorithm would be pretty much identical to the current one.

@Stebalien

Well, take the following:

#[derive(Ord)]
struct Name {
    last_name: String,
    first_name: String
}

Notice that the derived Ord will sort first by last name, and second by first name.

Now say that you want to get a query where you don't care what the last name is, you just want to get every first name from Aaron to Bob. To make your range, you have to iterate over every possible last name, extracting the first names that fit inside [Aaron, Bob). You have to skip nodes in the middle, and you can't be sure you're done until you've walked the whole tree, that's a filter operation. Your example doesn't cover that, but what you're offering will lead newcomers to expect that they can ignore parts of an Ord definition at will, without cost. If they are only allowed to ignore parts of Ord in the order of their weight, that will introduce frustration and confusion. Moreover, this ignores the possibility of a more complex implementation of Ord with dependent fields or strange conditions. All we know about the contents of a BTreeMap is that it satisfies Ord, and it really should be left at that.

If an author wants to filter, then reduce, that's what they should do. Those are vastly more intensive operations than simply walking the tree until you bump against a failing leaf, but that's the cost of greater power and flexibility.

The range in "range query" is important. These aren't general purpose filters, they're queries that can select sorted ranges from sorted collections; in other words, they're queries that describe a sorted range of values. It's already possible to do this in rust using some fancy Borrow implementations but it violates Borrow's specification.

That's exactly what I'm arguing for, staying close to the simplest definition of range. I believe that the extended semantics obfuscate, rather than clarify that fact, and they additionally get in the way of getting this feature implemented.

@OtterCode I think this is a problem of naming. Would OrdRange (or OrderedRange) be less confusing?

The problem with just going ahead with a poor/inflexible design is that we can't just fix it later. One way to solve this is to do the same thing we did with RangeArgument: keep the trait unstable and only implement it on simple ranges for now. This would allow the user to query, e.g., a..b and Bound::Inclusive(a)..Bound::Inclusive(b) but not define their own queries.

@Stebalien That's a fair point. It also brings up the question, should we even implement this at all? It can be solved by iter().skip_while().take_while(). Is that enough?

As you say, and as the comment history clearly indicates, there is no single, concise definition of how to express a range in this context. Covering all possible ranges, such as (), (], [), and [], and layering your partial queries on top of that might be counter productive, and I believe it is. It certainly takes the interface farther and farther from anything I might consider using. I don't want to import an enum simply to define bounds on a range.

I still vote strongly for picking one, or maybe two definitions, and leaving it at that, unless we want a whole suite of range tools, in which case, perhaps, it's better to have a separate implementation of Range, and just call my_b_tree.into_range(), and then chain onto that in the same way the iter trait does, as was originally suggested, I believe.

That's a fair point. It also brings up the question, should we even implement this at all? It can be solved by iter().skip_while().take_while(). Is that enough?

No for the reason you mentioned in your first post: that's way too slow especially when the range is small.

As you say, and as the comment history clearly indicates, there is no single, concise definition of how to express a range in this context. Covering all possible ranges, such as (), (], [), and [], and layering your partial queries on top of that might be counter productive, and I believe it is. It certainly takes the interface farther and farther from anything I might consider using. I don't want to import an enum simply to define bounds on a range.

The point of my suggested design is that most users will be able to write my_map.range("first".."until") and be done with it. However, those who need more expressive queries can write them.

I feel like we're talking in circles. Is there another voice willing to weigh in?

A b-tree is a collection sorted by keys. It, like other tree-based data structures, allows item lookup in O(log n) given a key (where n is the number of elements in the collection).

Calling .iter() returns an iterator over all key-value pairs in the collection. Using skip_while and take_while always gets you an O(n) operation since the iterator walks over _all_ pairs, not just the ones in the range (so you'll always walk over the elements before the range you're interested in, just because the iterator happens to start there).

Using "proper" range queries, you can basically do 2 lookups in O(log n) (for the start and end of the range) and yield all elements within (which in itself isn't depending on the size of the collection but the number of elements in the range).

Caveat: I might be getting all of this wrong.

Nothing anyone has said convinces me that https://github.com/rust-lang/rust/issues/27787#issuecomment-232217282 is a bad plan.

I've been writing code using the current range API (as implemented in stable_bst). It feels good. It looks like intervals in mathematics. The Included/Excluded/Unbounded trichotomy is easy to write and easy to read --- much clearer than the equivalent code using the C++ map API, and clearer than having to use explicit successor/predecessor operators on keys to create exclusive bounds (even when that can be done). It's also clearer than using min/max keys to get unbounded endpoints. Simplifying the API semantics further, e.g. by removing control over inclusion/exclusion/unbounded as suggested by @OtterCode, would make my code a lot less readable and more error-prone.

I have been convinced that it would be good to offer a _separate_ API that creates a range whose endpoints are given by query predicates expressed as closures. I don't see the point of trying to merge that into the Bound-based API; that would make the overwhelmingly more common cases harder.

Queries are fine, but there's an algorithmic cost, since you are adding an implicit filter/sort step to the process, and that should be made explicit.

I don't understand. A query predicate, however it's expressed, has the same performance characteristics as a Bound: you perform O(log n) comparison operations to find a position in the collection.

Now say that you want to get a query where you don't care what the last name is, you just want to get every first name from Aaron to Bob.

You just can't do this efficiently in a balanced tree ordered by last-name first. If you need to do this efficiently, you need to create a different tree ordered by first-name first. This is a case where it would be convenient to parameterize the BTreeMap with a Compare trait like stable_bst does. Otherwise you have to convert the values to a new struct with the right implementation of Ord.

Is it out of scope to also add a drain method along with range? drain was included in the original builder-based API, but I haven't seen it mentioned since.

@rocallahan Fair enough, not my preference, but it wouldn't be terribly hard to just wrap a function around the bounds selectors for a particular use case. What's holding this back? The comments just suggest that it wasn't a priority. Is there anything I can do to help, inexperience notwithstanding?

FWIW (having suggested a "pair of predicates" range method) I can't see anything wrong with @Stebalien 's suggestion: a single monotone 3-valued function (provided by a trait implementation) seems like a perfectly reasonable way to solve the problem. (The problem being that sentinel elements might be hard to find or non-existent, even though the range test is straightforward.)

Question: should my_btree_map[start..end] work as a default with the normal Range convention?

I need this in Servo. Please stop bikeshedding and stabilize this.

It needs to be implemented before it can be stabilized. The plan is https://github.com/rust-lang/rust/issues/27787#issuecomment-232217282

I'll put up a PR implementing the plan in the next day or two.

@aturon Is this being worked on? I would be happy to help in what way I can but I don't want to duplicate work if it's already in progress

Now that the changes to the surface API have landed, I think we're ready to consider stabilization.

@rfcbot rfc merge

Grr:

@rfcbot fcp merge

Team member @aturon has proposed to merge this. The next step is review by the rest of the tagged teams:

  • [x] @BurntSushi
  • [x] @Kimundi
  • [x] @alexcrichton
  • [x] @aturon
  • [x] @brson
  • [x] @sfackler

No concerns currently listed.

Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@aturon, to clarify, what APIs are you thinking of stabilizing as part of this? Just BTreeMap::range{,_mut} or also the RangeArgument trait and associated types?

@alexcrichton I think we should start with just the method for now, and let the trait bake a while longer.

In that case you have my checkmark!

There is a small ergonomic issue I came across while writing tests for this relating to taking a range by borrowed key; say you have a map with strings as keys, then it would be nice to just use range syntax to specify the range. Unfortunately this doesn't work, because you would need a Range<str> which is unsized. Changing the bounds required type annotations in the common case. Instead what I did is implement RangeArgument<T> for (Bound<&T>, Bound<&T>), allowing you to do this, but it still requires type annotations.

#![feature(btree_range)]
#![feature(collections_bound)]

use std::collections::btree_map::BTreeMap;
use std::collections::Bound::*;

fn main() {
    let mut map = BTreeMap::new();
    map.insert("key".to_string(), "value".to_string());
    map.insert("lee".to_string(), "walue".to_string());
    map.insert("me".to_string(), "xalue".to_string());
    // let range = map.range("key".."me"); // doesn't work
    // let range = map.range((Included("key"), Excluded("me"))); // type annotations required
    let range = map.range::<str, _>((Included("key"), Excluded("me"))); // finally
    for (key, value) in range {
        println!("({}, {})", key, value);
    }
}

I suppose this is covered by not stabilizing RangeArgument, though?

@djzin We have to be a bit careful: once the range method is stabilized, we're committing to it working for at least all of the arguments allowed by today's definition of RangeArgument. That is, we can change the definition and impls for RangeArgument but only if all uses of range will continue to work.

I propose that #33197 should be fixed before stabilization (unfortunately), but it's being fixed.

:bell: This is now entering its final comment period, as per the review above. :bell:

:bell: This is now entering its final comment period, as per the review above. :bell:

:bell: This is now entering its final comment period, as per the review above. :bell:

The first issue I listed above (not being able to use range syntax) may be fixed at a later time by adding impls such as

impl<'a, T: ?Sized> RangeArgument<T> for Range<&'a T> {
    fn start(&self) -> Bound<&T> { Included(self.start) }
    fn end(&self) -> Bound<&T> { Excluded(self.end) }
}

Then one may use map.range::<str, _>("a".."z").
A similar impl could be provided for &'a mut T if that were a thing.

For some reason, even though there is only one type that can fit here (as far as I can tell), rust still requires type annotations for the str. This seems to me like improvements in rust's type inference down the line may fix this ergonomic issue.

A related problem occurs with RangeFull aka .. - rust cannot derive the type for a String key. In this case there are multiple options, all with the same result. I feel as though this is a non-issue as selecting the whole range is a no-op.

None of these issues affect BTreeMaps with an i64 key (or any key that has no nontrivial implementation of Borrow), and the main ones are solvable in future backwards-compatibly. So I think, as this feature has been in unstable limbo for so long, it's due its chance to shine in stable rust-land :)

The final comment period is now complete.

!?!? REALLY?

@Gankro I was shocked too: https://github.com/solson/miri/pull/155 :)

this feature has been stable since ____. Attribute no longer needed is the best warning in rustc.

wow
On Wed, Mar 29, 2017 at 7:35 PM Scott Olson notifications@github.com
wrote:

@Gankro https://github.com/Gankro I was shocked too: solson/miri@c6a18ce

diff-b4aea3e418ccdb71239b96952d9cddb6L2

https://github.com/solson/miri/pull/156/commits/c6a18cead8885b229f5c7bc5d11cf6ed42d4dc6c#diff-b4aea3e418ccdb71239b96952d9cddb6L2
:)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rust-lang/rust/issues/27787#issuecomment-290258210,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAH3yVetV_d9Radh_Efyx_wJP4PucsrDks5rqurGgaJpZM4FqpeG
.

Was this page helpful?
0 / 5 - 0 ratings