Rust: Inclusive version of take_while

Created on 28 Jun 2019  路  12Comments  路  Source: rust-lang/rust

Iterator::take_while stops _before_ the first element where the condition evaluates to false. That is the correct behavior given its name, but it makes it difficult to write an iterator that should include the boundary element. In particular consider something like iterating over a histogram, trimming off the tails:

for v in histogram
    .iter_linear(1_000)
    .skip_while(|v| v.quantile() < 0.01)
    .take_while(|v| v.quantile() < 0.99)

This may seem right, but consider an iterator where the elements are such that v.quantile() yields this sequence: [0, 0.02, 0.99]. Here, the bin that spans <0.02-0.99] will be dropped, even though it includes the majority of the samples. What we really want to do is take from the iterator until we have _received_ an element whose v.quantile() > 0.99. To write that with take_while, we need something like:

let mut got_true = true;
for v in h
    .iter_linear(1_000)
    .skip_while(|v| v.quantile() < 0.01)
    .take_while(move |v| {
        if got_true {
            // we've already yielded when condition was true
            return false;
        }
        if v.quantile() > 0.99 {
            // this must be the first time condition returns true
            // we should yield i, and then no more
            got_true = true;
        }
        // we should keep yielding
        true
    })


Which isn't exactly obvious.

So, I propose we add Iterator::take_until_inclusive, which yields elements up to and including the first element for which its argument returns true. The name isn't great, but I'm also struggling to come up with a better one. In some sense, I want the function to communicate that it yields every item it evaluates. Some other possible names if we're willing to invert the boolean: break_once, break_after, end_at, end_once, end_after.

Thoughts?

A-iterators C-feature-request T-libs

Most helpful comment

I created a crate with this functionality: https://github.com/hdevalke/take-until

All 12 comments

How about a take_until?

for v in h
    .iter_linear(1_000)
    .skip_while(|v| v.quantile() < 0.01)
    .take_until(|v| v.quantile() >= 0.99)

@Lonami take_until to me implies that it does _not_ take the first element where the condition is true, which is why I went with take_until_inclusive instead. In some sense, at least to my ears, take_until(f) is the same as take_while(!f).

Oh, I completely missed you mentioned take_until_inclusive already, sorry!

No worries at all, happens to us all! I'd be curious to hear your thoughts on whether the _inclusive bit is necessary though :)

take_to maybe?

How about a generic take_while_then? The signature would be:

fn take_while_then<F, T, I>(self, while: F, then: T) -> TakeWhileThen<Self, F, T, I>
while
    for<'a> F: FnOnce(&'a Self::Item) -> bool,
    T: FnOnce(Self) -> I,
    I: IntoIterator;

The resulting adapter would:

  1. Return the elements of self while while returns true.
  2. Then, return the elements of then(self).

The existing take_while would be equivalent to self.take_while(while, |_| empty()), and take_while_inclusive would be equivalent to self.take_while(while, |x| x.take(1)), although presumably more efficient.

Although I'm not sure how useful this kind of method would be, and it may be better to just have some form of take_while_inclusive by itself, as this is probably a common pattern.

I also needed this recently to decode some protocol where a stop bit is used to indicate the last byte of a field. The other 7 bits are used to encode the data. I could not use the iterator API because of this.

The implementation should be less or more the same as TakeWhile.

next could be implemented as:

    #[inline]
    fn next(&mut self) -> Option<I::Item> {
        if self.flag {
            None
        } else {
            let x = self.iter.next()?;
            if (self.predicate)(&x) {
                self.flag = true;
            }
            Some(x)
        }
    }

This would, however, take all elements until some condition is true, which is not the same as take_while_inclusive.

The BufRead::read_until exists and reads up to inclusive the delimiter. So take_until seems to be a good name.

I created a crate with this functionality: https://github.com/hdevalke/take-until

Could be a good fit for the itertools crate (@hdevalke you could PR your code upstream)

@michaelsproul itertools already contains take_while_ref.

I don't think take_while_ref or peeking_take_while are well suited to this because they require you to assign the iterator to a local variable, and muck around with mutation, which breaks the flow of the iterator chain. The best I could do with them was this, which isn't as nice as take_until.

If you don't mind, I'd be happy to try and get your code into itertools, giving you credit of course! (Co-authored-by on the commit)

@michaelsproul no problem if you want to do it, most of the code I wrote was inspired by the implementation of TakeWhile.

Was this page helpful?
0 / 5 - 0 ratings