Rust: Adding method to `consume` an iterator efficiently

Created on 3 Sep 2019  路  5Comments  路  Source: rust-lang/rust

Currently, mapping methods for iterator, e.g. map and filter are implemented in a lazy manner. Which means they would not be calculated until further consumption.
However, sometimes it can be the case that given an Iterator (whatever it actually is) . we just wanted its side-effect and to simply consume it. for instance :

#[derive(Debug)]
struct Item(i32);

fn take_iter<T : Iterator>(iter: T) {
    // we just want to consume the `iter` here
    // do something else
}

fn main() {
    let ve = vec![Item(1), Item(2), Item(3), Item(1)];

    take_iter(ve.iter().filter(|value| value.0 < 3).map(|value| {
        println!("{:?}", value);
        value
    }));
}

Currently, there're several ways of achieving this goal, for example :

fn take_iter<T : Iterator>(iter: T) {
    iter.for_each(|_| {}); // consumed
     // do something else
}
fn take_iter<T : Iterator>(iter: T) {
    iter.count(); // consumed
    // do something else
}

While they're both usable, it's obvious that these methods are at the cost of efficiency, for_each yields a call to closure which is totally unnecessary, while count finally counting the number of elements in vain.

C-feature-request T-libs

Most helpful comment

for_each yields a call to closure which is totally unnecessary.

That call to a no-op closure will most certainly be optimized out in release builds.

In the case of the code that you've provided using map() for side effects is not really idiomatic. The lazy inspect() combinator or the strict for_each() consumer would be more suited for invoking a println!() on each element. inspect() for when you're trying to observe/debug the evaluation of the lazy iterator and for_each() for when you want to consume the iterator and produce predictable output. I'm not sure which it is in your case. I guess the latter?

In either case, I think the proposed scenario does not deserve a separate method. If you find yourself in need of force-consuming an iterator that produces side effects, that is a strong hint that you should move the effectful operations (like println!()) from closures within the lazy composition of your iterator to the part of the program that consumes the iterator, be that a for_each, a for loop or whatever else.

All 5 comments

I suggest adding a method consume for the Iterator trait, which takes no params and effectively consumes each element in an iterator to preserve their side-effects only. Then the example above could be written as :

fn take_iter<T : Iterator>(iter: T) {
    iter.consume(); // consumed
    // do something else
}

for_each yields a call to closure which is totally unnecessary.

That call to a no-op closure will most certainly be optimized out in release builds.

In the case of the code that you've provided using map() for side effects is not really idiomatic. The lazy inspect() combinator or the strict for_each() consumer would be more suited for invoking a println!() on each element. inspect() for when you're trying to observe/debug the evaluation of the lazy iterator and for_each() for when you want to consume the iterator and produce predictable output. I'm not sure which it is in your case. I guess the latter?

In either case, I think the proposed scenario does not deserve a separate method. If you find yourself in need of force-consuming an iterator that produces side effects, that is a strong hint that you should move the effectful operations (like println!()) from closures within the lazy composition of your iterator to the part of the program that consumes the iterator, be that a for_each, a for loop or whatever else.

Yea, most of the time a for_each() is quiet enough, and it would be a rare case that one would like to preserve the side effect of previous mappings only.
But in case that someone wants to perform an I/O action, or just to swiftly printing some debug output by inspect etc. . It would do no harm that we add this method for good (while indeed, not that necessary).

I recommend the idiom .for_each(drop) because it's clearer; what you're suggesting is to basically make an alias for this. And, like was said, this is usually optimised out.

As @clarfon noted, this has been discussed before (https://github.com/rust-lang/rust/pull/48945), leading to the current suggestion to use .for_each(drop) for this, which is recommended in warnings as well:

https://github.com/rust-lang/rust/blob/b9de4ef89e0e53099a084001b26ec3207c5f8391/src/libcore/iter/traits/iterator.rs#L1476-L1477

As such I'm going to close this.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nikomatsakis picture nikomatsakis  路  340Comments

Leo1003 picture Leo1003  路  898Comments

cramertj picture cramertj  路  512Comments

nikomatsakis picture nikomatsakis  路  412Comments

nikomatsakis picture nikomatsakis  路  210Comments