Rfcs: `iter.map()` should not guarantee that the closure is executed for all the iterated elements.

Created on 25 Sep 2016 · 16Comments · Source: rust-lang/rfcs

This would allow things like optimizing iter::Map's implementation of skip.

T-libs

Source

ticki

👍2

Most helpful comment

Sorry, "pure", not pure. My point is that most non-trivial functions in rust aren't pure (there's going to be an assert/unwrap/allocation etc. somewhere). However, many of these side effects aren't really important and it would be nice to be able to say "this function has no important/noticeable side effects, you can avoid calling it if you don't use the result". Basically, make this kind of optimization (this RFC issue) opt-in.

Stebalien on 26 Sep 2016

👍3

All 16 comments

It doesn't guarantee it right now, but the implementation is limited, as if it was necessarily true.

ticki on 25 Sep 2016

To be more specific, executed or not for all elements _when_? Since Iterator::map does not call the closure a single time unless something more happens to the iterator.

I associate to special-casing the method count for map which I still believe is a significant breaking change if it would be done so that it does not call the mapping function. Previously discussed in https://github.com/rust-lang/rust/pull/28125

bluss on 26 Sep 2016

Alternatively, we could introduce a notion of purity (this has been discussed in the past but I don't remember where). That is, add a marker trait trait PureFn: Fn {}, automatically infer it for applicable closures/functions, and then add a #[pure] attribute for telling the compiler that a function has no important side effects (even if it _looks_ like it might, e.g., panic, increment some global counter, etc...). This would allow optimizations based on purity. IMO, these optimizations should be done at the type level through specialization, not by the compiler itself.

Unfortunately, 99% of methods in debug mode wouldn't be pure because integer overflow panics in debug mode. This would make debug/release behavior differ significantly (although, arguably, it already does...).

Stebalien on 26 Sep 2016

If the map function is pure then the compiler can already optimize it away.

Amanieu on 26 Sep 2016

👎1

@Stebalien But if it's pure, then LLVM not being able to optimize it is more or less a bug.

eddyb on 26 Sep 2016

Stebalien on 26 Sep 2016

👍3

To defend the cause for some kind of special casing of the counting, the iterator might not be so simple so that a plain .count() without side effects can be reduced to a number (BTreeMap iterators and so on, maybe not even the VecDeque one).

bluss on 26 Sep 2016

@bluss That's different, in that the data structure _may_ have some simpler way to get the value.
Or do you mean that kind of special-casing combined with @Stebalien's "pure"?

eddyb on 26 Sep 2016

sure, the latter.

bluss on 26 Sep 2016

What about putting the guarantee not on map, but on count? That is, document count as "consumes all items from the iterator and returns the count". Because, for better or for worse, that's what its purpose has become. We could add another method that's documented to be specializable.

durka on 28 Sep 2016

@durka Once specialization lands, external crate could actually provide that using ExactSizeIterator:

#![feature(specialization)]

pub trait Count {
    fn count(self) -> usize;
}

impl<I> Count for I where I: Iterator {
    default fn count(self) -> usize {
        Iterator::count(self)
    }
}

impl<I> Count for I where I: ExactSizeIterator {
    fn count(mut self) -> usize {
        ExactSizeIterator::len(&mut self)
    }
}

pub fn len<I: IntoIterator>(iter: I) -> usize {
    Count::count(iter.into_iter())
}

Stebalien on 28 Sep 2016

@durka That's more or less what count already says Iterator::count in rustdoc

bluss on 28 Sep 2016

@Stebalien not sure what that adds over len anyway.

durka on 28 Sep 2016

@durka len only exists on iterators that implement ExactSizeIterator. This way, you can accept any Iterator (or IntoIterator) and get its length.

Stebalien on 28 Sep 2016

There's an intermediate way there too, let (lo, hi) = iter.size_hint(); if Some(lo) == hi { lo } else { iter.count() }

bluss on 28 Sep 2016

Famous example: chain will often have a dynamically exact size hint.

bluss on 28 Sep 2016

Was this page helpful?

0 / 5 - 0 ratings