Rust: std::iter::Chain should implement ExactSizeIterator when possible

Created on 23 Jun 2016  路  12Comments  路  Source: rust-lang/rust

When both iterators of the chain implement ExactSizeIterator, the chain itself should implement that trait as well.

Most helpful comment

This is the third time in 3 years that I have ended up in this issue. Each use case was definitely fine with panicking on overflow.

All 12 comments

Is it just

impl<A, B> ExactSizeIterator for Chain<A, B> where
    A: ExactSizeIterator,
    B: ExactSizeIterator<Item=A::Item>, {}

does this nead tests? if so where and how?

Unfortunately I don't think this is possible because sum of the the len() of the two iterators could overflow usize.

On that note, surely Range<u32> shouldn't implement ExactSizeIterator because Rust hopes to support 16-bit systems? The source code contains this snippet:

// Ranges of u64 and i64 are excluded because they cannot guarantee having
// a length <= usize::MAX, which is required by ExactSizeIterator.
range_exact_iter_impl!(usize u8 u16 u32 isize i8 i16 i32);

Seems vaguely related. Would it make sense for ExactSizeIterator to be generic in the return type of its len operation? The default would remain ExactSizeIterator<Len=usize>.

As @ollie27 noted, this is not possible:

use std::usize;

fn main() {
    let iter = (0..usize::max_value()).chain(0..usize::max_value());
    println!("{:?}", iter.size_hint());
}

It would require changing the contract for ExactSizeIterator, for example:

  • The iterator must report the smaller of: (1) usize::MAX and (2) the exact number of elements that remain in the iterator before it returns None from .next() (or .next_back() where applicable).

I don't think such a change affects how this trait can be used in practice today? It would extend its usability.

Implementation would be: saturating_add. Note that a longer than maximum chain will of course report its true exact length as soon as enough has been consumed to bring the length down into the usize value range.


5th november: One problem, .chain().rposition() and .chain().enumerate().rev() can have totally wrong results.

It doesn't seem like this can happen. Is there any hope here or can this be closed?

I think close.

My rationale would be that ExactSizeIterator was introduced to be able to move rposition from the slice to general iterators, producing element indices counted from the front of the sequence. Thus it's inherently designed to be tied to in-memory sequences.

This doesn't even work if both iterators reside in memory, e.g. on a 32bit platform:

use std::iter;

let x: Vec<u8> = iter::repeat(0).take((1 << 31) + 1).collect();
x.iter().chain(x.iter())

This produces a chain that is longer than 2**32, too large to be represented in a usize.

What I mean is that ExactSizeIterator was designed for slice.iter().rposition().

Well, it sounds like people are in agreement here that this can't happen, unfortunately

On the other hand there are cases where your iterators all fit into usize comfortably and you are also okay with a panic on overflow... for such cases, not having ExactSizeIterator is really painful.

This is the third time in 3 years that I have ended up in this issue. Each use case was definitely fine with panicking on overflow.

Was this page helpful?
0 / 5 - 0 ratings