Crystal: Iterator idempotence

Created on 5 Aug 2019  路  3Comments  路  Source: crystal-lang/crystal

The Iterator module has a number of methods inherited from Enumerable that when invoked result in the original iterator being exhausted. This can lead to some confusing behaviour when first encountered.

For example:

iter = (0...100).each

puts iter.size # => 100

puts iter.to_a # => []

puts iter.size # => 0

Similar behaviour also results from calls to #all?, #any?, #count, #none? and #one? due to internal calls to #each, which forces full evaluation of the original sequence.

Two possible solutions may exist here:

  1. Make note of this behaviour in the docs to help reduce WTFs/minute.
  2. Override these methods within Iterator so that calls become idempotent. This would likely require internal caching of generated elements so that subsequent calls involving #each are yielded the originally generated objects. There's an obvious resource impact here as well as change to existing API behaviour so this comes with it's own challenges.

Happy to put together a PR for either of these, but would be interested in some feedback on the preferred approach.

Most helpful comment

Implicitly caching an iterator is not an option as it entirely defeats the purpose of an iterator (which can have a potentially unlimited size). When such behaviour is needed, it should be implemented by deliberate choice, for example by solidifying into an array.

So we can only improving the documentation on the behaviour related to re-iterating an iterator.

All 3 comments

Implicitly caching an iterator is not an option as it entirely defeats the purpose of an iterator (which can have a potentially unlimited size). When such behaviour is needed, it should be implemented by deliberate choice, for example by solidifying into an array.

So we can only improving the documentation on the behaviour related to re-iterating an iterator.

Possible counterpoint to that is full evaluation and caching does already take place with calls it Iterator#cycle. But I do completely agree that this is dangerous territory.

#cycle is a single method and caches only for its internal behaviour.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cjgajard picture cjgajard  路  3Comments

lgphp picture lgphp  路  3Comments

oprypin picture oprypin  路  3Comments

RX14 picture RX14  路  3Comments

pbrusco picture pbrusco  路  3Comments