This will allow dispatching to types that aren't always Base.Generator
Thanks for opening this. I think the same holds for Iterators.filter
, Iterators.flatten
, and Iterators.product
.
Why?
It would be nice to be able to define function Iterators.map(a_function, thing::SpecialTypeOfArray)
to return something other than a Generator
. Of course, we can currently overload the Generator
constructor to make something that's not a Generator
, but it seems to violate the principle of least surprise. @tkf what was your reasoning?
Why do you need it to return something other than a Generator
? A generator just contains the function and iterator, so no information is lost.
@bramtayl That's my thought too.
More specifically, for example, Iterators.map(very_special_function, ::SpecialTypeOfArray)
can return a lazy AbstractArray
if you have a compiler-agnostic way of computing output type of the very_special_function
. Likewise, Iterators.product
can easily return a lazy array, too. Iterators.flatten
can also return an array if the input is an array of static arrays. OTOH, Generator
cannot be an array subtype as it has to be able wrap other things (e.g., Iterators.filter
)
Another usecase is to overload Iterators.map(f, ::MyTable) :: MyLazyTable
. You can then overload show
of MyLazyTable
to preview computed result like Query is doing.
(But maybe the first point should be resolved using traits, as discussed in the ArrayLike PR #34196)
As I understand it right now, that'll easen the implementation of all types of operators which intend to propagate something. For example a caching iterator which automatically caches the outermost iteration call. Even if it's used as the innermost starter. This would work by dispatching on
Iterators.map(f, c::Caching) = cache(f(x) for x in c.iter)
where cache(iter)
creates the caching layer.
Now, (f(x) for x in cache(1:10))
would be the same as cache(f(x) for x in 1:10)
Also, it'd allow custom return types which in turn could implement arbitrary traits. Currently, as I understand it, you can only implement that by either not using the generator syntax or by specializing the trait for ::Generator{myf, myiter}
Edit: We'd even be able to explicitly fuse multiple Generator layers (instead of nesting them):
Iterators.map(f, c::Generator) = ((f鈭榗.f)(x) for x in c.iter)
(Sure that already could be done aswell but it'd fit into the same generic style very well!)
I support multiple dispatch for construction of generic generators based on lowering to Iterators.map
.
A container has a larger API surface than iteration, and if Base.map
can conserve these features I don't know why generators and Iterators.map
wouldn't also try to conserve indexing and other rich and useful behaviors on the lazily-mapped container. The first example that comes to mind is that a generator of arrays or dictionaries could return something indexable - ideally closely adhering to interfaces like AbstractArray
or AbstractDict
or, in my case, Dictionaries.AbstractDictionary
. (There is the issue of it being ElTypeUnknown
, of course...)
Here is another argument: If you want to define a special method for (f(x) for x in xs)
, it is kind of possible to do this with dispatching on Generator{<:MyContainer}
. However, it does not scale well because Generator
can be nested; e.g., dispatching on (g(y) for y in (f(x) for x in xs))
requires Generator{<:Generator{<:MyContainer}}
. Lowering to Iterators.map
helps because it lets you use arbitrary container type.
Why is that:
julia> Iterators.map(x->x^2, 1:10) isa Base.Generator
false
julia> Iterators.map==map
true
I mean, shouldn't Iterators.map
be the lazy variant in the first place? Otherwise lowering the generator syntax to Iterators.map
would be very game changing in the first place since it wouldn't be lazy anymore. So either way, that should be changed even before changing the lowering IMO.
This is because Iterators.map
is implemented in #34352 which just got merged. So, Iterators.map === Base.map
is false
in Julia >= 1.6 and true
in Julia <= 1.5. This PR is what to do after #34352. As you said, (f(x) for x in xs)
should always be lazy.
What are the chances that this will make it into LTS (=1.6 feature freeze)? I think it would be very important to have a generic style and meaning of the Iterators submodule and to be able to hook into its functions in a way that uses context information to eliminate unnecessary work. I.e. the Iterators.<fun>
should only promise that whatever will be returned will be lazy. The current implementation which just points towards the lazy variants of the eager algorithms will become the default/fallback.implementation.
For example, having Iterators.product(CartesianIndices, CartesianIndices)
return something of type CartesianIndices
rather than just a generic ProductIterator
(EDIT: that concrete example might be a bad one since a tuple of CartesianIndex is something else than a single flattened CartesianIndex). Encourage the specialization of functions in the Iterators submodule.
Most helpful comment
Here is another argument: If you want to define a special method for
(f(x) for x in xs)
, it is kind of possible to do this with dispatching onGenerator{<:MyContainer}
. However, it does not scale well becauseGenerator
can be nested; e.g., dispatching on(g(y) for y in (f(x) for x in xs))
requiresGenerator{<:Generator{<:MyContainer}}
. Lowering toIterators.map
helps because it lets you use arbitrary container type.