The canary in the coalmine is mapreducedims
, which is a combo of at least three semantic operations: mapping, reducing, and slicing along dimensions. Base seems to be slowly collecting lazy implementations: Base.Generator
, Iterators.filter
, etc. Of the top of my head, we would want to additionally include reduce
and slicedims
as single table verbs. Going all the way would mean that the code currently in mapreducedims
would simply be a specialized method of collect
. This would sweep up #3893. Additionally, we might want to make verbs like sum
return a lazy reduce
.
Extensions to lazy operations on multiple tables seems trickier, but having a lazy broadcast
might also be useful.
Why do you want sum
to be lazy?
Also, IIRC mapreduce
was added for convenience; some people feel it is much easier to read and write than two nested calls.
Well, it doesn't have to be, but I think it would (at least at the moment) improve performance, because mapslices
is currently a lot slower for functions like sum
than reducedims
is for +
.
I've been meaning to write up a Julep and proof-of-concept, but I would like to get rid of mapslices
, mapreducedims
(and many of the slice
-specific functionality) in lieu of a Slice
iterator (which is then used with the usual functions).
@simonbyrne I am curious, I thought this would need a new trait Homogenous() which states that the elements of the iterator have the same shape to avoid the ugly lookahead I had in #16708.
@simonbyrne I started work on a slicing iterator package. So far it just handles mapslices. I'm still not quite there with performance but I'm not too far off.
Most helpful comment
I've been meaning to write up a Julep and proof-of-concept, but I would like to get rid of
mapslices
,mapreducedims
(and many of theslice
-specific functionality) in lieu of aSlice
iterator (which is then used with the usual functions).