from @ssfrr on Slack: mapslices(f, x; dims=d) and map(f, eachslice(x; dims=d)) take interpret dims in perpendicular ways (mapslices expects the slice dimensions and eachslice takes the dimensions the slices are iterated over). It seems like those two things are similar enough that they should be combined. (for v2)
One way to deprecate this cleanly would be to have better names than just dims
(bydims
? alongdims
?)
I personally find it hard to remember if dims
is the dimensions you keep or the dimensions you reduce away. (It's the dimensions you reduce away.) But perhaps we could move away from every single reduction function having to duplicate this machinery and instead do something more generic (like eachslice
) which also has the option to specify it either way and gives more control over whether to keep the reduced dimensions (current behavior) or drop them entirely.
(a) any keyword used widely in Base ought to
with this
and elsewhere without this
)(b) positive logic is more familiar and positive conditions are more easily remembered
__either__
dims
ought be the dimension[s] of focus in all uses
dims
is set to the dimensions to be selected/considereddims
is set to the dimensions to be omitted/deleted __orelse__
dims
ought be the dimension[s] utilized
dims
is set to the dimensions to be selected/considereddims
is set to the dimensions to be preserved/retainedMy preference is to choose what is more natural in this sense: easy to use and pleasant in use.
I'd rather remember one principle than two with respect to using one keyword [__orelse__ above].
I love the way that broadcasting standardized the process of applying scalar operations to collections without defining anything on each operation. It would be great to find a way to generalize that to mapping and reducing over different dimensions.
I've been a long-time fan of introducing *
(or something similar) as a "glob-like" indexing object: eachslice(A, :, :, *, :, *)
would return slices over dimensions 1, 2, and 4, iterating over dimensions 3 and 5. This could almost be done by the parser:
B = [f(slice) for slice in eachslice(A, :, :, *, :, *)]
could be expanded as
B = [f(view(A, :, :, i, :, j)) for i in axes(A, 3), j in axes(A, 5)]
But one could also do this in Julia code.
The latter operation, though, is really not that bad. In my own code I tend to avoid use of mapslices
altogether, except when I need a good demo of a function with poor inference properties (e.g., https://github.com/timholy/ProfileView.jl#usage-and-visual-interpretation).
This could go further, and make A[*, :]
something like eachcol(A)
(up to views, generators, etc), there was a thread discussing this. Would there be advantages to having this done at the parser level?
The only missing part of mapslices is then the ability to easily collect all the f(slice)
s into one big array, if f
does not return a scalar. reduce(hcat, B)
works well if B
is a vector of vectors, but I think you have to do a bit of reshaping to make this work in higher dimensions. And this is quite slow if given a generator, instead of the collected set of slices. Perhaps something like cat(f, eachslice(A, ...)) == cat(f(x) for x in eachslice(A, ...))
should do this?
Might make sense to use .
rather than *
for this to relate it to broadcasting, but some kind of indexing syntax that could be used to control the iteration axes in map
and reduce
-type operations would be awesome.
Most helpful comment
I personally find it hard to remember if
dims
is the dimensions you keep or the dimensions you reduce away. (It's the dimensions you reduce away.) But perhaps we could move away from every single reduction function having to duplicate this machinery and instead do something more generic (likeeachslice
) which also has the option to specify it either way and gives more control over whether to keep the reduced dimensions (current behavior) or drop them entirely.