Julia: mapslices, map interpret dims differently

Created on 21 Feb 2020  路  7Comments  路  Source: JuliaLang/julia

from @ssfrr on Slack: mapslices(f, x; dims=d) and map(f, eachslice(x; dims=d)) take interpret dims in perpendicular ways (mapslices expects the slice dimensions and eachslice takes the dimensions the slices are iterated over). It seems like those two things are similar enough that they should be combined. (for v2)

Most helpful comment

I personally find it hard to remember if dims is the dimensions you keep or the dimensions you reduce away. (It's the dimensions you reduce away.) But perhaps we could move away from every single reduction function having to duplicate this machinery and instead do something more generic (like eachslice) which also has the option to specify it either way and gives more control over whether to keep the reduced dimensions (current behavior) or drop them entirely.

All 7 comments

One way to deprecate this cleanly would be to have better names than just dims (bydims? alongdims?)

I personally find it hard to remember if dims is the dimensions you keep or the dimensions you reduce away. (It's the dimensions you reduce away.) But perhaps we could move away from every single reduction function having to duplicate this machinery and instead do something more generic (like eachslice) which also has the option to specify it either way and gives more control over whether to keep the reduced dimensions (current behavior) or drop them entirely.

(a) any keyword used widely in Base ought to

  • carry meaning consistently (should not be with this and elsewhere without this)
  • be used for consonant purposes, in shared role[s]

(b) positive logic is more familiar and positive conditions are more easily remembered

  • negative logic is less immediately apparent to most (some proof systems notwithstanding)

__either__
dims ought be the dimension[s] of focus in all uses

  • for functions that cull or consider, dims is set to the dimensions to be selected/considered
  • for functions that strip or erase, dims is set to the dimensions to be omitted/deleted

__orelse__
dims ought be the dimension[s] utilized

  • for functions that cull or consider, dims is set to the dimensions to be selected/considered
  • for functions that strip or erase, dims is set to the dimensions to be preserved/retained

My preference is to choose what is more natural in this sense: easy to use and pleasant in use.
I'd rather remember one principle than two with respect to using one keyword [__orelse__ above].

I love the way that broadcasting standardized the process of applying scalar operations to collections without defining anything on each operation. It would be great to find a way to generalize that to mapping and reducing over different dimensions.

I've been a long-time fan of introducing * (or something similar) as a "glob-like" indexing object: eachslice(A, :, :, *, :, *) would return slices over dimensions 1, 2, and 4, iterating over dimensions 3 and 5. This could almost be done by the parser:

B = [f(slice) for slice in eachslice(A, :, :, *, :, *)]

could be expanded as

B = [f(view(A, :, :, i, :, j)) for i in axes(A, 3), j in axes(A, 5)]

But one could also do this in Julia code.

The latter operation, though, is really not that bad. In my own code I tend to avoid use of mapslices altogether, except when I need a good demo of a function with poor inference properties (e.g., https://github.com/timholy/ProfileView.jl#usage-and-visual-interpretation).

This could go further, and make A[*, :] something like eachcol(A) (up to views, generators, etc), there was a thread discussing this. Would there be advantages to having this done at the parser level?

The only missing part of mapslices is then the ability to easily collect all the f(slice)s into one big array, if f does not return a scalar. reduce(hcat, B) works well if B is a vector of vectors, but I think you have to do a bit of reshaping to make this work in higher dimensions. And this is quite slow if given a generator, instead of the collected set of slices. Perhaps something like cat(f, eachslice(A, ...)) == cat(f(x) for x in eachslice(A, ...)) should do this?

Might make sense to use . rather than * for this to relate it to broadcasting, but some kind of indexing syntax that could be used to control the iteration axes in map and reduce-type operations would be awesome.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ararslan picture ararslan  路  3Comments

wilburtownsend picture wilburtownsend  路  3Comments

StefanKarpinski picture StefanKarpinski  路  3Comments

TotalVerb picture TotalVerb  路  3Comments

manor picture manor  路  3Comments