Julia: rename `indices`?

Created on 24 Aug 2017 · 15Comments · Source: JuliaLang/julia

I propose renaming this to axes, since it basically returns a description of each axis of an array. The current name makes me think it gives you, well, the indices (which is what keys does).

arrays decision deprecation design

Source

JeffBezanson

👍1

Most helpful comment

I like the idea of switching our terminology to "axes" to refer to what we sometimes call "dimensions." The problem with dimensions is that it both means "number of dimensions" (as in "n-dimensional") and "the dimensions of this room" (aka, size). "axes" seems to clearly imply the former and not the latter. For example, I'd propose that most methods that take an argument called dims or region should probably rename that variable axes.

With AxisArrays, @mbauman has played with making indices return the information that is currently returned by AxisArrays.axes. This makes sense because both return a tuple that is essentially a "broadcast-worthy" description of the complete set of Cartesian indices for the array. However axes returns these in physical units (e.g., mm if you've assigned such units to the axes of the array) whereas indices returns these in computer science units (aka, integers). Both turn out to have their uses, so AFAICT attempts to unify axes and indices have proved problematic. For Base, the advantage of indices is that it somehow seems to imply integer units to me.

I wonder if the distinction you're working towards is the difference between the set of all indices, an iterator to generate all indices, and the "basis vectors" for constructing (via reshape&broadcasting) all possible indices. Given that arrays have rectangular/Cartesian indexing, all valid indices can be constructed from the "basis vectors". Maybe call it indexvectors?

timholy on 24 Aug 2017

👍4 ❤2

All 15 comments

There is also eachindex ... personally, I find axes more confusing than indices. We don't use the "axis" terminology anywhere else in Julia as far as I can tell.

stevengj on 24 Aug 2017

👍1

eachindex also doesn't tell you the indices of an array --- it gives you something useful for fast iteration, which can be different from the actual indices.

JeffBezanson on 24 Aug 2017

However indices(A, i) gives you the indices for dimension i, so that's good.

JeffBezanson on 24 Aug 2017

timholy on 24 Aug 2017

👍4 ❤2

There might be some more inconsistencies in how properties of arrays are named. I could see that size in singular refers to both individual an individual size(A,i) as well as the collection size(A), but always find this confusing when on the next line probing strides as stride(A,i) or strides(A).

On a more philosophical level, I would say that the current nomenclature for array properties is very much oriented towards arrays representing data on a discretised grid in space(time), i.e. the rank of the tensor is referred to as ndims being typically 2 (plane), 3 (space) or 4 (spacetime), and indices(A,i) is used for the possible indices (the range of values) that the ith dimension can take, and the length of that range is size(A,i), again referring to the geometrical interpretation af the array.

That use of indices and dimensions is exactly opposite to how tensors are often described and used in physics. A tensor has a number of indices (the rank of the tensor), which are abstract objects (not the actual values they take), and the ith index of a tensor has a certain dimension associated to a vector space in which it lives, which can be spacetime. For example, tensors in classical physics or general relativity can have a number of indices whose dimension is typically 3 (for space) or 4 (for spacetime), but the number of indices itself has nothing to do with space(time) dimensionality. size and ndims are therefore awkward descriptions of these properties. Similarly, in quantum physics, the indices of a tensor take values in some Hilbert space with a certain dimension, and neither those dimensions nor the number of indices have any relation to space(time) dimensionality.

All of this just as a side note, as I am not actually voting to change the current convention.

Jutho on 4 Oct 2017

Would existing users of an axes() function have to rename their versions? (It might be quite common in the plotting/graphics world as well as the array world...)

cormullion on 4 Oct 2017

I made a prototype PR of changing this name at #25057 - it's pretty easy to choose a new name on that branch, as desired.

I'll cross-post a discussion I made there:

I feel that the _indices_ of an indexable container would probably be the collection of things you can index with, which is currently what the keys function does. [...]

Going further, I feel there is a bit of discord between "keys" and "indices", since we have keys and haskey as doing discovery of things we can do getindex and setindex! with. I don't see the advantage on forcing users to learn and use multiple words for the same concept. We could consider harmonizing like so:

| Current | "Index" terminology | "Key" terminology |
| --- | --- | --- |
| getindex | getindex | getkey |
| setindex! | setindex! | setkey! |
| keys | indices or index | keys |
| haskey | hasindex | haskey |

Personally, I'd prefer the "index" version.

andyferris on 13 Dec 2017

I prefer the "index" terminology as well – "index" sounds right for both arrays and dicts (although it's a bit non-standard for dicts), whereas "key" sounds quite wrong for arrays.

StefanKarpinski on 13 Dec 2017

👍3

Has the name dims been considered? It matches ndims and seems like a more intuitive name for what this function returns: a description of the dimensions of the argument. It would be a little weird that dims(A, i) is plural, but it seems ok to me:

julia> dims(rand(3, 4))
(Base.OneTo(3), Base.OneTo(4))

julia> dims(rand(3, 4), 1)
Base.OneTo(3)

julia> ndims(rand(3, 4))
2

StefanKarpinski on 13 Dec 2017

With strides, it's also strides(A) and stride(A,n) (plural and singular). However, dim sounds like dim(A,n) should just be size(A,n), not the corresponding range, but that's maybe my difference in background, as explained above: https://github.com/JuliaLang/julia/issues/23434#issuecomment-334051258

Jutho on 13 Dec 2017

I had thought of dimranges since it returns a range for each dimension.

However maybe dimindices / dimkeys would be better, since it’s the indices (keys) for each dimension.

andyferris on 14 Dec 2017

FWIF, I prefer axes to dimindices, but I prefer dims to both.

StefanKarpinski on 14 Dec 2017

Just looked into this - dims is really commonly used in Base as a variable name, containing a tuple of integers (the result of size). We'd want to do a bit of a nomenclature switch around if we go with dims here.

andyferris on 14 Dec 2017