I propose renaming this to axes
, since it basically returns a description of each axis of an array. The current name makes me think it gives you, well, the indices (which is what keys
does).
There is also eachindex
... personally, I find axes
more confusing than indices
. We don't use the "axis" terminology anywhere else in Julia as far as I can tell.
eachindex
also doesn't tell you the indices of an array --- it gives you something useful for fast iteration, which can be different from the actual indices.
However indices(A, i)
gives you the indices for dimension i
, so that's good.
I like the idea of switching our terminology to "axes" to refer to what we sometimes call "dimensions." The problem with dimensions is that it both means "number of dimensions" (as in "n-dimensional") and "the dimensions of this room" (aka, size
). "axes" seems to clearly imply the former and not the latter. For example, I'd propose that most methods that take an argument called dims
or region
should probably rename that variable axes
.
With AxisArrays, @mbauman has played with making indices
return the information that is currently returned by AxisArrays.axes
. This makes sense because both return a tuple that is essentially a "broadcast-worthy" description of the complete set of Cartesian indices for the array. However axes
returns these in physical units (e.g., mm
if you've assigned such units to the axes of the array) whereas indices
returns these in computer science units (aka, integers). Both turn out to have their uses, so AFAICT attempts to unify axes
and indices
have proved problematic. For Base, the advantage of indices
is that it somehow seems to imply integer units to me.
I wonder if the distinction you're working towards is the difference between the set of all indices, an iterator to generate all indices, and the "basis vectors" for constructing (via reshape&broadcasting) all possible indices. Given that arrays have rectangular/Cartesian indexing, all valid indices can be constructed from the "basis vectors". Maybe call it indexvectors
?
There might be some more inconsistencies in how properties of arrays are named. I could see that size in singular refers to both individual an individual size(A,i)
as well as the collection size(A)
, but always find this confusing when on the next line probing strides as stride(A,i)
or strides(A)
.
On a more philosophical level, I would say that the current nomenclature for array properties is very much oriented towards arrays representing data on a discretised grid in space(time), i.e. the rank of the tensor is referred to as ndims
being typically 2 (plane), 3 (space) or 4 (spacetime), and indices(A,i)
is used for the possible indices (the range of values) that the i
th dimension can take, and the length of that range is size(A,i)
, again referring to the geometrical interpretation af the array.
That use of indices and dimensions is exactly opposite to how tensors are often described and used in physics. A tensor has a number of indices (the rank of the tensor), which are abstract objects (not the actual values they take), and the i
th index of a tensor has a certain dimension associated to a vector space in which it lives, which can be spacetime. For example, tensors in classical physics or general relativity can have a number of indices whose dimension is typically 3 (for space) or 4 (for spacetime), but the number of indices itself has nothing to do with space(time) dimensionality. size
and ndims
are therefore awkward descriptions of these properties. Similarly, in quantum physics, the indices of a tensor take values in some Hilbert space with a certain dimension, and neither those dimensions nor the number of indices have any relation to space(time) dimensionality.
All of this just as a side note, as I am not actually voting to change the current convention.
Would existing users of an axes()
function have to rename their versions? (It might be quite common in the plotting/graphics world as well as the array world...)
I made a prototype PR of changing this name at #25057 - it's pretty easy to choose a new name on that branch, as desired.
I'll cross-post a discussion I made there:
I feel that the _indices_ of an indexable container would probably be the collection of things you can index with, which is currently what the
keys
function does. [...]Going further, I feel there is a bit of discord between "keys" and "indices", since we have
keys
andhaskey
as doing discovery of things we can dogetindex
andsetindex!
with. I don't see the advantage on forcing users to learn and use multiple words for the same concept. We could consider harmonizing like so:| Current | "Index" terminology | "Key" terminology |
| --- | --- | --- |
|getindex
|getindex
|getkey
|
|setindex!
|setindex!
|setkey!
|
|keys
|indices
orindex
|keys
|
|haskey
|hasindex
|haskey
|Personally, I'd prefer the "index" version.
I prefer the "index" terminology as well – "index" sounds right for both arrays and dicts (although it's a bit non-standard for dicts), whereas "key" sounds quite wrong for arrays.
Has the name dims
been considered? It matches ndims
and seems like a more intuitive name for what this function returns: a description of the dimensions of the argument. It would be a little weird that dims(A, i)
is plural, but it seems ok to me:
julia> dims(rand(3, 4))
(Base.OneTo(3), Base.OneTo(4))
julia> dims(rand(3, 4), 1)
Base.OneTo(3)
julia> ndims(rand(3, 4))
2
With strides
, it's also strides(A)
and stride(A,n)
(plural and singular). However, dim
sounds like dim(A,n)
should just be size(A,n)
, not the corresponding range, but that's maybe my difference in background, as explained above: https://github.com/JuliaLang/julia/issues/23434#issuecomment-334051258
I had thought of dimranges
since it returns a range for each dimension.
However maybe dimindices
/ dimkeys
would be better, since it’s the indices (keys) for each dimension.
FWIF, I prefer axes
to dimindices
, but I prefer dims
to both.
Just looked into this - dims
is really commonly used in Base
as a variable name, containing a tuple of integers (the result of size
). We'd want to do a bit of a nomenclature switch around if we go with dims
here.
Has the name
dims
been considered?
Wasn't this precisely what Tim Holy's comment above addressed? Or am I missing something?
Closed via #25057
Most helpful comment
I like the idea of switching our terminology to "axes" to refer to what we sometimes call "dimensions." The problem with dimensions is that it both means "number of dimensions" (as in "n-dimensional") and "the dimensions of this room" (aka,
size
). "axes" seems to clearly imply the former and not the latter. For example, I'd propose that most methods that take an argument calleddims
orregion
should probably rename that variableaxes
.With AxisArrays, @mbauman has played with making
indices
return the information that is currently returned byAxisArrays.axes
. This makes sense because both return a tuple that is essentially a "broadcast-worthy" description of the complete set of Cartesian indices for the array. Howeveraxes
returns these in physical units (e.g.,mm
if you've assigned such units to the axes of the array) whereasindices
returns these in computer science units (aka, integers). Both turn out to have their uses, so AFAICT attempts to unifyaxes
andindices
have proved problematic. For Base, the advantage ofindices
is that it somehow seems to imply integer units to me.I wonder if the distinction you're working towards is the difference between the set of all indices, an iterator to generate all indices, and the "basis vectors" for constructing (via reshape&broadcasting) all possible indices. Given that arrays have rectangular/Cartesian indexing, all valid indices can be constructed from the "basis vectors". Maybe call it
indexvectors
?