Julia: Ellipsis slicing

Created on 15 Jan 2014  Â·  24Comments  Â·  Source: JuliaLang/julia

Numpy has this ellipsis notation for filling
index tuples up to dimension, where in
A[..., i, k]
... is interpreted as shorthand for the number of colons :, :, : etc needed to fill
the indexing tuple up to dimension n.

http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

This gives nice access to contiguous subarrays in the dense array case and would be not more harmful than the : notation itself.

Was this feature already considered? It was mentioned by @toivoh on the mailing list, but did not trigger discussion.

arrays

Most helpful comment

I tried to make this possible in #24091, but that approach wasn't welcomed. Instead, someone would need to add this as a special form in the [] indexing syntax — adding both parser and lowering support. We could then use the implementation from EllipsisNotation.jl — it's a really small, efficient, and great module.

I'm not the biggest fan of adding a special case that works in A[...] but not view(A, ...), but if enough folks want this then I suppose I can get on board.

(I hope you don't mind I took the liberty of fixing your link).

All 24 comments

+1. I find myself using this a lot in numpy. I also like the numpy newaxis feature, which seems useful and harmless.

+1 for newaxis and something that works like ... in numpy (I'm not sure if
it's appropriate to actually use ... for this purpose, since its meaning
right now is entirely different)

In Python 3.x ... (or Ellipsis) is the singletone instance of type ellipsis:

Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AM
D64)] on win32
>>> ...
Ellipsis
>>> str(...)
'Ellipsis'
>>> type(...)
<class 'ellipsis'>
>>> type(...)()
Ellipsis
>>> type(...)() is type(...)()
True

You can use it as a prettier alternative to None, if you want.

In Python 2.7, however, you cannot just write ... as that results in a syntax error (but Ellipsis is possible).
Nevertheless it is possible to use it in __getitem__:

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
>>> ...
  File "<stdin>", line 1
    ...
    ^
SyntaxError: invalid syntax
>>> class K:
...   def __getitem__(self, o): print(o)
...
>>> K()[...]
Ellipsis
>>> Ellipsis
Ellipsis
>>> type(Ellipsis)
<type 'ellipsis'>
>>> type(Ellipsis)() is type(Ellipsis)()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot create 'ellipsis' instances
>>>

In Julia, we have to decide weather we choose one of both or just don't allow ..., which would be sad.

In a sense this is already here, although at the scalar level rather than the vectorized level:

R = CartesianRange(size(A)[1:end-1])
for j = 1:size(A,ndims(A))
    for i in R
        # do something with A[i,j]
    end
end

In fact it's even better than that, because you can do this with "middle indexes," see #10814, which is presumably impossible with ellipsis indexing.

There are still some challenges with regards to type-stability, at least in context of the example in #10814. But for this example, I suspect it would be OK (I haven't tested).

If we do decide to add this, it'd be good to try it out first before adding syntax. We could experiment with allowing indexing with CartesianRange objects directly.

My hunch, though, is that this isn't needed nearly as much as it is in NumPy since we don't need to bend over backwards to vectorize things for performance. In fact, we do the opposite...

I don't have an opinion on the syntax, but it has been pointed out before
that sometimes vectorized versions of algorithms are easier to read and
understand (and that we want these to be fast as well).

(No bending over backwards, though...!)

On Thursday, April 23, 2015, Matt Bauman [email protected] wrote:

If we do decide to add this, it'd be good to try it out first before
adding syntax. We could experiment with allowing indexing with
CartesianRange objects directly.

My hunch, though, is that this isn't needed nearly as much as it is in
NumPy since we don't need to bend over backwards to vectorize things for
performance. In fact, we do the opposite...

—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/5405#issuecomment-95590600.

@timholy Does that example actually work? A[j, i]works, but it looks like A[i, j] (eg, getindex(::Array, ::CartesianIndex, ::Int)) isn't implemented in master.

Aha! Yes, you're right that not all possible combinations have been implemented, for reasons described in #10814. Would be easy to add. The right way to do it is through #10525, though.

Indeed. Indexing with any combination of I::Union(Real,AbstractArray,Colon,CartesianIndex)... is already implemented there.

Writing A[..., i] is very natural if A is collection of say points or small matrices. Especially expressions like x[..., j] = Phi*x[...,j-1] + b mean the same for A::Matrix and A::Vector{Point} so for example the Euler scheme etc. will be oblivious to the matrix type and will default to a very fast operation in the latter case.

For the vectorized access a not very intrusive approach is to define

getindex{T}(A::AbstractArray{T,1}, ::Type{Val{:...}}, i) = A[i]
getindex{T}(A::AbstractArray{T,2}, ::Type{Val{:...}}, i) = A[:,  i]

and friends. This does not need any additional infrastructure.

Yes, as @mschauer mentioned, this isn't about forcing vectorization, but rather it is the natural thing to do in many cases. This allows you to index 1,...,n arbitrary size tensors, and because it's in a standard array it's very performant (and can make clean code in conjunction with the f.() syntax). Your fix there is a good stand-in until something more general comes along.

@mbauman contributed a great solution to EllipsisNotation.jl which essentially solves this problem.

https://github.com/ChrisRackauckas/EllipsisNotation.jl

It handles cases like A[5,..,5], A[..,5], etc. it's very robust and (now) it infers well!

I for one think it would be nice to have this in Base so it could be standardized as part of Julia notation. It would also solve the #24069 problem of wanting a type for all indices that keeps shape.

Since the implementation is all there, I think the main question would be notation since interval arithmetic people seem to want ...

If gets into Base, then it should use A[i, ..., j]. Test: "I could not think of anything else what that syntax could mean."

... is getting pretty overloaded, but this seems like a fairly natural meaning.

Amusingly :: parsed as an identifier in this context up through 0.6. It's also overloaded, though.

We could use a word like colons, which has the advantage of also having a natural extension to a fixed number of colons with colons(N). That'd give us a complete replacement for slicedim.

I like the idea of sometime in the future having colons(N), since then you'd be able to have more than one ... and still parse it.

I prefer ... for this anyway – at least the "and the rest" sense is fitting. The other sense of :: as a type annotation does not dovetail with the "many colons" meaning here, imo.

I would like to propose the Unicode symbol … for this. It has the advantage of being succinct - only 1 character, yet preserves all the good perks of ....

Hopefully this is still on the "future features" check-list.
This kind of indexing could return slices of unknown-dimensional arrays,
and the returned slices would be shape-ready for broadcasting operations on the original array.

Related discourse discussion.
And a work around repo pointed out by @crstnbr (Sorry if this @ bothers you)

I tried to make this possible in #24091, but that approach wasn't welcomed. Instead, someone would need to add this as a special form in the [] indexing syntax — adding both parser and lowering support. We could then use the implementation from EllipsisNotation.jl — it's a really small, efficient, and great module.

I'm not the biggest fan of adding a special case that works in A[...] but not view(A, ...), but if enough folks want this then I suppose I can get on board.

(I hope you don't mind I took the liberty of fixing your link).

There is not much discussion in #24091, but well, I can trust your judgement "not gonna happen". A pity in my opinion.

Well, if we can vote for features, then I would say +1 for this syntax. It would be nice and clean if it would be fitted into the language.
What is the problem with this feature exactly? Does it bring speed drawback?

Note BTW that the objection above that .. is also wanted for intervals is now solved, in that IntervalSets.jl uses the definition from EllipsisNotation.jl.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StefanKarpinski picture StefanKarpinski  Â·  3Comments

felixrehren picture felixrehren  Â·  3Comments

arshpreetsingh picture arshpreetsingh  Â·  3Comments

yurivish picture yurivish  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments