Julia: Broadcast Array Indexing

Created on 31 Oct 2016  Â·  25Comments  Â·  Source: JuliaLang/julia

I have an array x::Array{ImmutableArray.Vector2}.

It would be cool to be able to broadcast indexing as is done with functions.
e.g.
x.[1],
which would return the first element of each Vector2 in my array.

This is currently achievable with getindex.(x,1) or in the case of a range of indices
getindex.(x,[range]).

broadcast speculative

Most helpful comment

If we assign .[] to pointwise-indexing then we are effectively "pigeon-holing" indexing.

Yup. On the other hand, if we assign A.[X] to getindex.(A,X), then we're limiting the syntax to nested collections of arrays. Now, it looks like you use them pretty heavily, but in my uses, arrays of scalars are a much more common data structure.

I typically only use collections of arrays when I don't know if all the arrays will be the same shape. Otherwise, I'll just use one higher dimensional array. And if the shapes of the arrays aren't all the same, I'm not sure I'd be able to index into them in a homogenous manner.

There's no limit to the number of short-hand syntaxes that one can invent. The key is to find the middle-ground where it's both concise and understandable.

All 25 comments

I know your request is much more general than your example, but in this very specific case I'd use first.(x).

This syntax has also been proposed as a "pointwise" indexing operation — that is, instead of vectorizing over the array being indexed, you vectorize over the indices (#2591). Personally, I think that's more in-line with vectorized function calls; your suggestion would seem surprising to me.

getindex.(x, 1) works fine for this.

@mbauman if getindex.() works then I would suggest .[] should be equivalent. If you want .[] to mean something different, then that would be confusing, wouldn't it?

(Actually, destructuring arrays like this is a very common thing to want to do, in some fields. Sometimes I even want something like vector..field where vector::Vector{T} and T has a field named field, to extract that field from each element. Currently getindex.() is slow for vectors of tuples and getfield.() is always slow, since inference doesn't work it's magic...)

It just seems complicated and surprising to me. By analogy to the f.(args...) operator, I'd expect that only the indices in A.[inds...] would be broadcast over. We don't broadcast over arrays of functions.

This gets even worse with assignment. What parts of A.[inds...] = B should be broadcast over? Does it also lower to setindex!.(A, B, inds...)? Then what happens with A.[inds...] += B?

If B is compatible with A.[inds...], then I fail to see the problem.

For example,
A.[inds] += B --> setindex!.(A, getindex.(A , inds)+B, inds)
works.

This seems reasonable to me. I'm trying to think of whether there is some other reasonable use for A.[inds], but nothing comes to mind. #2591 seems less compelling to me.

Is this really entirely at odds with #2591? I feel like there's a generalization that covers both ideas. cc @toivoh

I think the difficulty stems from the fact that indices work element-wise into an array. That is, I typically think of collections of indices and one atomic source array. By broadcasting over _both_ the source and the indices you're treating both parts as though they're both collections of things. This really only makes sense if the source is a collection of arrays. Personally, I find that a lot less useful.

It's easiest to think about this with concrete examples. Pointwise indexing makes it trivial to extract the diagonal from a square matrix with A.[1:end, 1:end]. If the matrix is included in the broadcast part, that should be a BoundsError since it attempts to index the scalar elements of the matrix by (2,2) and (3,3). You'd have to protect the source in a collection itself with (A,).[1:end, 1:end].

Finally, I must say it's much easier to explain the behavior if the source array isn't a part of the broadcast operation. You broadcast the indices to a common size and index into the source. This works for both indexing and assignment. While the other semantic is _really_ easy to implement, I'm not sure I know how to explain it without resorting to getindex.(A, I...).

Isn't #2591 doable (and perhaps clearer) with a getindex that accepts a function?

By which I mean something like

diagonals = mat[i -> (i,i), 1:end]

In fact what you really want is a Generator

diagonals = mat[i,i for i = 1:end]

Would there be parsing issues with typed comprehensions here? Currently we try to construct Array with this syntax, giving an error. @mbauman Is "pointwise" indexing is more complex in some way?

I agree with @mbauman that it does seem that getindex.(A, inds) is confusing if inds is not scalar. That's really a tough one to nut out. Nonetheless, if we're going to have .+ parse as +.() and so-on, but not the [] operator, it would be surprising, so the pay-off would have to be large (and something we couldn't get with just a Generator, say). Of course [] is confusing for millions of reasons - it is used for so many things and has special parsing rules - like where does end come from in getindex.(a, 1:end)?

@andyferris I think that a nice way to deal with ambiguity would be to overload y.[x] such that if x is an array getindex.(y,x) is called; otherwise the argument is considered a tuple or enclosed in an array and getindex.(y,[x]) is called. I don't know how end is handled, so I'll leave that case to the gurus.

The type of indexing suggested in #2591 seems like a whole new brand of indexing, which should have a unique notation, such as x{inds...}, if it is permissible. .[] has the connotation of broadcasting.

I think I agree with @mbauman's assessment that broadcasting over the collection being indexed into is too confusing and not terribly common. The pointwise indexing operation described in #2591, otoh, seems fairly useful to me – I've certainly used ind2sub and sub2ind with linear indexing to accomplish this sort of thing before. These days we want to avoid linear indexing wherever possible, especially in generic code, so it would be good to have a syntax for this. The difference seems to be that this issue wants broadcasting getindex over the collection _and_ the indices whereas #2591 describes broadcasting the operation of indexing over just the indices. Not sure if that clarifies anything to anyone else. I'm inclined to support a version of #2591 that dovetails with broadcasting. What about having A.[X...] mean ((x...)->A[x...]).(X...), and whatever the corresponding definition for A.[X...] = ... would be.

Yeah, I think we're talking past each other with three different behaviors here. Correct me if I'm wrong:

  • #2591: A.[X...] => broadcast((x...)->A[x...], X...). This is also Numpy's advanced indexing.
  • @jecs: A.[X...] => map(a->a[X...], A). This indexes into each element of A with the given indices.
  • @andyferris: A.[X...] => broadcast(getindex, A, X...). Coupled broadcasting across both A and the indices.

I'm actually rather surprised that #2591 hasn't garnered more support, but perhaps you're right that folks have just been getting by with linear indexing. I don't think we need to bend over backwards to find a meaning for .[] if there's not an obvious one.

@mbauman @andyferris and I are referring to the same thing, namely the behavior that you assigned to him.

A.[1:N] --> getindex.(A, [1:N])

A.[1] --> getindex.(A, 1) or getindex.(A,[1])

A.[[1,2,3,4]] --> getindex.(A,[1,2,3,4])

I = ones(3,3); A.[cat(2,I,2I)] --> getindex.(A, cat(2,I,2I))

I want the singleton expansion that is offered with broadcast. map does not handle that.

I'm all for #2591. It was more out of left field when it was proposed, but these syntaxes are perfectly analogous:

f.(X...) => broadcast((x...)->f(x...), X...) # aka broadcast(f, X...)
A.[X...] => broadcast((x...)->A[x...], X...)

When you write f.(x...), f can't be a collection of functions, so it would be bizarre if A.[x...] expected A to be a collection of collections to index into.

I completely disagree. [] is an operation that you apply to A. f is an operator or function in its own right. A.[] means apply [] to its elements. Doesn't seem bizarre to me.

@StefanKarpinski from that perspective, broadcast((x...)->A[x...], X...) and broadcast(getindex, A, X...) are both pretty natural analogues of f.(x).

getindex.(A, i) is easy enough to write, however, that maybe this tips the balance in favor of #2591. Combined with the fact that Numpy implements #2591, which speaks to the utility of that form of indexing.

Here's the problem with that point of view though: the thing to the left of the . (f or A) does not play an analogous role in your proposed definition for A.[...], whereas in mine it does. In other words, in f.(x...) the value f is captured and not broadcast over, so by analogy in A.[x...], the value A should be captured and not broadcast over. The analogy you're trying to make is something like this:

A.[X...] => broadcast(getindex, A, X...) # aka getindex.(A, X...)
f.(X...) => broadcast(call,     f, X...) # implied by analogy ≠ real definition

Except that call isn't a thing anymore and we don't define f.(X...) that way. Which is why #2591 is actually the right analogous behavior. Not to mention that getindex.(A, X...) works already.

Indexing is a different beast from calling a function. It's like calling a function with an argument on the left and the other arguments on the right. I'm not sure I follow your logic. Basically, if indexing were done as [](A,inds) then you would be agreeing with me. Is A[sub2ind(S,inds...)] really more cumbersome than getindex.(A, inds) and setindex!.(A, f(getindex.(A,inds)), inds)?

Calling a function and indexing into an array are not all that different. In Matlab, APL and other languages, they even have the same syntax (which has its own problems but makes a certain amount of sense). Moreover, in Julia 0.4 f(x...) was overloadable and meant call(f, x...) just like A[x...] means getindex(A, x...), so they were not different at all. But details of how indexing and calling happen to be implemented don't matter since the argument that A.[X...] should mean broadcast((x...)->A[x...], X...) is _syntactic_ – because that's the appropriate level of reasoning for a syntactic constructs and it's the level of reasoning people intuitively apply to syntax.

As it happens, getindex.(A, X...) already works and is efficient and unambiguous. On the other hand, A[sub2ind(A,X...)] is inefficient and terribly opaque unless you're already steeped in the ways of Matlab and ugly functions like sub2ind (any converter with a "2" in its name is suspect). The pointwise indexing described in #2591 is called "advanced indexing" in NumPy, and we know it's generally considered to be useful, and it's very hard to emulate efficiently using only sub2ind, which is also bad practice for generic arrays.

This seems perfectly compelling to me:

f.(X...) => broadcast((x...)->f(x...), X...) # aka broadcast(f, X...)
A.[X...] => broadcast((x...)->A[x...], X...)

I'll withdraw my objection in the interest of moving forward. I assume this is also nice for A.[x...] .= ... (i.e. broadcast!? (And is it .= or =?)

The problem is that [] is sugar and . is sugar, and the "order of desugaring", if you will, is not commutative, so in either case someone might be confused and a choice must be made, plus @StefanKarpinski's arguments seem correct. (At the very least, it was nice to be down to two options! Other things have been suggested for .[] before).

@StefanKarpinski You make a compelling case. I'm almost fully convinced that your approach is the way to go (not that it really matters). Though... this example that I'm thinking of is nagging at me, and I'd like to hear your thoughts on it.

If we assign .[] to pointwise-indexing then we are effectively "pigeon-holing" indexing. Say I have the following datatype.

x::Array{Array{Matrix2x2},2},1}

I would like to access the diagonal elements of certain Matrix2x2s at the bottom-most level.

If we were to assign {} to pointwise indexing and .[] to broadcast indexing, then a natural extension to pointwise indexing would be .{}. Then, the operation that I would like to carry out would be as easy as

x[indices1].{indices2}.{:,:} or x[indices1].[indices2].{:,:}

(I know that this could be done with diagind, but that is just a byproduct of the example that I have chosen.)

Using {}, .[], .{} makes it easy to chain indexing operations via broadcasting. However, if .[] is made to correspond to pointwise indexing, then there does not seem to be an elegant way of broadcasting pointwise indexing similar to the manner that I have shown. This is what I mean by pigeon-holing indexing. In order to perform the operation that I have proposed, then we would have to use for-loops, or chain maps, resulting in verbose and ugly code. What attracts me to Julia is its elegance, and my solution seems more elegant (at least to me :-)).

Thoughts?

If we assign .[] to pointwise-indexing then we are effectively "pigeon-holing" indexing.

Yup. On the other hand, if we assign A.[X] to getindex.(A,X), then we're limiting the syntax to nested collections of arrays. Now, it looks like you use them pretty heavily, but in my uses, arrays of scalars are a much more common data structure.

I typically only use collections of arrays when I don't know if all the arrays will be the same shape. Otherwise, I'll just use one higher dimensional array. And if the shapes of the arrays aren't all the same, I'm not sure I'd be able to index into them in a homogenous manner.

There's no limit to the number of short-hand syntaxes that one can invent. The key is to find the middle-ground where it's both concise and understandable.

@jecs Collections of collections are definitely tricky. I've been tempted to think about what ..() means and so on, but at some stage surely the short-hand syntax will degrade, rather than improve, code clarity (as opposed to explicit broadcast calls, or loops).

@mbauman wrote

I typically only use collections of arrays when I don't know if all the arrays will be the same shape.

_Typically_, yes, but do you like it? I thought this was a pragmatic way of handling the fact that Vector{Vector{T}} is less efficient of a data structure than Matrix{T}. What I have found improves code clarity a lot is using Vector{SVector{N,T}} (where SVector is an immutable from _StaticArrays.jl_) in the case where the length N is known (e.g. positions in 3D space - you can have a vector of 3D points, which is a more precise abstraction than a 3xM matrix).

Granted, sometimes N is uniform but unknown to the compiler.

Either way here, I don't think we'll be able to allow fusion through to a computation on the indices. I'm using the definition that A[X] := broadcast(x->A[x], X) here, but that doesn't really matter for this problem.

One of the things I would expect to work is A.[A .> 0]. Clearly the naive broadcast(x->A[x > 0], A) won't work. Or worse: A.[A.[:, 1] .> 0, :] ./ maximum(A). This is a problem for all indices that aren't integers or collections thereof (logical arrays and colons, by default… and extensible to more).

I think we'll have to compute the indices first and then broadcast. This means that A.[A.[:, 1] .> 0, :] ./ maximum(A) is:

inds1 = to_indices(A, (:, 1))
inds2 = to_indices(A, (broadcast((I...)->A[I...] > 0, inds1...), :))
broadcast((x,I...)->A[I...] / x, maximum(A), inds2...) # ERROR!

But that's wrong, too! I didn't realize it when I came up with the toy example, but A.[mask, :] is only well-defined when sum(mask) == size(A, 2) since both mask and : expand to vectors that must be able to broadcast to a common size. It's confusing since regular old indexing _already_ does a slightly different form of broadcasting where it first orthogonalizes all non-scalar indices to be perpendicular to each other and then calls the scalar getindex across the product.

This syntax would be useful for JuMP. The use case I'm imagining would just require broadcasting over the indices and not the collection.

Hi, just here to drop a comment that today I tried

p = v..x

where p was an array of objects that had field x, and I was sorely disappointed. I've been using

p = getfield.(v, :x)

entirely too long, and anyone who looks at my code and isn't super familiar with Julia has no idea that [] and . are just syntactic sugar for getindex() and getfield() respectively (and sometimes broadcast()). At least with only an extra dot, I can just say "when you see a dot that you don't expect, take that to mean it's vectorizing the operation". With the current syntax, I have to explain both, as well as what is going on with the :x. This isn't to say the language should decide on something just for newcomers, but if we are going to have the generalized dot syntax, there has to be some way (even if it isn't .[] and ..) to use it on getindex() and getfield() without breaking out the full function names and field symbols. Just my two cents.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

StefanKarpinski picture StefanKarpinski  Â·  3Comments

sbromberger picture sbromberger  Â·  3Comments

Keno picture Keno  Â·  3Comments

felixrehren picture felixrehren  Â·  3Comments

i-apellaniz picture i-apellaniz  Â·  3Comments