Julia: range indexing should produce a subarray, not a copy

Created on 12 Jul 2013 · 32Comments · Source: JuliaLang/julia

I would prefer that range indexing of an Array (e.g. X[1:10, :]) created a SubArray (a view/reference), not a copy. This seems more in the spirit of Julia's pass-by-reference semantics, and would eliminate some of the performance gotchas with range indexing.

It might also make future loop-devectorization optimization easier, because subarray references can be devectorized back into references to the original array without worrying that you will be changing the semantics.

It would reduce Matlab compatibility, but we already do that with pass-by-reference and so I doubt many users will be surprised by having the same reference semantics for range indexing.

This is something that has come up informally several times on the mailing list, but I didn't see any issue for it.

breaking

Source

stevengj

Most helpful comment

I find myself wanting to use this pattern all the time: a .= b[:,i]
But it seems that the right way to do this is: a .= view(b,:,i), which reads a bit less clearly (besides not being able to use end).

a = zeros(1000)
b = zeros(1000,10)

function f1(a,b,i)
    a[:] = b[:,i]
    nothing
end
function f2(a,b,i)
    a .= b[:,i]
    nothing
end
function f3(a,b,i)
    a .= view(b,:,i)
    nothing
end

@time f1(a,b,1)
  0.000011 seconds (6 allocations: 8.109 KiB)
@time f2(a,b,1)
  0.000006 seconds (6 allocations: 8.109 KiB)
@time f3(a,b,1)
  0.000002 seconds (4 allocations: 160 bytes)

Preserving the semantics of copying on slicing, could the compiler optimise away the copy in a .= b[:,i]? That would seem logical for a dot-operation, barring further use of the RHS through the return value.

I find the idea of module-wide @views begin ... end dangerous as it makes the semantics modal.

damiendr on 3 Aug 2017

👍3

All 32 comments

Thanks for making the issue. We should definitely do this.

StefanKarpinski on 12 Jul 2013

ViralBShah on 13 Jul 2013

At the moment, the issue is not that it would reduce Matlab compatibility, but that it would almost certainly reduce performance---for many operations SubArrays have nowhere near the performance of Arrays. #3503 will help when it can be merged (but there are a few "deep" issues that need to be addressed first).

timholy on 13 Jul 2013

@timholy, there is no technical reason (as far as I can tell) why accessing a SubArray should not generate the same performance (and in fact, almost exactly the same code) as accessing the same data directly in the original Array. If that's not the case now, then this is a performance bug that should be fixed, but I see that as a somewhat orthogonal issue to this one.

(There will be cases where it is advantageous for cache reasons to make a contiguous copy of a subarray that is discontiguous in memory, but of course the user can always call copy explicitly in that case.)

stevengj on 13 Jul 2013

As @timholy mentioned, we are working on an improved version of subarray -- the current implementation is way too slow.

However, the main obstacle is performance degradation. Even a very simple immutable wrapper over Array would lead to 3x - 5x slow down --- things often do not get inlined. I believe Jeff has been working on improving the compiler and hopefully such issues would be resolved when it is done.

lindahua on 13 Jul 2013

@stevengj, there are two issues. Dahua mentioned one, inlining the wrapper.

The other principal hurdle is linear indexing, which is heavily used. Suppose A is a 10x20 Array, I can access the memory location of each element using a simple for loop. But if A is a 10x20 SubArray of a 27x43 Array, then I need to do something more complicated. These operations have never been as well optimized as they could be.

Heck, even getting good performance from Arrays for operations like getindex took some work, e.g., make_arrayind_loop_nest. I never got around to doing that for SubArrays, although in principle it's a pretty small change. The issue in part is that you might need something like this even for "simple" operations like taking the sum of all elements of the subarray, whereas with arrays you can confine such complexity to a few operations.

@StefanKarpinski has been doing some work on the linear indexing problem, hopefully this will help.

timholy on 13 Jul 2013

Regarding inlining, that is a compiler issue; there is no technical reason why subarray accesses could not be inlined with zero overhead. It seems like a bad idea to decide fundamental language semantics based on temporary performance issues.

Regarding linear indexing, exactly the same issue applies to looping over a portion of the original Array directly. As I said, there are cases where one would want to make a contiguous copy of a subarray for performance reasons—the appropriate performance comparison here is between looping over a portion of an Array versus looping over a SubArray abstraction of the _same memory locations_.

stevengj on 13 Jul 2013

Regarding inlining, that is a compiler issue; there is no technical reason why subarray accesses could not be inlined with zero overhead. It seems like a bad idea to decide fundamental language semantics based on temporary performance issues.

Absolutely. We're all for range slices producing views. These performance issues are clearly surmountable.

StefanKarpinski on 13 Jul 2013

The linear indexing issue with subarrays was actually one reason that I started to work on the broadcast stuff. I think that with the proper broadcast primitives, we could handle most interesting uses of linear indexing, almost as efficiently and in a more powerful way (though generating a bit more code than the linear indexing approach.) But we're not there yet.

toivoh on 13 Jul 2013

I strongly agree with having a good design and improving the compiler to match it. I know it will be possible to generate much better code for subarrays than we do now.

JeffBezanson on 13 Jul 2013

Is this already implemented in @lindahua's ArrayView's branch? #5556

quinnj on 21 Aug 2014

Since we're in the chaos period, I'd be down to just rebase that branch and merge it as soon as @lindahua get's a chance to do so.

StefanKarpinski on 21 Aug 2014

We can add it, but we can't remove SubArrays until the AbstractArray problem is solved. That's basically waiting on stagedfunctions.

timholy on 21 Aug 2014

To answer @quinnj 's question: No #5556 provides the infrastructure to make this happen and the getindex methods have to be adapted.

tknopp on 21 Aug 2014

Is this issue the reason why the v0.5 release will be called "Arraymaddon"?

Sisyphuss on 13 Jul 2015

One of several.

StefanKarpinski on 13 Jul 2015

@stevengj: Since you initiated this issue it would be very interesting whats your opinion on the recent discussion on the topic in https://github.com/JuliaLang/julia/issues/13157

tknopp on 26 Feb 2016

Current plan is to revisit this after all immutables can be inlined. See #9150

vtjnash on 22 Dec 2016

I'm pretty sure this was decided against and the thought now is to keep returning copies but with a more convenient syntax for views (@view)

KristofferC on 26 May 2017

@KristofferC, is there a discussion/rationale leading to this decision available somewhere online? From several issues here in this repo I got an impression that people were generally in favor of returning views by default as soon as some general performance issues are fixed. I am really curious as to why the opposite decision was finally made.

fjarri on 26 Jun 2017

Probably some useful discussion in #7941 and #13157. We also have @views now and dot-fusion so there really isn't much need for it.

timholy on 26 Jun 2017

@timholy, thanks. I knew about @views (and dot-fusion seems orthogonal to this), it's just that I come from numpy, and almost never needed copied (or even mutable) slices in my numeric code, so the decision seemed a bit counter-intuitive. In fact, I only discovered that this is the case recently, when I got some unexpected performance drop from using slices; before that I was under the impression that slicing works just as it does in numpy.

fjarri on 26 Jun 2017

There are two major issues:

In my experience, the performance story is really quite mixed. It tends to only be a performance win when you're slicing inside Matlab/Numpy style vectorized inner loops. In some other cases it can be a huge performance loss. Modern CPU architectures really like predictable and cache-efficient memory locality. Copying is a great way to ensure that happens. I used to think it'd be a bigger win across the board, but that's simply not the case. It will be interesting to see how stack-allocated views might change this, but I don't think it'll help the outer loop cases. Note that there are other compiler optimizations that could help the copying-slices-within-the-inner-loop case (in some of my applications, I've gotten the best performance by copying slices into a pre-allocated buffer with a simple custom getindex! function).
Changing the default semantics would be an upgrader's nightmare. You might not exploit the independent mutability of slices, but many lines of code out there does. Not only would folks need to vet their code for semantic compatibility... they'd also need to profile and try copy(X[Y]) to see if they want to opt back out to retain their original performance. There'd need to be a really compelling story for it to be worth the churn. I used to think it'd be more compelling.

As it stands, every single indexing expression in Julia can be performed as a view. You can opt-in to using views over large sections of code with @views begin … end. Even entire modules can be written this way:

```
module M @views begin

… Your code

end end
````

mbauman on 26 Jun 2017

and dot-fusion seems orthogonal to this

One of the best reasons for returning views was to minimize the amount of copying in complex expressions; fusion is really a better solution to the same problem.

timholy on 26 Jun 2017

One thing that could help there is hammering out a definition for A.[X] so indexing syntaxes can also participate in fusion.

mbauman on 26 Jun 2017

The options seem to be

A.[X] := getindex.(A, X)
A.[X] := (x->A[x]).(X)

I've found myself wanting both on occasion. IIRC, there's a discussion about this somewhere.

StefanKarpinski on 26 Jun 2017

👍1

Yes, as have I. I've wanted them both exclusively of each other (within a few lines) when working with nested JSON-like data. #19169.

mbauman on 26 Jun 2017

It tends to only be a performance win when you're slicing inside Matlab/Numpy style vectorized inner loops. In some other cases it can be a huge performance loss.

To clarify my usage pattern, I'm talking about something like this (a stencil-type computation, essentially). Omitting @views leads to a five times worse performance (which is what I would expect for large arrays, since creating a view object is a constant time operation in the size of the array, as opposed to the linear time copying). Of course, the views in my example contain large contiguous parts, perhaps the situation is different for sparse ones.

One of the best reasons for returning views was to minimize the amount of copying in complex expressions; fusion is really a better solution to the same problem.

I guess you can look at it that way. The problem is that loop fusion and views prevent the creation of temporary arrays resulting from two separate expression types: arithmetic and slicing respectively. If fusion worked for the latter type of expressions too (e.g. as @mbauman suggested), it would indeed be a better solution to this problem.

fjarri on 27 Jun 2017

a = zeros(1000)
b = zeros(1000,10)

function f1(a,b,i)
    a[:] = b[:,i]
    nothing
end
function f2(a,b,i)
    a .= b[:,i]
    nothing
end
function f3(a,b,i)
    a .= view(b,:,i)
    nothing
end

@time f1(a,b,1)
  0.000011 seconds (6 allocations: 8.109 KiB)
@time f2(a,b,1)
  0.000006 seconds (6 allocations: 8.109 KiB)
@time f3(a,b,1)
  0.000002 seconds (4 allocations: 160 bytes)

I find the idea of module-wide @views begin ... end dangerous as it makes the semantics modal.

damiendr on 3 Aug 2017

👍3

It is entirely possible that the compiler could at some point recognize that it doesn't actually need to make a copy in these cases. We've also considered the syntax b@[:,i] for view slices which would make the awkwardness difference considerably less. Some people do hate that syntax though.

StefanKarpinski on 3 Aug 2017

@views function f2(a,b,i)
    a .= b[:,1]
    nothing
end

isn't so bad and that way you don't have to put it on the entire module. Optimization would, of course, be better.

KristofferC on 3 Aug 2017

@views on a function is more acceptable already, though I agree that optimisation would be better.

b@[:,i] sounds fine to me for times when the intent is to create a view that will be used as such later in the code, though I don't mind view() in that case either. But doesn't the dot syntax already imply that one wants to use views?

From the doc:

If the left-hand side is an array-indexing expression, e.g. X[2:end] .= sin.(Y), then it translates to broadcast! on a view, e.g. broadcast!(sin, view(X, 2:endof(X)), Y), so that the left-hand side is updated in-place.

How about a similar rule for the RHS, eg. indexing produces a view whenever the array is an operand of a dot call?

damiendr on 4 Aug 2017

👍2

Was this page helpful?

0 / 5 - 0 ratings