This issue supersedes the 0.4 work towards array nirvana (#7941), and will tracks the issues we aim to complete during 0.5 and beyond — now updated through work on 0.7. This is an umbrella issue and will track specific tasks in other issues. Please feel free to add things that I've missed.
Required underlying technologies
@inbounds
elide code blocks hidden within an @boundscheck
macro, propagating down only one level of inlining (https://github.com/JuliaLang/julia/issues/7799#issuecomment-117362695). This is a strong requirement for the subsequent steps. (implemented in #14474)sub
behaviour to slice
(#16846)New functionality
<: Integer
and widen non scalar indexing to <: Union{Number, AbstractArray, Colon}
(https://github.com/JuliaLang/julia/pull/12567#issuecomment-170982983).Int
or AbstractArray
: #19730Other speculative possibilities
Array
definition (#12447); this type could also be used for I/O buffers and string storage.IndexSet
IntSet
on BitArray
AbstractArray{Bool}
find
on logical arrays and simply wrap it with an IndexSet
LogicalIndex
instead? (#19730)IndexSet
(https://github.com/JuliaLang/julia/issues/1032) or special Not
type? (Perhaps in a package: https://github.com/mbauman/InvertedIndices.jl)T[...]
) is kind of a bad pun. This syntax is especially bad since some variations are parsed as indexing into a type while others are parsed as special forms (typed hvcat, typed comprehensions)@sub
@view
macro (#16564)This looks like a great list Matt, thanks. I'm a little scared of the
fallout but it'll be a huge leap forward for the language.
On Tuesday, September 15, 2015, Matt Bauman [email protected]
wrote:
This issue supersedes the 0.4 work towards array nirvana (#7941
https://github.com/JuliaLang/julia/issues/7941), and will track the
issues we aim to complete during 0.5. This is an umbrella issue and will
track specific tasks in other issues. Please feel free to add things that
I've missed (at least during the first half of 0.5)._Required underlying technologies_
- Julia native bounds checking and removal (#7799
https://github.com/JuliaLang/julia/issues/7799). Several tries have
been made at this, but I believe the current plan of action is to make
@inbounds elide code blocks hidden within an @boundscheck macro,
propagating down only one level of inlining (#7799 (comment)
https://github.com/JuliaLang/julia/issues/7799#issuecomment-117362695).
This is a strong requirement for the subsequent steps.- ReshapedArrays (#10507
https://github.com/JuliaLang/julia/pull/10507). Requires better
performance:
https://groups.google.com/d/msg/julia-dev/7M5qzmXIChM/kOTlGSIvAwAJ_Breaking behavior changes_
- Drop dimensions indexed by a scalar (#4774 (comment)
https://github.com/JuliaLang/julia/issues/4774#issuecomment-81228816)- Return slices as views. A first attempt at this was at #9150
https://github.com/JuliaLang/julia/pull/9150. Special attention may
be needed for BitArrays.- Flip the switch on the concatenation deprecation (#8599
https://github.com/JuliaLang/julia/pull/8599)_New functionality_
- Allow expression of varargs of defined length (#11242
https://github.com/JuliaLang/julia/pull/11242). This allows us to
take full advantage of #10525
https://github.com/JuliaLang/julia/pull/10525.- Transpose returns a covector or transpose type (#4774 (comment)
https://github.com/JuliaLang/julia/issues/4774#issuecomment-81228816)- Ditch special lowering of Ac_mul_Bt, use dispatch instead.
- Dimensions indexed by multidimensional arrays add dimensions (full
APL-style: the dimensionality of the result is the sum of the
dimensionalities of the indices)- Allow any index type in non-scalar indexing (#12567
https://github.com/JuliaLang/julia/pull/12567)- Easier creation of immutable arrays with tuples and #12113
https://github.com/JuliaLang/julia/pull/12113._Other speculative possibilities_
- A mutable fixed-size buffer type, which would allow for a
Julia-native Array definition (#12447
https://github.com/JuliaLang/julia/issues/12447)- Base IndexSet on BitArray or perhaps any AbstractArray{Bool}.
- Rework nonscalar indexing to prevent calling find on logical arrays
and simply wrap it with an IndexSet instead?- Negated indexing with complement IndexSet? (Perhaps in a package)
—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13157.
Yes, great list! Let's roll up our sleeves!
Are there any pieces of this that are unclaimed? I'd like to help, but I
don't want to step on anyone's toes.
On Tuesday, September 15, 2015, Jeff Bezanson [email protected]
wrote:
Yes, great list! Let's roll up our sleeves!
—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13157#issuecomment-140586293.
It's a phenomenal list. There's actually only a few changes that are very breaking, which is nice, but those are significant enough.
There is definitely more than enough work to go around! I don't think any of these tasks are claimed.
Things like dropping scalar dimensions are rather simple changes, but will take a lot of work finding and fixing bugs... and that part is easy to collaborate on. Same goes for views (if you ignore the perf issues with ReshapedArrays and inbounds). Anyone is welcome to dig in!
Views is hard, you have to make it through bootstrap without :hocho: yourself in the :eyes:.
Having just done a bunch of work to get string changes through bootstrap, I'm inclined to believe this.
Thanks for doing this @mbauman, so much to digest
I've added "Remove default no-op behavior for (c)transpose" as an item. I expect much complaining, but as we've discussed before, it's simply wrong to assume that <:Any
is a scalar and the logic error rears its head every time one tries to wrap and/or implement custom array/matrix types. cc @jakebolewski @andreasnoack
I think we need to think through the options for that carefully. It's pretty idiomatic to write A'
to transpose a non-complex matrix.
Isn't it possible that the (c)transpose wrapper will solve this issue? There's a lot of design work that will need to go into it, but:
transpose(A::AbstractVectorOrMatrix) = TransposeType(A) # Will call `transpose` upon indexing, too
transpose(::AbstractArray) = error("cannot take transpose of 3+ dim array") # yeah, error methods aren't the best…
transpose(x) = x
related: I think some of the typing problems in linalg (and transpose behavior) comes from the fact that we represent a blocked linear operator as an array of arrays. We may want to switch to a new type that knows of the various sizes inside it for that, I remember discussing that with @andreasnoack. 0.5 might be a time to at least think about it.
Isn't it possible that the (c)transpose wrapper will solve this issue?
Maybe; we'd have to think about it.
The blocking issue last time is that transpose has to be recursive to handle cases like Matrix{Matrix}
(Example: A = [rand(1:5, 2, 2) for i=1:2, j=1:2]; A'
) correctly, but people want to write A'
on 2D arrays on non-numeric types (e.g. images, as Matrix{<:Colorant}
) and expect the transpose to not apply to the scalar elements. The no-op transpose(x::Any)
method exists to handle these cases. However, this definition conflicts with matrix-like objects, which have the algebraic semantics of matrices but are not stored internally in any array-like form, and hence by #987 should not be an AbstractArray
(QRCompactWYQ
is the poster child, but we have many such examples). If you introduce a new matrix-like type, you have to explicitly define (c)transpose
otherwise you get the no-op fallback which is a source of many bugs.
To be clear, the behavior we would break explicitly is the claim (which you can find in the help for permutedims
) that
Transpose is equivalent to permutedims(A, [2,1]).
This equivalence makes no sense for types that are not AbstractArray
s and that are AbstractArray
s of non-scalars, and we actually do have matrix-like types that need this more abstract sense of transpose.
I think assuming that a Matrix{Matrix} will automatically recursively
transpose the elements is bad and dangerous. I'd rather see a special type
BlockMatrix{Matrix} that does what you're looking for.
On Wed, Sep 16, 2015 at 10:30 AM, Jiahao Chen [email protected]
wrote:
Isn't it possible that the (c)transpose wrapper will solve this issue?
Maybe; we'd have to think about it.
The blocking issue last time is that transpose has to be recursive to
handle cases like Matrix{Matrix} (Example: A = [rand(1:5, 2, 2) for
i=1:2, j=1:2]; A') correctly. However, people want to write A' on 2D
arrays on non-numeric types (e.g. images, as Matrix{<:Colorant}) and
expect the transpose to not apply to the scalar elements. The no-op
transpose(x::Any) method exists to handle these cases. However, this
definition conflicts with matrix-like objects, which have the algebraic
semantics of matrices but are not stored internally in any array-like form,
and hence by #987 https://github.com/JuliaLang/julia/issues/987 should
not be an AbstractArray (QRCompactWYQ is the poster child, but we have
many such examples). If you introduce a new matrix-like type, you have to
explicitly define (c)transpose otherwise you get the no-op fallback which
is a source of many bugs.To be clear, the behavior we would break explicitly is that
Transpose is equivalent to permutedims(A, [2,1]).
would now be false. This equivalence makes no sense for types that are not
AbstractArrays, and we actually do have matrix-like types that need this
more abstract sense of transpose.—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13157#issuecomment-140759002.
@tbreloff that's exactly my point. Most people think it's odd that transpose should be recursive, but it _must_ be to be a mathematically correct transpose, and that disquiet exposes corner cases where transposition is not simply permutedims(A, [2,1])
. (Although it’s true that Matrix{Matrix}}
is not really a blocked matrix type, because there are absolutely no guarantees that the inner matrices have dimensions that are consistent with any partitioning of a larger matrix.)
Ah, yes, I forgot about all the Tensor-like objects that aren't AbstractArrays. No matter what happens, either the authors of Tensor-like objects will need to communicate to Julia somehow that they're not scalars (sometimes being an AbstractArray works, but not always), or the authors of Scalar-like objects will need to do the reverse (sometimes being a Number works, but not always), or both. This same sort of scalar-or-not question rears its head all over the place… e.g., indexing: https://github.com/JuliaLang/julia/pull/12567.
Right now we require a mishmash of method specializations for the tensors with some scalar-like fallbacks. This means that some get forgotten and we end up with scalar methods getting called, returning the wrong result.
Since we can't communicate this through supertypes (https://github.com/JuliaLang/julia/issues/987#issuecomment-31531602), I think it's gotta either be through better documentation of the required methods or having those methods explicitly encoded into a traits-based system. And if we remove all fallbacks (which ensures correct behavior at the cost of missing methods for everyone), I think we need to make this as simple as possible with traits.
+1
I really think we should start enumerating traits for AbstractArray's. Linear indexing seems to have been an amazing example of the power of a few careful design decisions involving traits.
One possible solution would be to keep AbstractArray <: Any
, and introduce AbstractNonArrayTensor <: Any
alongside it. Everything else would be considered "scalar" as far as indexing and linear algebra are concerned.
Note that this is distinct from and much more well-defined than an Atom
vs. Collection
distinction (https://github.com/JuliaLang/julia/pull/7244#issuecomment-46333261); A[:] = (1,2,3)
and A[:] = "123"
behave very differently from A[:] = 1:3
for A = Array(Any, 3)
, as they should.
I really think we should start enumerating traits for AbstractArray's.
@johnmyleswhite By any chance did you mean language supported "traits"? That's one thing I've really wanted to see in the language since JuliaCon.
Yes, I meant language supported traits.
Do you have any ideas/suggestions about how traits could be added to Julia, syntactically? They would be very useful for arrays, strings, encodings, at the very least.
I prefer leaving those decisions to Jeff rather than speculating.
Given that this is an umbrella issue, it would be nice to discuss specific items in their own issues.
I do think though that having traits in the language might substantially change the design for Arrays, which is why discussion of traits, at least in the area of how they could be used for better Array abstractions, would be useful.
Please move traits discussion to #5, and the (c)transpose issue to #13171.
I don't think the syntax for traits needs to be figured out before figuring out what traits are important for arrays. In fact, having more examples of traits that we actually need is excellent for helping design the language feature.
Ok, good point, as long as traits are being thought of as part of the design.
Upper and lower for triangular matrices? Those seem like good candidates for being done as traits to me.
Added speculative item for replacing indexing into a type as syntax for typed arrays.
I would prefer not to move sparse arrays out of base in 0.5. I think we should aim to have SparseVectors in Base for 0.5, and make sure that there is one really high quality default implementation.
As we move some of the other functionality out of base, we can move out the various sparse solvers first. If alternative sparse data structures are shaping up nicely at that point in JuliaSparse
, we can eventually move SparseMatrixCSC there too.
How about a small housecleaning task here as well - move all the array code into its own subdirectory and module in Base?
There are a bunch of things marked as milestone 0.5. I would like to suggest that the issues referenced here should be marked as 0.5 blocker
, and the rest that are 0.5 are essentially 0.5 nice to have
. We could then have a 1-2 week time period for accumulating a complete list of 0.5 blocker
issues, and release 0.5 as soon as the blockers are all done. Any nice to have issues that get fixed in the same timeframe will make there way into 0.5, but the release won't block on any nice to have feature that is not ready.
I'd like to add a few additional possibilities. I'm extremely excited about Arraypocolypse!
First, I think it would make sense to have operators for tensor sum and tensor product. These could be ++
and **
, or could be unicode symbols \otimes
and \oplus
. Tensor product can replace kron()
. Tensor sum gives concatenation of vectors and, and as aside, it is a great description of string concatenation (there has been many, many conversations about this, and the core Julia developers seem worried that +
corresponds to the incorrect monoid here - and so go to *
because it possibly seems less commutative - while I would strongly argue the correct operation is tensor sum \oplus
which is also not commutative). These can also be used for higher-order arrays/tensors (see below) and perhaps (I'm speculating here) relational algebra operations on DataFrames?.
Second, it has occurred to me that Julia may be really, really good at doing multi-linear algebra with extremely nice syntax. I have been playing with a package for assigning names to each index as an array as a part of the type using a Tuple{...}
type. An array is turned into a tensor by indexing it like A[Tuple{:index1,:index2}]
and if Julia 0.5 uses {}
for Tuple{}
then it's much improved to A[{:index1,:index2}]
and we get the following awesomeness:
A[{1,2,3}] = B[{3,2,1}]
A[{:a}] = B[{:b,:a}] * C[{:b}]
(contracts over :b
index, so A = B' * C
).A[{:a,:b}] = B[{:a,:c,:d}] * C[{:c,:e}] * D[{:d,:f}] * E[{:e,:f:,:b}]
. Brackets can be used to choose the fastest contraction ordering.The permutation syntax may solve another issue - transpose
is a linear algebra thing, but since it is so simple to type '
it is used instead of permutedims
to, e.g. reflect an image (a 2D array of pixel values). These different use cases seem to be causing an issue as to whether transpose
is called recursively, but I would argue this "tensor" syntax for permutedims
is clear enough for use in general programming and data/array manipulation.
I'm not sure if people would like this approach to multi-linear algebra (it relies on generated functions quite a bit) but I would welcome feedback!
Could you elaborate a bit on tensor sum and vector ( and string) concatenation? I thing it would be great to have a "generic" concatenation operator, which the mathematicians also like.
The tensor sum (or direct sum of v = [v_1,v_2,v_3] and w = [w_1, w_2, w_3] is written v \oplus w = [v_1,v_2,v_3,w_1,w_2,w_3]. (imagine I'm using LaTeX - \oplus and \otimes works in the Julia REPL already and look like + (or times) drawn with a circle around them).
The tensor/direct sum truly is the mathematical way of concatenating vectors (like vcat
) and I would suggest strings also. For matrices A \oplus B is the bigger, block-diagonal matrix [A 0; 0 B], (and is a good candidate to be represented by the BlockedArray
s mentioned further above). For transposed (row) vectors, the tensor sum is again just the concatenation given by hcat
, so this would have to be treated differently depending if you had a row vector or a 1-by-n matrix (which hopefully should be OK in Julia 0.5).
So if we think of strings as vectors of characters, then direct/tensor sums are the way to go if the core developers are going to use words like "monoid" as a guiding syntax principle. Can string manipulation be thought of as a monoidal category where parallel arrows are concatenations?? Certainly, multilinear algebra over real (or complex) numbers is given by a (dagger) symmetric bi-monoidal categories where tensor/direct sum (\oplus) and tensor/outer/direct product (\otimes) are the two "monoids".
@andyferris please open a new issue or PR for tensor concatenation. This issue is long and complicated enough without throwing in more features.
The relationship between multidimensional arrays and multilinear algebra was discussed in the APL literature of the 1970s, notably by Trenchard More and others working on APL2.
Yes, this deserves a new issue, this is too much of a derailment from the main work here.
(@andyferris, see also https://github.com/Jutho/TensorOperations.jl, https://github.com/mbauman/AxisArrays.jl, https://github.com/timholy/AxisAlgorithms.jl.)
I'd like to share with y'all some code I did a while ago that I think handles hits one point in the design space for array operations that reflect locality and support sparse as first class. Porting it over would require traits I suspect, but you may find some of the ideas useful https://github.com/wellposed/numerical/tree/master/src/Numerical/Array/Layout the layout modules specifically
Thanks @timholy for those links; I wasn't aware of the last two.
How about a small housecleaning task here as well - move all the array code into its own subdirectory and module in Base?
@ViralBShah That sounds like a very good idea - also move array unit tests so there is a nice 1-1 correspondence between implementation and test files.
Should we get the ball rolling here by flipping the switch on the concatenation deprecation (#8599)?
+1
@mbauman has a branch where most of the transpose changes have been implemented. @jiahao will correct me if I'm wrong here, but the new behavior of tranpose can be separated into a matrix transpose and a vector transpose where the former is less controversal so to make some progress here and start the process of adjusting code, I'm proposing that we start with the matrix transpose type only and postpone the vector tranpose a bit. @mbauman Do you think that would be feasible and do you have the time to make this separation if we decide that it's a useful way to proceed?
Implementing the transpose types themselves is rather simple — just a few dozen LOC. And the matrix and vector transpose types themselves aren't intrinsically coupled (except that making matrix transposes views without changing vectors would be even more strange than the status quo).
The most difficult part about the direction I headed in mb/transpose is that I _also_ tried to remove the special lowering to Ax_mul_Bx. That part is a ton of work, with each operator containing hundreds of permutations of transpose types, special matrix types, and BLAS-compatible element types. I didn't quite make it to the end of the tunnel for mul
, and I didn't even start tackling the other operators in that branch. Removing the special lowering is also the point where we need to make a decision about the vector transpose.
That said, there's no need for these two changes to be coupled to each other. The transpose type _allows_ for the removal of the Ax_op_Bx special lowering steps, and even simplifies some of that mess, but removing Ax_op_Bx isn't necessary in order to introduce transpose views. I think that the path forward here is to make smaller steps. While it was very informative to try tackling both together (it works! and it can make things simper!), it's too much to bite off at one time.
I think it should be possible to add simple fallbacks, e.g., *(A::Transpose, B::AbstractArray) = At_mul_B(untranspose(A),B)
, in order to get transpose views working with the current system without touching hundreds of methods. My time is very limited these days, so I cannot promise I'll find time to do this myself. But I think this could be done in a relatively small PR. Anyone is welcome to pick up the torch here.
Couldn't we start by defining the Ax_mul_Bx methods in terms of the
transpose types?
On Thursday, November 5, 2015, Matt Bauman [email protected] wrote:
Implementing the transpose types themselves
https://github.com/JuliaLang/julia/blob/335200c142e368cad9aba885df5539d2755195ad/base/transpose.jl
is rather simple — just a few dozen LOC.The most difficult part about the direction I headed in mb/transpose is
that I _also_ tried to remove the special lowering to Ax_mul_Bx. That
part is a _ton_ of work, with each operator containing hundreds of
permutations of transpose types, special matrix types, and BLAS-compatible
element types. I didn't quite make it to the end of the tunnel for mul,
and I didn't even start tackling the other operators in that branch.
Removing the special lowering is also the point where we need to make a
decision about the vector transpose.That said, there's no need for these two changes to be coupled to each
other. The transpose type _allows_ for the removal of the Ax_op_Bx
special lowering steps, and even simplifies some of that mess, but removing
Ax_op_Bx isn't necessary in order to introduce transpose views. I think
that the path forward here is to make smaller steps. While it was very
informative to try tackling both together (it works! and it can make things
simper!), it's too much to bite off at one time.I think it should be possible to add simple fallbacks, e.g., *(A::Transpose,
B::AbstractArray) = At_mul_B(untranspose(A),B). My time is very limited
these days, so I cannot promise I'll find time to do this myself. But I
think this could be done in a relatively small PR. Anyone is welcome to
pick up the torch here.—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13157#issuecomment-154101995.
How are things coming along with Arraypocalypse? Is the original plan to have a 0.5 release by the end of Jan with all of the things above in it still on the books?
Lots of other things are happening in 0.5, but AFAICT little is happening on arraysthis particular list. Some of us who have contributed a lot in the past to Julia's arrays are busy (in part, reaping the benefits of our previous work :wink:). To make this happen, it's possible others will have to step up to the plate (hint). But the just-posted #14474 is perhaps the most exciting thing on this front to come in a long time---in many ways, it's THE crucial bottleneck.
If other stuff is ready to go, we could delay arrays to a later release.
Yeah, lots of other exciting work going on, but this list is a little slow with progress. My thought is that we review this when the debugger is in good shape, perhaps around end of Jan. We have a new LLVM, thread safety, fast closures, Juno-Atom and many other goodies coming up. I do hope we can also get the array views in, if not this whole list.
Here is one vote for delaying 0.5 until all the breaking changes around arrays are implemented. My understanding is that APL style slicing (which breaks backwards compatibility) is already on master, so package authors will have to update a large amount of code no matter what for 0.5. If the other major change (returning views instead of copies from []
) would only come in some post-0.5 julia release, everyone would have to review/update all the same lines of code again, which would really be a LOT of extra work that could be entirely avoided if both of these breaking changes were delivered in one release.
Plus, from a users perspective, it would be great if the major breaking changes in the language would come rather earlier than later... From my point of view it would be much preferable to have things like a debugger, ARM compatibility, faster closures, threads etc. all come after the breaking language changes are done. All the code I write these days will have to be updated once these breaking changes land in master, whereas all the other good things mentioned would (most likely) not affect any of the code people are writing right now... The "looming" things like Arraypocalypse are the major reason a lot of people I know are staying away from julia for now...
Of course I have no insight whether there is anyone left who has the capacity to work on this, it seems some of the people that drove this earlier have moved on... But maybe Julia Computing can dedicate some resources to finishing this job?
In any case, could someone update the due date on the 0.5 milestone? At least if my interpretation of @ViralBShah comment above is correct, there won't be a julia 0.5 release on 1/31, which is the current information associated with the 0.5 milestone, right?
This is not really an issue about dedicating resources. There are a fair number of open questions, and many folks still have apprehensions about array views. We'll have to do it and try it out, and see how it feels. As mentioned above, @blakejohnson 's PR is the first step and let's get that merged. There is certainly no plan to release 0.5 on 1/31. I am removing that date.
@davidanthoff: Is returning views instead of copies really such a deal breaker to stay away from julia? One can already use sub
to get a view into an array.
While I agree that it is always best to make major changes early, there are simply things that take time and in my perspective the Julia release team has done a pretty amazing job in judging which feature is ready for release inclusion and which is not. 0.3 and 0.4 are pretty stable in my point of view.
I was under the impression that the idea was a rapid 0.5 release _precisely_ because of the Array breaking changes (these should be in a single release) were going to be coming soon. If there coming over the next few months, 0.5 should be pushed back?
@ViralBShah The milestone for this issue is still 0.5 isn't it? It can't be 0.4.x?
A while back, I added this benchmark to track how our array performance was relative to Numpy's (which is faster than we are on this benchmark). The benchmark is an LU factorization with complete pivoting and it deliberately allocates a lot of temprorary arrays by working on vectors instead of looping through the elements of the matrix. The Julia version has a _copy_ and a _view_ version of the benchmark so this also gives an indication of how much speedup we'd get from a change to _views by default_. On my machine, the results are
➜ arrayalloc git:(master) ✗ make
python lucompletepiv.py
size 100 matrix factorized in 0.010 seconds
size 250 matrix factorized in 0.047 seconds
size 500 matrix factorized in 0.246 seconds
size 1000 matrix factorized in 2.330 seconds
Julia version with copy slices
size 100 matrix factorized in 0.016 seconds
size 250 matrix factorized in 0.093 seconds
size 500 matrix factorized in 0.517 seconds
size 1000 matrix factorized in 4.126 seconds
Julia version with view slices
size 100 matrix factorized in 0.004 seconds
size 250 matrix factorized in 0.078 seconds
size 500 matrix factorized in 0.453 seconds
size 1000 matrix factorized in 3.555 seconds
so using views improves the timings a bit, but not enough to match Numpy.
It's very easy to change a _copy_ based algorithm to _view_ based algorithm and this change could be even easier if we introduce special syntax for views. Therefore, it might not be worth the trouble of changing array indexing to _views by default_.
This is a great starting point. The next obvious step seems like figuring out what we need to do to make subarrays faster. Perhaps @timholy or @mbauman already know.
We need to have the compiler elide subarray construction altogether. I think the subarray code is already well-suited for that, so it's probably "just" hard work on codegen that's needed.
@andreasnoack the difference with small matrices makes this totally worthwhile for my workloads.
Agreed, and note we're faster than python for small matrices.
Hmm, I wonder if there's something else going on---this may not be a benchmark for subarrays at all. What happens if you replace https://github.com/JuliaLang/julia/blob/6d7a50b880fe2189b1efa34eb47d4dfeb181b674/test/perf/arrayalloc/lucompletepiv.jl#L39 with a call to BLAS.syrk2!
?
@blakejohnson My main point is that you are already free to use views and we can make this even simpler with syntax.
@timholy It would be faster, but the point of the benchmark is that it allocates temporaries. The Numpy implementation doesn't use syrk2
.
@ViralBShah I had interpreted @timholy's comment as a call for more resources (i.e. manpower) for this issue because the devs that intended to do the work on this have moved on. But maybe I misunderstood, or maybe new people have already volunteered to take over.
@tknopp It is not the lack of views from []
that is keeping people from using julia, it is the expectation of major breaking changes coming 1-2 releases down the line. At least I'm surrounded by people for whom julia would be perfect, but that are simply not willing to use a language where they might have to update their code so that it still runs correctly on the next julia release.
I'm not saying that any design decision should be rushed because of that group, I'd rather see things being sorted out for good and taking a little longer. I mainly just wanted to bump this thread, and see whether anyone is still working on things here :)
@andreasnoack Also, your view version doesn't have e.g. @blakejohnson work on bounds removal incorporated, right? My sense was that one of the major points of the general issue here was that there are known perf issues that need to be resolved to make views faster, but that those haven't been done. If things are still not faster once they are done I would understand that argument to just stick with copies by default. But before that call is made, the perf work discussed at the top of this issue should be finished, right?
simply not willing to use a language where they might have to update their code so that it still runs correctly on the next julia release.
Is that because they wouldn't use any pre-1.0 language, or they've heard specifically there will soon be breaking array changes (after which they'd use julia)?
@hayd I'm not 100% sure, but the array stuff I think is especially scary to people because it affects so much code (although, I guess the APL style indexing that is already on master even more so than the view/copy thing).
Are NumPy and Julia using the same BLAS here? I can get a substantial (>1/3) speedup on the size 1000 matrix case by calling Base.disable_threaded_libs()
.
@davidanthoff:
If this is a concern than Julia is probably too young for these people. In my point of view the number of changes to the core language over the last 3 years were not that drastic.
There are of course pretty stable languages but if I think about C++11 or Python3 there are even big changes in mature languages.
@davidanthoff, yes, it's a call for more people to contribute to this effort---previous contributors are probably not "done" (I'm not), but the list is long and individuals have multiple priorities.
A fair bit of array stuff is not actually that hard to work on, because it's a focused part of julia. Or maybe I'm just biased by having collaborated on it with a bunch of super-smart people :smile:. But I think it's amenable to contributions from a larger community, and waiting until some mysterious "they" do all the work may not be the best strategy if "they" don't have time.
Specifically regarding the bounds-check removal: currently views are quite fast, but they are unsafe because there is no bounds-checking. In order to make them generally usable, we need to _add_ bounds-checking, but until @blakejohnson's work merges we can't do that without destroying performance.
@andreasnoack, can you open a separate issue about your lu benchmark? There's more to discuss, but I think it's distracting from the larger issues here.
What about allowing fields in abstract types (https://github.com/JuliaLang/julia/issues/4935) anything new in that direction?
How about stealing the {}
syntax from the planned tuple types and have A{:, 1}
create a view while A[:, 1]
still creates a copy?
@JeffBezanson has suggested A@[:,1]
for views and A[:,1]
for copies as well.
Naive question why is A[ :,1] not a view by default and if you want a copy you call 'copy' on the view?
To make this terse in a related usage we overloaded '*' eg *(A[:, 1] ) - if unary * were in the parser then this becomes the terse *A[:,1]. Not advocating strongly for the use of * but it maps in my mind to the C usage.
I'm still in favor. @JeffBezanson had some reservations about that – I'll let him explain.
There are two main issues with views by default IMO
1) it's a breaking change, one of the worst kind since it introduces aliasing bugs (basically, short of deprecating range indexing for a full release you would have to audit every bit of code that uses it)
2) immutables containing julia references are quite inefficient for now so if people start using views for convenience (i.e. to avoid manually computing indices) they will probably see performance regressions
I admit that (2) is something we should work on anyway but it might be important to do it before widespread view usage. I don't see how (1) could be solved without having a separate syntax.
edit: thinking about it, a less troublesome transition for (1) could be to return read only views for a release, avoiding some aliasing bugs (but not those where you write into the original array), and enable writing through a view later
While I appreciate efforts to minimize breaking changes and/or provide deprecation paths... prior to 1.0 shouldn't we be willing to break some things?
I like the idea of having a new syntax for array views. The current indexing behaviour (that generates lots of temporaries) should get faster as we can do better escape analysis and memory reuse. I personally don't know if I like array views by default, and I like the idea of at least exploring potentially different syntax for views.
In general I am in favour of breaking things before 1.0, but this doesn't appear to be a case where it is clear that the new behaviour is a sure shot win.
To take this at least a couple more steps forward, what are the potential candidates for syntax for array views?
There is something interesting about the A@[:,1]
syntax, in that it reads A at the location of the following indices
.
There is something interesting about the A@[:,1] syntax, in that it reads A at the location of the following indices.
Agreed, it seems surprisingly intuitive.
Also, it is further growing on me because is very easy to experiment with views in a code with this syntax.
My view (ba-ding!) on A@[:,1]
is that it is confusing to introduce the macro symbol @
into a completely different context.
That is what my first reaction was, but I got over it for the reasons I wrote above.
But similarly, the {}
introduce a type-like thing into a completely different context.
view(A, :, 1)?
@kkmann: views need special syntax so that features like end
can work.
A@[:,1]
has potential. +1 for testing it.
I'd argue that in most cases you want views and only occasionally do you want copies. There are going to be a plethora of threads about code running slowly to which the answer will be 'litter your code with Array@[...]' Can't array views be immutable by default? Then you would have to explicitly request a mutable view.
Immutable views by default is an interesting idea. The A@[...]
syntax could be for getting a mutable view and copy(A[...])
would be how to get a non-view slice.
short of deprecating range indexing for a full release you would have to audit every bit of code that uses it
I don't think we have a choice here. If we go ahead with this breaking change, we have to deprecate range indexing for a full release, maybe with an opt-in to disable the warnings. If you don't then @tkelman will get calls about autonomous vehicles driving off the road because code stopped working as intended.
But immutable (read-only) views would turn any potential bug into an error, right? Because reading from views is safe.
@mauro3, not all potential bugs.
from @carnaval:
edit: thinking about it, a less troublesome transition for (1) could be to return read only views for a release, avoiding some aliasing bugs (but not those where you write into the original array), and enable writing through a view later
The intended meaning of code changes with julia version, and that kind of upgrade is something that people are hopefully being more cautious about than a Pkg.update.
Views by default still feels like the right long term choice to me. Least risky way to get there might be a transition via read only views by default.
+1 to a immutable view transition period followed by mutable view
I totally understand the hesitation to make such a breaking change, but I think that argument depends upon understating another risk that's not being emphasized much: the loss of credibility for Julia as a language if it reverses course on a change that's been planned for several years and publicly announced repeatedly. To me, abandoning this planned change feels like a substantive betrayal of trust about the language's long-term course. I don't think the loss of credibility implied by reversing that decision should be understated.
How about an argument/instruction to make all attempts to mutate a view raise an error for debugging purposes? Anyway, people need to go through a porting phase to support a new Julia version: they could run their code once with that flag on to identify problematic cases, and be safe after that.
I vote for just doing views by default. In my point of view this is absolutely consistent with the current behavior of setindex!
julia> a= zeros(3,3)
3x3 Array{Float64,2}:
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
julia> a[:,1] = 1
1
julia> println(a)
[1.0 0.0 0.0
1.0 0.0 0.0
1.0 0.0 0.0]
This behavior is inconsistent to the case where a[:,1]
is an R-value and silently copied. I therefore am also against making views immutable (which would not be the same as an immutable type by the way).
Of course all this is a design decision and there are pros and cons. But lets decide this early and not postpone it for yet another release.
@johnmyleswhite, I think not providing views by default (but still making them easy to access with something like A@[...]
) wouldn't be a betrayal of trust. There may be valid reasons not to make indexing mutable by default.
For example, as we move toward multithreading, views by default would mean that I suddenly have to worry that the data I'm working with might be changed by _another_ thread (even if my view of it is immutable). So I might need to add semaphores or copy the data explicitly, and my program and my reasoning about it suddenly becomes more complicated.
Don't read my comments as opposing views by default, my main point is that we need to be aware of the what should be expected of the transition.
Would also be interesting how much code this really breaks. We had large discussions that Int8+Int8 -> Int8
will break lots of code but in the end the change got in and there was no massive breakage.
@kmsquire: Multithreading is in my opinion totally orthogonal to this feature. Yes there are scenarios where it may make sense to copy data in mutithreaded code so that each thread has its own copy but usually this will be done by partitioning an array and operate on parts of this array. And exactly this will require views. Nobody wants to start a bunch of threads in order to copy around code (which [:,:] currently does) in a "hot" loop.
Would be good to have a syntax that gives you always a view, and that works identically in both 0.4 and 0.5. Ditto for a syntax that makes a copy (without copying twice). If you're worried about backward compatibility, you can then use this syntax, and avoid the syntax that changes meaning.
Isn't this kind of thing always done when making a breaking change? The discussion here conflates two issues -- what the long-term future state should be, and how to go about making this change.
I don't have a strong view on what the final solution should be, I could easily live with any of the suggestions made here.
The only thing I really care about is that it would be great to have all these breaking array changes in one release, namely 0.5, and not stretch them out over multiple releases. I will have to go over all my array code already to accommodate APL style indexing for 0.5, and it would be a great help if whatever breaking change in terms of views came in the same release, so that I only have to change my array code once. My guess is that a fair number of package devs/users would feel the same way.
I think you could find some bugs where a view changes because you've written into the original array by making indexing return an object that wraps both a copy and a view, and checks that the copy matches the view on getindex
(and has no setindex!
). But that wouldn't be a sensible default, since it's slower than either copies or views.
But in addition to @carnaval's points above, a third point against views by default is that they would not always be a performance win even if implemented efficiently. Even for cases like indexing a vector with a Vector{Int}
, if the array is being traversed enough times it's likely to be faster to create a copy. A view into a sparse matrix with a vector of integers as rows seems difficult to make anywhere near as fast as a copy for most operations. Given that whether a view is preferable to a copy depends on what's being indexed and what's being done with it, I think there should be easy ways to do either.
Given that whether a view is preferable to a copy depends on what's being indexed and what's being done with it, I think there should be easy ways to do either.
This!! Views should not be considered a definite performance win. Mutable views in particular are a very powerful and elaborate feature. I think it's much too difficult to promise that indexing will always return a reference to the underlying data rather than copying. We can't tell everyone who has ever defined getindex
that they must now arrange to return a view.
I bet nobody wants scalar indexing to return a view, nor does it seem likely to be implementable efficiently. However with separate syntax we could have even that, i.e. A@[1,1]
gives a 0-d view that can be written to with v[] = x
.
I think it has already been said before, but there are too many cases like sparse matrices, distributed arrays, out of core array data structures, where array views by default are either not a good idea or difficult to good performance on.
The proposed syntax for array views allows me to use views when I want them, and keep my existing code the way it is without breakage - all with the addition/deletion of one character.
There are clearly cases where mutating views can be preferable, and cases where you want a copy. Many people do believe that views are a good default choice for performance, and while I tend to agree, I am not sure if this is very clear cut.
Is it worth gathering some benchmarks, as we go about this change? That could perhaps even generally be a good thing.
The argument that making subarrays views on dense matrices would lead to the expectation that they're views on all data structures, even when taking a mutable view into a sparse matrix is generally a terrible idea, is a pretty convincing argument against doing so. If Julia is still going for the principle of least surprise, then making views opt-in with a pleasing syntax seems pretty reasonable.
Sorry to chime in late here.
@eschnett,
Would be good to have a syntax that gives you always a view, and that works identically in both 0.4 and 0.5.
We already have that: the two functions slice
and sub
.
@johnmyleswhite,
a substantive betrayal of trust
At a minimum, I agree it's embarrassing. However, without mentioning examples I have the impression that other well-regarded languages have changed their mind about a major initiative. It would be worse to persist in the face of obvious problems, and force a non-working strategy on users.
Also, some of us have periodically surfaced on the mailing list to say "no need to wait for Arraypocalypse, we have efficient views now. What's the big hangup over syntax, anyway?" So at least that viewpoint, espoused multiple times over the past year, is relatively consistent with the direction this conversation is moving in.
@JeffBezanson,
I think it's much too difficult to promise that indexing will always return a reference to the underlying data rather than copying.
Having spent a fair amount of time thinking about how to compose two different view types, slice
and reshape
, I couldn't agree more. Not that I think we can't do it, but it's far from obvious that we'll achieve performance that's as good as you'd get from making a copy. (We still have to implement it, so that it's possible, but it's not obvious we want to make it the default.)
I have some vague recollection of some two character brackets that were being proposed for addition to base (found it -- #8892). Things like [| ... |]
, {| ... |}
, etc. Couldn't we take one of these to construct a view? I think the former could work well without introducing a use for @
that doesn't mean macro.
My proposal is:
@
is reserved for macros{ }
is reserved for tuple typesx[1,:]
means index and create a copyx[|1,:|]
means index and create a viewAs far as I can tell this syntax is free (|
is not a unary operator) and is unlikely to surprise people. The downside is that it's a little more work to type [|
and |]
than [
and ]
.
A@[1,:]
seems much more intuitive for a view than x[|1,:|]
. I personally don't feel any confusion with macro syntax.
Is distinguishing mutable and immutable views advantageous not only in the near term, but indefinitely?
+1 for A@[1,:]
The current indexing behaviour (that generates lots of temporaries) should get faster as we can do better escape analysis and memory reuse.
Any guess what gains could be achieved via this sort of stuff?
Also, I find the simplicity arguments for keeping the current behavior quite compelling. Seems much easier to reason about copies than views for novices, so having a slightly more complicated syntax for views for those that understand the differences seems like a good idea.
On the comment that @
is currently reserved for macros, I would also point out that Julia _already_ has two symbols that are defined (or idiomatically used) to change behaviour of operators and functions (and I'm still talking of []
as an operator, _a la_ C++). These are the .
and !
symbols, and I'm worried adding a third @
symbol to the mix adds (possibly unnecessary) confusion.
Have people considered A.[1,:]
or even A![1,:]
or A[1,:]!
as possible syntax? That is, change the operator []
into .[]
or []!
? Intuitively, the former might be interpreted as an element-wise indexing (which arguably might be interpreted as forcing a copy??) and the latter a kind of mutating indexing (which seems similar spirit to views, where mutating the resultant view results in mutations to the original matrix).
A![1,:]
is already valid syntax for indexing into a array named A!
. The other syntaxes are available.
Furthermore, @[]
could still be made available for definition for doing macro-like magic with the indices (I'm thinking of something along the lines of @jutho's TensorOperations.jl, here, possibly even Devectorize... just guessing here)
@StefanKarpinski Yes, I thought of that after, []!
makes the most sense to me personally.
See https://github.com/JuliaLang/julia/issues/7721#issuecomment-170942938 for a syntactically similar proposal to keep in mind here: f(x)!
for calling f(x)
and autofinalizing the result on scope exit.
I really dislike using !
for anything except mutating operations. For example, reshape!
has been proposed for data-sharing reshape. I think this creates a lot of confusion about what !
means. Taking a view is not mutating; in fact it works especially well in programs that never use mutation at all!
( @JeffBezanson: does that include opposing the f(x)!
syntax proposed in #7721? )
@JeffBezanson That is a fair point... I'm sometimes already confused by !
. The view allows a kind of possibly-delayed mutation, but doesn't modify the array immediately, so I'll admit it's a bit of a stretch of the interpretation of !
.
f(x)!
from #7721 doesn't bother me quite as much, maybe because it also involves side effects of some kind, whereas returning a view of an array is just a pure function.
OK, what about the other operator, .[]
? I can see two alternatives:
In a world where []
returns a view (where defined, e,g. Array
, or else a copy where it makes more sense), .[]
could be convenient to _force_ a copy. Of course, as discussed, it makes it hard to predict what you are getting back from []
.
Or, perhaps .[]
could be considered an alternative to @[]
for views, in case that seems intuitive to anyone? How would people interpret "element-wise indexing"?
How would people interpret "element-wise indexing"?
Pointwise indexing is one possibility. A.[1:3,1:3]
would return the first three elements along the diagonal. #2591
I really wish that there'd be a clear right answer here... but neither the semantics nor performance of views makes this decision obvious. Perhaps supporting two operators is the right thing to do.
OK I agree, #2591 makes much much more sense for .[]
+1 for A@[1,:]
I also think having the ability to make an array temporarily immutable would be very useful for tracking down bugs where an array gets altered and you can't figure out why (or just to check that it is not being altered while inside a function). This would not necessarily need to be part of the slice
but it could be. Is there a way to do that now?
I really wish we had more data on the performance impact here. I just don't have the time to chase down all the changes that are necessary to make this work, but here's a barely functional start that skirts around the bootstrapping issue with views as default: mb/viewdefault. See the caveats in the commit message. But it does at least allow running the array perf tests… just so long as you pass it the full filename. Also note that it doesn't tell the whole story since bounds checking hasn't been incorporated into views yet.
It'd be absolutely wonderful to see someone pick up the torch here and shoot for some preliminary numbers with the benchmark CI infrastructure.
@andyferris Don't macro names always need to be identifiers (at least now)? I think special work would need to be done to allow characters like [
, (
or {
to be used as macro names, and to have the corresponding closing character be handled correctly.
I think the syntax A[ind]!
could be ambiguous. For example, what does A[ind]!=5
mean?
I think the proposed f(x)!
syntax might also run into problems of ambiguity also. (yes, you could require a space in between the !
and a =
, but it does seem potentially confusing)
What about simply @view A[1,:]
? (see https://github.com/JuliaLang/ArrayViews.jl/pull/14)
I think that is too verbose when you want to use all views in an algorithm for example.
couldn't a @view
macro be scoped to replace getindex within an entire begin ... end
block?
i don't see why not
Another thing to consider if we introduce a new syntax for views (e.g. @[...]
): what should the corresponding setindex!
variant
A@[1,:] = x
do? Since it feels kind of unclear it might be best to disallow it, but then you would have to remember to exclude the @
when you do assignment.
Cool idea @tkelman , would make it similar to @inbounds
.
not exactly: @inbounds
is a little bit more magical (see https://github.com/JuliaLang/julia/blob/9bfd27bd380124174a5f37c342e5c048874d71a4/test/boundscheck.jl). This would be a straightforward syntax transformation.
Similar in usage not implementation then. Anyway, I like it :P
But problematic for this use case: A
is a sparse matrix (so let's say it's faster to copy than to construct a view), B
is dense and has fast views. What should @view c += A[1:10,2:15]*B[:,j]
do?
Couldn't it be written something like c += A[1:10,2:15]*(@view B[:,j])
where @view
only works on what is inside the parenthesis. Don't know enough about macro to know if it is feasible.
But maybe it loses on verbosity now..
I think having both syntax would be best. Then use A2@[]
if fine grained control is necessary. (edit: added @
)
@KristofferC, yes, that works, but the "verbose" criticism applies if you regularly have to pick and choose.
That said, overall I like it. Note the implementation is quite simple: it comes down to replacing Expr(:ref, :A, args...)
with Expr(:call, :slice, :A, args...)
(although you also have to scan for end
and replace it manually).
I don't mind at least trying out @view
on master for some time and seeing how it feels before we do A@[]
.
In case anyone wants to have a shot at this, here is the current logic for replacing end
.
A@[]
is going in the perl sigil direction of being hard to read and understand for new users. At least the macro or function spell out what they do...
Why would a sparse view have to be expensive to construct? It could be lazy and essentially do nothing but reference the parent array and indices, and only be expensive upon use. It would let you trade allocation vs indirection depending how many times you expect to use the indexed result. There's a big missing method issue though.
Offering two syntaxes has advantages, but it doesn't solve all the use cases. I can actually see three of them:
1) You want a copy which can be modified without affecting the original array -> []
.
2) You want a view to modify the original array -> @[]
3) You want the fastest/cheapest way to extract a subset of an array, but do not need to modify it. You would prefer a (possibly immutable) view for index/array type combinations where it is faster, and a copy elsewhere. This is similar to using eachindex(a)
instead of 1:length(a)
when iterating over an array. -> no proposed syntax
Offering three syntaxes is a bit too much... But it seems to me that most of the gain we expect(ed) from using views by default corresponds to case 3.
So I agree with @tkelman that the question of whether views could be made the efficient for all Base types deserves investigation: it would effectively allow merging cases 2 and 3, and possibly steal the syntax of case 1 (i.e. views by default). With generalized generator syntax and fast anonymous functions, views sound like a great feature to avoid allocations and ensure a high performance.
One problem with using @view
to wrap expressions/blocks: what if I have a block wrapped in @view
and I need to add a copying indexing inside? Then I would have to refractor the whole block. What if I don't notice? What if I paste code into that block? I would feel a lot better if taking a view or not would only depend on the operation, and not the context of the surrounding code.
For what it's worth:
I deal a lot with views in Machine Learning. Many algorithms loop over the observations of some dataset, which itself is more often than not represented as a matrix. In cases such as SVMs one can not formulate this in terms of matrix multiplications, but instead needs to pretend each observation is a vector. In these cases a simple readable syntax would be appreciated, since I am aiming for pseudo-code resembling implementations. I don't think A@[1,:]
is confusing at all; it even reads "A at [1,:]".
I agree with @toivoh that using @view
seems like it forces one to decide between either only using views or more verbose trickery. @inbounds
feels much more clear cut in that regard
Would view(arr, dims)
and view!(arr, dims)
work, to return a immutable & mutable view respectively?
That would allow parameterizing it based on whether arr is dense, sparse, or something else entirely, right?
@tkelman In A@[x,y]
, why is @
a sigil? I read @[]
as an operator, rather than @
as a sigil attached to A
.
It's a funny looking operator that doesn't have a lot of precedent. Having to keep track of syntax for different kinds of indexing doesn't seem ideal. It's a bit like how rust has different kinds of pointers (and used to have more), it's not intuitive and a lot to have to keep track of when trying to read code for the first time. I could probably live with A@[]
but it feels off to me.
edit: if views were default and lazily constructed, then one would hope copy(A[1,:])
would be exactly equivalent to today's copying A[1,:]
but I'm not sure how close we would be in practice using today's compiler.
edit2: I think @[]
is as good or better than any of the other "two kinds of brackets" proposals. The alternative from my standpoint is to make []
always return a view (read only for a release transition) and call copy
on an indexing expression to get behavior equivalent to today. I'll dig into @mbauman's branch.
If we look at precedent, A(1,:)
could be always-copy, since that's what it is in matlab AFAIK, and A[1,:]
always view as it's almost the case in NumPy. I think ()
is a bad idea, just throwing it into the brainstorm. I also dislike A@[1,:]
.
If the main objection to @view
is how to handle exceptional cases (e.g. a copy inside a @view
block or a view normally), can people just use the existing sub
/slice
in the latter case, and maybe we add methods to the copy
function that handle the former? Currently it doesn't look like copy
has any multi-argument methods, so it could basically be a shorter name for what getindex
does now.
so copy(A, idx...)
would always give a copy, sub(A, idx...)
would always give a view, and A[idx...]
would copy by default but could be wrapped in a @view
to change behavior (or view by default with a @copy
macro to change behavior, IDK which is better)
I'm not a big fan of subtle syntactical cues changing subtle behavior (like bracket types determining copy/view behavior) because it seems like it's too easy to make a mistake that seems like it's working until it doesn't. It seems like it would be a common pitfall for new users, and also when refactoring code or just skimming code to see what it does.
@timholy The implementation is not quite that simple, because if you simply replace :ref
Exprs then @view X[:, 1] += 2
will become slice(X, :, 1) += 2
. I think you could use the custom_ref
parser tweak from #8227 to get the desired behavior including end
, though.
@ssfrr I think @view
could be at least as obscure as @[]
when refactoring code. Different people will undoubtedly place @view
around blocks of different sizes, and perhaps some people would even place it around entire files.
To interject as an aside : @tkelman and I spent a bit of time last spring
chatting about what techniques can be used to make sparse matrix
manipulations and slicing pretty efficient. Slice notation is generally at
odds with general slice notation, but the sparse friendly view / non
copying restricted versions of slice are all super easy to define for
pretty much any standard dense layout.
Not sure if there's interest though?
On Friday, January 29, 2016, Simon Kornblith [email protected]
wrote:
@timholy https://github.com/timholy The implementation is not quite
that simple, because if you simply replace :ref Exprs then @view X[:, 1]
+= 2 will become slice(X, :, 1) += 2. I think you could use the custom_ref
parser tweak from #8227 https://github.com/JuliaLang/julia/pull/8227 to
get the desired behavior including end, though.@ssfrr https://github.com/ssfrr I think @view could be at least as
obscure as @[] when refactoring code. Different people will undoubtedly
place @view around blocks of different sizes, and perhaps some people
would even place it around entire files.—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13157#issuecomment-176787527.
@cartazio I am interested in revisiting those design discussions and seeing what's possible to adapt within Julia's type system, but this issue is getting very long and it might be better to move some pieces of this discussion to their own separate issues or mailing list / direct email threads.
If you consider A[1,:]
as short hand for constructing a 'copying slice type' which gets applied to the call method - A( copy_slice(1, :) )
Would it be unreasonable to have a mutable view be A( view_slice( 1, : ) )
?
Adding a syntax for directly constructing copy_slice and view_slice makes this easier to use in line. This is very similar to the existing range constructors
1:3 -> UnitRange{Int64}
This basically works today if we assert that `view_slice is an alias for tuple (unfortunately this is the opposite of NumPy/matlab ) but you have the choice to be clear and explicit
1,:,1:2 -> (1,Colon(),1:2)
So
A( 1,: ) -> A( view_slice( 1,: ))
Is it that verbose to write
A( view( 1, : ) )
This also allow the extension to immutable views
A( immutable_view( 1,: ) )
You can implement this today without parser changes ( and likely in a backward compatible way, dispatching to getindex et al ) excepting the support for end
.
If the parser is modified to substitute end
for :end in the general short form of colon we can rely on dispatch to do the correct thing.
1:end -> 1 : :end -> colon( ::Int64, ::Symbol )
[ edit: though I would prefer Type{End} ]
Before worrying about syntax, I would really like to see more engineering effort put into views and a closer look at the performance tradeoffs. One thing that'd be a great help and is fairly mechanical is updating SubArray's indexing methods to use the fancy new @boundscheck checkbounds(A, I...)
idiom. (In general, all usages of unsafe_getindex
can be replaced by @inbounds A[...]
, and all of its method definitions removed.)
@nalimilan wrote:
3) You want the fastest/cheapest way to extract a subset of an array, but do not need to modify it. You would prefer a (possibly immutable) view for index/array type combinations where it is faster, and a copy elsewhere. This is similar to using eachindex(a) instead of 1:length(a) when iterating over an array. -> no proposed syntax
I wish it were this simple. Even without worrying about mutability, the trouble is that the fastest option depends not only on the structure you're indexing into, but also how much data is involved and how you end up using the returned object. There's also the difficulty that specialized array structures suddenly become unspecialized. We currently work around this for BLAS-compatible dense arrays and views with the StridedArray
typealias, but it'll take a very long time to find all the missing methods and type mismatches that subarrays will hit in general. Think of the LinSpace
missing methods… but much more pervasive.
And then there's the complexity class issue for specialized arrays. It's not expensive to construct sparse views, for example, but the trouble with them is that the returned SubArrays no longer hit the specialized sparse structures. So sum(slice(S, :, 1)
hits the abstract fallback and is no longer O(nnz).
@mdcfrancis — one of the most powerful uses of the end
keyword is to use it in arithmetic, e.g., A[(end+1) ÷ 2]
or A[mod1(x, end)]
.
-1 for A@[]
It does feel a ton like introducing Perl contexts, and the @ symbol suggests that there are macros involved. I like the A[|x,y|] idea, or some other kind of novel bracket.
I think what people are looking for is not "views by default", but narrowing the performance difference between benchmarks showcased by https://github.com/JuliaLang/julia/issues/13157#issuecomment-168378436. It would be nice for the default behaviour to do that, but an easy tweak one can find in the Performance Tips section is almost as good.
@mbauman - make the end sentinel type act as a number. The expression tree is then passed to the call method and can be evaluated with the length of the array?
END + 1 -> EndExpr( length(A))
@mauro3 did some work on that in a ragged array package. It's not easy.
-1 for A@[]
. Is A[[index]]
available for syntax? That would call something like getindexview(A, index)
? I believe that expresses that we are getting a level of indirection which is appropriate for a view.
@GlenHertz That already means something:
julia> a = ["pablo","martín","garcía"] ; a[[1,2]]
2-element Array{UTF8String,1}:
"pablo"
"martín"
@lucasb-eyer wrote:
If we look at precedent, A(1,:) could be always-copy, since that's what it is in matlab AFAIK, and A[1,:] always view as it's almost the case in NumPy. I think () is a bad idea, just throwing it into the brainstorm. I also dislike A@[1,:].
This suggestion regarding ()
for copies and []
for views appears to me to be the most concise solution and may be the most intuitive for new users from MATLAB and Python (it would increase the amount of runnable MATLAB code). It kinda feels "right" that the C-like []
does something pointerish and the ()
performs a function to return a new object (and it naturally doesn't make sense on the left-hand-side... but we'd have to add a warning to avoid logical errors here). Overall, I like it.
Unfortunately we lose the ability to use call
for array-like objects...
Using ()
for indexing eliminates the main advantage of having a syntax for indexing – namely allowing things like end
to work.
It really rubs me the wrong way to tell everybody to first switch all their indexing to use ( )
, then in the future use [ ]
for views when you want them.
The truly C-like thing would be to use &A[x,y]
for views, which is currently a syntax error and so may actually be possible.
@JeffBezanson Wouldn't that also be ambiguous though, with the &
and operator? Having syntax where meaning changes with or without spaces makes things harder to understand. Shouldn't Julia be better than C, also?
Being a better language doesn't necessarily mean you need to change the spelling of every operator.
Using () for indexing eliminates the main advantage of having a syntax for indexing – namely allowing things like end to work.
Out of curiosity, couldn't end
be implemented like :
/ Colon
and let the compiler deal with it rather than the parser? That is, make it a fully-typed and overloadable feature rather than a syntax-sugar feature? I understand it would have to support a bunch of arithmetic operators (end - 1
, etc), but it should be in the land of the feasible, no?
EDIT: Sorry, I see now the discussion about this above, e.g. Ragged.jl.
The truly C-like thing would be to use &A[x,y] for views, which is currently a syntax error and so may actually be possible.
Could we imagine using &
to make changes to tuples, NTuple
s, immutable arrays, etc? (i.e. useful for arrays on the stack, like #11902)
&A[x,y]
doesn't look to bad either! It seems intuitive and I would absolutely prefer it over ()
Having syntax where meaning changes with or without spaces makes things harder to understand.
While not wrong, I believe that is a bit of a stretch in this case. Can we think of a sensible ambiguous statement where both interpretations would make sense? All I can think of is the multiply number sugar.
This works on 0.4:
A = rand(Int, 10,10)
3&A[1:2,3:5]
2x3 Array{Int64,2}:
1 0 3
3 0 2
@andyferris: making end
work in full generality with that approach requires delayed evaluation; without it, you can at best implement a half-baked version with a type that caches all operations on it.
Here is a random thought which I have not think through yet. I just bring it up here for us to brainstorm.
Maybe it is a good time now to rethink the semantic meaning of array (or tensor), and the relationship between array and function.
I'd like to think array in this way: as abstract objects, array is no different than a function. They both take some input (index for array) and give output (value). Array v=A(i) is a function, the input parameter is index. Function y=f(x) is a infinite dimensional array, where the index is x. This is also true mathematically, vectors in Euclidean space is not much different than functions in Hilbert space.
I'd like to think the semantic meaning of A[1] is that, A is some storage, we use [] to access the particular value of the storage. This is a low level semantic.
However we could use () to model the high level abstraction. A(1) is function, you input index 1 get out something, then we can use map and other high level functions to manipulate A. For evaluating a function over a range of values (or a container), we could either define a range version of f which takes a range as parameter, or we could map f over the range. We can do the same thing to a array too.
To me, view is a lazy evaluation mechanism of array like a continuation of a function. Therefore we could construct some high level functions for array to produce view. Just like get a continuation monad for function f. I don't have a clear idea how to do it though.
Maybe it is a good time to unify the behaviour of both function and array. I don't know whether this is feasible in theory (I am no language expert). But it will be so cool, If this can be done in Julia. Again, I know I am asking something that is too big of a change for current Julia. If this is irrelevant, just ignore me.
+1 for []
views
+/-1 for ()
copies
-1 for &A[x,y]
views
-1 for A@[x,y]
views
Reasoning:
There is a common idiom that the common case should be fast and easy (to code, in this case), while uncommon cases should be possible. Currently, copies are treated as the common case via []
syntax (which is essentially sugar for getindex
/setindex!
); views are treated as uncommon case via sub
/slice
and no sugar. I think the fact that so many have piped in about having a better approach to views indicates that views should be treated as the common case with optimized fast/easy syntax.
Here is my response to some outstanding concerns @JeffBezanson raised:
It really rubs me the wrong way to tell everybody to first switch all their indexing to use ( ), then in the future use [ ] for views when you want them.
Is making everyone switch really necessary though? My impression is that most code probably doesn't rely on any automatic copying currently done by []
, and would probably benefit automagically performance-wise from using views. Since there are already major breaking changes in v0.5 w.r.t. arrays, I don't think the switch is too big a deal.
The truly C-like thing would be to use &A[x,y] for views, which is currently a syntax error and so may actually be possible.
I find this a rather strange comment. C doesn't really have the notion of sub array in general N dimensions, and either end pointers or element counts (or sentinels!) are used ad-hoc for one-dimensional (sub) arrays, and none of those really use &
. Regardless--as I mentioned above--I think most Julia array code probably wants views for performance reasons, which would result in an unsightly deluge of ampersands with the &A[x,y]
syntax.
Finally, do we really need a special syntax for copies of sub-arrays? We already have deepcopy
and convert
. If copies are really important, I'm certainly not opposed to having special syntax like A(x,y)
, or some other slight tweak of A[x,y]
eg with your favorite symbol. I just don't think copies are nearly as important as views
If anyone has evidence that copies should be the most common/easy-to-use case, such as a large package that would be difficult to change, please let me/us know.
Is making everyone switch really necessary though? My impression is that most code probably doesn't rely on any automatic copying currently done by [], and would probably benefit automagically performance-wise from using views. Since there are already major breaking changes in v0.5 w.r.t. arrays, I don't think the switch is too big a deal.
I'm not saying that we should not do it, but at the risk of repeating myself, the point is not whether many people rely or not on the behavior. The point is that the difference between the two behaviors is basically, AFAICT, impossible to detect (ok, doable but would make every array operation extremely slow).
So it's not like people will have to go through and fix a couple of annoying warnings, the failure mode (however rare) is : code works but returns wrong answer. Again, not saying it makes the change a non-starter, just that among the backward-incompatible ones, this kind requires careful thinking.
I think most Julia array code probably wants views for performance reasons
Aha! This is an absolutely crucial distinction: do you just want programs to run faster, or do you want an object A such that modifying A modifies another array B?
If you don't care about having the _semantics_ of views, then the best approach is to return a copy-on-write view object, and the library can decide based on whatever criteria it wants whether to return a view. It could return views only for certain contiguous regions that it thinks will be efficient. You get speedups without needing to think about when something is a view or not.
In contrast, with view semantics we would be _promising_ that these two functions are equivalent:
function f1(A)
A[1,:] = 2
end
function f2(A)
row = A[1,:]
row[:] = 2
end
If I think _only_ about the semantics of this (ignoring any performance consideration), I think I actually prefer copies over views as the default. In general copies seem much easier to reason about and probably have way less potential for introducing subtle bugs into code.
Fair point, Jeff. You are right that the semantics matter a lot too. I
think when I said performance, I was also thinking of how quickly and
easily the programmer can get the job done, at least on average.
Your hypothetical snippet could easily go either way in terms of copy/view
semantics. But in real code, I think the need for view semantics comes up
more often than the need for copy semantics. As an example, someone posted
earlier in this thread about viewing column vectors in a data matrix for
machine learning codes, etc. I think most everyone arguing about the syntax
also has concrete examples of code that needs view semantics.
If I'm right about there being more code that wants views (in terms of
either performance or semantics), then I think my point stands that view
syntax should be easiest, and existing code which doesn't rely on copy
semantics would get a free performance boost.
Regarding Oscar's very good point, yes there would be subtle bugs in some
cases for sure. But most package maintainers should be well aware of the
breaking changes coming in 0.5, so it should end up as relatively simple
bug fixes I think. Again, I'm open to persuasion if someone has a concrete
example of a code base that would be difficult to switch to view semantics.
On Feb 5, 2016 2:16 PM, "Jeff Bezanson" [email protected] wrote:
I think most Julia array code probably wants views for performance reasons
Aha! This is an absolutely crucial distinction: do you just want programs
to run faster, or do you want an object A such that modifying A modifies
another array B?If you don't care about having the _semantics_ of views, then the best
approach is to return a copy-on-write view object, and the library can
decide based on whatever criteria it wants whether to return a view. It
could return views only for certain contiguous regions that it thinks will
be efficient. You get speedups without needing to think about when
something is a view or not.In contrast, with view semantics we would be _promising_ that these two
functions are equivalent:function f1(A)
A[1,:] = 2
endfunction f2(A)
row = A[1,:]
row[:] = 2
end—
Reply to this email directly or view it on GitHub
https://github.com/JuliaLang/julia/issues/13157#issuecomment-180506892.
I think the most conservative (and probably correct) approach is to retain current copy functionality with square brackets. It's safe, and unlikely to ruin people's lives (don't underestimate the subtle bugs that could crop up). The problem is that I, personally, will want to use views frequently, and so they should have short, dedicated syntax.
Again, I'm open to persuasion if someone has a concrete
example of a code base that would be difficult to switch to view semantics.
I think the better question is if someone has a concrete example of a code base that is not difficult to switch. The problem is that the package maintainer might not even know there's a issue, even with good test coverage. That's the nature of subtle, hard-to-detect bugs.
Can we use brackets for views?
julia> x = rand(1,5)
1x5 Array{Float64,2}:
0.877481 0.18628 0.739978 0.306893 0.037569
julia> x{:,2:3}
ERROR: TypeError: Type{...} expression: expected Type{T}, got Array{Float64,2}
It gives an error, which means it might be feasible. (I think it was suggested before, but I don't remember seeing a reason it is a bad idea)
In light of the fact we can't use A[[2,21]]. What about something not symmetric: A[[2,21]? All the suggestions with the & and @ symbols would make code look really foreign. I know the suggestion is not symmetric but it's easy to type and looks more familiar.
@nstiurca Yes, view semantics are very useful, but to me this is still not enough to resolve the case. Is A[1,1]
a view? If not, then (1) square brackets are a strange construct that sometimes returns a reference and sometimes doesn't, (2) if you want a pointer to a single element you need to switch to some other syntax. This is made much worse by the fact that some types, e.g. sparse matrices, either won't implement views or have very slow views. Making views _1 character_ easier to get does not seem to be worth adding all of this confusion to the humble square bracket operator. We would need some other operator to be the "sane indexing" operator, for which some have proposed A( )
but that is _not going to happen_. The new, more complicated operation will get new syntax.
What about something not symmetric
Because that would drive a lot of programmers crazy: https://xkcd.com/859/
@JeffBezanson A[1,1]
is a view if on the left hand side of =
, and a copy otherwise currently... To me that's actually _more_ complicated than "always a view" for one syntax and "always a copy" for another syntax. But you've already made up your mind so whatever.
@Evizero @musmo
In A[1,1] = z
, it is perhaps more accurate to consider [1,1]
a modifier for =
, not A
. For example (and to insert parenthesis where they aren't actually allowed), it is A ([1,1]=) z
(the actual representation however is setindex!(A, z, [1, 1]
), similar to how A (+=) z
parses to A = A + z
instead of (the non-existant construct) (A+) = z
.
The result of &x[3]
can already be entered: Ref(x, 3)
That's right, it doesn't make sense to say the A[1,1]
on the left of =
is a view --- what would it mean for it to be a copy instead?
I fully agree that the right thing is to have one syntax mean "always a view" and another mean "always a copy". The only disagreement is about what the syntax for each should be. I think changing A[ ]
is too drastic. What about, say, dictionary lookup? Will that support views as well?
At this point I think A[...]
always means copy while A@[...]
always means view is worth trying out to see how it feels. The exact syntax isn't really as important as trying it out. For what it's worth, however, I'm not entirely convinced by what I've going to call "the Jeff position" for a few reasons:
copy
of a view collapses the lazy view to no longer be a view then you have a consistent way of expressing both views and copies: A[...]
is a view and copy(A[...])
is a non-view slice. No odd syntax needs to be introduced. This also has the syntactic advantage that the shortest, most obvious thing is cheap (and often fast to use too), and the expensive thing – i.e. making a copy – is very explicit; you definitely know a copy is being made.A[1,:]
and v[1:5]
– return containers; scalar slices – e.g. A[1,2]
and v[1]
– return single values. Compared to this HUGE semantic difference, the difference between a non-scalar slice being a view and a scalar slice being a non-view seems trivial. Since we already have to understand that difference, I don't see how this argument is compelling.I know that in #8892 we discussed unicode-based angle-brackets, but what about ASCII angle-brackets (A<:,3>
)? I recognize that they are the least-readable of all bracket choices, but they do look like sideways "v" for "view" :smile:.
An obvious problem: currently
julia> 2<3>1
true
julia> 1<2>3
false
I think that personally, I'd be willing to have to switch to 1 < 3 && 3 > 2
in exchange for getting another single-character bracket. In some contrast to the impossibility of issuing a warning for copy->view semantics, I wonder whether it would be possible to issue an informative parser error for this change in syntax.
Or, we could go all C++ and use A<<:,3>>
. I'm already quivering with thoughts of the awesomeness!
For copy-on-write: 'COW' (U+1F404)
For slices: 'SLICE OF PIZZA' (U+1F355)
Ok, I agree the scalar indexing argument is not a strong one. The more important aspect is how this affects implementers of getindex
(of which there are many) in general. Is the guidance that you should not implement getindex
, instead letting it fall back to the default that constructs a view, and implement copy(::View{MyType})
instead?
Making a generic lazy view type that can be the result type for slicing general containers doesn't seem all that hard to me.
We've done it, and it _was_ hard. @timholy I recall the biggest problem was linear indexing; is that right? While it's true that the special cases for dense arrays are "just optimizations", we already know that there is no efficient implementation e.g. for sparse arrays on the horizon. And what about something like DataFrames?
I'd say that DataFrames are hard to use as a clear argument for either side: indexing into a pandas DataFrames generally produces a view; indexing into a R data.frame always produces a copy-on-write copy.
Implementing linear indexing isn't hard – implementing it _efficiently_ is hard. But linear indexing is only efficient for a few data structures anyway (arrays that happen to be stored contiguously) – and we'd like to move away from it in general. If you implement scalar indexing, then implementing a view V = A[I,J]
is just a matter of hanging onto A
, I
and J
and defining V[i,j] = A[I[i],J[j]]
. If there's a more efficient way to do this, then you can overload non-scalar indexing, but in general I think this should actually be _less_ work for most container implementers. Note that composition of views can be made generically efficient: V[I,J][K,L] = V[I[K],J[L]]
. Since slicing a range with a range produces a range, this works quite nicely.
Addendum to my last comment: while the scalar indexing case doesn't really work as a "semantic consistency" argument, it works better as a "future generality" argument. While we don't have or want it now, if we ever wanted the ability to return a pointer to a single array element then special view syntax gives you that already.
@johnmyleswhite Can I get a more opinionated answer --- which would you prefer? :)
Well played, @JeffBezanson. :)
For pedantic-level clarity, I should point out that I don't particularly care which of these two options ends up being chosen for my own code since I almost always use sub
now.
As I said before, what really worries me about this thread is the reversal of a long-standing plan to switch to views as the default semantics for slicing -- one that strongly appealed to some people (e.g. some of the Torch devs). The reversal strikes me as particularly odd given the complexity of the topic and the absence of a clear right answer: no one has proven that copies were the right decision all along, it's just become clear that neither copies nor views strictly dominate the other.
With regard to DataFrames, I've come to think that providing any form of indexing at all is a mistake: you really need a higher level of abstraction like that provided by R's dplyr.
I don't think a reversal happened; I may well lose this one. It's just that the issue wasn't fully discussed --- reading #3701 again, the discussion was mostly about performance, and it was too long before the semantic issue and performance issue were cleanly separated. I think what we need now most of all is to finish up #9150, and if this debate ends in stalemate just go with the original plan.
@johnmyleswhite I agree that views would really appeal to Torch devs and others who prefer views, but what is the harm in using A@[]
for achieving exactly that? If they don't like making copies, they can just avoid the A[]
syntax.
I personally would prefer A[]
to have copy semantics, where the implementation may choose to use copy-on-write in some cases for performance reasons. View semantics are then guaranteed with the A@[]
syntax.
@timholy, I agree, it would be good to be able to use <>
but it has a few issues, e.g. logical indexing would be tricky with <>
too:
A[x.<1]
A<x.<1>
Probably there are too many corner cases/extra rules to make <>
work.
A[...]
is a view andcopy(A[...])
is a non-view slice
I like this because whenever one cares about getting either a view or a copy for semantic reasons (as opposed to performance), it will typically not be in the middle of an expression, so the extra verbosity of copy()
feels much saner than the extra magic of @[]
to me.
@lucasb-eyer I think the issue is that the concept of a view does not always make sense. So A[...]
would have a different conceptual meaning for different types of A
. Someone correct me if I misunderstood that.
Plus there is this Angst that changing to views by default would introduce a lot of subtle bugs in the existing ecosystem.
I'm still in favour of A@[...]
by the way.
We've done it, and it was hard. @timholy I recall the biggest problem was linear indexing; is that right?
Linear indexing accounted for 2 of the 3 main challenges, but it's also fair to say that sub
vs slice
accounted for 2 of the 3 challenges:
sub
we convert integer indexes into ranges. This is relevant because sub(A, 3, :)
can be inferred to have efficient linear indexing but sub(A, 3:3, :)
cannot (change that to sub(A, 3:4, :)
and you break linear indexing), yet internally during storage we effectively convert 3->3:3. Despite that conversion, we've managed to pull this off:julia> A = rand(5,5);
julia> B = sub(A, 3, :);
julia> Base.linearindexing(B)
Base.LinearFast()
julia> C = sub(B, :, 2:4);
julia> Base.linearindexing(C)
Base.LinearFast()
and also this:
julia> A = rand(5,5,5);
julia> B = sub(A, :, 2:3, 2:3);
julia> Base.linearindexing(B)
Base.LinearSlow()
julia> C = sub(B, :, :, 1);
julia> Base.linearindexing(C)
Base.LinearFast()
because we are just that awesome. (The trick comes down to the final LD
parameter and reasoning about how you must have gotten whatever it was for the parent SubArray.) Were I to do it over again, I'd avoid trying to be awesome, and preserve "non-sliced" dimensions with a new type of index, NoSlice(3)
, that makes inference straightforward.
A = rand(5,5,5)
B = sub(A, :, 3:17) # constructs a 2d view from a 3d array
or
A = rand(5,5,5)
B = sub(A, 2:4, 2:4, 2:4)
B[1, 7]
If we get rid of the latter, and replace the former with a composition of two view types, SubArray{ReshapedArray{Array}}
, things will be cleaner (but possibly of lower performance).
S1 = slice(A, :, 5, 2:6)
S2 = slice(A, 5, :, 2:6)
S1[i,j] -> A[i,5,(2:6)[j]]
S2[i,j] -> A[5,i,(2:6)[j]]
Now that I've grown more accustomed to "lisp-y" styles of coding, I suspect we might be able to dispense with the generated functions if the various performance problems from splatting (#13359) can be resolved.
@mauro3, good point.
I think the issue is that the concept of a view does not always make sense. So A[...] would have a different conceptual meaning for different types of A. Someone correct me if I misunderstood that.
@Evizero could you explain what you meant by this? A[...] already has different conceptual meanings for different types of A.
- I don't buy the scalar indexing argument. That's because the biggest different remains either way: non-scalar slices – e.g. A[1,:] and v[1:5] – return containers; scalar slices – e.g. A[1,2] and v[1] – return single values. Compared to this HUGE semantic difference, the difference between a non-scalar slice being a view and a scalar slice being a non-view seems trivial. Since we already have to understand that difference, I don't see how this argument is compelling.
I agree with @StefanKarpinski and @lucasb-eyer here. Yes, copy(A[...])
is more verbose, but it is much more readable, intuitive, and explicit. A@[...]
is likely impenetrable to people coming to Julia from Python, Matlab, C, etc.
The more important aspect is how this affects implementers of
getindex
, in general.
What if we just changed A[…]
to lower to slice
instead of getindex
, and kept getindex
's semantics as they currently are? Then copy(::SubArray)
could be spelled like this:
copy(S::SubArray) = getindex(S.parent, S.indexes...)
Implementers of getindex
still only need to handle the scalar case. The base library still provides (copying) nonscalar fallbacks for getindex. If there is an optimization available to them, they can specialize getindex
(as they currently do). Or if they can implement a more efficient view type, they can specialize slice
. This also means that copy(A[…])
is effectively the same as the good old getindex(A, …)
. It's just flipping the privilege of syntax to views, without loss of functionality.
(This doesn't quite work as is, since slice(A, 1, 1)
is a 0-dimensional view right now, but either that or the name of the allocating getindex can change.)
could you explain what you meant by this? A[...] already has different conceptual meanings for different types of A
I guess my point was that it would be weird if A[1,:] would return a view for dense arrays and a copy for sparse arrays. In other words for some types manipulating the return value would in fact change the original object while for other types it wouldn't.
But why is A[1,:]
defined for sparse arrays at all? It seems to me that this is only defined to fulfill the full AbstractArray
interface (whatever that is). Its clear that types that don't have the data for a slice directly available have to calculate it on the fly and therefore naturally create a copy.
I worry about that too, but @StefanKarpinski addressed that by proposing to implement a fallback getindex
that always returns a generic view type. This type would simply do indirect indexing, i.e.
getindex(v::View, i) = v.a[v.idx[i]]
and therefore works for anything that has scalar indexing. It might be slow sometimes, but that's better than having indexing copy (and therefore be slow) in the common case, and better than introducing weird syntax (trying to make Stefan's case for him here :) ).
Since we're switching roles here, I should point out that spelling copy-slices as copy(A[...])
has the drawback that, at least semantically and in the absence of clever optimizations, means that you have to pay for creating the view object _and_ then copying it. This additional cost could be optimized away (especially plausible if we manage to get stack-allocation of objects that contain references to other objects), but without optimization work, this would be slower than our current copying slices.
I should point out that spelling copy-slices as copy(A[...]) has the drawback that, at least semantically and in the absence of clever optimizations, means that you have to pay for creating the view object and then copying it.
Maybe this could be solved similar to how we currently have sub
with copy(A, ...)
?
Maybe this could be solved similar to how we currently have sub with copy(A, ...)?
:+1:
Maybe this could be solved similar to how we currently have
sub
withcopy(A, ...)
?
That would have the same problem sub
has now, namely that you can't use end
as an indexing expression.
copy(A[...])
has a lot of appeal for the copying version: it reads entirely straightforwardly, doesn't hide two operations inside one, and looks just like copy(foo(A))
for any other operation foo()
.
Now for a somewhat crazy suggestion as a stopgap in the absence of sufficiently clever optimizations: what if copy(A[...])
was just syntactic sugar for a combined copy and getindex? I know people will be happy to get rid of things like Ac_mul_B
, but those also served as a useful stopgap until something better came along.
is copy(A[...])
actually an issue? If subarray construction is cheap the combined operation will only have a slight overhead.
@c42f I think for syntactic sugar it should be possible to implement a @copy
macro that does that. Making copy(A[...])
syntactic sugar would require changes to the parser.
@tknopp I have been wondering that too. The cost of copying should make the cost of creating the subarray negligible.
What if A
is of a type that doesn't lend itself to views (i.e. A[...]
has to a copy for A
anyhow). Does copy(A[...])
then perform two copies? Seems like a subtle source of performance problems.
Can we get a use case where copying is desired? In my point of view we should concentrate on the majority of use cases instead of looking at corner cases that might make [...]
more generic. I don't really buy the SparseMatrix
argument since [...]
should really not be provided for this type.
Regarding the subtle bug argument: There are people that think the following is surprising:
julia> a = [1,2,3];
julia> b = a;
julia> b[1] = 0;
julia> a
3-element Array{Int64,1}:
0
2
3
Seems to be very similar to an array returning a view to me.
The reason to favor array views is that it minimizes the number of hidden memory allocations. Its true that in some cases (multithreaded code) it can be faster to create copies in order to better utilizes caches. However, this use case is better explicitly programmed since it requires to trade-off memory for better throughput.
@tknopp Are you proposing that A[:, 1:2]
simply shouldn't work for SparseMatrices? That seems pretty weird.
So you think that scipy is pretty weird?
I actually don't care if these methods are defined for sparse matrices. But taking these methods as an argument to not do array views isn't that convincing to me. Or do you have a use case where the methods are used?
For scipy it depends what format your matrix is in. For CSC
you can do slicing but only with a step size of 1.
So you think that scipy is pretty weird?
IMHO python isn't a good role model when it comes to syntax for linear algebra
But taking these methods as an argument to not do array views isn't that convincing to me.
I don't think _anyone_ is against doing array-views. The question is how they should look like.
The point I wanted to make is that A[:, 1:2]
is a convenience method for sparse matrices. And I wanted to know if people think that this is a very important use case that should drive the syntax discussion or not. We have learned from this thread that we can be more generic with copying [...]
but this alone is IMHO not enough to warrant special syntax. Its also important that the syntax is what people actually want in most situations. I think its a view but this is my personal preference and therefore I think this should be discussed here.
I suspect that SciPy works that way precisely because indexing returns views. You would never want to use an inefficient sparse matrix view so you can't have it. But it still strikes me as really bizarre if you can do A[1, 2]
but not A[:, 1:2]
. You would still need some way to express the latter operation, and it can't be copy(...)
if you can't construct a view. It also seems like a total disaster if you want to write code that can operate on both dense and sparse matrices.
One theme of this thread is that there are two components to "what people actually want": The semantics they want, and what's fast now. I prefer copy semantics, and when I use sub
it's usually for performance and not because I actually want view semantics. I think the ideal approach may be copy-on-write views as @JeffBezanson proposed in https://github.com/JuliaLang/julia/issues/13157#issuecomment-180506892.
I suspect that SciPy works that way precisely because indexing returns views. You would never want to use an inefficient sparse matrix view so you can't have it. But it still strikes me as really bizarre if you can do A[1, 2] but not A[:, 1:2]. You would still need some way to express the latter operation, and it can't be copy(...) if you can't construct a view. It also seems like a total disaster if you want to write code that can operate on both dense and sparse matrices.
- yes scalar indexing does not make too much sense for sparse arrays. So this should be treated similarly
- If you want to write generic code that operates on sparse and dense matrices, would you use range indexing? Of course I get your argument but it would be interesting if this is something people are using in that form or if this is a hypothetical use case.
One theme of this thread is that there are two components to "what people actually want": The semantics they want, and what's fast now. I prefer copy semantics, and when I use sub it's usually for performance and not because I actually want view semantics. I think the ideal approach may be copy-on-write views as @JeffBezanson proposed in #13157 (comment).
I am not sure if its that simple to decouple semantics from performance. In my point-of-view [...]
should be as lightweight as possible and should prevent any heap allocation.
Regarding copy-on-write: I was actually surprised to see this suggestion by @JeffBezanson since it would complicate the entire array system. With multithreading under way its not obvious to me how this could be implemented in an efficient manner.
WRT copy on write, you can make the kernel do it, but only on OS X AFAICT and it would probably still have many of the costs associated with heap allocation. If we don't use the kernel, it seems you might need to do some fancy things to make copy on write efficient for loops that read and then write to an array, but if it's easier to make the compiler smarter than to make users smarter, it might be worth it.
Threading is already kind of complicated with the array semantics we have now, because it's possible to change the array pointer using push!
etc. and then some threads might be accessing the wrong buffer. I'm not sure if/how that's currently handled but it seems like a similar problem.
If copy on write isn't on the table, then the question is essentially whether we should build our language semantics around what yields the best performance for the most common case (dense arrays) or whether we will accept a performance penalty for unoptimized code in the interest of generality. The main caveat to the "best performance" choice is that it doesn't actually help very much. Making [...]
return views will not free you from thinking about how you're allocating memory unless you're only using scalar operations, since addition, multiplication, etc. of arrays will still allocate the output. So you're still going to have to manually optimize your memory allocation when you care about performance, and reusing memory in all the other places you might allocate is often much more painful than switching [...]
indexing to sub
. Given that fact, as well as the fact that changing [...]
to return views will break a very large proportion of extant Julia code, I don't really think it's a good idea.
Please keep in mind that we can do views of sparse arrays at relatively little cost simply by indexing through the slice objects: i.e. indexing into S[1, 3:25]
just requires one additional integer add per element access. If the slice is a vector, then we need to hang onto that vector, but I don't see that as a show stopper – it can be constructed in constant time and is only a little bit slow to work with.
It's true that if you're doing scalar indexing into a view of a sparse matrix, the overhead is going to be minimal compared to the cost of scalar indexing to begin with, because scalar indexing of sparse matrices is not very efficient. But in the absence of specialized methods/algorithms, many things that scale only with nnz for a SparseMatrix would scale with the total number of elements (zeros and non-zeros). Given that sparse matrices are sparse, this seems pretty bad.
SparseMatrices are the worst case, but for basically any subtype of AbstractMatrix that is not a StridedMatrix, as well as StridedMatrices with non-unit stride between adjacent elements, it would likely be faster to allocate a new matrix before trying to do matrix multiplication. Views can really only be used directly for types where there is no more efficient way to do anything than to perform scalar indexing, or for the subset of cases where it's possible to write efficient algorithms to operate on (potentially only certain types of) views and someone has implemented those algorithms.
The least bad option for views of other types is likely to copy them before performing matrix multiplication etc. That's still more expensive than copying at indexing or copy on write if you're going to use them more than once, but at least the asymptotic complexity shouldn't change.
This is why I'm a bit unsure whether it's worth getting too worked up about performance concerns about sparse array views; yes, the naive algorithms are penalized with a view, but it's quite straightforward to fix them. Presumably quite a few algorithms could just be rewritten using
for M in (SparseMatrixCSC, SubArray{,...,SparseMatrixCSC, ...})
@eval begin
....
end
end
The thought of optimizing views of BitArrays, on the other hand, is positively frightening to me.
Sorry to butt in here as a pure user who is anxiously watching this debate:
The suggestion to make slices _copy on write_ is the best news I've heard about Julia in a very long time. I myself, and virtually all of my collaborators on our Julia codes, will be thrilled if the standard slicing notation A[...]
behaves semantically like a copy.
P.S.: I also like the idea of having simple notation such as A@[...]
for views.
@cortner: Could you back up this opinion, why you would be thrilled? I mean: Currently its always a copy so why are you thrilled?
@tknopp: sorry if I wasn't clear. I actually don't care particularly whether A[2:5, 6:12]
returns a copy or a copy-on-write abstract array as long as it behaves like a copy. To some extent it may be just personal preference, but in the end, isn't this debate essentially about the best option for the largest number of users i.e. personal preferences? For me personally the view approach that Python takes is something that annoys me every time I write Python code, so I was dreading the day that Julia switched to this behaviour in 0.5.
I strongly believe that the "default behaviour" should be as elementary and safe as possible. In this particular question I just believe that that copy-style behaviour (whether it is copy or copy on write) achieves this.
And to be sure: It would annoy me immensely to have to write copy(A[2:5, 6:12])
just as it annoys me to have to write slice(A, 2:5, 6:12)
a the moment if I want to drop singleton dimensions. (This is just to prove that I am not _just_ a Matlab evangelical.)
Maybe a last point to add to this: while I didn't entirely follow the difficulties with sparse arrays, I think it is _paramount_ that we can slice sparse arrays and get the same behaviour as for dense arrays. For a production code, where you can tweak and tune this may not matter, but Julia wants to be useful for dirty prototypes codes. Here, it is incredibly useful to have this behaviour.
Does anyone know of a write-up or discussion of either Matlab or NumPy developers about their decision to copy-by-default and view-by-default? I don't believe this is the first time in history of programming languages this is discussed, and those who have gone through it might raise relevant points.
Views in NumPy date all the way back to Numeric 1. You can see some discussion around the withdrawn PEP209 from 2001 for a proposed redesign, wherein they cite performance. On the mailing list, they further discuss composability and syntax:
Yes, performance was the main reason. But there is another one: if
slicing returns a view, you can make a copy based on it, but if
slicing returns a copy, there's no way to make a view. So if you
change this, you must provide some other way to generate a view, and
please keep the syntax simple (there are many practical cases where a
view is required).
Sounds familiar. Their choice might have also been influenced by support for the list-of-lists abstraction for multidimensional arrays; A[0]
is a reference to the first row of a matrix.
Matlab, interestingly, only seems to use copy-on-write sharing for the A(:)
syntax and reshapes. All other indexing syntaxes are immediate copies (tested on 2014a). I'd be very surprised if there were any public discussions about this… but views would be a huge deviation from the copy-on-write semantics for bindings.
As a Julia user, here are some of my concerns about making array views the default:
@annalam : thank you for this great list.
The change breaks significant amounts of existing packages and user code in a subtle way, without a deprecation period or any way to automatically warn users.
Could you please indicate on what you base your judgement that the change breaks significant amounts of existing packages in a subtle way? This is an honest request. Please give us links to the important Julia packages that all break and where they break.
What happens to a view if the shape/size of the underlying array changes
Arrays with more than one dimension cannot be resized. Vector
may be resized, so some bounds check inefficiency may creep in there.
@tknopp : You are absolutely right. I should not have written "significant amounts of", that was unwarranted. I have edited the sentence to say "can break existing packages and user code", and added a sentence pointing out that a period of read-only views can mitigate this concern somewhat (although not entirely).
My humble request to everyone here is to not weigh in with further pros and cons. I think pretty much everything is covered here, and what we really need is a set of user codes that serve as benchmarks to help make this a more informed decision.
I am happy to create a repo and manage it, if everyone who has chimed in here is willing to submit a few self-contained codes and test cases, which we can benchmark.
I am happy to create a repo and manage it
Please do @ViralBShah. I think some real code will help the discussion along nicely. Things I'd like to see:
I think with a few "real" examples, some of this might become more obvious.
This is a great suggestion Viral. I can contribute a use case where the copying behavior is misleading. We should summarize all the subtle bugs
there.
Regarding COW, is there any implementation that _won't_ slow down access? I'm thinking of this:
type COWArray{T,N,A<:AbstractArray} <: AbstractArray
iscopy::Bool
parent::A
end
function setindex!(A::COWArray, val, indexes...)
if !A.iscopy
A.parent = copy(A.parent)
A.iscopy = true
end
A.parent[indexes...] = val
end
Now, _maybe_ branch prediction will mean that the check doesn't slow down non-SIMD access, but testing this would be a first priority.
COW might not be as bad in Matlab because the language wasn't initially designed for element-by-element access patterns---if the only reasonable way to use arrays is via vectorized syntax, you effectively hoist that COW check.
@timholy If we can stipulate that we only care about the overhead in loops, then I think henchmen unrolling might work. The copying would happen in the unrolled first iteration, and the check could be eliminated from the main loop (with appropriate TBAA). This only helps for loops with simple branching patterns but those are likely the only loops where the overhead would matter.
I think this still does not have the same semantics as copy since you will see mutations of the original array
note that if we choose copy _semantics_ then we can maybe play tricks where we copy small arrays and use the virtual memory system for arrays large enough to be page-aligned. I'm not convinced that would be a lot faster but at least you can use normal loads/stores in the compiled code.
@carnaval: In a proper COW implementation the "mother" array would need a backlink to all its children and initiate a copy if it changes. IMHO much too complex and even more complex (-> slow) when considering multi-threading.
I'd like to provide a concrete example for people to consider. Presume I want to implement completely pivoted LU, an algorithm that you will see in textbooks described like so (Golub and Van Loan 4/e, p. 132, Algorithm 3.4.3):
This particular algorithm is quite rich in indexing and slicing behaviors.
A naive user would want to write for this algorithm something like
function lucompletepiv!(A)
n=size(A, 1)
rowpiv=zeros(Int, n-1)
colpiv=zeros(Int, n-1)
for k=1:n-1
As = abs(A[k:n, k:n])
μ, λ = ind2sub(size(As), indmax(As))
μ += k-1; λ += k-1
rowpiv[k] = μ
A[[k,μ], 1:n] = A[[μ,k], 1:n]
colpiv[k] = λ
A[1:n, [k,λ]] = A[1:n, [λ,k]]
if A[k,k] ≠ 0
ρ = k+1:n
A[ρ, k] = A[ρ, k]/A[k, k]
A[ρ, ρ] = A[ρ, ρ] - A[ρ, k] * A[k, ρ]
end
end
return (A, rowpiv, colpiv)
end
but because of all the copies that are made by the indexing expressions, this Julia code in 0.4 is about twice as slow as the equivalent Python code
import numpy as np
def lucompletepiv(A):
assert np.size(A, 0) == np.size(A, 1)
n = np.size(A, 1)
rowpiv = np.zeros(n-1, dtype=int)
colpiv = np.zeros(n-1, dtype=int)
for k in range(n-1):
Asub = abs(A[k:n, k:n])
mu, lam = np.unravel_index(np.argmax(Asub), np.shape(Asub))
mu, lam = mu + k, lam + k
rowpiv[k] = mu
A[[k, mu], :n] = A[[mu, k], :n]
colpiv[k] = lam
A[:n, [k, lam]] = A[:n, [lam, k]]
if A[k, k] != 0:
rho = slice(k+1, n)
A[rho, k] /= A[k, k]
A[rho, rho] -= np.dot(np.reshape(A[rho, k], (n - (k + 1), 1)),
np.reshape(A[k, rho], (1, n - (k + 1))))
return (A, rowpiv, colpiv)
I'd be curious to see how much faster people can make this Julia code by changing only the indexing operations.
Ref: https://github.com/JuliaLang/julia/issues/13157#issuecomment-168378436
:+1:
For comparison, this is what we'd have to write in 0.4 for better performance:
function lucompletepiv3!(A)
n = size(A, 1)
lda = stride(A, 2)
rowpiv = zeros(Int, n - 1)
colpiv = zeros(Int, n - 1)
@inbounds begin
for k = 1:n - 1
offsetk = (k - 1)*lda
μ, λ = idxmaxabs2(A, k:n, k:n)
rowpiv[k] = μ
swaprows!(A, k, μ)
colpiv[k] = λ
swapcols!(A, k, λ)
if A[k,k] ≠ 0
ρ = k + 1:n
scale!(1/A[k + offsetk], sub(A, ρ, k))
for j in ρ
offsetj = (j - 1)*lda
Akj = A[k + offsetj]
@simd for i in ρ
A[i + offsetj] -= A[i + offsetk] * Akj
end
end
end
end
end
return (A, rowpiv, colpiv)
end
I have some code here that I would be happy to contribute the testing repo. The code is a simplified model of one part of the CFD solver I am working on (calculating the Euler flux for a chunk of the mesh). It testing working with large numbers of small arrays.
Wow - the examples came fast. I think we need many more codes that are closer to workloads or real applications. Maybe packages will be a good source.
I just created this repo: https://github.com/JuliaLang/IndexingBenchmarks
If everyone here who submitted snippets can commit them into the above repo or provide a PR, I will take on the task of organizing and preparing a harness to run stuff etc.
Permissions problem with the repo:
remote: Permission to JuliaLang/IndexingBenchmarks.git denied to JaredCrean2.
fatal: unable to access 'https://github.com/JuliaLang/IndexingBenchmarks.git/': The requested URL returned error: 403
@JaredCrean2 You should be able to write to that repo now. In general, for folks who do not have commit access, they will have to submit PRs, but happy to add anyone who is contributing/helping there.
I submitted a PR instead (I didn't see your comment until just now).
What is a minimal set of tests to be useful? A julia (0.4) version of an algorithm using views and one using copies?
Two checkboxes in two days. At this rate, we'll be done in no time :smile:.
Several examples were posted in this issue demonstrating that simply returning views-by-default is insufficient to solve many performance problems---many of these would be better addressed by "automatic devectorization." This gets us into the realm of sufficient-smart compiler. However, in https://github.com/JuliaLang/julia/pull/6837#issuecomment-213617933 it struck me that, since the merger of jb/functions
, we now have the machinery to at least represent these operations in an efficient manner.
Of course, there's still an enormous amount of compiler magic that would need to be done.
I would add that we also don't know how fast the views-by-default approach can be made once view objects can be stack allocated. Since the ability to do that is in sight before 1.0, this decision could still go either way – hence the decision not to decide yet.
Which items here are still in scope for 0.5?
I think I've found a way to avoid the big slow down in test times when introducing the Transpose
type so "Ditch special lowering of Ac_mul_Bt" should still be in scope. Working on it.
Of the "core" items, I have the impression we've decided against "Return slices as views." What @andreasnoack is working on will close most of the rest. Seems like the prospect for "Easier creation of immutable arrays" is dim? The "any index type" could probably be finished quickly.
I am hoping for #16260, but no promises. On my local branch (which is well ahead of what I've pushed), I have a weird inference error I am having trouble tracking down.
Can someone point to the "return slices as views" discussion? I know it's probably been hashed over and over, but is there at least a way in Base to now do something like view(A, i:j)
(such as provided by the ArrayViews package)?
I think it was mostly here, but github collapses really long discussions now. There's a button midway through the show hidden comments.
I would characterize the "slices as views" decision as "not yet, maybe never". In the current state of our compiler technology, returning views rather than copies is a wash in terms of performance: sometimes it's better but often it's not and sometimes it's worse. In the future, when we can allocate objects on the stack that refer to the heap this may change, at which point we can re-evaluate the performance tradeoff. It's possible that with that capability to stack-allocate views, returning views will be such a performance win that we decide to do it, but it's also possible that it isn't and we decide against. Given the uncertainty and the extremely (subtly) disruptive nature of the change, it seemed best to hold off for now.
Thanks for the summary @StefanKarpinski; do we at least have the ability, though, to return a view? I think that would be the minimum feature so that in my own code, I know I can do view(A, i:j)
when I know that's all I need/want for performance.
We've had that for ages; see sub
and slice
, which got fast in time for julia-0.4. We now have a few additional view types, too (at least ReshapedArray
and the unexported PermutedDimsArray
).
Thanks @timholy, sorry for my ignorance here. :)
The "any index type" could probably be finished quickly.
I no longer think we should support any index type due to the difficulty in getting dispatch to do so — I've edited that bullet point. I think two things should happen for that item:
Integer
. This will cause A[1.0]
to error with a message that indexing with Float64
is not supported. And the deprecation in to_index
can then be removed.Number
s. They only really need to know that those indices are scalar, and then pass them along to the scalar indexing methods. If the custom array adds support for floating point or other more complicated indices, it'll just work*ⁱˢʰ. Otherwise, it'll error like above.There are two tricky parts here:
to_index
? It now really only does two things: call find
on logical arrays and convert integers to Int
. It'd be weird for SubArrays and non-scalar indexing methods to pass all scalars through untouched except integers — those would get converted to Int
first by to_index
. They probably shouldn't muck with any scalars at all. We could split to_index
into two parts: to_nonscalar_index
(calls find
and allows a hook for e.g., NullableArrays
to optimize their use as indices) and to_scalar_index
(just converts to Int
)? While we're refactoring here, do we also want to allow for the array being indexed into to have a say about its indices, a la https://github.com/JuliaLang/julia/pull/15750/files#diff-a21a7fc275830bf9efb5b7c17c4fb98eR439?eltype
(like interpolating with Dual numbers). They'll still allocate an array of eltype
s for the output, but will hit an error upon trying to assign into it. It's fine for now, and in the future a motivated individual could possibly implement a promote_eltype_op
specialization for getindex
.Scalar indexing fallbacks need to be tightened to Integer. This will cause
A[1.0]
to error with a message that indexing withFloat64
is not supported.
Is the problem with allowing floats to be used for indexing technical or philosophical?
We've had that for ages; see sub and slice, which got fast in time for julia-0.4. We now have a few additional view types, too (at least ReshapedArray and the unexported PermutedDimsArray).
Has there been any consideration of renaming sub
to view
? sub
really doesn't indicate very well that it will return a view...
view
is used by the ArrayViews
package which currently has significant performance advantages in certain cases (small contiguous views).
Following up on @davidanthoff, I think we should deprecate either sub
or slice
and it should probably be sub
now that getindex
behavior matches slice
.
Is the problem with allowing floats to be used for indexing technical or philosophical?
It's a mix of both, but if we detangle scalar/nonscalar to_index
like I suggest above then that removes (the last of?) the technical reasons. Linear indexing requires doing math on the indices, which requires integers. A lot has changed since we merged that deprecation (https://github.com/JuliaLang/julia/pull/10458).
Ah, I had thought the base capability is now a complete superset of the ArrayViews
stuff and always preferred... It is a bit of a shame that the more intuitive name is used in the package and base is left with something less clear...
It seems unfortunate and somewhat backwards for a name used in a package to block choosing a much clearer name for a function in Base. Perhaps the way forward is to eliminate the performance advantage of ArrayViews, deprecate that package, and change the function name to Base.view.
My assumption/hope is that the performance gap will go away when we are able to heapstack-allocate containers with references...and I am skeptical that there's anything we can do about it without that. The ArrayView
wrapper is just smaller, so being able to inline & elide the wrapper creation should do the trick.
That's why the difference only shows up with creation of small arrays---on most other benchmarks, SubArray
equals or outperforms ArrayViews
.
Oh, and I agree about deprecating sub
.
Base and ArrayViews
could both use view
. After all, ImageView also uses view
:)
I suppose that could work because ArrayViews
defines methods only for Array
but Base
defines methods for AbstractArray
(I think?). How would this work when different scopes exists:
module A
function myfunc(a::AbstractMatrix)
av = view(a, :, 1)
# do something with av
end
end
module B
using ArrayViews
end
Does typeof(av)
change depending on whether module B has been loaded (assuming a
is an Array
)?
ArrayViews
would have to stop exporting view
, and anytime you wanted to use ArrayViews'
version you'd say ArrayViews.view(A, :, 1)
.
ImageView would also have to stop exporting view
, but I'm fine with that idea.
Of the remaining issues, only #5332 and #16846 remain for 0.5; moving this issue to 0.6.
I'm also planning to do https://github.com/JuliaLang/julia/pull/16260#discussion_r67140179:
size
and length
throw an error for arrays with unconventional indexing@arraysafe
to rewrite calls to size
and length
to something that doesn't throw an errorallocate_for
into similar
Probably won't be done until after JuliaCon, unfortunately. In that same discussion, @eschnett has made a case for introducing a separate type for linear indexing, and pointed out that the introduction of linearindices
is an opportune moment to do so; I don't disagree, but I don't think I'll have time to tackle that myself, so that one is up-for-grabs.
As long as you're on it, @timholy – needs to be done by next week so we can tag an RC. If you'd like to open an issue and put it in the 0.5.0 milestone so we can track it, you can.
Shouldn't the title be adapted, if the milestone is moved?
I believe the 0.6 parts of this are reflected in more specific issues and PRs; moving to 1.0.
See also #20164 to more easily opt-in to views for a large block of code.
Everything in this issue is either done, has its own issue, or isn't going to happen (I checked).
Most helpful comment
Base and
ArrayViews
could both useview
. After all, ImageView also usesview
:)