Julia: What should repeat(::OffsetArray, ...) do

Created on 15 Jun 2020  ·  6Comments  ·  Source: JuliaLang/julia

In #35944 we noted that repeat(::OffsetArray, ...) is a little funny; it errors for inner repeats and happens to return 1-based arrays for outer repeats. I don't think either of these behaviors were ever explicitly thought through; it'd be good to decide how we want them to behave in the future.

arrays

Most helpful comment

My original intention for collect was pretty simple: it should produce the vector of values that you would get if you iterated an entire iterable and put the results in a vector. But that concept was not adhered to terribly well and now it has a hodgepodge of behaviors 🤷‍♂️

All 6 comments

repeat, like cat, push!, etc, seems to belong to the "arrays as lists" worldview rather than "arrays as associative containers" worldview. xref https://github.com/JuliaArrays/CatIndices.jl, which has never yet gotten around to deciding on a very complete solution to this question but has taken a few tentative stabs at it.

One option I can't remember noticing in the past (but surely @andyferris and others have): should we use values(A) to indicate that we wish to strip the indices, returning a 1-based indexing view? At one point I decided to call this collect, but that's eager rather than lazy and the docstring for collect is not as clear on this point as I seem to remember it.

Currently it's a bit of a wreck:

julia> a = Base.IdentityUnitRange(-3:3)
Base.IdentityUnitRange(-3:3)

julia> collect(a)
7-element Array{Int64,1}:
 -3
 -2
 -1
  0
  1
  2
  3

julia> Array(a)
7-element Array{Int64,1}:
 -3
 -2
 -1
  0
  1
  2
  3

julia> b = OffsetArray(-3:3, -3:3)
-3:3 with indices -3:3

julia> collect(b)
7-element Array{Int64,1}:
 -3
 -2
 -1
  0
  1
  2
  3

julia> Array(b)
ERROR: BoundsError: attempt to access 7-element Array{Int64,1} at index [-3:3]
Stacktrace:
 [1] throw_boundserror(::Array{Int64,1}, ::Tuple{OffsetArrays.IdOffsetRange{Int64,Base.OneTo{Int64}}}) at ./abstractarray.jl:537
 [2] checkbounds at ./abstractarray.jl:502 [inlined]
 [3] copyto!(::Array{Int64,1}, ::OffsetArray{Int64,1,UnitRange{Int64}}) at ./multidimensional.jl:959
 [4] Array at ./array.jl:541 [inlined]
 [5] Array(::OffsetArray{Int64,1,UnitRange{Int64}}) at ./boot.jl:429
 [6] top-level scope at REPL[9]:1

julia> values(b)
-3:3 with indices -3:3

julia> values(a)
Base.IdentityUnitRange(-3:3)

It's a good question. I have to say, I'm even sometimes a little confused about what collect is supposed to do (well, it's eager?) - it seems like such a nice generic english verb one might like to extend, but it's tied to the Array literal syntax.

we wish to strip the indices, returning a 1-based indexing view

If I think beyond arrays, I can use enumerate(iter) to get "indices" for each element in an iterable which might not even have been iterable to begin with. Not all iterables will support fast lookup based on their iteration position, but for those that do I can imagine instead of doing something like:

for (i, x) in enumerate(iter)
    ...
end

I could do something like

for (i, x) in pairs(enumeration(iter))
    ...
end

and enumeration(iter) would be an indexable such that enumeration(iter)[1] === first(iter), etc.

values is currently an AbstractDict thing also. E.g. it would seem natural to me if values(::Dict) would still be indexable - and the indices don't change. Whatever we come up with it would be nice to have similar semantics across dictionaries and arrays (and their keys).

My original intention for collect was pretty simple: it should produce the vector of values that you would get if you iterated an entire iterable and put the results in a vector. But that concept was not adhered to terribly well and now it has a hodgepodge of behaviors 🤷‍♂️

Thanks Stefan. Out of curiosity, what was wrong with using vector for that - like we have tuple and string as constructors for other built-in collections?

Related to this, I suggested more "lawful" approach to values etc. in #36175.

I think it's a good idea to make values and keys optionally indexible by "tokens". The tokens can be 1-origin dense indices for arrays.

what was wrong with using vector for that

I think the problem is that it doesn't scale. I don't think it's ideal for each custom container to have lowercase factory function that may or may not do what you expect (although it might be OK for Vector and Array). See #36288

Was this page helpful?
0 / 5 - 0 ratings