Now that we distinguish one
and oneunit
, ones(T, sz)
seems like a misnomer. Deprecate in favor of fill(oneunit(T), sz)
? If so, should we ditch zeros
too?
xref https://github.com/JuliaLang/julia/issues/11557#issuecomment-339776065 and below, and also @Sacha0's PR #24389
I have work in progress to this effect that I hope to post in the next day or two :). Best!
To me fill(oneunit(T), sz)
looks like a non-trivial loss in readability compared to ones(T, sz)
.
Note that you rarely need write something as verbose as fill(oneunit(T), sz)
, as usually a literal or something similarly compact suffices in place of oneunit(T)
. Shorter incantations may also become possible with future changes to array constructors. Additionally, once you've transitioned to fill
, you grow fond of it I assure you :). Best!
We could just pick whether ones
uses one
or oneunit
. ones
and zeros
should be considered convenience functions, which is ok as long as they have clear meanings in terms of more general functions.
Plan per triage: Revisit next week, specifically considering moving ones
and zeros
to the MATLAB compatibility layer, in favor of fill
in base. Best!
ones
and oneunits
would distinguish the two options.
Not sure how I feel about moving this and zeros
to MATLAB compatibility. Just because Mathworks invented it doesn鈥檛 mean it鈥檚 not a good convenience API that we can be proud of. Just briefly - what were the reasons behind this thought? (Sorry, I have to say that triage seems much more efficient, but significantly less transparent when the reasoning isn鈥檛 given).
It's not a change I'm thrilled about either, but I raised it because it's a logical inconsistency. I think it's fair to say that ones(T, sz)
implies fill(one(T), sz)
but really what it does under the hood is fill(oneunit(T), sz)
.
One option would be to rename it to oneunits
. An alternative would be to do a horrific swap: one
-> algebraic_one
and oneunit
->one
. This is breaking and not possible to do via deprecation, which is why I say "horrific."
Yes, I would say adding oneunits
and changing ones
to give fill(one(T), ...)
would be the obvious fix for that, no?
I'd be fine with that. Out of curiosity, what are the uses of fill(one(T), sz)
? Do we even need ones
at all?
Haha :) I was going to ask - what do you use fill(oneunits(T), sz)
for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)
I think fill(one(T), sz)
is used for much the same reason as zeros(T, sz)
, which is to initialize an array for a custom reduction-type-of operation. An array z = zeros(T, sz)
is ready to have it's elements added to with z[i] += x
whereas as o = fill(one(T), sz)
is ready to have it's elements multiplied o[i] *= x
. For example, I'm thinking of situations where the array elements might represent (relative) probability. In these two cases, I'm looking for the identity for the +
or *
operator respectively (and not additive generators).
See also #23544
Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)
The dependent variables for an ODE. It would be weird if it just randomly chopped of units when you asked for an array of type T
?
The fundamental reason to prefer fill
is that a tight set of powerful, orthogonal, composable tools serves better than a larger collection of ad hoc, limited, and overlapping tools such as ones
/zeros
/trues
/falses
. The discussion above highlights this point: While fill
accommodates all use cases above unambiguously and (in most real use) concisely, the ones
/zeros
/trues
/falses
approach requires a new function for each use case.
A couple relevant examples from rewrites in base:
complex.(ones(size(Ac, 1)), ones(size(Ac, 1)))
becomes
fill(1.0 + 1.0im, size(Ac, 1))
and
2ones(Int, 2, 2, 2)
becomes
fill(2, (2, 2, 2)) # or fill(2, 2, 2, 2) if you prefer
Please note that these fill
incantations are simpler, more readable, more compact, and more efficient than their non-fill
counterparts, and base/
and test/
are rife with examples like these. As with all change, transitioning to fill
requires some initial mental adjustment. But post-adjustment you find you have more power and elegance at your fingertips :). Best!
@Sacha0: trues
/falses
are not directly replaceable by fill
, but need to use fill!
with a BitArray
initializer. They are also not included in the ambiguity between one
and oneunit
. Therefore I don't think that they fit in this discussion at all.
As for ones
, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill
is not very compelling in my opinion, as all these examples use ones
as intermediate steps to do something else; but what about when you actually want an array of ones? Then having to use fill
is longer, less obvious and just more annoying. I like @JeffBezanson's proposal best. If parts of base
or test
can be rewritten more effectively using fill
instead of ones
there's nothing preventing that right now.
@carlobaldassi While that is true, a quick GitHub search reveals that almost all uses of ones
should really be using fill
to avoid intermediate allocations...
Using fill
for these cases makes sense to me. Note that we will want a new fill(x, A) = fill!(x, copymutable(A))
method to replace the corresponding methods of ones
etc.
@TotalVerb From a quick skim at the first 10 pages of that search, I wouldn't say that "almost all" is a fair assessment. A "significant fraction", at best. There are lots and lots of legitimate use cases for ones
there, perhaps even the majority (and even if they were just say 20% I think my argument still stands).
(I also actually have reservations about the readability argument, I think it's questionable to claim that e.g. fill(2, 2, 2, 2)
is more readable than 2 * ones(Int, 2, 2, 2)
, or that, say, fill(k * 1.0, 3)
would be more readable than k * ones(3)
; I'm not at all convinced that it's merely a matter of habit. This is a minor point though.)
trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.
Indeed, trues
and falses
are not directly replaceable by fill
:). Rather, trues
and falses
are additional instances of the general problem described above / related to #11557 (and that the recent direction of array constructors will hopefully tackle). Other examples include the existence of e.g. bones
,bzeros
, brand
, brandn
, and beye
in BandedMatrices.jl and equivalents with a d
prefix in DistributedArrays.jl.
As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else
Having just rewritten a few hundred uses of ones
in base, I can affirm @TotalVerb's statement
a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...
(Edit: Though I would say roughly half rather than almost all, and appropriate rewrites may be something other than fill
.) Moreover, that rewrite experience has taught me...
but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying.
... that on the other hand, fill
is frequently shorter and simpler in that case: The requested ones are frequently not Float64
(instead e.g. ones(Int, n...)
and ones(Complex{Float64}, n...)
), in which case fill
is shorter and simpler by admitting a literal (e.g. fill(1, n...)
and fill(1.0 + 0im, n...)
). In measurable terms, the branch in which I have been rewriting ones
calls in base is ~5% shorter by character count from ones
-> fill
rewrites. Best!
To get an objective sense of how ones
appears in the wild, I collected all ones
calls appearing in the first ten pages of a GitHub search for ones
in Julia code, rewrote each such call as appropriate and classified the corresponding change (see this gist), and then reduced the classification data to the following summary:
The analysis included 156 ones
calls. Of those calls,
84 calls (~54%) were ad hoc fill
s. (For example, ones(100)/sqrt(100)*7
simplifies to fill(7/sqrt(100), 100)
or, better yet, fill(.7, 100)
. My favorite was kron(0.997, ones(1, J*J*s)
-> fill(0.997, 1, J*J*s)
.)
3 calls (~2%) were ad hoc broadcasts. (For example, A - ones(n,n)
simplifies to A .- 1.
.)
5 calls (~3%) were ad hoc vector literals. (For example, ones(1)
simplifies to [1.]
.)
1 call (~0.5%) was semantically a junk matrix construction. (Though relatively uncommon in the wild, this pattern is quite common in test/
because we do not have a concise convenience constructor for uninitialized Array
s, as in e.g. @test_throws DimensionMismatch BLAS.trsv(...,Vector{elty}(n+1))
versus @test_throws DimensionMismatch BLAS.trsv(...,ones(elty,n+1))
.)
The remaining calls were semantically reasonable as ones
, though often ones
sees use merely because it's short, rather than because one
s specifically are necessary. Of those remaining calls,
13 calls (~8%) were slightly shorter as fill
. (For example, ones(Int, n, n)
-> fill(1, n, n)
or ones(Float64, n)
-> fill(1., n)
.)
50 calls (~32%) were slightly longer as fill
. (For example, ones(n, n)
-> fill(1., n, n)
.)
Overall, in the wild ~60% of ones
calls are better written somehow else, ~8% are reasonably semantically ones
and slightly shorter as fill
, and ~32% are reasonably semantically ones
and slightly longer as fill
.
An additional observation:
I encountered only one instance of a ones
call accepting an array argument, and it wasn't clear whether the enclosing snippet was real code. So the ones
methods accepting an array argument see little if any use in the wild.
Really interesting discussion ... went from being on the against side to the for side ... Also as another language precedent, R uses rep
and matrix
in a manner that is equivalent to the fill
(just corresponding to the 1d and 2d cases) and you get used to it very quickly -- even though I came from a zeros/ones world.
Wow, thank you @Sacha0 for putting in that effort!
The question naturally arises as "what about zeros
"? I'm guessing there will be significantly more usage, and a few more categories of use (including things like "I simply don't trust uninitialised arrays" or "I don't know how to use the Array
constructors").
For whatever reason (I guess it's the symmetry of one
and zero
), I'm somewhat attracted towards replacing either both ones
and zeros
with fill
, or neither.
The thing with zeros is that you would seem to be in one of these situations:
0I
.There's really no use case where actually allocating a matrix of dense zeros is actually a good idea.
That鈥檚 perhaps true of linear algebra. It鈥檚 not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it鈥檚 not worth the performance impact to represent them compactly.
Fair enough 鈥撀爏ometimes you don't care about density and the simplicity is worth it.
Triage: resolved that we keep only the completely non-generic methods i.e. zeros(dims...)
and ones(dims...)
and maybe zeros(dims)
and ones(dims)
as well.
@StefanKarpinski for usage recommendations does that mean we would recommend zeros(3, 3)
over fill(0.0, 3, 3)
for normal code (when a dense array is wanted etc etc)? Some of the details of efficiency etc is out of my reach, I am just thinking of how I would teach best/idiomatic practices in julia moving forward.
This decision seems very surprising to me, it's not so common in base to specifically prevent genericity. What is the reasoning behind? is it that this function comes from matlab where it is not generic (and works only with floats)?
Triage: resolved that we keep only the completely non-generic methods
What is the reasoning behind?
Further to this, is this issue somehow related to the general problem of constructing arrays that seems to be being considered right now, and if so, how does this help? Or, are we trying to reduce the number of exported methods from Base
? (Or something else entirely?)
Writeup forthcoming in https://github.com/JuliaLang/julia/issues/24595 :). Best!
ones
, zeros
, and fill
in depth. Reading the former is valuable in appreciating the latter. Best!Well, saving at least the Float64
case is better than nothing.
I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.
It should also be noted that zeros
does not have the "problem" of allowing the 3 * ones(n)
anti-pattern. In fact, I don't really see why ones
and zeros
should go together, except in the broad sense of being convenience constructors. There is no real 'symmetry' between those two.
A couple of additional comments on the statistical analysis, since it seems to be the basis of the following discussions and the writeup of #24595. First, ten pages are not really enough for fine-grained conclusions of what's going on in the wild, they can give a rough idea at best. Some files came straight from matlab, for example, as it's clear from their name/style. Secondly, as I suspected, even that analysis shows that roughly half of the uses of ones
in there were "legitimate". Thirdly, looking at code like this doesn't tell anything about when writing 3 * ones(...)
is really an anti-pattern that creates performance issues, or it's a piece of code that has no performance implications at all (and the writer may have decided that it's just more readable written that way -- which I strongly feel is nobody else's business to decide otherwise, in that case).
Relatedly to the last point, and I think more importantly, what you get to see from a github search will never take into account what's going on in the REPL of people doing exploratory/preliminary work in Julia. Which is exactly where convenience functions come in most handy, and taking them away for no discernible reason is correspondingly most annoying. My point being, having a consistent set of orthogonal primitives that allow to write generic and efficient code is a great goal, and the effort to get there is truly commendable; it's just that not all code is supposed to be beautiful, generic, composable library code. Just my two cents.
Regarding
I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that
@vtjnash
was making here.
which refers to
It鈥檚 not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it鈥檚 not worth the performance impact to represent them compactly.
Note that fill
serves as well or better in that case: fill(0, shape...)
versus zeros(Int, shape...)
.
Concerning your other points, #24595 may be worth reading :). Best!
I simply find that zeros(Int, 10, 10)
is more readable/explicit than fill(0, 10, 10)
, and zeros(T, k)
is better than fill(zero(T), k)
. Why can't we just have both? I don't buy the argument that zeros
suffers the same ambiguity problem as ones
.
Concerning your other points, #24595 may be worth reading
I had read it. (I even linked it.)
I had read it. (I even linked it.)
Having read #24595 in full and given it due consideration, you are then aware that #24595: (1) concerns a much broader issue of which convenience constructors are only a part; and (2) considers far more than just the statistical analysis posted above and the points you focus on here.
I appreciate that you feel strongly about ones
and zeros
; your sentiment has come through loud and clear :). As such, chances are our bandwidth would be better spent pushing other fronts forward than continuing this conversation in its present form. Best!
Is there a generic fill
incoming along with the zeros
change? zeros
had a very legitimate use for generic programming since it's much safer than similar
, so all of DiffEq, and then I know recently Optim and NLsolve moved from allocating with similar
to zeros
since knowing everything is allocated to have zeros in it stops a lot of bugs. However, now it seems there's going to be no method for:
zeros(X)
anymore, other than:
similar(X); fill!(X,0)
because the current fill
only builds Array
s and thus doesn't type-match like similar
or zeros
does. I know that some people misused zeros
to allocate when they shouldn't have, but allocating with zeros
is a very reasonable thing to do in many cases. I hope that a fill(0,X)
shorthand gets added to fill this void.
Much thanks for the thoughtful post Chris! :) As an interim shorthand replacement, does zero(X)
do the trick?
Note that such use cases are precisely where the ambiguity in zeros
and ones
can be problematic: Does zeros(X)
yield an object with eltype(X)
and filled with eltype(X)
's additive identity (i.e. fill!(similar(X), 0)
), or instead an object filled with a multiplicative zero for eltype(X)
(possibly not of eltype(X)
)? (For expansion, see #24595.)
The fill(0, X)
concept sees a bit of discussion in #11557, and I agree that might be a useful generalization of fill
. Thanks and best!
The other problem is that arrays with unconventional indices might want to be created with something like zeros(inds...)
(because the index type determines the array type). But for a 1-d case, is X
"the array you want to be similar to" or "the indices for the desired array"? (After all, AbstractUnitRange <: AbstractArray
.) Concretely, does zeros(3:5)
mean fill!(similar(3:5), 0)
or fill!(OffsetArray(Vector{Float64}(3), 3:5), 0)
?
Linking https://github.com/JuliaLang/julia/pull/24656, which contains additional discussion of {ones|zeros }(A::AbstractArray, ...)
convenience constructors. Best!
I am surprised it's considered odd to use zeros
to make cache variables according to #24656. I would think that, if zeros
was reduced to close to zero overhead, almost any case where people are using similar
should instead be zeros
since that tends to fix quite a few bugs. I think we should encourage more people to do this since similar
can be quite unsafe, and not having a function and instead pulling together fill!
+ similar
makes it less obvious that is what people should be doing. Here's a comment about this popping up in Optim.jl:
https://github.com/JuliaNLSolvers/NLsolve.jl/issues/89#issuecomment-294585960
However, I do agree with @timholy that it is non-obvious how it should be interpreted. Let me point you to a really non-simple example in DiffEq.
https://github.com/JuliaDiffEq/MultiScaleArrays.jl
MultiScaleArrays.jl created abstract arrays which are recursive graph-like structures which can be used in the diffeq solvers (and I think it may be compatible with Optim.jl and NLsolve.jl now?). It's a nice convenience for biological models among other things. When we throw it into the ODE solver there's a question: what should we make the cache arrays? In some cases it matters that the user gets back the array they wanted since it will show up in their ODE function f(t,u,du)
and they will want to treat it accordingly. However, in other cases it only shows up as something that is internally broadcasted against. So there's two different types of cache variables.
To handle this, take a look at the cache for one of the algorithms:
https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/caches/low_order_rk_caches.jl#L224-L234
Here, rate_prototype = similar(u,first(u)/t,indices(u)
is similar but with potentially different eltype for units. But notice that there's two separate types here: similar(u)
vs similar(u,indices(u))
. I have interpreted those to mean "match the type and the shape" vs "match the shape and the eltype, but doesn't need to be the same type". So for an AbstractMultiScaleArray
, the first will create another AbstractMultiScaleArray
while the other, for speed since it isn't seen by the user, will just create an Array
of the appropriate size. This are then extended to similar(u,T)
and similar(u,T,indices(u))
.
Maybe that's just punning off of what already exists, but I think this is an important distinction. When doing generic programming, you have two separate caches: user-facing caches that you want to match types to match their expectation, and internal caches which are just used by the algorithm and you want as much speed as possible.
Notice that these are using similar
because I was too lazy to come up with a zeros
version of it. I actually have a separate place that can zero out some of these arrays because if the user only sets du
in their f(t,u,du)
derivative calculation, they tend to implicitly thing "what I don't set means zero", which only is true when it was allocated with zeros
, so I try to pre-allocate using zeros
as much as possible (the same issue comes up in NLsolve.jl for this).
Hopefully that explanation wasn't too confusing to follow. In all of my cases I can just switch to similar
followed by fill!
, but I know some packages won't and that will be a source of bugs.
Interesting. You're right that similar(A, inds)
might create a different type than similar(A)
, but in general I've always thought of it as likely creating the same type but with different indices. For example, if you needed a 1-dimensional cache for a columnwise operation on a 2-d object, I would use similar(A, first(inds))
. (Of course it is a different type because the dimensionality is a type parameter, but it might be the same abstract container type.) You could also use it to create a 5x5 cache of a small tile, etc.
Overall this seems to be a challenging problem. It's a little late in the game, but should we introduce same
? It could have the same arguments as similar
, but the contract would be that it would be required to return the same abstract container.
I could support a one-argument form of same
, but even this is tricky - note that same(a)
couldn't return the same array type if a
doesn't support setindex!
because same
and similar
are only useful if you are going to write to the array afterwards. We could make this an error for immutable a
, but as a interface for AbstractArray
this seems unnecessary (and maybe unhelpful) for making correct, generic code.
Similarly, we can't assume every AbstractArray
can support different eltypes or indices. To me, having a two- or three-argument form of same
would just introduce run-time errors in a bunch of places while giving people a false sense of security that their generic code will work well for any AbstractArray
input, when it does not.
But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"?
This is another reason I am in favour of keys
returning a container with identical indices and values, and then making this a requirement of similar
(unless you provide one (some) integer(s) in which case Base.OneTo
(CartesianRange) is assumed).
This discussion is veering towards #18161, and perhaps should continue there :).
Most recent triage was leaning towards just keeping ones
.
Does it hurt to keep them? I think it helps people who come from numpy feel at home in Julia.
Closing since we are keeping ones
and zeros
.
Most helpful comment
To me
fill(oneunit(T), sz)
looks like a non-trivial loss in readability compared toones(T, sz)
.