Julia: Deprecate `ones`?

Created on 2 Nov 2017  路  46Comments  路  Source: JuliaLang/julia

Now that we distinguish one and oneunit, ones(T, sz) seems like a misnomer. Deprecate in favor of fill(oneunit(T), sz)? If so, should we ditch zeros too?

decision linear algebra stdlib

Most helpful comment

To me fill(oneunit(T), sz) looks like a non-trivial loss in readability compared to ones(T, sz).

All 46 comments

xref https://github.com/JuliaLang/julia/issues/11557#issuecomment-339776065 and below, and also @Sacha0's PR #24389

I have work in progress to this effect that I hope to post in the next day or two :). Best!

To me fill(oneunit(T), sz) looks like a non-trivial loss in readability compared to ones(T, sz).

Note that you rarely need write something as verbose as fill(oneunit(T), sz), as usually a literal or something similarly compact suffices in place of oneunit(T). Shorter incantations may also become possible with future changes to array constructors. Additionally, once you've transitioned to fill, you grow fond of it I assure you :). Best!

We could just pick whether ones uses one or oneunit. ones and zeros should be considered convenience functions, which is ok as long as they have clear meanings in terms of more general functions.

Plan per triage: Revisit next week, specifically considering moving ones and zeros to the MATLAB compatibility layer, in favor of fill in base. Best!

ones and oneunits would distinguish the two options.

Not sure how I feel about moving this and zeros to MATLAB compatibility. Just because Mathworks invented it doesn鈥檛 mean it鈥檚 not a good convenience API that we can be proud of. Just briefly - what were the reasons behind this thought? (Sorry, I have to say that triage seems much more efficient, but significantly less transparent when the reasoning isn鈥檛 given).

It's not a change I'm thrilled about either, but I raised it because it's a logical inconsistency. I think it's fair to say that ones(T, sz) implies fill(one(T), sz) but really what it does under the hood is fill(oneunit(T), sz).

One option would be to rename it to oneunits. An alternative would be to do a horrific swap: one -> algebraic_one and oneunit->one. This is breaking and not possible to do via deprecation, which is why I say "horrific."

Yes, I would say adding oneunits and changing ones to give fill(one(T), ...) would be the obvious fix for that, no?

I'd be fine with that. Out of curiosity, what are the uses of fill(one(T), sz)? Do we even need ones at all?

Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)

I think fill(one(T), sz) is used for much the same reason as zeros(T, sz), which is to initialize an array for a custom reduction-type-of operation. An array z = zeros(T, sz) is ready to have it's elements added to with z[i] += x whereas as o = fill(one(T), sz) is ready to have it's elements multiplied o[i] *= x. For example, I'm thinking of situations where the array elements might represent (relative) probability. In these two cases, I'm looking for the identity for the + or * operator respectively (and not additive generators).

See also #23544

Haha :) I was going to ask - what do you use fill(oneunits(T), sz) for? (E.g. What would an array filled with 1 metre, or 1 kilogram, or 1 second, be used for?)

The dependent variables for an ODE. It would be weird if it just randomly chopped of units when you asked for an array of type T?

The fundamental reason to prefer fill is that a tight set of powerful, orthogonal, composable tools serves better than a larger collection of ad hoc, limited, and overlapping tools such as ones/zeros/trues/falses. The discussion above highlights this point: While fill accommodates all use cases above unambiguously and (in most real use) concisely, the ones/zeros/trues/falses approach requires a new function for each use case.

A couple relevant examples from rewrites in base:

complex.(ones(size(Ac, 1)), ones(size(Ac, 1)))

becomes

fill(1.0 + 1.0im, size(Ac, 1))

and

2ones(Int, 2, 2, 2)

becomes

fill(2, (2, 2, 2)) # or fill(2, 2, 2, 2) if you prefer

Please note that these fill incantations are simpler, more readable, more compact, and more efficient than their non-fill counterparts, and base/ and test/ are rife with examples like these. As with all change, transitioning to fill requires some initial mental adjustment. But post-adjustment you find you have more power and elegance at your fingertips :). Best!

@Sacha0: trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.

As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else; but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying. I like @JeffBezanson's proposal best. If parts of base or test can be rewritten more effectively using fill instead of ones there's nothing preventing that right now.

@carlobaldassi While that is true, a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...

Using fill for these cases makes sense to me. Note that we will want a new fill(x, A) = fill!(x, copymutable(A)) method to replace the corresponding methods of ones etc.

@TotalVerb From a quick skim at the first 10 pages of that search, I wouldn't say that "almost all" is a fair assessment. A "significant fraction", at best. There are lots and lots of legitimate use cases for ones there, perhaps even the majority (and even if they were just say 20% I think my argument still stands).

(I also actually have reservations about the readability argument, I think it's questionable to claim that e.g. fill(2, 2, 2, 2) is more readable than 2 * ones(Int, 2, 2, 2), or that, say, fill(k * 1.0, 3) would be more readable than k * ones(3); I'm not at all convinced that it's merely a matter of habit. This is a minor point though.)

trues/falses are not directly replaceable by fill, but need to use fill! with a BitArray initializer. They are also not included in the ambiguity between one and oneunit. Therefore I don't think that they fit in this discussion at all.

Indeed, trues and falses are not directly replaceable by fill :). Rather, trues and falses are additional instances of the general problem described above / related to #11557 (and that the recent direction of array constructors will hopefully tackle). Other examples include the existence of e.g. bones,bzeros, brand, brandn, and beye in BandedMatrices.jl and equivalents with a d prefix in DistributedArrays.jl.

As for ones, I'm generally opposed to deprecating it anyway, I don't see the benefit. The argument that some expression can be written more effectively with fill is not very compelling in my opinion, as all these examples use ones as intermediate steps to do something else

Having just rewritten a few hundred uses of ones in base, I can affirm @TotalVerb's statement

a quick GitHub search reveals that almost all uses of ones should really be using fill to avoid intermediate allocations...

(Edit: Though I would say roughly half rather than almost all, and appropriate rewrites may be something other than fill.) Moreover, that rewrite experience has taught me...

but what about when you actually want an array of ones? Then having to use fill is longer, less obvious and just more annoying.

... that on the other hand, fill is frequently shorter and simpler in that case: The requested ones are frequently not Float64 (instead e.g. ones(Int, n...) and ones(Complex{Float64}, n...)), in which case fill is shorter and simpler by admitting a literal (e.g. fill(1, n...) and fill(1.0 + 0im, n...)). In measurable terms, the branch in which I have been rewriting ones calls in base is ~5% shorter by character count from ones -> fill rewrites. Best!

To get an objective sense of how ones appears in the wild, I collected all ones calls appearing in the first ten pages of a GitHub search for ones in Julia code, rewrote each such call as appropriate and classified the corresponding change (see this gist), and then reduced the classification data to the following summary:

The analysis included 156 ones calls. Of those calls,

  • 84 calls (~54%) were ad hoc fills. (For example, ones(100)/sqrt(100)*7 simplifies to fill(7/sqrt(100), 100) or, better yet, fill(.7, 100). My favorite was kron(0.997, ones(1, J*J*s) -> fill(0.997, 1, J*J*s).)

  • 3 calls (~2%) were ad hoc broadcasts. (For example, A - ones(n,n) simplifies to A .- 1..)

  • 5 calls (~3%) were ad hoc vector literals. (For example, ones(1) simplifies to [1.].)

  • 1 call (~0.5%) was semantically a junk matrix construction. (Though relatively uncommon in the wild, this pattern is quite common in test/ because we do not have a concise convenience constructor for uninitialized Arrays, as in e.g. @test_throws DimensionMismatch BLAS.trsv(...,Vector{elty}(n+1)) versus @test_throws DimensionMismatch BLAS.trsv(...,ones(elty,n+1)).)

The remaining calls were semantically reasonable as ones, though often ones sees use merely because it's short, rather than because ones specifically are necessary. Of those remaining calls,

  • 13 calls (~8%) were slightly shorter as fill. (For example, ones(Int, n, n) -> fill(1, n, n) or ones(Float64, n) -> fill(1., n).)

  • 50 calls (~32%) were slightly longer as fill. (For example, ones(n, n) -> fill(1., n, n).)

Overall, in the wild ~60% of ones calls are better written somehow else, ~8% are reasonably semantically ones and slightly shorter as fill, and ~32% are reasonably semantically ones and slightly longer as fill.

An additional observation:

I encountered only one instance of a ones call accepting an array argument, and it wasn't clear whether the enclosing snippet was real code. So the ones methods accepting an array argument see little if any use in the wild.

Really interesting discussion ... went from being on the against side to the for side ... Also as another language precedent, R uses rep and matrix in a manner that is equivalent to the fill (just corresponding to the 1d and 2d cases) and you get used to it very quickly -- even though I came from a zeros/ones world.

Wow, thank you @Sacha0 for putting in that effort!

The question naturally arises as "what about zeros"? I'm guessing there will be significantly more usage, and a few more categories of use (including things like "I simply don't trust uninitialised arrays" or "I don't know how to use the Array constructors").

For whatever reason (I guess it's the symmetry of one and zero), I'm somewhat attracted towards replacing either both ones and zeros with fill, or neither.

The thing with zeros is that you would seem to be in one of these situations:

  1. You need to overwrite most of the zeros 鈥撀爄n this case you're better off using a comprehension;
  2. You don't need to replace most of the zeros 鈥撀爄n this case you're better off using a sparse matrix;
  3. You actually need an all-zero matrix 鈥撀爄n this case you're better off using 0I.

There's really no use case where actually allocating a matrix of dense zeros is actually a good idea.

That鈥檚 perhaps true of linear algebra. It鈥檚 not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it鈥檚 not worth the performance impact to represent them compactly.

Fair enough 鈥撀爏ometimes you don't care about density and the simplicity is worth it.

Triage: resolved that we keep only the completely non-generic methods i.e. zeros(dims...) and ones(dims...) and maybe zeros(dims) and ones(dims) as well.

@StefanKarpinski for usage recommendations does that mean we would recommend zeros(3, 3) over fill(0.0, 3, 3) for normal code (when a dense array is wanted etc etc)? Some of the details of efficiency etc is out of my reach, I am just thinking of how I would teach best/idiomatic practices in julia moving forward.

This decision seems very surprising to me, it's not so common in base to specifically prevent genericity. What is the reasoning behind? is it that this function comes from matlab where it is not generic (and works only with floats)?

Triage: resolved that we keep only the completely non-generic methods

What is the reasoning behind?

Further to this, is this issue somehow related to the general problem of constructing arrays that seems to be being considered right now, and if so, how does this help? Or, are we trying to reduce the number of exported methods from Base? (Or something else entirely?)

Writeup forthcoming in https://github.com/JuliaLang/julia/issues/24595 :). Best!

24595's OP detailed this issue's broader context, and now a followup in #24595 specifically addresses convenience constructors ones, zeros, and fill in depth. Reading the former is valuable in appreciating the latter. Best!

Well, saving at least the Float64 case is better than nothing.
I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.

It should also be noted that zeros does not have the "problem" of allowing the 3 * ones(n) anti-pattern. In fact, I don't really see why ones and zeros should go together, except in the broad sense of being convenience constructors. There is no real 'symmetry' between those two.

A couple of additional comments on the statistical analysis, since it seems to be the basis of the following discussions and the writeup of #24595. First, ten pages are not really enough for fine-grained conclusions of what's going on in the wild, they can give a rough idea at best. Some files came straight from matlab, for example, as it's clear from their name/style. Secondly, as I suspected, even that analysis shows that roughly half of the uses of ones in there were "legitimate". Thirdly, looking at code like this doesn't tell anything about when writing 3 * ones(...) is really an anti-pattern that creates performance issues, or it's a piece of code that has no performance implications at all (and the writer may have decided that it's just more readable written that way -- which I strongly feel is nobody else's business to decide otherwise, in that case).

Relatedly to the last point, and I think more importantly, what you get to see from a github search will never take into account what's going on in the REPL of people doing exploratory/preliminary work in Julia. Which is exactly where convenience functions come in most handy, and taking them away for no discernible reason is correspondingly most annoying. My point being, having a consistent set of orthogonal primitives that allow to write generic and efficient code is a great goal, and the effort to get there is truly commendable; it's just that not all code is supposed to be beautiful, generic, composable library code. Just my two cents.

Regarding

I believe that the case of integer zeros is also quite relevant, which -- I assume -- is basically the point that @vtjnash was making here.

which refers to

It鈥檚 not uncommon to need a zero-initialized collection in my work on compilers and other data structures. Maybe they are often sparse, but it鈥檚 not worth the performance impact to represent them compactly.

Note that fill serves as well or better in that case: fill(0, shape...) versus zeros(Int, shape...).

Concerning your other points, #24595 may be worth reading :). Best!

I simply find that zeros(Int, 10, 10) is more readable/explicit than fill(0, 10, 10), and zeros(T, k) is better than fill(zero(T), k). Why can't we just have both? I don't buy the argument that zeros suffers the same ambiguity problem as ones.

Concerning your other points, #24595 may be worth reading

I had read it. (I even linked it.)

I had read it. (I even linked it.)

Having read #24595 in full and given it due consideration, you are then aware that #24595: (1) concerns a much broader issue of which convenience constructors are only a part; and (2) considers far more than just the statistical analysis posted above and the points you focus on here.

I appreciate that you feel strongly about ones and zeros; your sentiment has come through loud and clear :). As such, chances are our bandwidth would be better spent pushing other fronts forward than continuing this conversation in its present form. Best!

Is there a generic fill incoming along with the zeros change? zeros had a very legitimate use for generic programming since it's much safer than similar, so all of DiffEq, and then I know recently Optim and NLsolve moved from allocating with similar to zeros since knowing everything is allocated to have zeros in it stops a lot of bugs. However, now it seems there's going to be no method for:

zeros(X)

anymore, other than:

similar(X); fill!(X,0)

because the current fill only builds Arrays and thus doesn't type-match like similar or zeros does. I know that some people misused zeros to allocate when they shouldn't have, but allocating with zeros is a very reasonable thing to do in many cases. I hope that a fill(0,X) shorthand gets added to fill this void.

Much thanks for the thoughtful post Chris! :) As an interim shorthand replacement, does zero(X) do the trick?

Note that such use cases are precisely where the ambiguity in zeros and ones can be problematic: Does zeros(X) yield an object with eltype(X) and filled with eltype(X)'s additive identity (i.e. fill!(similar(X), 0)), or instead an object filled with a multiplicative zero for eltype(X) (possibly not of eltype(X))? (For expansion, see #24595.)

The fill(0, X) concept sees a bit of discussion in #11557, and I agree that might be a useful generalization of fill. Thanks and best!

The other problem is that arrays with unconventional indices might want to be created with something like zeros(inds...) (because the index type determines the array type). But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"? (After all, AbstractUnitRange <: AbstractArray.) Concretely, does zeros(3:5) mean fill!(similar(3:5), 0) or fill!(OffsetArray(Vector{Float64}(3), 3:5), 0)?

Linking https://github.com/JuliaLang/julia/pull/24656, which contains additional discussion of {ones|zeros }(A::AbstractArray, ...) convenience constructors. Best!

I am surprised it's considered odd to use zeros to make cache variables according to #24656. I would think that, if zeros was reduced to close to zero overhead, almost any case where people are using similar should instead be zeros since that tends to fix quite a few bugs. I think we should encourage more people to do this since similar can be quite unsafe, and not having a function and instead pulling together fill! + similar makes it less obvious that is what people should be doing. Here's a comment about this popping up in Optim.jl:

https://github.com/JuliaNLSolvers/NLsolve.jl/issues/89#issuecomment-294585960

However, I do agree with @timholy that it is non-obvious how it should be interpreted. Let me point you to a really non-simple example in DiffEq.

https://github.com/JuliaDiffEq/MultiScaleArrays.jl

MultiScaleArrays.jl created abstract arrays which are recursive graph-like structures which can be used in the diffeq solvers (and I think it may be compatible with Optim.jl and NLsolve.jl now?). It's a nice convenience for biological models among other things. When we throw it into the ODE solver there's a question: what should we make the cache arrays? In some cases it matters that the user gets back the array they wanted since it will show up in their ODE function f(t,u,du) and they will want to treat it accordingly. However, in other cases it only shows up as something that is internally broadcasted against. So there's two different types of cache variables.

To handle this, take a look at the cache for one of the algorithms:

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/blob/master/src/caches/low_order_rk_caches.jl#L224-L234

Here, rate_prototype = similar(u,first(u)/t,indices(u) is similar but with potentially different eltype for units. But notice that there's two separate types here: similar(u) vs similar(u,indices(u)). I have interpreted those to mean "match the type and the shape" vs "match the shape and the eltype, but doesn't need to be the same type". So for an AbstractMultiScaleArray, the first will create another AbstractMultiScaleArray while the other, for speed since it isn't seen by the user, will just create an Array of the appropriate size. This are then extended to similar(u,T) and similar(u,T,indices(u)).

Maybe that's just punning off of what already exists, but I think this is an important distinction. When doing generic programming, you have two separate caches: user-facing caches that you want to match types to match their expectation, and internal caches which are just used by the algorithm and you want as much speed as possible.

Notice that these are using similar because I was too lazy to come up with a zeros version of it. I actually have a separate place that can zero out some of these arrays because if the user only sets du in their f(t,u,du) derivative calculation, they tend to implicitly thing "what I don't set means zero", which only is true when it was allocated with zeros, so I try to pre-allocate using zeros as much as possible (the same issue comes up in NLsolve.jl for this).

Hopefully that explanation wasn't too confusing to follow. In all of my cases I can just switch to similar followed by fill!, but I know some packages won't and that will be a source of bugs.

Interesting. You're right that similar(A, inds) might create a different type than similar(A), but in general I've always thought of it as likely creating the same type but with different indices. For example, if you needed a 1-dimensional cache for a columnwise operation on a 2-d object, I would use similar(A, first(inds)). (Of course it is a different type because the dimensionality is a type parameter, but it might be the same abstract container type.) You could also use it to create a 5x5 cache of a small tile, etc.

Overall this seems to be a challenging problem. It's a little late in the game, but should we introduce same? It could have the same arguments as similar, but the contract would be that it would be required to return the same abstract container.

I could support a one-argument form of same, but even this is tricky - note that same(a) couldn't return the same array type if a doesn't support setindex! because same and similar are only useful if you are going to write to the array afterwards. We could make this an error for immutable a, but as a interface for AbstractArray this seems unnecessary (and maybe unhelpful) for making correct, generic code.

Similarly, we can't assume every AbstractArray can support different eltypes or indices. To me, having a two- or three-argument form of same would just introduce run-time errors in a bunch of places while giving people a false sense of security that their generic code will work well for any AbstractArray input, when it does not.

But for a 1-d case, is X "the array you want to be similar to" or "the indices for the desired array"?

This is another reason I am in favour of keys returning a container with identical indices and values, and then making this a requirement of similar (unless you provide one (some) integer(s) in which case Base.OneTo (CartesianRange) is assumed).

This discussion is veering towards #18161, and perhaps should continue there :).

Most recent triage was leaning towards just keeping ones.

Does it hurt to keep them? I think it helps people who come from numpy feel at home in Julia.

Closing since we are keeping ones and zeros.

Was this page helpful?
0 / 5 - 0 ratings