Julia: Reconsider uses of promote_op

Created on 21 Dec 2016 · 19Comments · Source: JuliaLang/julia

At the moment, the result's element type of broadcast is determined in a straight-forward way by typejoining the type of all the result's elements. (Exception: The empty case, where inference is invoked.) However, for certain operations, the more elaborate mechanism of promote_op is used, which especially aims at doing magical stuff if the input arrays have non-leaf element types. E.g.

julia> x = ones(Real, 1)
1-element Array{Real,1}:
 1

julia> x .+ x
1-element Array{Int64,1}:
 2

julia> x + x
1-element Array{Real,1}:
 2

While the behavior of + here does look useful, I wonder whether it is worth the inconsistency and the effort. I tend to say no, but before starting any real work on the issue, I'd like some feedback.

types and dispatch

Source

martinholters

Most helpful comment

This is quite convincing. I think we should just list the typical cases where preserving abstract element types could have been useful (if any), just to be sure we won't regret losing this feature too much.

nalimilan on 21 Dec 2016

👍4

All 19 comments

As a quick test, I did

--- a/base/arraymath.jl
+++ b/base/arraymath.jl
@@ -68,7 +68,7 @@ promote_array_type{S<:Integer}(F, ::Type{S}, ::Type{Bool}, T::Type) = T

 for f in (:+, :-, :div, :mod, :&, :|, :xor)
     @eval ($f)(A::AbstractArray, B::AbstractArray) =
-        _elementwise($f, promote_eltype_op($f, A, B), A, B)
+        _elementwise($f, Broadcast._broadcast_type($f, A, B), A, B)
 end
 function _elementwise(op, ::Type{Any}, A::AbstractArray, B::AbstractArray)
     promote_shape(A, B) # check size compatibility

which at least gives

julia> x + x
1-element Array{Int64,1}:
 2

for x as above and it _breaks no tests_. So in case we decide to keep the current behavior, we should add tests...

martinholters on 21 Dec 2016

Poking around a bit, the code lines for mixed scalar/array cases pointed out by @andreasnoack in https://github.com/JuliaLang/julia/issues/19615#issuecomment-268329094 are more interesting. They are responsible e.g. for

julia> 1.0 + Float32[1.0]
1-element Array{Float32,1}:
 2.0

julia> 1.0 + [1]
1-element Array{Float64,1}:
 2.0

Preserving the array element type is a no-go for the second example. Blindly using broadcast would give a Vector{Float64} in the first case. The question is whether this doubling of memory consumption would be too surprising for the casual user (also see #14252). Consistency with .+ still seems preferable to me, but here, it is maybe not as clear anymore.

martinholters on 21 Dec 2016

Regarding the preservation of the array element type, I believe we should consider the following for a scalar s and an array A

Doing s ⨳ A creates a new array applying the ⨳ operation elementwise. The result of each such operation has a type which, in case of being the same for all array elements, should be the eltype of the new array, or a type wide enough to hold all the values otherwise. For the empty case, use inference. This is the behavior of broadcast.
If you need to preserve the type of your container you now have many options:
a. Modify your array in place. E.g. Float32[1.0] .+= 1.0
b. In some cases you could use an scalar such that the return type is the same as the array type. Float32[1.0] + 1.f0
c. Use a comprehension. T=eltype(A) and T[a ⨳ s for a in A] or [T(a ⨳ s) for a in A]
d. Take advantage of dot fusion. T.(s .⨳ A)

This and the fact that the code would be more easily maintainable, as @martinholters has mentioned before, makes me think that we should follow the route proposed in this issue (point 1 above).

pabloferz on 21 Dec 2016

I don't think that Array+Number can meaningfully be anything but broadcast (or an error, see https://github.com/JuliaLang/julia/pull/5810). * is different but I'd guess that most operators are similar to + here. Consequently, the behavior should be exactly that of broadcast.

andreasnoack on 21 Dec 2016

I have collected some of the low-hanging fruit in this branch. I probably won't spend time on this during the holiday season, so if anyone wants to pick up from here, feel free. (Note the remarks in the commit regarding Diagonal!)

Edit: I have delete the branch as most of the work there has made it into master, and the work on Diagonal wasn't really usable as it was.

martinholters on 21 Dec 2016

I think that the performance issues with using .* etc in Diagonal should be fixed by #19639?

stevengj on 21 Dec 2016

@stevengj, is that in response to the changes in the mentioned branch? Then no, the problem is that transposition cannot be fused into broadcast, leading to additional temporaries. (broadcast_At_B, anyone? :smile:)

martinholters on 21 Dec 2016

Another voice of support for this thread's proposal and the overall direction in the branch linked above. (Re. the first commit on that branch / @stevengj's comment, ref. #19672.) Best!

Sacha0 on 21 Dec 2016

nalimilan on 21 Dec 2016

👍4

@martinholters, ah, I see. Once we have some kind of Transpose type (#6837 or #19670), fusion of transposition won't be a problem.

stevengj on 21 Dec 2016

👍1

I don't think that Array+Number can meaningfully be anything but broadcast (or an error, see #5810). * is different but I'd guess that most operators are similar to + here. Consequently, the behavior should be exactly that of broadcast.

I would prefer +, -, *, etc take their linear algebra meanings only (which are scalar*vector, vector+vector, and the understanding that arrays of any dimension are still living in a vector space (they are like a tensor), plus we have a variety of matrix/vector multiplications).

So Array + Number should definitely be an error, IMO (with a nice error message telling people to use .+).

andyferris on 22 Dec 2016

👎2

We already tried deprecating array + scalar and it was too inconvenient, nor is there a strong mathematical need to restrict the meaning of + to linear algebra. On the other hand, we are now deprecating most vectorized functions like sin(array), so there is greater consistency in deprecating array + scalar at this point (and potential performance advantages, but this is limited by the fact that we won't deprecate array + array or array * scalar).

stevengj on 22 Dec 2016

👍1

The Diagonal multiplication is actually an interesting case to study. E.g.

(*)(A::AbstractMatrix, D::Diagonal) =
    scale!(similar(A, promote_op(*, eltype(A), eltype(D.diag))), A, D.diag)

Here, similar(A, T) to preserve the container type of A (if possible) makes a lot of sense, but using promote_op to determined the element type T leads to the discussed inconsistencies with broadcast behavior. But OTOH, using broadcast to obtain an Array of the correct element type and converting to the desired final return type by combing container type of A and this element type sounds prohibitively wasteful.
Now _broadcast actually supports broadcasting into an AbstractArray (using similar to update the element type if needed), but the wrappers broadcast_t and broadcast_c that do a lot of setup work are restricted to Arrays.
What might be very useful here and probably elsewhere would be a function that combines the element type computation of broadcast with the possibility to give a container D similar to the destination container for broadcast!, but only using it as the first argument to similar to obtain the output container. Does that sound reasonable? (And what should the function be called?)

That would allow

(*)(A::AbstractMatrix, D::Diagonal) = function_to_be_named(*, A, A, D.diag.')

martinholters on 2 Jan 2017

Couldn't some of this logic go into broadcast_c and broadcast_t? It seems wrong to me that these explicitly use Array.

andreasnoack on 2 Jan 2017

Certainly. Replacing similar(Array{T}, shape) with similar(newly_introduced_parameter, T, shape) would probably go a long way of getting the desired functionality. I just wanted to to see whether this was deemed useful/desirable before cooking up an actual design/implementation.

martinholters on 2 Jan 2017

starting to sound a bit like #16740 again

tkelman on 2 Jan 2017

A bit like it, but different in two aspects: The newly_introduced_parameter could be a (container) type or an instance; not sure which is better or whether both should actually be allowed. Further, the element type of the given container (type) would always be ignored, while it would be used in #16740 if set.

martinholters on 2 Jan 2017

The OP doesn't use promote_op anymore, but the discussion seems to have strayed a bit from there. Is there anything left here, or has everything mentioned here eventually covered by the implementation and changes surrounding broadcast?

vtjnash on 20 Oct 2018

I don't know whether to open another issue for this, so I am commenting here: the docstring of promote_op explicitly discourages its use:

Due to its fragility, use of promote_op should be avoided. It is preferable to base
the container eltype on the type of the actual elements. Only in the absence of any
elements (for an empty result container), it may be unavoidable to call promote_op.

while the design pattern section encourages it:

https://docs.julialang.org/en/v1.4-dev/manual/methods/#Output-type-computation-1

I think this is confusing.