Julia: document that 2-arg mul! methods trashes all arguments

Created on 5 Jun 2016 · 88Comments · Source: JuliaLang/julia

Example methods are here for Triangular, or here for Diagonal where these follow the BLAS trmm convention of mutating whichever argument is not the type in question. So A_mul_B!(A, B) might mutate A, or it might mutate B, depending on their types. This is the case for several other combinations of input types as well, and I think it's bad for generic programming. Ref collapsed discussion right below https://github.com/JuliaLang/julia/pull/16615#issuecomment-222230001, and #16577 and #16562.

If we can't apply a consistent rule here, "always mutates the second argument" (or, if we apply it consistently everywhere, the first?), then we shouldn't define this method. Better to favor the more explicit A_mul_B!(C, A, B) method where the output is a separate argument. Some combinations of types can allow one of A_mul_B!(A, A, B) or A_mul_B!(B, A, B) to work, but writing code that does that will be inherently non-generic. The individual methods of A_mul_B! should be responsible for checking whether the inputs are === (and possibly more complicated forms of alias checks if we have more view types in common use) and only allowing the specific combinations that are expected to work.

This issue is still present for all the in-place multiplication methods even if #6837 renames all 7 (or 9 if you count possible combinations we don't define right now, ref #5332) variants to methods of mul!. I think this is an explicitly post-0.5 cleanup to make though since we don't have time to deal with it right now.

doc linear algebra stdlib

Source

tkelman

Most helpful comment

As I see it, there are 3 options that have been proposed so far:

the 3-argument version with a check if some of the arguments are common, e.g. mul!(A,B,A)
an Inplace type signifying which argument to overload, e.g. either
- mul!(B, Inplace(A)), or
- mul!(Inplace{2}(), B, A)
separate functions, e.g. mul!2(B, A)

I don't like option 1. for two reasons:

a. You would have to write all mul! methods with a bunch of branches

function mul!(A,B,C)
     if A === B
          # reuse B
     elseif A === C
          # reuse C
     else
          # don't reuse
     end
end

To avoid this, my guess that we would end up implementing the 3-arg mul! as one of the other options internally anyway.

b. It encourages potentially dangerous behaviour: at the moment we have the explicit rule that in the 3-arg mul!, the first argument shouldn't alias either of the other two. The new rule would be that the first argument shouldn't alias either of the other two, unless it is identical to one of the other objects (but not both). The subtlety of this exception seems like a giant footgun to me.

So I would prefer not to do option 1. As far as options 2 & 3 go, the difference is mostly semantics. Option 2 does have the slight advantage of extending more easily to ternary and higher-order operations, but I will concede it is a bit of abuse of dispatch.

My final observation is that this would probably be easier to do in concert with lazy transpose, as it touches a lot of the same code.

simonbyrne on 9 Nov 2017

❤2 👍2

All 88 comments

I know I've mentioned it before, but I would like to plug the Inplace{N} type I use in InplaceOps.jl.

simonbyrne on 5 Jun 2016

See also #16702

stevengj on 6 Jun 2016

Adding to this list the 2-argument scale!(A, b) vs scale!(b, A) from #14425 (and #17739). The ambiguity in which argument gets mutated strikes me as bad for the same reasons as in A_mul_B!, so we should probably require an explicit output argument if you need to be picky about left multiplication vs right-multiplication.

tkelman on 1 Aug 2016

I don't think scale! is problematic, because in general one of them is a scalar, unlike A_mul_B! which are both arrays.

Keno on 1 Aug 2016

There are a few methods where that's not the case:

scale!(A::AbstractArray{T<:Any,2}, b::AbstractArray{T<:Any,1}) at linalg/generic.jl:477
scale!(b::AbstractArray{T<:Any,1}, A::AbstractArray{T<:Any,2}) at linalg/generic.jl:478

If you write code where some inputs may need to have varying dimensionality of scalar, 1, or 2, there's an ambiguity here.

tkelman on 1 Aug 2016

Arguably those shouldn't exist. I had no idea what they were going to do until I looked at the code. I think that should be explicitly written as

A*Diagonal(b)

(or rather whatever mutating equivalent we have).

Keno on 2 Aug 2016

👍1

scale! is the mutating equivalent and, as far as I'm aware, nobody who has been using these methods has had problems. With a language that supports dispatch, I think it fine to have scale!(Vector, Matrix) and scale!(Matrix,Vector) instead of scale_matrix_with_vector_left!(Matrix,Vector) and scale_matrix_with_vector_right!(Matrix,Vector).

andreasnoack on 2 Aug 2016

What does

function foo(a, b)
    scale!(a, b)
    return a
end

do then? If we can't give a consistent definition for what the generic function verb scale! means in terms of its side effects, that's bad.

tkelman on 2 Aug 2016

I think we should use types in the definitions. That was the point I tried to make in the last comment.

andreasnoack on 2 Aug 2016

I don't have a particularly strong opinion on the topic, but it feels like a pun to me.

Keno on 2 Aug 2016

I think we should use types in the definitions

That defeats the purpose of generic programming. I think the non-commutative case would be better served by a scaleleft!(A, b) and scaleright!(A, b) where the mutated argument is the first input, as is the convention in most other ! functions.

edit: or mul!(A, A, b) for right-multiplication by a scalar and mul!(A, b, A) for left-multiplication by a scalar.

tkelman on 2 Aug 2016

I am also in favor of only having the 3-argument mul!, where the only difference for the special types (scaling with a vector, or multiplying with a triangular matrix) is that you can use the same output argument as the (other) input argument, whereas you can't for a general matrix. That might still be hard to explain, but a simple detection in the generic matrix multiplication code (which is already present) suffices to catch this.

Jutho on 2 Aug 2016

Unlikely to receive attention prior to 0.6. Best!

Sacha0 on 22 Dec 2016

Bump, if any linalg people are looking for things to do.

StefanKarpinski on 20 Jul 2017

Seems like the two-argument mul! should be documented to return the product and potentially trash either or both of its arguments and that it may return one of its arguments or a new object.

StefanKarpinski on 27 Jul 2017

👍1

A majority of triage'ers think this behavior is fine if it's clearly documented, warning people that you can't predict which argument will be mutated.

JeffBezanson on 27 Jul 2017

The three-argument mul! should be documented to only mutate the output, so if you want to mutate only the left argument of the product, then write mul!(A, A, B); if you want to mutate only the right argument of the product, write mul!(B, A, B); if you want to leave them both intact, write mul!(A, B, C). Without the two-argument form mul!(A, B) there's no way to express that you're ok with either argument being trashed as long as you get the product as a result – this is far less constrained than any of the above three calls since it could return either argument or neither and completely independently it could trash either, both or neither of the inputs depending on what's efficient. It should be clearly documented that this is expert functionality to be used with caution.

StefanKarpinski on 27 Jul 2017

In practice in current package code the two-argument versions are very rarely used, and when they are they are virtually always used for their side effects rather than assigning their output and using that. Given that usage, these are more harmful than helpful since their side effects aren't predictable in a generic way.

tkelman on 27 Jul 2017

A lot of the 3-argument versions will fail or give incorrect results if you reuse input and output arguments:

julia> A = eye(2)
2×2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

julia> B = eye(2)
2×2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

julia> Base.A_mul_B!(A,A,B)
ERROR: ArgumentError: output matrix must not be aliased with input matrix
Stacktrace:
 [1] gemm_wrapper!(::Array{Float64,2}, ::Char, ::Char, ::Array{Float64,2}, ::Array{Float64,2}) at ./linalg/matmul.jl:349
 [2] A_mul_B!(::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,2}) at ./linalg/matmul.jl:148


julia> A = ones(BigInt, 4, 4)
4×4 Array{BigInt,2}:
 1  1  1  1
 1  1  1  1
 1  1  1  1
 1  1  1  1

julia> B = ones(BigInt,4,4)
4×4 Array{BigInt,2}:
 1  1  1  1
 1  1  1  1
 1  1  1  1
 1  1  1  1

julia> A_mul_B!(A,A,B)
4×4 Array{BigInt,2}:
 4  7  13  25
 4  7  13  25
 4  7  13  25
 4  7  13  25

simonbyrne on 28 Jul 2017

Looks like the BigInt method is skipping the aliasing checks that the Float64 method caught, that should be fixed. 2-arg methods wouldn't be able to work in any situations where 3-arg with the equivalent aliased input/output arguments doesn't.

tkelman on 28 Jul 2017

The three-argument methods being broken is a separate issue (which should be fixed, of course). I think the fact that 2-arg mul! is not sufficiently documented and therefore misused is perfectly valid, but not really an argument for deleting it.

StefanKarpinski on 28 Jul 2017

At some point, I tried implementing a subset of Lapack in pure Julia, and for those case where in place matrix multiplication and division are possible and useful (e.g. triangular matrices), I just still had a three argument version mul!(C,A,B), but where you could just choose C equal to A or B (depending on which of the two is e.g. triangular). That is, there was no two-argument mul!, but you could access it with mul!(B,A,B), if e.g. A is Diagonal or Triangular.

Jutho on 28 Jul 2017

I found this more intuitive, i.e. it is explicitly clear which argument is overwritten. Like the implementation of generic matrix multiplication has to check against aliasing, in that case the implementation would look like

function mul!(C::Matrix,A::Triangular,B::Matrix)
    C !== B && copy!(C,B)
    # ... in place multiplication of A onto C
end

Jutho on 28 Jul 2017

I agree that mutating either A or B in a function f!(A, B) is pretty confusing and the methods should probably have different names then .

KristofferC on 28 Jul 2017

👍1

The issue is that there are generic programming situations where you don't care about preserving either input and you just want the resulting product to be computed as efficiently as possible. In generic code, the caller may not know or care which argument is better to clobber – the method implementing this is the only one in the position to make that choice. Without the 2-arg mul! there's no way to express that since the only 3-arg options are:

mul!(similar(A), A, B) which can only overwrite similar(A) not A or `B
mul!(A, A, B) which can only overwrite A
mul!(B, A, B) which can only overwrite B

The pattern here would be C = mul!(A, B) which could overwrite A or B (or both or neither). Afterwards, C is expected to alias A or B. By using this, you've explicitly indicated that you no longer care about the contents of A and B and that you don't care which one C aliases. Note that it does not make sense to use 2-arg mul! unless you assign the result somewhere, which could be a useful lint.

StefanKarpinski on 28 Jul 2017

We deleted map!(f, A) without an explicit output array due to the ambiguity and this seems very similar. You'd need to use the return value of the two-argument version for it to do anything usefully predictable or generic. In practice the three-argument version is usually called for its side effects, with its return value ignored. Having correct usage of the two be so different doesn't seem like a good API design to me. We wouldn't be losing much, and we'd be gaining clarity and simplicity, if we didn't have it at all.

tkelman on 28 Jul 2017

We could call this mul_destroy_all_the_args!(A, B) but that seems like an unfortunate name.

StefanKarpinski on 28 Jul 2017

It would be fine to delete it for now and reintroduce it later with a sternly worded docstring. That would force people to change what they're doing now (which may be broken) but still allow it at some point in the future. It may turn out that no one really ever needs this.

StefanKarpinski on 28 Jul 2017

There's also always the dreaded double bang: mul!!(A, B).

StefanKarpinski on 28 Jul 2017

👍2

Dunno how serious you're being, but I don't hate that. It won't show up under the same docstring that way, and be possibly a bit clearer that it should be used differently.

tkelman on 28 Jul 2017

I'm pretty sure the main "main effect" of these methods is their "side effects" as Tony labels it. The return values of often not assigned when using the x_mul_x! methods (2 as well as 3 argument versions). You mainly want to use these methods for updating existing variables.

I use the two argument versions a lot for orthogonal multiplication with Givens rotations and sometimes for triangular multiplication and I don't think it causes any problems. Of course, I happen to have written the two argument multiplication methods so I might be biased, but I think the argument against these is a bit of a straw man. When using the two argument version, it is pretty clear from the context what is getting updated and what is not. So far the examples against the use have been pretty abstract.

I think that we'll have to accept that different methods of the same function do slightly different things. There is little point in having the type information if you'd have to name the function something like mul2argLowerTriangularStridedMatrixUpdatingSecondArgument!(A::LowerTriangular, B::StridedMatrix). I think it is better to have mul!(A::LowerTriangular,B::StridedMatrix) and specific docstring for that method explaining the behavior.

Notice also that settling on three arguments won't even solve the problem generally because general matrix multiplication is actually supposed to be five arguments, C := α*A*B + β*C, and the current situation is actually incomplete causing users to call gemm! directly in the BLAS module when they should be using a more generic mul! function. The sparse x_mul_x! already has the five argument version.

andreasnoack on 28 Jul 2017

Semiserious. It's not terrible.

StefanKarpinski on 28 Jul 2017

When using the two argument version, it is pretty clear from the context what is getting updated and what is not.

No, it isn't clear from context when you write a generic function that should behave predictably when called with various different types of inputs. Which input gets mutated, when that's the reason you're calling the function, is more than "slightly different things."

BLAS and Julia have different APIs, scalar factors are a totally separate discussion unrelated to everything else here.

You can do "orthogonal multiplication with Givens rotations" with a three argument version just fine, then you wouldn't have to understand the context or input types of each mul! call to be able to read the code and see at a glance what it does. Generic functions should have a consistent contract for what they do.

tkelman on 28 Jul 2017

I agree with @tkelman . That's why I like the three argument version, where you explicitly choose C=A or C=B. Is there a good use case for C=mul!!(A,B) where you don't care which matrix is overwritten? If there is, I also like the double bang and it should also work in general, i.e. even if neither A or B can be overwritten.

Jutho on 28 Jul 2017

No, it isn't clear from context when you write a generic function that should behave predictably when called with various different types of inputs.

It is not clear what the basis for this assertion is. Are you really in doubt which of G and U is getting mutated in this line?

You can do "orthogonal multiplication with Givens rotations" with a three argument version just fine

Yes, but you'd never want to and if you suggest otherwise, please provide a real example.

andreasnoack on 28 Jul 2017

Are you really in doubt which of G and U is getting mutated in this line?

If G came from an untyped input to the function defined far away in the caller (which could be unknown user code) rather than on the immediately preceding line, yes that would absolutely be unclear which of G or U is being mutated.

but you'd never want to

What is the argument for why not, other than a desire to save 2-3 characters to write the ambiguous A_mul_Bc!(U,G) instead of the explicit A_mul_Bc!(U,U,G) ? Those 2-3 characters are worth the clarity and better generality - the code will do the same thing for more combinations of input types for U and G.

tkelman on 28 Jul 2017

There is a point in what @andreasnoack said that I think is worth repeating. The 2-argument methods are most useful when the non-mutating argument is a special type of matrix (Triangular, Diagonal, product of Givens' rotations or Householder transformations) for which it is possible to operate on the target in place. Those seeking to avoid unnecessary allocation and copying will gravitate to such methods. I know that for me it is important to perform multiplications and solve systems with triangular matrices in place.

I would be fine with abandoning the 2-argument form in favor of the 3-argument form if it was explicit that in the aliased case the operation will be performed in place if possible.

dmbates on 28 Jul 2017

@dmbates that would indeed be the point of implementing it like https://github.com/JuliaLang/julia/issues/16772#issuecomment-318654389

Jutho on 28 Jul 2017

@Jutho Agreed. I was just repeating a point that I thought was getting lost in the discussion.

dmbates on 28 Jul 2017

If G came from an untyped input to the function defined far away

This is not a likely use case for in-place multiplcation. I've provided a specific example and you haven't. We are trying to solve a non-existing problem by generalizing to something that won't happen.

What is the argument for why not

I'm saying that you don't want to do A_mul_Bc!(Uout, Uin, G). It is an unnecessary generalization. You'd always want to write A_mul_Bc!(U, U, G). If a user doesn't care to understand which operations can be done in-place then the in-place API is probably not the right one for that user.

the code will do the same thing for more combinations of input types for U and G.

You don't suddenly need to pass a general matrix to a function that uses two argument multiplication. It's a straw man. Context is important.

andreasnoack on 28 Jul 2017

Also, in the Givens case it would probably be slower, as you would need to either copy elements, or at least check if they need to be copied, as most of the elements remain unchanged

simonbyrne on 28 Jul 2017

@simonbyrne , I don't understand that argument: A_mul_Bcl!(U, U, G) would really do the same thing as A_mul_Bc!(U, G) (except for a trivial check that the first argument is indeed the same (===) as the second. There would be no copying or anything involved.

So it's a matter of typing two more characters, against the argument that there might be no use case for a more general A_mul_Bc!(Uout, Uin, G) (in which case the first step is indeed to copy Uin into Uout). I find it hard to exclude that there will ever be a use case for the latter.

Jutho on 28 Jul 2017

@jutho Ah, I missed your comment above. Good point.

My main objection to the mul!(A,A,B) pattern is that I'm not convinced that the alias check can be made 100% accurate for cases (such as generic matmul) where aliased arguments shouldn't be used, in which case we should generally discourage aliasing.

simonbyrne on 28 Jul 2017

e.g.

julia> A = ones(5,5);

julia> Av = view(A,1:5,1:5);

julia> B = ones(5,5);

julia> Base.A_mul_B!(Av,A,B)
5×5 SubArray{Float64,2,Array{Float64,2},Tuple{UnitRange{Int64},UnitRange{Int64}},false}:
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0

simonbyrne on 28 Jul 2017

But that's true for the generic matrix multiplication A_mul_B!, and while a rightful concern, I don't think it's very relevant for this issue.

Here, the question is, can you faithfully detect that the C and A argument are the same in order to do the equivalent of the current 2-argument form, and C===A should be sufficient for that.

Jutho on 28 Jul 2017

My point was that it seems odd to encourage aliasing in one context (to avoid the need for 2-arg functions) and discourage it in another (to avoid getting invalid results).

simonbyrne on 28 Jul 2017

👍2

what if we just made the 2-arg form new functions, i.e. mul!l/mul!r, or mul!1/mul!2?

simonbyrne on 28 Jul 2017

An observation from having recently interacted with several A_mul_B!(A, B) uses in base: Many (most?) calls to A_mul_B!(A, B) do not assume that either or both arguments may be mutated. Instead such calls expect that only one argument will be mutated, and mutating the other argument would cause trouble (e.g. an upstream A_mul_B(A, B) call would unexpectedly mutate A or B).

In other words, while saying A_mul_B!(A, B) can mutate either or both arguments is a reasonable contract, many (most?) uses of A_mul_B!(A, B) do not follow that contract, which strongly suggests something is amiss. Furthermore, given that which method an A_mul_B!(A, B) call hits often isn't obvious and may inadvertently change with introduction of new A_mul_B!(A, B) methods, the situation with any given call is somewhat precarious/brittle. Best!

Sacha0 on 22 Oct 2017

Are there any methods of A_mul_B!(A, B) that mutate both arguments?

JeffBezanson on 22 Oct 2017

Are there any methods of A_mul_B!(A, B) that mutate both arguments?

andreasnoack on 22 Oct 2017

Perhaps this thread distills to the following fundamental point: Existing three-plus argument mul! methods are out-of-place, whereas existing two-argument mul! methods are in-place; we have a reasonable API for the former, but lack an API for the latter (and so shoe-horn the latter into the former's API, precipitating this discussion). Best!

Sacha0 on 22 Oct 2017

Something along the lines of @simonbyrne's design in InplaceOps.jl could work beautifully as an interface for in-place multiplication. Specifically, to indicate in-place-ness and unambiguously identify the mutated argument(s), one could write e.g. mul!(InPlace(A), B) (or mul!(InPlace(1), A, B)). This approach also generalizes to other operations. Thoughts? Best!

Sacha0 on 26 Oct 2017

👍1

That's a pretty verbose way of (ab)using dispatch to try and use the same generic function name for what are semantically distinct operations. If you care at all about the values of the arguments after the call returns (which in most existing uses of these people do, since the return value is often not saved), then in-place right multiplication and in-place left multiplication aren't the same operation, they should use different names - same with left-multiply scale! and right-multiply scale!.

If you don't care about the values afterwards, then something like the mul!! suggestion seems best. For safety's sake you'd even maybe want to poison all the arguments as un-referenceable after the call, since you can't generically predict which has been mutated.

The title of this issue should be changed back to being about naming and semantics rather than documenting the existing input-type-dependent mutation behavior.

tkelman on 27 Oct 2017

Is it possible for mul!(A, A, B) and mul!(B, A, B) to work by seeing if the output is === to one of the arguments, and using an in-place algorithm if so?

JeffBezanson on 27 Oct 2017

If you care at all about the values of the arguments after the call returns (which in most existing uses of these people do, since the return value is often not saved), then in-place right multiplication and in-place left multiplication aren't the same operation, they should use different names - same with left-multiply scale! and right-multiply scale!

Could you expand? The proposal in https://github.com/JuliaLang/julia/issues/16772#issuecomment-339811134 allows for both left- and right- in-place multiplication: mul!(InPlace(A), B) would compute A*B mutating and storing the result in A, and mul!(B, InPlace(A)) would compute B*A mutating and storing the result in A. Insofar as I can see, that interface unambiguously covers all two-argument scale! and mul! methods. Best!

Sacha0 on 27 Oct 2017

Is it possible for mul!(A, A, B) and mul!(B, A, B) to work by seeing if the output is === to one of the arguments, and using an in-place algorithm if so?

Perhaps, yes, though the concerns in e.g. https://github.com/JuliaLang/julia/issues/16772#issuecomment-318756167 and https://github.com/JuliaLang/julia/issues/16772#issuecomment-318761560 apply.

Sacha0 on 27 Oct 2017

Could you expand? The proposal in #16772 (comment) allows for both left- and right- in-place multiplication:

Would InPlace annotation wrappers work? Probably. But I don't think they would make for a beneficial or desirable API design in this case. It would be lifting into the type dispatch domain something that doesn't really belong there. Somewhat like using a Val argument unnecessarily (if there's no type stability reason), when separating the dispatch into distinct generic function names would be simpler in both use and implementation, less overhead, and more accurately separate concerns.

At the heart of this issue is what generic contract should an in-place multiplication function be designed to have across all input types. If you aren't programming to the same contract across types (with each type-specialized implementation providing an optimization of the same behavior), you should use separate function names, as it's not best-practices generic programming if the function does substantially divergent things for different input types.

If the contract should be "mutates one but not both of its inputs, and different input types are free to make the choice however they like" then we would have to accept that few if any of the existing uses of this are correctly programming to that contract since they typically don't save or use the return value. For this to be the desired contract that in-place multiplication should use, we'd need good reasons and benefits motivating the choice, beyond this being what the existing 2-argument A_mul_B! does.

Instead I believe there are two separate contracts that would be useful here:

Multiply the inputs, overwriting the first input with the result
Multiply the inputs, overwriting the second input with the result

These are distinct operations and I believe we should give them separately named, distinct generic functions, or deprecate both in favor of the more general 3 argument form that can subsume both as special cases. Different combinations of input types may only be able to implement one or the other operation, so usage of these in highly generic code would run the risk of method errors (or aliasing errors in the three argument case), rather than silently mutating a different input argument giving very different results than the author may have expected.

tkelman on 2 Nov 2017

As I see it, there are 3 options that have been proposed so far:

the 3-argument version with a check if some of the arguments are common, e.g. mul!(A,B,A)
an Inplace type signifying which argument to overload, e.g. either
- mul!(B, Inplace(A)), or
- mul!(Inplace{2}(), B, A)
separate functions, e.g. mul!2(B, A)

I don't like option 1. for two reasons:

a. You would have to write all mul! methods with a bunch of branches

function mul!(A,B,C)
     if A === B
          # reuse B
     elseif A === C
          # reuse C
     else
          # don't reuse
     end
end

To avoid this, my guess that we would end up implementing the 3-arg mul! as one of the other options internally anyway.

My final observation is that this would probably be easier to do in concert with lazy transpose, as it touches a lot of the same code.

simonbyrne on 9 Nov 2017

❤2 👍2

Thanks for the great writeup @simonbyrne! I share your misgivings about the first option. And between the second and third options, I also slightly favor the second for its scalability. Perhaps the second option is the best path forward for now?

Sacha0 on 21 Nov 2017

The Inplace type seems really ugly to me and I think we'd regret it. We should have a function that checks whether two arrays alias. That certainly seems generally useful.

JeffBezanson on 21 Nov 2017

👍1

I don't share the opinion about option one.
Part a. of the argument:
For general matrices, C should not alias either A or B and this is already checked now in gemm_wrapper:

    if C === A || B === C
        throw(ArgumentError("output matrix must not be aliased with input matrix"))
    end

Only if A has special structure, can B be reused, and if it is not, the 3-argument implementation would/could/should probably start as

C === B || copy!(C,B)
# perform multiplication with `A` in-place on B

Similarly, C===A is only possible if A has special structure.

So these different branches would not really be branches as they would end up in different method definitions anyway. The only place where branches would occur is in something like UpperTriangular times UpperTriangular. But what is the alternative to a simple branch then? Having two or three different method definitions depending on which argument has been dressed with InPlace ?

So that leaves part b. of the argument. Users should know which argument they can alias. People using in-place methods are supposed to know a little bit what they are doing, and people matrices with special structure do probably know a little bit about what multiplications can be performed in place.

Jutho on 21 Nov 2017

We could have similar infrastructure to @inbounds and @boundscheck for aliasing, e.g. @noalias and @aliascheck. That way functions can be generically written to handle aliasing correctly while allowing the caller to declare that they are sure there is no aliasing and avoid potentially expensive alias checks. Of course, if we can make alias checking cheap enough that's even better.

StefanKarpinski on 21 Nov 2017

For general matrices, C should not alias either A or B and this is already checked now in gemm_wrapper

There are other ways that matrices can alias which aren't currently checked, e.g. if A is a view into part of B or C. I'm not even convinced that such checking is even possible in all cases.

So that leaves part b. of the argument. Users should know which argument they can alias. People using in-place methods are supposed to know a little bit what they are doing, and people matrices with special structure do probably know a little bit about what multiplications can be performed in place.

By that argument we should just keep what we currently have.

simonbyrne on 21 Nov 2017

If you try to do in-place multiplication with types that are unsupported or in the wrong order, it should predictably be a method (or aliasing) error, not sometimes mutate the opposite input from what you intended.

tkelman on 21 Nov 2017

👍1

I think we should go with option 3. It seems the easiest to implement (essentially it just requires renaming the appropriate functions), and is probably the easiest to do until we have a transpose immutable.

simonbyrne on 21 Nov 2017

There are other ways that matrices can alias which aren't currently checked, e.g. if A is a view into part of B or C. I'm not even convinced that such checking is even possible in all cases.

True but irrelevant to my argument as this is anyway a problem. I guess a conservative solution would be if any AbstractArray overloads a function that gives some range in memory that it occupies (or could possibly occupy) and these are called and checked not to overlap. Not sure if that really is such a big problem in practice.

By that argument we should just keep what we currently have

Unless you also want a specialized inplace multiplication of two UpperTriangular matrices, and want to be able to specify which of the two you would like to overwrite (which is not implemented currently). That problem also arises with the mul2! method suggestion.

Jutho on 21 Nov 2017

That problem also arises with the mul2! method suggestion.

You specify it via the function: mul!1 will overwrite the first one, mul!2 the second.

simonbyrne on 21 Nov 2017

Ah, I thought the 2 was just for 2-argument version :-).

At this point, it's probably personal preference. I think the fear for misuse is purely hyptothetical, especially if a more elaborate aliasing checking mechanism would be put in place. But even without, similar to how accidental aliassing does not seem to appear often or cause many problems currently.

Personally, I find mul!(B,A,B) as clear (if not more) as mul!2(A,B), without requiring new syntax conventions (specifiers after the exclamation mark), but I don't use this sufficiently to have any strong opinion on the final decision.

Jutho on 21 Nov 2017

I'm not even convinced that such checking is even possible in all cases.

Precise checking is probably not possible, but checking if the underlying "buffer" array is the same is possible and we can fall back to a slow, alias-safe implementation in those cases. Of course, we'd also want tooling to help detect such situations and avoid it when it's a performance problem.

StefanKarpinski on 22 Nov 2017

Nitpick: the ! should come last.

JeffBezanson on 22 Nov 2017

mul!(A, B) is documented as overwriting one of its arguments, plus this is not even exported, so closing.

JeffBezanson on 3 Jan 2018

This really shouldn't be a doc issue, "overwrites one argument but which one depends on input types" isn't a useful thing for this function to do.

tkelman on 3 Jan 2018

Ok, we can reopen it if you want but I don't think it belongs on the 1.0 milestone since the function is not exported.

JeffBezanson on 3 Jan 2018

Some aspects of this are still not totally clear to me. I can imagine two reasons for an API like the current mul!(A, B):

I know something about the types of A and B, and I want the function to use that information to figure out which argument I expect it to mutate, so I don't have to specify e.g. mul2!(A,B) or mul!(B,A,B).
I don't care at all which of A or B is mutated, the function can trash both for all I care, I just want it to do the fastest thing.

These are very different styles of use. The cases @andreasnoack is referring to seem to be (1). Are there any examples of (2)?

If we mostly care about (1), then I do think it would be better to write such calls as mul!(B,A,B). Existing 2-argument methods can be mechanically converted to 3-argument methods by checking that the output is === to the expected input and throwing an error if not.

JeffBezanson on 3 Jan 2018

👍1

When I last looked at the uses of 2-arg in place multiplication across packages over the summer, hardly any used and referred to the return value, which is the only sane way you might use (2) to do something worthwhile

tkelman on 3 Jan 2018

@JeffBezanson , mul!(B,A,B) + checking has been my proposal all along. Furthermore, this resolves the ambiguity when either of the two arguments could in fact be mutated (e.g. Diagonal times Diagonal). I think there was a case of such ambiguity that @Sacha0 bumped into at some point during his amazing work on the mul! transition.

Jutho on 3 Jan 2018

👍1

If we mostly care about (1), then I do think it would be better to write such calls as mul!(B,A,B)

I agree with @simonbyrne's criticism of this proposal in https://github.com/JuliaLang/julia/issues/16772#issuecomment-343034593. You don't win much in terms of generic programming when mul!(B,A,B) is only valid for some subtypes of AbstractMatrix.

I still don't get why it is so important to change the name or signature of methods that no users have reported issues with. The methods multiply matrices in-place so the name mul! seems pretty reasonable but if the consensus is that generic programming considerations prohibit the use of the most obvious name for this operation then I'd prefer renaming over the three-argument version.

andreasnoack on 3 Jan 2018

when mul!(B,A,B) is only valid for some subtypes of AbstractMatrix

I think that's exactly the point --- if you hit a case where the types don't work, you get a method error instead of silently mutating a different value.

JeffBezanson on 3 Jan 2018

I think that's exactly the point --- if you hit a case where the types don't work, you get a method error instead of silently mutating a different value.

Isn't it the other way around? At the moment you get a MethodError, if we change it to mul!(B,A,B) then you will either get silent mutation or an error if the method manually checks.

simonbyrne on 3 Jan 2018

Also, I should point out that the two separate mul1!/mul2! method deal with the Diagonal ambiguity case, as well as a bunch of other ones (I managed to get rid of quite a few definitions). We could replace scale! as well.

simonbyrne on 3 Jan 2018

Isn't it the other way around?

Not the way I'm thinking about it. Currently I can write mul!(A, B), expecting B to be mutated. If the types change, this can call a different method that mutates A instead. But if I write mul!(B,A,B) then it has to mutate B, and if the types of A and B make that impossible you get an error. The 3 argument form can't do what I'm calling "silent" mutation, because it says explicitly which value to mutate.

JeffBezanson on 3 Jan 2018

👍1

I've updated my previous PR at #25421.

I did have a quick stab at trying the 3-element version, but it was far from clear how to go about it.

simonbyrne on 5 Jan 2018

Thanks for the nice work.

For the alternative proposal of the 3-argument version, what is wrong with replacing
every instance of function f1(A,B)
by

function f(C,A,B)
C === A || copy!(C,A)
...

and similar for f2 type of methods.

Sure, this is a problem if C and A differ but alias, but that is the same now with all mutating methods, and the advanced user that wants to use these methods needs to know about this. If there is at some point a method to check for aliasing that's great, but I don't see this as a breaking point right now.

Jutho on 5 Jan 2018

It's not so straightforward: for some it should throw an error, for others that wouldn't actually work at all (e.g. some of the triangular types return a value of a different type that reuses the underlying array).

If we still wanted to do it, we could implement that on top of #25421.

simonbyrne on 5 Jan 2018

I trust your assessment as I am not too familiar with all the special array types. However, I do find that statement confusing, as above it was mentioned that almost all use-cases of the 2-argument version were not using the return type explicitly, but where just called as mul!(A,B) without assigning the return value.

Jutho on 6 Jan 2018

@simonbyrne,

(e.g. some of the triangular types return a value of a different type that reuses the underlying array)

sounded familiar, so I skimmed base/linalg/triangular.jl to refresh my memory but found no examples. I encountered only the following somewhat idiosyncratic (perhaps accidental?) definition.

https://github.com/JuliaLang/julia/blob/08620e5c1ec64d4961b78d6b038520cc31c0e2fe/base/linalg/triangular.jl#L449

Could you perhaps link a relevant definition or two for consideration? Thanks! :)

Sacha0 on 6 Jan 2018

Most are in triangular.jl, e.g. https://github.com/JuliaLang/julia/pull/25421/files#diff-f3a0525264ed2a88c32e4ec3f72c8fa4R163

simonbyrne on 11 Jan 2018

Ah, XUnitTriangular -> XTriangular, I see! Thank you! :)

Sacha0 on 12 Jan 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Help regression for Rational in 0.6

omus · 3Comments

Getting Openssl error while building from source on ubuntu 14.04.3 cmake version 2.8.12.2

arshpreetsingh · 3Comments

isposdef() is incorrect

wilburtownsend · 3Comments

Dates.format regression on master: Width of milliseconds field cannot be fixed anymore

helgee · 3Comments

split() enhancement, keep_splitter

i-apellaniz · 3Comments