Julia: implement Ac_mul_B(Vector, Matrix) to avoid temporary

Created on 16 Aug 2016  Â·  35Comments  Â·  Source: JuliaLang/julia

Let mx be a 3x3 matrix, and vv be an nx3 matrix.

Up to now v=vv[row,:] was a row vector, i.e. a 1x3 matrix,
so the matrix multiplication v*mx worked, and provided a 1x3 row vector.

In 0.5.0 Julia this has changed,
v=vv[row,:] rows are now real 1-dimensional vectors.

This is welcome change.
But as a side effect, it became more obvious and painful
that v*mx does not work for real 1-dimensional vectors.

I know that transpose(mx)*v is a workaround,
but I think this code rewrite is an unnecessary requirement.

Considering the original definition of matrix multiplication,
summing inner indices should naturally work in this case.
A good example is Fortran's matmul(v,mx): it just works fine.

I would greatly appreciate your opinion,
and any pointers to your future plans.

help wanted linear algebra

Most helpful comment

What are the three types of product in linear algebra? I still don't see where vector times matrix is some standard mathematical product in any given context.

Either one works in MATLAB style where everything is matrices and * always denote matrix multiplication without discussion, or you work in some more abstract setting and you indeed need to define what * means. In abstract linear algebra, I know of inner and outer products, but most of the time * in code is neither of those two.

In an abstract setting, matrices correspond to linear mappings from vectors to vectors. So in mx*v I read * as function application. Composing such linear mappings gives rise to matrix matrix multiplication. So in m1*m2 I read * as function composition â—¦. Given that these two different operations are rather close, I think it is ok to use the same symbol for both.

There is also the concept of linear forms / covectors which map vectors to numbers. These can be represented as one-dimensional objects and are isomorphic to vectors, namely using the inner product. To any linear form f, there corresponds a vector w such that f(v) = dot(w,v) for any v. You can also compose such a linear form with a linear mapping (represented by a matrix), which I think is the operation I think you are referring to. But I don't see why this would be a standard use of * in mathematics. Such linear forms are isomorphic to vectors, but not identical, and thus julia currently forces you to act with the transpose on a vector in order to get the corresponding covector (well, not exactly, but that is indeed the point of issue #4774 ).

I am not defending that this choice is ideal or optimal, but I also don't see how v*mx would in any way be more standard? If any, I would argue that you actually need to use the dot product for this operation, so that this should be written as dot(v,mx). Maybe that's actually a possibility which is open, since currently dot is not defined for a combination of a vector and a matrix argument.

All 35 comments

What about either doing

v'mx

or

v=vv[row:row,:]

?

Yes, these are all working tricks.
But conceptually not clean.

I think Julia is still struggling with some Matlab-like behaviour.
Having true 1-dimensional vectors is very important,
and with 0.5.0 Julia made a big step in this direction.
It would be nice reaching a really consistent state of its own.

How does Fortran handle complex conjugation in matmul(Vector,Matrix)? Is x*A the same as A'x or conj(A'x)?

Personally I don't see v'mx as a trick; it seems like the sensible way to go about this. If you think of two vectors in some space, the proper way to multiply them would be to multiply one by the transpose of the other, as vectors from a purely mathematical standpoint don't necessarily have an "orientation" (i.e. row or column).

I think Julia is still struggling with some Matlab-like behaviour.

Actually, what we had prior to APL-style slicing was more similar to Matlab, which makes a distinction between column and row vectors.

@andreasnoack

As far as I see, there is no implicit complex conjugation involved.

This is what the Metcalf-book says:
matmul(matrix_a,matrix_b) performs matrix multiplication.
For numeric arguments three cases are possible:

i) matrix_a has shape (n,m) and matrix_b has shape (m,k).
The result has shape (n,k) and element (i,j) has the value
sum( matrix_a(i,:) * matrix_b(:,j) )

ii) matrix_a has shape(m) and matrix_b has shape (m,k).
The result has shape (k) and element (j) has the value
sum( matrix_a * matrix_b(:,j) )

iii) matrix_a has shape (n,m) and matrix_b has shape (m).
The result has shape (n) and the elemeny (i) has the value
sum( matrix_a(i,:) * matrix_b )

Hm. Thanks. That was a surprising choice. In any case, I think this is very much #4774 territory.

Also note, that in Fortran transpose is defined only for an array of rank two.
In _Fortran-speak_ creating a one-column matrix from a 1-dimensional vector is not a transpose.

@ararslan
I agree that from the current options v'mx seems to be the best.
But is there a performance penalty?

Performance penalty compared to what?

Compared to storing predefined mxT=mx' and using mxT*v,
that also returns a nice 1-dimensional vector.

I wouldn't think so offhand (though I'm certainly no expert), but you could check with @time. :)

No performance penalty, no.

I have just tested.
There is a significant performance penalty.

n = Int(10e6)
vv = rand(n,3)
mx = rand(3,3)
mxT = mx'

f1(vv,mx) = (@time for i=1:n; y=vv[i,:]'*mx; end)
f2(vv,mxT) = (@time for i=1:n; y=mxT*vv[i,:]; end)

f1(vv,mx)
  4.843726 seconds (120.00 M allocations: 5.215 GB, 8.36% gc time)
  4.836846 seconds (120.00 M allocations: 5.215 GB, 8.33% gc time)

f2(vv,mxT)
  2.714492 seconds (80.00 M allocations: 3.278 GB, 9.10% gc time)
  2.709794 seconds (80.00 M allocations: 3.278 GB, 9.07% gc time)

Just checked and v'A actually calls the fallback in operators.jl so it allocates an unnecessary temporary. That could easily be avoided though.

I have made a mistake.
Of course n should also be passed as a function argument.

f1(n,vv,mx) = (@time for i=1:n; y=vv[i,:]'*mx; end)
f2(n,vv,mxT) = (@time for i=1:n; y=mxT*vv[i,:]; end)

It brings a speedup, but the performance penalty becomes even worse:

f1(n,vv,mx)
  3.529541 seconds (100.00 M allocations: 4.768 GB, 11.86% gc time)
  3.434869 seconds (100.00 M allocations: 4.768 GB, 8.80% gc time)

f2(n,vv,mxT)
  1.490498 seconds (60.00 M allocations: 2.831 GB, 11.13% gc time)
  1.507337 seconds (60.00 M allocations: 2.831 GB, 10.67% gc time)

Dunno if it'll make a difference but you should try moving @time outside of the functions and using @inbounds on the loops.

@ararslan
I have done it. The results are basically the same.

@andreasnoack
Does it mean that you can cure the performance penalty while keeping the current syntax?

If you by y=vv[i,:]'*mx mean the current syntax then it should be a yes.

Yes.
At present your suggestion seems to be the best,
and if there is no performance penalty involved, then I would happily use it.

Hm. Maybe I spoke too soon. It seems like OpenBLAS' GEMM is slower than GEMV

julia> A = randn(3,3);

julia> x = randn(3);

julia> xt = x';

julia> let A = A, x = x
       @time for i in 1:10^6; BLAS.gemv('T', 1.0, A, x); end
       end
  0.113836 seconds (2.00 M allocations: 122.070 MB, 8.78% gc time)

julia> let A = A, xt = xt
       @time for i in 1:10^6; BLAS.gemm('N', 'N', 1.0, xt, A); end
       end
  0.188009 seconds (2.00 M allocations: 137.329 MB, 7.56% gc time)

which I don't think is reasonable.

Even so, your testcode is instructive for me.

I've updated the title. I think the remaining aspects of this discussion is fully covered by #4774.

@andreasnoack @StefanKarpinski
My original issue (and title) was about making v*mx work
when v is a real 1-dimensional vector and not a 1xn matrix,
and returning the result as a real 1-dimensional vector.

Is this your present target?

Or is it just making v'mx work efficiently (without creating a temporary)
but still returning the result as a 1xn matrix?

It seems that this simple code does what I miss.
It is fast and it returns a real 1-dimensional vector.

import Base.*
*(v::Vector{Float64},mx::Matrix{Float64}) = BLAS.gemv('T',mx,v)

v = rand(3)
mx = rand(3,3)

v*mx
3-element Array{Float64,1}:
 0.786774
 0.821454
 0.444747

Update:
Unfortunately, this is not the general solution.
Integer vectors and matrices are not handled by BLAS.gemv.

Or is it just making v'mx work efficiently (without creating a temporary) but still returning the result as a 1xn matrix?

Yes. The idea was that this issue tracks the problem with the temporary. For you other suggestion please refer to #4774

I don't understand this issue. v*mx is not standard mathematical notation right. Why not just write it as you suggest to implement it: BLAS.gemv('T',mx,v) is obtained by writing mx.'*v or just mx'*v if your working over the reals.

I respectfully disagree.

v*mx is standard mathematical notation that contracts
the last (and only) index of v and the first index of mx.

Otherwise we should also leave mx*v undefined
in the case when v is a true 1-d vector.

I don't think there is a or one standard mathematical definition for *; it's highly context dependent. But you are right to suggest other definitions than the one currently adopted in julia, which I agree is indeed a bit hard to capture in one `mathematical' operation.

Can I assume that your suggestion generalises to using it in a kind of multilinear algebra context as some default tensor contractions operator: contract the last index of the first object with the first index of the second, irrespective of the rank of the two tensors involved?

I did not dare to go as far as multilinear algebra.
I have just suggested to clearly separate the three types of products in linear algebra.
See #4774.

What are the three types of product in linear algebra? I still don't see where vector times matrix is some standard mathematical product in any given context.

Either one works in MATLAB style where everything is matrices and * always denote matrix multiplication without discussion, or you work in some more abstract setting and you indeed need to define what * means. In abstract linear algebra, I know of inner and outer products, but most of the time * in code is neither of those two.

In an abstract setting, matrices correspond to linear mappings from vectors to vectors. So in mx*v I read * as function application. Composing such linear mappings gives rise to matrix matrix multiplication. So in m1*m2 I read * as function composition â—¦. Given that these two different operations are rather close, I think it is ok to use the same symbol for both.

There is also the concept of linear forms / covectors which map vectors to numbers. These can be represented as one-dimensional objects and are isomorphic to vectors, namely using the inner product. To any linear form f, there corresponds a vector w such that f(v) = dot(w,v) for any v. You can also compose such a linear form with a linear mapping (represented by a matrix), which I think is the operation I think you are referring to. But I don't see why this would be a standard use of * in mathematics. Such linear forms are isomorphic to vectors, but not identical, and thus julia currently forces you to act with the transpose on a vector in order to get the corresponding covector (well, not exactly, but that is indeed the point of issue #4774 ).

I am not defending that this choice is ideal or optimal, but I also don't see how v*mx would in any way be more standard? If any, I would argue that you actually need to use the dot product for this operation, so that this should be written as dot(v,mx). Maybe that's actually a possibility which is open, since currently dot is not defined for a combination of a vector and a matrix argument.

My apologies, wrong button.

I find instructive all what you wrote.
Maybe it should be copied to #4774, and keep the present issue to track
just the problem with the temporary, as suggested by @andreasnoack.

Yes indeed. My apologies for leading the original discussion astray. I'll make a similar post there and agree that, for now, v'mx should work efficiently and produce 1xn matrix.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Keno picture Keno  Â·  3Comments

iamed2 picture iamed2  Â·  3Comments

musm picture musm  Â·  3Comments

wilburtownsend picture wilburtownsend  Â·  3Comments

manor picture manor  Â·  3Comments