MWE:
using LinearAlgebra, Random
Random.seed!(1)
n = 100
A = rand(n, n)
B = rand(n, n)
C = zeros(n, n)
mul!(C, B, A, -1, 0)
D = -B * A
@show norm(D - C) # 2534.227986442133
This also happens for sparse arrays.
@tkf this occurs on Julia 1.4 master, for a variety of matrix types. Any idea why?
Thanks for the heads up. It turned out to be a stupid mistake (alpha and beta swapped). #33218 should fix it.
I only just hit another problem where alpha and beta were both floats and the output was incorrect- makes sense that it's not an issue with integers :-). In case it's useful to test,
using Random, SparseArrays, LinearAlgebra
Random.seed!(1)
n = 100;
A = Symmetric(rand(n, n))
B = sparse(rand(n, n))
C = zeros(n, n)
mul!(C, B, A, -1.0, 0.0)
D = -B * A
@show norm(D - C) # 2543.524362254463
The error is not triggered without the Symmetric
or without the sparse
.
This also happens for sparse arrays.
I forgot to address this part. @lkapelevich Can you give me an example? Is the snippet you just posted the same as what you mentioned in the OP?
https://github.com/JuliaLang/julia/issues/33214#issuecomment-530181496 dispatches to the same method I fixed in #33218
What I meant in the description was that I was also getting incorrect results when A
and B
were sparse, e.g.
Random.seed!(1)
n = 100
A = sparse(rand(n, n))
B = sparse(rand(n, n))
C = spzeros(n, n)
mul!(C, B, A, -1, 0)
D = -B * A
@show norm(D - C) # 2534.227986442133
(I didn't notice that alpha and beta need not be integers)
Thanks. Fortunately, the new snippet (https://github.com/JuliaLang/julia/issues/33214#issuecomment-530183714) is also handled by #33218, too (which is a bit surprising; I guess dispatch plumbing can be improved a lot...).
Why did this hit generic_matmul
at all? Seems like a huge performance trap if you just forget to use 1.0
and 0.0
instead of 1
and 0
?
julia> @btime mul!($C, $B, $A, -1, 0);
835.482 渭s (69 allocations: 3.56 KiB)
julia> @btime mul!($C, $B, $A, -1.0, 0.0);
52.149 渭s (0 allocations: 0 bytes)
Most helpful comment
Why did this hit
generic_matmul
at all? Seems like a huge performance trap if you just forget to use1.0
and0.0
instead of1
and0
?