Problem Statement
The diag() function in Linear Algebra returns an error for 1- and 0-dimensional matrices.
Desired Behavior
In both cases, the input matrix can simply be returned
Error Example: 1 dimensional case
julia> using LinearAlgebra
julia> a=[5]
1-element Array{Int64,1}:
5
julia> diag(a)
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix
Stacktrace:
[1] diag(::Array{Int64,1}) at C:\Users\julia\AppData\Local\Julia-1.4.0\share\julia\stdlib\v1.4\LinearAlgebra\src\generic.jl:425
[2] top-level scope at none:0
Error Example: 0 dimensional case
julia> b=[]
0-element Array{Any,1}
julia> diag(b)
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix
Stacktrace:
[1] diag(::Array{Any,1}) at C:\Users\julia\AppData\Local\Julia-1.4.0\share\julia\stdlib\v1.4\LinearAlgebra\src\generic.jl:425
[2] top-level scope at none:0
What do other languages do?
| language | diag([1]) | diag([]) |
|--------------|-----------|----------------|
| Python Numpy | [[1]]†| error* |
| MATLAB | [1] | [] |
| SAS JMP JSL | [1] | error |
| Octave | ans=1 | ans = [](0x0) |
| R | 1 | <0 x 0 matrix> |
* the numpy error returned says that diagonal() requires at least a 2x2 matrix; however, it works for a 1-dimensional matrix.
†result corrected from comment below.
_System information for above examples:_
julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, ivybridge)
Note that both [5]
and []
are not matrices (2D arrays), but vectors (1D arrays). It works for actual matrices:
julia> a = fill(5, 1, 1) # 1x1 matrix
1×1 Array{Int64,2}:
5
julia> diag(a)
1-element Array{Int64,1}:
5
julia> a = fill(5, 0, 0) # 0x0 matrix
0×0 Array{Int64,2}
julia> diag(a)
0-element Array{Int64,1}
Similar behavior with numpy as well actually; [] turns into a 1D numpy array, so this should be [[1]]
and [[]]
respectively, and both behave just like matrices in julia:
>>> numpy.diag([[1]])
array([1])
>>> numpy.diag([[]])
array([], dtype=float64)
The table is also wrong for numpy.diag([1])
, it does not give you [1] but [[1]], since it actually does the equivalent of julias diagm
and constructs a matrix from a diagonal
>>> numpy.diag([1])
array([[1]])
Note that both
[5]
and[]
are not _matrices_ (2D arrays), but _vectors_ (1D arrays). It works for actual matrices:
We can contrast _Linear Algebra_ to _Julia's Implementation_ of linear algebra.
Is [5] a vector or an array? From a Linear Algebra perspective, it's both, or either.
The Julia Implementation can store [5] in two different data structures (vector or array). The storage data structure shouldn't break the Linear Algebra.
Furthermore, what does Julia tell me about vector [5]?
julia> a=[5]
1-element Array{Int64,1}:
5
It informs the user that it is a 1-element Array. It doesn't say Vector, it says Array.
I've corrected the table for the Numpy result based on the other comment.
Examples are still wrong, because you are not calling the same functionality; numpy.diag
has 2 completely opposite modes into one function; constructing a 2D thing from a 1D thing, or extract a 1D thing from a 2D thing. Julia named these diagm
and diag
respectively.
All languages and libraries I'm aware of, except matlab, is consistent on this; 2D input, 1D output, even when the size in each dimension is less than 2:
Julia:
julia> diag(ones((1,1)))
1-element Array{Float64,1}:
1.0
Numpy:
>>> diag(ones((1,1))) # or diag([[1]])
array([1.])
Mathematica:
In[4]:= Diagonal[ConstantArray[1, {1, 1}]] (* or Diagonal[{{1}}] *)
Out[4]= {1}
R
> diag(matrix(1, nrow=1))
[1] 1
The only difference is syntax for constructing 2D like things. Matlab/octave is the odd one out, as it decides to give special treatment and disallow true 1D things by padding on a second dimension. The ambiguity of the math is a bad thing.
Julia tells you that [5] is a (1) dimensional thing, and not a 1x1 dimensional thing.
Python Numpy:
>>> a=np.arange(1).reshape(1,1)
>>> a
array([[0]])
>>> a.shape
(1, 1)
>>> a.diagonal()
array([0])
>>> np.diag(a)
array([0])
>>> b=np.array([5])
>>> b
array([5])
>>> b.shape
(1,)
>>> b.diagonal()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: diag requires an array of at least two dimensions
>>> np.diag(b)
array([[5]])
Julia tells you that [5] is a (1) dimensional thing, and not a 1x1 dimensional thing.
Julia is telling me it is _storing it_ as a (1) dimensional thing, not a 1x1 dimensional thing.
[5] is a special case of a (1) dimensional thing and a 1x1 dimensional thing.
You're explained _why_ Julia does it the way it does it.
But . . . how _should_ it be done? You see how I think it should be done under "desired behaviors".
As documented, diag
returns "the k
th diagonal of a matrix, as a vector." That's fairly specific and justifies the current refusal to do anything for [5]
or []
, which are both vectors. We do as a convention, however, allow many operations that work on matrices to work on vectors by interpreting them as column matrices. Under that interpretation, it would be possible to define diag([1, 2, 3])
to return [1]
since that's the diagonal of the matrix with a single column, [1, 2, 3]
. That would prevent us from giving the helpful warning we currently give:
julia> diag([1, 2, 3])
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix
This is helpful since people coming from other systems may be expecting diag
to construct a matrix with [1, 2, 3]
as its diagonal. The question is whether allowing vectors to act like column matrices with respect to the diag
function is worth not giving users a clear warning that they're using the wrong function here. Note that if we added this diag
would apply to any vector, not just the cases you've picked where diag
and diagm
happen to have similar results.
The question is whether allowing vectors to act like column matrices with respect to the diag function is worth not giving users a clear warning that they're using the wrong function here.
Eh, they're not using the wrong function, they're using the wrong type. The user truly wanted the diagonal of [5].
It's cumbersome in the REPL to create a 1 dimensional matrix.
julia> a=[5] # creates vector
1-element Array{Int64,1}:
5
julia> a=ones(1)*5 # creates vector
1-element Array{Float64,1}:
5.0
julia> a=fill(5,1) # creates vector
1-element Array{Int64,1}:
5
julia> a=Array{Int64}(undef,(1,1)) # creates array
1×1 Array{Int64,2}:
934448784
julia> a[1,1]=5 # or a[1]=5
5
julia> a=[5] # easy to do an oopsy and turn it into a vector
1-element Array{Int64,1}:
5
julia> a=ones(1,1)*5 # creates array
1×1 Array{Float64,2}:
5.0
julia> a=fill(5,(1,1)) # creates array
1×1 Array{Int64,2}:
5
Just considering the user interface, it would be friendlier to a statistician typing in the REPL if diag([n]) returned [n] and diag([]) returned [].
Note that if we added this diag would apply to any vector, not just the cases you've picked where diag and diagm happen to have similar results.
I don't see why this is the case. It would just apply to 1-dimensional and 0-dimensional vectors.
_However,_ if you don't want to modify the behavior of the function, then I would request that the error message change.
julia> a=[5]
1-element Array{Int64,1}:
5
julia> diag(a)
ERROR: ArgumentError: use diagm instead of diag to construct a diagonal matrix
The current error message is assuming the user is trying to construct a diagonal matrix.
If the user was just trying to get the diagonal of a 1-dimensional matrix, this error is confusing.
The user error is feeding diag() a vector and not an array. The error message should say that.
Perhaps something like:
ERROR: ArgumentError: diag input must be of an Array type. Use diagm instead if you were trying to construct a diagonal matrix
That would be a helpful error message.
I don't see why this is the case. It would just apply to 1-dimensional and 0-dimensional vectors.
We don't do that sort of thing. Making a function that works when a vector has certain sizes but not others is how you end up with a language that's impossible to write reliable code in.
_However,_ if you don't want to modify the behavior of the function, then I would request that the error message change.
Yes, that would be a reasonable change as well since some users may have wanted to take the diagonal of a vector as though it were a column matrix.
To be clear, it's also reasonable to make diag(v)
work and return v[1:end]
for vectors, but what's not ok is making it _only_ work for vectors of length zero or one. If we were to do that then we would not be able to warn users expecting diag(v)
to do what diagm(v)
does as we do now.
Most helpful comment
Examples are still wrong, because you are not calling the same functionality;
numpy.diag
has 2 completely opposite modes into one function; constructing a 2D thing from a 1D thing, or extract a 1D thing from a 2D thing. Julia named thesediagm
anddiag
respectively.All languages and libraries I'm aware of, except matlab, is consistent on this; 2D input, 1D output, even when the size in each dimension is less than 2:
Julia:
Numpy:
Mathematica:
R
The only difference is syntax for constructing 2D like things. Matlab/octave is the odd one out, as it decides to give special treatment and disallow true 1D things by padding on a second dimension. The ambiguity of the math is a bad thing.
Julia tells you that [5] is a (1) dimensional thing, and not a 1x1 dimensional thing.
Related:
https://www.youtube.com/watch?v=C2RO34b_oPM