If a user needs to define custom inner products for general Hilbert spaces, what is the current state in Base for this type of generalization? https://github.com/JuliaLang/julia/issues/16573 is a related, but less general issue. My concern is with new types that aren't arrays.
I'd like to propose renaming dot
to inner
, or perhaps directing users to define inner(x,y)
as a general inner product between objects x
and y
, including the array case:
inner(x::AbstractVector, y::AbstractVector) = dot(x, y)
In case the change is reasonable, could it be part of Julia v1.0?
Could you explain a little bit more about your use case and why having it in Base instead of just defining it in your package is beneficial? A concrete example would be best. Do you expect several inner
definitions across packages loaded simultaneously?
I think having a formal interface for these mathematical spaces will help users exploit the type system better. For example, I would expect clustering methods to work on any metric space. If I could define my type with an inner product, I would benefit from Clustering.jl
out of the box (after the package is fixed accordingly). Many other distance-based or projection-based algorithms could also be generalized.
As a concrete example, I came across this limitation today trying to define a geometry for compositional data: https://github.com/juliohm/CoDa.jl I'd rather specialize on a well-known inner
function defined in Base than define my own interface that no one else will be aware of.
Why not extend dot
for your Hilbert space types? Iâm pretty sure itâs designed with being the generic inner product in mind.
The concept of dot product is more strict than the concept of inner product. Whereas the latter is defined for general spaces, a dot product is only defined when there is the notion of a coordinate system defined by a finite basis. The semantics of dot(x,y)
is x'*y
where x
and y
represent coordinates of the objects in a Cartesian world. Mathematical textbooks will rarely mention the term dot product as the authors are usually interested treating the material in more general (not necessarily finite nor Euclidean) spaces.
To distinguish further, in a Hilbert space with inner product <x,y>
(or inner(x,y)
) objects can be infinite and the semantics x'*y
doesn't apply. For example, in functional data analysis the objects are functions f
and g
and the inner product is usually obtained by numerical integration: inner(f,g) = quadrature(f*g)
. Calling this operation a dot product is misleading.
Another example as I pointed out in my CoDa.jl
package is compositional data. Composition objects lie in a simplex world for which the operation x'*y
doesn't make any sense. However, there exists a isometric transformation (the log-ratio transformation) that one can use to map compositions into another geometry where one can then apply the dot product with coordinates. Working with coordinates is not necessary, but it is common practice in this field. The result can be back transformed to the original space where the objects exist.
I don't see benefit in maintaining the term dot
in the language, but if one asks for backward compatibility, the generalization inner(x::AbstractVector, y::AbstractVector) = dot(x,y)
works perfectly.
Can you elaborate on the objections for this change?
Can you elaborate on the objections for this change?
We generally require a fair amount of justification for adding new public functions to Base, that's the objection. This could be provided by an InnerProducts
package. Why does it need to be built into the language itself? This was the first question that @andreasnoack asked above â it got a somewhat vague answer of "I think having a formal interface for these mathematical spaces will help users exploit the type system better". There's no reason that an interface defined in a package is any less formal than one in Base. What does having Base.inner
offer that InnerProducts.inner
doesn't? This is a genuine question that could have a convincing answer, but I don't know what that answer might be, which is why the question is being asked.
I don't see a good argument to define a basic mathematical concept like inner products elsewhere that is not in Base. A language whose main audience is scientific computing folks would benefit from correct terminology. Why the concept of norm
is defined in Base.LinAlg
and inner
, which is on the same cohort, should be defined in a package? Besides this inconsistency, the language already has dot
, which makes me wonder why it should have something so specific rather than a more general concept?
So you want all possible mathematical concepts in the base language? Not having something defined in Base doesn't force people to use the wrong terminology. The norm
function is exported from LinAlg
because it is defined and used in LinAlg
. Similar for dot
. Are you proposing that dot
should be renamed to inner
?
So you want all possible mathematical concepts in the base language?
I never said that.
Not having something defined in Base doesn't force people to use the wrong terminology.
I am sure that it doesn't. Promoting the wrong terminology is the issue. People coming from a less mathematical background will adopt the usage of dot because they see it in Base. Usage of the term "dot" product to represent the concept of inner product is incorrect. It is also harmful to the mathematical community, which struggles every now and then to fix these scars that wrong terminology has left. Students from my generation are consistently having to refer to old books to get the terminology right, this shouldn't be the case.
Are you proposing that dot should be renamed to inner
That would already be a major improvement in my opinion. See all the examples I gave above on functional and compositional data. People in these communities would never use the term dot
in their work. "dot" is more like a computer science term than anything else.
Renaming dot
to inner
is quite a different proposal than adding inner
to Base in addition to dot
. That's more of a "correct terminology" question, which you and other linalg folks will have to hash out, although I seem to recall we bikeshedded this once and concluded that dot
was the correct name for what this function implements.
There was a little discussion of this in https://github.com/JuliaLang/julia/issues/22227 and https://github.com/JuliaLang/julia/pull/22220
Renaming dot to inner is quite a different proposal than adding inner to Base in addition to dot.
This is what I proposed in my first message in this thread:
I'd like to propose renaming dot to inner, or perhaps directing users to define inner(x,y) as a general inner product between objects x and y
I repeat dot product is the incorrect term for the operation I am discussing here. Inner, outer, scalar product... these are mathematical objects. "dot product" is a computational object: it gets two sequences of numbers and performs x1*y1 + x2*y2 + ... xn*yn
, a useless operation in other mathematical spaces.
I had focused on the second option you proposed, which seems to have been adding Base.inner
with a fallback to call Base.dot
. Either option is possible, but both require some justification: to add a new operation, one needs a reason why it can't just be in a package (what the initial part of this discussion was about); to rename, it needs to be decided that dot
is the wrong name and inner
is the correct one (what the conversation seems to have turned to).
@juliohm It's probably worth (re)stating that there is an active effort currently trying to shrink Base
and encourage the use of packages. In this case dot
seems to be correct for all the types participating in linear algebra provided in standard Julia (i.e. Number
and Array
- so yes, there is a definite, known, finite basis in all cases - thus I don't think we've made a mistake in terminology, though there may be better choices). I'm not against this proposal - but wanted to point this to clarify why you might be experiencing some "latent" resistance to change.
Also worth keeping in mind that a fair number of Julia newcomers may be familiar with a dot product but not an inner product (say, they did a bit of physics at university, but aren't math majors) so there are also some reasons to keep dot
(not to mention that we have an infix operator that it corresponds with - we could just map it to inner
I suppose but that's slightly less obvious). We also don't have an outer
function, or a variety of other possible operations.
Thus, there is a burden to make a reasonable case for how putting this in Base
(or LinAlg
) is strictly better than putting this in a user package. The primary reason seems to be to provide an interface that can be shared and extended by others - is that a reasonable summary? The argument about letting generic code from Clustering.jl work with your inner product seems pretty compelling. Also, in the context that we seem to be splitting LinAlg
into a stdlib package - I was thinking that if I were to author a package called LinearAlgebra
I'd probably be happy to include an inner
function for others to extend.
Thank you @andyferris for sharing your thoughts. I see the resistance very clearly, which is something that I am not very excited about. Nevertheless, I am curious about how this specific proposal leads to code increase? To me, it seems like a trivial change in code with major improvement in abstraction. The example with Clustering.jl is just one of many, think of any kernel-based method that can be made to work with arbitrary Julia types for which the notion of inner product exists. MultivariateStats.jl has plenty of them.
Regarding the comment about LinAlg
split into a separate package, I agree that it seems like a good place to encapsulate mathematical products. I am assuming that this LinearAlgebra
package of the future would be imported in a Julia session by default and so all users would have access to the concept of inner
, outer
, etc right away.
Yes, the standard libraries are all built together with the Julia system image and available by default. At least for the v1.x series no-one will need to type (I don't think it will be renamed using LinAlg
LinearAlgbebra
, btw, I just made that up as a hypothetical competitor).
To clarify, it would be loaded with standard Julia so you don't have to install anything, but you would still have to write using LinAlg
to get the names it exports.
This is where it gets odd, right, since we'll get the *
methods and so-on without using LinAlg
? (in other terms, LinAlg
is a type pirate).
Yes, that's basically where we'll have to draw the line: Base must define as much linear algebra functionality as needed to make LinAlg not a pirate, so matmul is defined in Base because Array
and *
both are. Funky matrix types and non-base operations live there though.
Let me give you a concrete example and ask you how would you solve it with the current interface, maybe this can clarify things for me.
The goal is to perform factor analysis with compositional data. I have a type called a Composition
and an inner product in the space of compositions. I collect many samples (e.g. rock samples) and put all of them into a big Vector{Composition}
(e.g. composition = %water, %grain, %air). Now I want to call a factor analysis algorithm implemented in another package (e.g. MultivariateStats.jl) on this vector of data. How would you implement that generically without having an inner
product imported by default?
What I understood from the last comments is that both MultivariateStats.jl and CoDa.jl would have to depend on LinAlg.jl. The dependency in MultivariateStats.jl is just to bring name inner
into scope. The dependency in CoDa.jl is to define a method for inner
that can be called by MultivariateStats.jl. Is that what you are suggesting?
It seems Composition{D}
is a D
dimensional vector space under +
and *
.
I would be quite tempted to define the dual vector space.
So, you could define adjoint(::Composition) -> DualComposition
and *(::DualComposition, ::Composition) -> scalar
(currently inner
). DualComposition
wouldn't have to do much except hold a Composition
inside.
The first sentence in https://en.wikipedia.org/wiki/Dot_product does seem to suggest that dot
could be an operation on any two iterables. We could make it recursive and define it for Number
, and define inner
as the abstract linear algebra function, which happens to overlap for Number
and AbstractArray
.
Thank you @andyferris, I appreciate your thoughts on the dual space. I'd rather not rely on a new type for this task though. The final solution is unnecessarily complex.
What I am interested in understanding is why something like:
inner(x,y) = sum(x.*y)
norm(x) = sqrt(inner(x,x))
export inner, norm
not welcome in Base? I am assuming this is all that is required to define the function names generically for users of the language to specialize on. Keep in mind I am asking these questions with the genuine interest of understanding the point of view of the core devs. I want to say this before the conversation goes into the wrong direction again.
From the perspective of someone interested in math in general, it feels unnatural to have these concepts not exported by default, and instead have them defined inside of LinAlg
. I think of LinAlg
as implementations of these high-level concepts for array types. Perhaps my entire work does not require linear algebra on arrays, but I could still benefit from the concept of inner product across packages (e.g. MultivariateStats.jl, Clustering.jl). Also, I may not want to have LinAlg
as a dependency in my package because it is not.
If I can emphasize it further, there is the concept of inner product, which is independent of arrays. This concept is represented by the statement export inner
in Base. There is the implementation of inner products for array-like objects representing coordinates inner(x,y) = sum(x.*y)
. This operation can be defined as a fallback method in Base like above, if necessary.
Another example of a use case is Krylov methods. If you have e.g. function spaces with inner products, then you could use Krylov methods to approximate a linear problem or eigenproblem in a small finite-dimensional subspace of that infinite dimensional function space.
I too have my own objects which form a vector/Hilbert space but are not part of <: AbstractArray
. From the analogy that also arrays with rank N>1
form vector spaces and can be used as 'vectors' in Krylov methods, I've come to rely on using vecdot
and vecnorm
being the generalized notion of inner product and norm. So I've been developing a package with Krylov methods that uses functions as linear operators and where the 'vector's can be of any type, provided objects of that type support vecdot
, vecnorm
and a few other things (scale!
, zero
, ...). But maybe that is abusing what was meant by these concepts in Base, so it would be good to straighten out the correct interface here.
Right - vecdot
could be renamed inner
.
(Now Iâm vaguely wondering if norm
should actually be called matrixnorm
for matrices with norm
always matching inner
. It seems that maybe there are two distinct things going on with norm
which is causing some difficulties with generalising it)
In fact, for general vector-like objects, it's also useful to query the dimension of the vector space (e.g. to verify that Krylov dimension should not be larger than dimension of the full space in my example use case). The example of nested arrays shows that length
is not the right concept here, i.e. there would need to be some recursive notion of length for those cases.
Now for the example of using nested arrays as a general vector, vecdot
and vecnorm
are in some cases not even the correct notion of inner product and norm, as discussed in #25093, i.e. they are not recursively calling vecdot
and vecnorm
. My interpretation of these functions as a generic inner product and norm function is what triggered #25093, but it seems that this might not be how these functions were intended (not sure what they were intended to do instead).
So I do agree that we need a consistent interface here to be used across packages, that would therefore belong in a central location (probably not in Base but certainly in a Standard Library, e.g. such that one has to do using VectorSpaces
). As for naming, I see two options:
Option 1 (my interpretation so far):
the prefix vec
indicates the property of that object when interpreting it as a generic vector, hence
vecdot
and vecnorm
for nested arrays are fixed (PR #25093)veclength
definition is addedOption 2 (probably better): use more mathematically correct names
inner
dimension
norm
? And finally, just pinging @stevengj as he will certainly have some useful comments; my apologies if this is inconvenient.
The name is the least interesting part of all of this. I have zero problems with using the function dot
to refer to a general inner product for arbitrary Hilbert spaces. Not only is there no other reasonable meaning for e.g. "dot product of two functions", it's pretty common to see "dot product of functions" in informal usage, especially in pedagogical settings where one is trying to emphasize the analogy to finite-dimensional vector spaces.
@juliohm, inner(x,y) = sum(x.*y)
is not even an inner product in general, so this would be a pretty terrible fallback to put in to base.
But dot
is already not computing the correct inner product (in fact failing) for various objects in Base that behave as vectors, e.g. arrays with rank N>1
or nested arrays (nested vectors being the only case where it does work correctly). Furthermore, the generic name norm
becomes ambiguous for matrices, because I agree with the current choice of having this return the induced norm, but occasionally the "vector norm" (Frobenius norm) is also required.
Hence, my least-impact proposal would be to let go of the semantics vecnorm(x) = norm(vec(x))
and rather interpret vecnorm(x)
as "for x
being an object of some generic type that behaves as a vector space, compute the corresponding vector norm of x
" (and similar with vecdot
). While this is a shift in interpretation (and hence documentation), the actual implementation/action for objects in Base would not be very different (PR #25093) and would produce the same result for most cases (rank N
arrays of scalars or of vectors). A function veclength(x)
that returns the corresponding vector space dimension of x
would complete the interface.
Custom packages should then learn to implement these functions when they define new types which behave as vectors.
it's pretty common to see "dot product of functions" in informal usage, especially in pedagogical settings where one is trying to emphasize the analogy to finite-dimensional vector spaces
Please don't say the name is unimportant, because it is. I will repeat for the n-th time: inner product and dot product are not the same thing. Any serious material exposing work with abstract Hilbert spaces will never use "dot". If you prefer to trust Wikipedia rather than my words, here are the definitions copied and pasted:
In linear algebra, an inner product space is a vector space with an additional structure called an inner product. This additional structure associates each pair of vectors in the space with a scalar quantity known as the inner product of the vectors.
In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers (usually coordinate vectors) and returns a single number.
This resistance to improve terminology and mathematical consistency in the language is demotivating. No matter how many facts I present to you, no matter the number of examples and use cases, there is no counter argument other than "I'm fine with dot".
@juliohm, terminology is a matter of convention, not correctness. I agree that in formal usage for Hilbert spaces, especially infinite-dimensional ones, the term "inner product" is used pretty much exclusively. But, as I said, if you google "dot product functions" you will find plenty of informal usages of that terminology too. If you say "take a dot product of two elements of this Hilbert space", every mathematician will know that you are referring to an inner product, even for infinite-dimensional spaces, so there's no real danger of confusion, because there is no other standard generalization of the term "dot product". That's why I don't find the spelling debate of "dot" vs. "inner" to be a central issue.
It is important to decide on the semantics one wants here, and the set of functions that types should implement if they define a new Hilbert space or Banach space. Currently, if you want to define a type representing a new Hilbert space, you should arguably define dot
and norm
(since we currently lack a fallback for the latter), and I guess adjoint
if you want the mapping to a dual-space object.
As @Jutho says, this is all complicated by the array-of-arrays case, since there are multiple possible things one might want there. Since there aren't standardized names for all of the possible semantics, it's hard to find names/semantics that will satisfy everyone. See the #25093 for the discussion of vecdot
semantics. I don't have a good answer here, myself.
Some possibilities here
x[i]' * y[i]
. Currently, this is dot(x,y)
. Not an inner product for vectors of matrices (where it gives a matrix), and currently not defined all for multidimensional arrays.dot(x[i], y[i])
, including for multidimensional arrays, and conj(x)*y
for Number
. Currently, this is vecdot(x,y)
.inner(x,y)
defined to always be a true inner product, and for arrays make it sum inner(x[i],y[i])
 â essentially the "recursive vecdot" that @Jutho wants. But then, for matrices A
, this inner product is inconsistent with the induced norm norm(A)
that is our current norm
definition. To fix it, we would have to change norm(A)
for matrices to default to the Frobenius norm, which would potentially be a far-reaching breaking change.A question (partially discussed in #25093) is whether we need all three of these in Base, or if we can get away with two (and which two, and what do we call them). @Jutho's proposal, as I understand it, is essentially to eliminate option 2 in Base and then use vecdot
and vecnorm
for option 3. Then we have a true inner product, but the terminology is rather unique to Julia, and a bit odd for e.g. infinite-dimensional Hilbert spaces. That wouldn't be the end of the world, of course.
Another possibility (somewhat independent of what we do with vecdot
) would be to go (back) to requiring dot
to be a true inner product. i.e. eliminate behavior 1, and make dot(x::AbstractVector, y::AbstractVector)
equal to sum dot(x[i],y[i])
. Still don't define it for multidimensional arrays (to stay consistent with norm
).
My current personal inclination would be to define dot
to be a true inner product (which should be consistent with norm
), changing it to sum of dot(x[i],y[i])
for vectors (i.e. changing the vector-of-matrices case), and continuing to not define it for multidimensional arrays. Then define vecdot
to recursively call vecdot
as @Jutho suggests, with a fallback vecdot(x,y) = dot(x,y)
. Finally, say that new "Hilbert-space" types should define dot
and norm
. This seems like the least-disruptive, most comprehensible change to me.
(A norm(x) = sqrt(real(dot(x,x)))
fallback is also a possibility, although it is somewhat dangerous since it is vulnerable to spurious overflow. Note that we can't use sqrt(dot(x,x))
as the fallback for technical reasons: we want a Real
result, not a Complex
result.)
Thanks @stevengj for this informative reaction. Just one small comment:
with a fallback
vecdot(x,y) = dot(x,y)
. Finally, say that new "Hilbert-space" types should definedot
andnorm
.
There are two problems with that. vecdot(x,y) = dot(x,y)
fallback cannot exist, since vecdot
does already accept Any
arguments to deal with general iterators. The second problem is that, if dot
and norm
are exposed to be the true inner product and norm that any vector like user type should define, then even when writing a package with e.g. Krylov methods that should work with completely generic vector like types, it will still not work for the case where the user wants to use nested or multidimensional arrays as the vector like objects. Therefore, I would argue that vecdot
and vecnorm
are the general inner product and norm of vector like objects. This also fits nicely with the fact that for matrices, most people will indeed expect norm
to be the induced matrix/operator norm.
As for an actual use case (to show that this is not some exceptional case). Stochastic matrices have a largest (Perron-Frobenius) eigenvalue for which the corresponding eigenvector represents a fixed point probability distribution. In the quantum generalization thereof, the probability distribution generalizes to a positive definite matrix (the density matrix) and such a matrix is the fixed point (eigenvector corresponding to largest eigenvalue) of a completely positive map, i.e. the map rho -> sum(A[i] rho A[i]^\dagger for i = 1:N)
where thus rho
is the density matrix and A[i]
is a matrix for every i
(known as the Kraus operators representing the completely positive map). For large matrix dimensions, an Arnoldi method is ideally suited for finding the fixed point density matrix.
My current personal inclination would be to define dot to be a true inner product (which should be consistent with norm), changing it to sum of dot(x[i],y[i]) for vectors. Finally, say that new "Hilbert-space" types should define dot and norm.
That is a huge improvement already. Documenting dot
to have inner
semantics in Base will at least allow users to define their own spaces without importing unnecessary libraries. I am not happy about the naming, but at least the functionality would be available for those who need it.
Yes, I think it will be nice to have a documented interface to implement for "Hilbert-space" types.
Of course, thinking about this generic interface for vector spaces, if it includes norm
as suggested above then that should be the Frobenius norm for matrices (and generalize for higher-dimensional arrays, since all arrays are elements of a vector space). In that case we'd need a separate "operator norm" function for matrices (matnorm
or opnorm
or something, or a keyword argument on norm
...).
@andyferris , please note my last comment. norm
and dot
cannot become the general Hilbert space interface, as they don't even work on vector like objects in Julia such as higher-dimensional arrays and nested arrays. Hence vecdot
and vecnorm
are a 'better' (in the sense of least breaking) candidate for this.
Reviving this topic, which I consider quite relevant to the type of math I expect to do with the language in the near future. Is there a consensus on what will be done to improve the generality and semantics of inner products?
Here is the part of my personal maths ontology concerning product.
If it could help to brush up memory/ bring consensus
Bonus: no wikipedia refs
At this point, @Jutho's proposal in #25093 seems like the least disruptive change, even though the vec*
terminology is a bit odd to me in this context.
I agree the vec*
terminology is odd. That is why renaming the functions to have standard names would be beneficial to all users.
I likewise agree that the vec*
terminology is odd.
I agree, as an alternative to vecdot
we could introduce a new method inner
, but I don't know of a good name to "replace" vecnorm
. In fact, I don't find vecnorm
that bad, vector norm is a well established and explicit term for the operation we want.
The basic issue here is with matrices and multidimensional arrays, for which the usual norm(A)
does not correspond to an inner product, as well as with arrays of arrays as discussed above. Some disambiguation (e.g. vec*
or fro*
) is required in these cases to indicate which inner product is intended.
You could have an inner
function that defaults to vecdot
, but it is a little silly to have two names for the same function, and there is still the problem of what to call the norm.
I also find the vecdot
name odd, in fact, I didn't even know it existed and had made my own function for it... called inner
.
My understanding is that we can just deprecate the odd vecdot
in favor of inner
, and give it the inner product semantics for users to implement their own spaces.
Regarding the norm
, that I don't know. I opened this issue to discuss inner products, maybe another issue would be appropriate to discuss the state of norms in Base.
I suppose we could have inner(x,y)
and innernorm(x) = sqrt(inner(x,x))
(with optimized special cases to avoid overflow) instead of vecdot
and vecnorm
. innernorm
is slightly unusual but is reasonably clear in context.
Thumbs up for this change. The names inner
and innernorm
are clear and consistent with the concepts. I wish they could make it to Julia v1.0.
inner
and innernorm
seem OK to me.
I'd still say that, in my opinion, our norm
function doesn't really fit very nicely into Julia's generic function and dispatch system and what I'd call "clear interfaces" where dispatch shouldn't be making semantic choices, just implementation choices. I'd personally rather we could say "norm
returns the norm of an element of a vector space", where matrices and linear operators are still elements of vector spaces (you can add them and multiply them by a scalar). We could also have e.g. "opnorm
returns the operator norm of a linear operator" (or matnorm
or whatever).
At the moment we have "norm
returns the norm of an element of a vector space, unless the element is also a linear operator, in which case we'll give you the operator norm instead". I personally feel that dispatch should never be surprising.
I.e. I'd prefer one function that always does vector norm and another function that always does operator norm, and no function that tries to do both.
Like it even better @andyferris :+1: Specific norms that aren't the norms induced by the inner product in the space could have a more specific name. The name norm
would mean exactly norm(x) = sqrt(inner(x,x))
, and could be redefined as needed for user types.
I'd personally rather we could say "
norm
returns the norm of an element of a vector space"
The current norm
function satisfies that definition. For matrices, it computes the induced (operator) norm, which is a perfectly valid norm for a vector space. (Vector spaces don't have to have inner products or norms at all.)
You may be somewhat confused about the definition of a "norm" if you think that the operator norm is not a "norm of a vector space".
This is also a useful distinction between norm
and innernorm
. If you define norm
, I would say that it implies only that you have a Banach space (or at least a normed vector space). If you define innernorm
, it implies that you have a Hilbert space (or at least a inner product space) and that this norm is consistent with inner
.
For example, adaptive numerical integration (ala quadgk) is something that only requires a normed vector space, not an inner-product space.
Sure, sorry, I was perhaps a bit too imprecise with my language. There are obviously many valid norms for a vector space, including various operator norms.
I guess what I'm getting at is that maybe I'd prefer the choice of which norm to be more explicit than implicit? And that if you use the same function (without e.g. additional keyword arguments) you get the "same" norm, in which case the Euclidean seems like a somewhat defensible choice for AbstractArray
.
This is also a useful distinction between
norm
andinnernorm
. If you define norm, I would say that it implies only that you have a Banach space (or at least a normed vector space). If you defineinnernorm
, it implies that you have a Hilbert space (or at least a inner product space) and that this norm is consistent withinner
.
This does seem reasonable, but I'd still wonder why if an object has an innernorm
it would need a different norm
? I would alternatively propose that the interface for Banach space requires norm
while an interface for inner product spaces would provide both norm
and inner
. These functions can then be used in generic code that expects objects of Banach or inner-product spaces as appropriate (EDIT: with the thought that code that works in Banach spaces will automagically also work on inner-product spaces).
I think you're proposing that norm(x)
always refer to some kind of element-wise Euclidean norm (i.e. a Frobenius norm for matrices), i.e. basically what vecnorm
is now modulo the recursive case. In this case we might as well redefine dot(x,y)
to be the corresponding inner product (inner
works too, but dot
has the advantage of an infix variant x â
y
).
I'm fine with this in principle, but this would be a breaking change and it might be a little late before 0.7 to get that inâŠ
Is L2 a good default in high dimensional too ?
This article talks about distance, but may be it can concern norm too
https://stats.stackexchange.com/questions/99171/why-is-euclidean-distance-not-a-good-metric-in-high-dimensions
In this case we might as well redefine dot(x,y) to be the corresponding inner product (inner works too, but dot has the advantage of an infix variant x â y)
Can we get rid of dot
entirely? The infix notation should be unrelated to the existence of a function called dot
. Just define the infix with the inner
method for Julia arrays. Is that possible?
That is really what it is, the dot product: a convenient notation x â y for inner products between x and y vectors in R^n with Euclidean geometry.
@stevengj I think that's a good summary, yes.
@o314 Is L2 a good default in high dimensionality? Possibly not, but I'd really hate it if e.g. the norm chosen by norm(v::AbstractVector)
depended on length(v)
:) I'd equally not like it to second guess whether my matrix or higher-dimensional array is "too big for L2" - I'd suggest that perhaps this should be explicitly marked by the user?
@juliohm That's definitely possible, though like mentioned, these are breaking changes we're suggesting. (Again, modulo what to do in the recursive case and earlier discussions on the possible differences between inner
and dot
).
@stevengj, my interpretation of what @andyferris was implying is that, because of duck typing, it is hard to decide whether a user wants to interpret an object as a vector (and use a corresponding vector p
-norm) or as an operator (and compute an induced p
-norm). So I think there is no choice but to specify explicitly what behaviour is wanted. The current approach is a bit odd in the sense that norm
tries to guess implicitly whether to choose vector norm or induced norm based on the input, and vecnorm
is a way of explicitly specifying that you want the vector norm (which is also why I don't find vecnorm
such a bad name). A more radical change would be to make norm
always default to the vector norm, and specify explicitly when you want the induced norm, using a (keyword) argument or a different function altogether.
On the other hand, I also don't mind the name innernorm
, which is explicit in that this is an inner product based norm (i.e. always p=2
in the Euclidean case). I find it hard to judge wether for custom objects (vec)norm
should support an optional argument p
as part of the interface, since in some of my use cases, only p=2
is easy to compute.
That is really what it is, the dot product: a convenient notation x â y for inner products between x and y vectors in R^n with Euclidean geometry.
I agree with this, in the sense that I don't recall ever having seen the notation x â
y
in the context of general (e.g. complex) vector spaces. I think only the mathematical notation (x,y)
or the Dirac notation < x | y >
is used in such cases. In electromagnetism one often uses E â
B
for vectors in 3-dimensional Euclidean space, and even if one uses complex notation (i.e. phasors) this does not imply complex conjugation. If needed, complex conjugation is denoted explicitly in such cases. So I wouldn't mind if dot
just became sum(x_i * y_i)
without complex or Hermitian conjugation, and inner
became the correct inner product for general inner product spaces. Unfortunately, this can probably not be done in a single release cycle.
Is L2 a good default in high dimensionality? Possibly not, but I'd really hate it if e.g. the norm chosen by norm(v::AbstractVector) depended on length(v) :) I'd equally not like it to second guess whether my matrix or higher-dimensional array is "too big for L2" - I'd suggest that perhaps this should be explicitly marked by the user?
I work in the BIM world where we handle 2d and 3d, but also 4d, 5d, 6d may be 7d. We never go further. At any point we know in which dimensions we work, and which algo is involved. That's largely enough.
I can not express the pov of people who work in ML, information retrieval etc. There, may be norminf is better. What is important in my pov is guessability and stability. I will not be shocked at all if people in ML needs different default for their stuff. If there is no confusion. Eg. it is decided explicitly and statically at compile time. It's even luxury if it is remains stable and consistant during algos application.
Inspired from array:similar Not fully implemented and test it.
norm2 = x -> x |> inner |> sqrt
norminf = ...
NMAX = 10
for N in 1:NMAX
@eval begin norm(a::Array{T,N}) where {T} = norm2 end
end
norm(a::Array{T,n}) where {T} = norminf
Can we get rid of dot entirely? The infix notation should be unrelated to the existence of a function called dot. Just define the infix with the inner method for Julia arrays. Is that possible?
norm(x::AbstractVector, p::Real=2) = vecnorm(x, p) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L498
vecdot(x::Number, y::Number) = conj(x) * y # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L657
dot(x::Number, y::Number) = vecdot(x, y) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L659
function dot(x::AbstractVector, y::AbstractVector) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L677
# Call optimized BLAS methods for vectors of numbers
dot(x::AbstractVector{<:Number}, y::AbstractVector{<:Number}) = vecdot(x, y) # https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/generic.jl#L698
Dot / vecdot implies to use conjugate and to decide when to go to BLAS. this have to be handle somewhere. But this should be manageable in a single namespace.
Is L2 a good default in high dimensionality? Possibly not
L2 is also the most common norm for infinite-dimensional spaces (e.g. functions). I think it is a reasonable default to expect for any vector space.
Obviously you want to have other norms available, too. If we redefine norm(x)
to be elementwise L2 wherever possible, then norm(x, p)
would be elementwise Lâ, and we'd need some other function (e.g. opnorm
) for the corresponding induced/operator norms.
I agree with this, in the sense that I don't recall ever having seen the notation
x â y
in the context of general (e.g. complex) vector spaces.
I gave several citations in another thread, IIRC (e.g. BLAS uses dot
for complex dot product, and you can find pedagogical sources even using the term for inner products of functions). The very term "inner product" is usually introduced a "a generalization of a dot product". I don't think anyone will be too surprised by the notation of dot
for a Euclidean inner product, and it is convenient to have an infix operator.
We could keep dot
as-is and introduce inner
, of course, but I think that would create a confusing dichotomy â in the most common cases, the functions would be equivalent, but in odd cases (e.g. arrays of matrices) they would differ.
But again, it might be a little late for breaking changes, so we might have to resort to innernorm
and inner
. In any case, someone would need to create a PR ASAP.
If a reasonable measure of consensus forms, I may be able to devote some bandwidth to exploring implementation on a relevant (short) timescale, potential breaking changes included. I appreciate the drive to clarify these operations' semantics and give them explicit names. Best!
I see two main options:
Non breaking, adds a feature: inner(x,y)
and innernorm(x)
. Replacing vecdot
and vecnorm
, and recursive for arrays of arrays.
Breaking: change norm(x,p=2)
to be always elementwise and recursive, replacing vecnorm
, and introduce a new function opnorm
for the operator/induced norm. Make dot(x,y)
the corresponding elementwise dot product, replacing vecdot
. (Alternative: rename to inner
, but it's nice to have an infix operator, and it is annoying to have both dot
and inner
.)
If I were designing things from scratch I would prefer 2, but it might be too disruptive to silently change the meaning of norm
.
One intermediate option would be to define inner
and innernorm
(deprecating vecdot
and vecnorm
), and deprecate norm(matrix)
to opnorm
. Then, in 1.0, re-introduce norm(matrix) = innernorm(matrix)
. That way, people can eventually just use inner
and norm
, and we leave dot
as the current odd beast for vectors-of-arrays (coinciding with inner
for vectors of numbers).
One oddity about innernorm
is you want a way to specify the L1 or Linf "elementwise" norms, but neither of these corresponds to an inner product so innernorm(x,p)
is a bit of a misnomer.
I like your intermediate option.
As stated above, I like the name innernorm(x)
because it implies p=2
and there shouldn't be a second argument . I have objects for which I only know how to compute the inner product norm. But with the current (vec)norm
, it is unclear to me if the p
argument is part of the assumed Base interface, and so I don't know whether to omit the second argument, or to support it but then check explicitly for p != 2
and yield an error.
But I see the problem with not having any non-deprecated way of doing vecnorm(matrix, p!=2)
during the intermediate stage of your proposal.
I also like the intermediate option - we definitely want to go through a proper cycle of deprecation for the norms rather than make an immediate breaking change. (As a user, the breaking changes scare me, but I see fixing deprecations in my code for v1.0 are like an investment in clean, clear code for the future).
Would we actually need innernorm
or could we just use vecnorm
for now (and deprecate vecnorm
in favor of norm
later)?
I actually don't see any potential uproar in simply replacing dot
with inner
... I too think it's clear enough that inner product is meant to be a generalization of dot products.
Changes could be implemented in two separate PRs:
dot
with inner
and give it the generalized meaning. Optionally, make the infix \cdot
notation point to inner between Julia arrays.My understanding is that PR 1 could be merged before Julia v1.0. It is not breaking.
Replacing dot
with inner
would still be breaking because dot
is currently not a true inner product for arrays of arrays â so you would be changing the meaning, not just renaming. I'm for changing the meaning to be a true inner product, but if you change the meaning (defining it as the true inner product) I don't see the problem in continuing to spell it as dot
.
So, we could do the following in 0.7:
norm(matrix)
to opnorm(matrix)
and norm(vector of vectors)
to vecnorm
.dot([vector of arrays], [vector of arrays])
to a call to sum
.vecdot(x,y)
and vecnorm(x, p=2)
are Euclidean inner products/norms (for p=2
), and make them recursive (which is slightly breaking, but in practice probably not a big deal).Then, in 1.0:
vecnorm
to norm
and vecdot
to dot
. (Not sure if this is allowed by the 1.0 release rules, @StefanKarpinski?)(Note that the numpy.inner function, amazingly, is not always an inner product. But NumPy's terminology on inner
and dot
has been weird for a while.)
The reasons I prefer to continue spelling it as dot
:
dot
is a more familiar name for the Euclidean inner product. (Mathematicians will easily adjust to using the name dot
for the inner-product function on arbitrary Hilbert spacesâ"dot product" has no other possible meaning for such spaces.)inner
and dot
would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot
meaning).inner
has a lot of other potential meanings in computer science, and it hence it is somewhat annoying to export this name from Base.Can you elaborate on your opposition to the name inner? I still don't get
it why you prefer to go against a terminology everyone on this thread seems
to agree?
On Tue, May 15, 2018, 5:13 AM Steven G. Johnson notifications@github.com
wrote:
(Note that the numpy.inner
https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.inner.html
function, amazingly, is not always an inner product.)â
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/JuliaLang/julia/issues/25565#issuecomment-389144575,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADMLbdcpeWo7M4prYz76NoqUPIkfVPP3ks5tysZlgaJpZM4ReGXu
.
None of the reasons are compelling to me:
>
- It is nice to have an infix variant.
Yes, and the infix notation can still exist regardless of the rename to
inner as explained above.
>
- For non-mathematicians operating on ordinary finite-dimensional
vector spaces, dot is a more familiar name for the Euclidean inner
product. (Mathematicians will easily adjust to using the name dot for
the inner-product function on arbitrary Hilbert spacesâ"dot product" has no
other possible meaning for such spaces.)This argument is not good: let's teach ordinary people the wrong
terminology because they are lazy and can't learn a new appropriate word,
and force mathematicians to use the wrong terminology against their will.
>
- Having both inner and dot would be confusing, since they would
coincide in some cases but maybe not others (if we keep the current dot
meaning).We don't need both, get rid of the less general name, which we agree is
dot at this point.
>
- Outside of linear algebra, inner has a lot of other potential
meanings in computer science, and it hence it is somewhat annoying to
export this name from Base.Outside of linear algebra I can find many uses for dot. Even more for the
dot infix notation meaning completely different things.
>
I'm reposting @juliohm's last post with fixed formatting.
None of the reasons are compelling to me:
It is nice to have an infix variant.
Yes, and the infix notation can still exist regardless of the rename to inner as explained above.
For non-mathematicians operating on ordinary finite-dimensional vector spaces, dot is a more familiar name for the Euclidean inner product. (Mathematicians will easily adjust to using the name dot for the inner-product function on arbitrary Hilbert spacesâ"dot product" has no other possible meaning for such spaces.)
This argument is not good: let's teach ordinary people the wrong terminology because they are lazy and can't learn a new appropriate word, and force mathematicians to use the wrong terminology against their will.
Having both inner and dot would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot meaning).
We don't need both, get rid of the less general name, which we agree is dot at this point.
Outside of linear algebra, inner has a lot of other potential meanings in computer science, and it hence it is somewhat annoying to export this name from Base.
Outside of linear algebra I can find many uses for dot. Even more for the dot infix notation meaning completely different things.
Yes, and the infix notation can still exist regardless of the rename to inner as explained above.
You can certainly define const â
= inner
, but then your terminology is inconsistent. I thought you didn't like using the "dot product" as a general inner product?
force mathematicians to use the wrong terminology against their will
Mathematicians know that terminology is neither right nor wrong, it is only conventional or unconventional (and maybe consistent or inconsistent). (And most people don't go into mathematics because they have a passion for prescriptive spelling.) In my experience, if you tell mathematicians that in quantum mechanics a vector is called a "state", the adjoint is called "dagger", and a dual vector is called a "bra", they are sublimely unconcerned. Similarly, I don't think any experienced mathematician will blink more than once if you tell them that in Julia an inner product is spelled dot(x,y)
or x â
y
, especially since the terms are already understood to be synonyms in many contexts. (I doubt you will find any mathematician who does not know instantly that you are referring to an inner product if you say "take the dot product of two functions in this function space".)
On the other hand, for people who aren't trained mathematicians and haven't been exposed to abstract inner-product spaces (i.e. the majority of users), my experience is that unfamiliar terminology is more of an obstacle. "How do I take a dot product of two vectors in Julia?" will become a FAQ.
There really is no mathematical difficulty here to be solved aside from choosing the semantics. The spelling question is purely one of convenience and usage.
Outside of linear algebra I can find many uses for dot. Even more for the dot infix notation meaning completely different things.
Except that Julia and many other programming languages have had dot
for years and it hasn't been a problem. inner
would be new breakage.
Ultimately, the spelling of this (or any other) function is a minor matter compared to the semantics and the deprecation path, but I think the balance tips in favor of dot
.
You can certainly define const â = inner, but then your terminology is inconsistent. I thought you didn't like using the "dot product" as a general inner product?
I think you still don't get it. There is no inconsistency in calling dot an inner product. It is an inner product, a very specific and useless one for many of us. Nothing more than sum(x.*y)
.
If the term dot
ends up in Julia having the semantics of inner
, this will be a historical disaster that I can guarantee to you many will feel annoyed. I can foresee professors in a classroom explaining things like: "You know, we are now gonna define the inner product for our space, but in Julia someone (@stevengj) decided to call it dot."
I will make sure I will screenshot this thread for future reference if that ends up happening.
You are the only one @stevengj insisting on the dot
terminology, no one else else has manifested opposition to it. It would be nice if you could reconsider this fact before making a decision.
It is an inner product, a very specific and useless one for many of us. Nothing more than sum(x.*y).
If you think "dot product" can only refer to the Euclidean inner product in ââż, then you shouldn't define const â
= inner
, you should define only â
(x::AbstractVector{<:Real}, y::AbstractVector{<:Real}) = inner(x,y)
.
You can't have it both ways: either inner
can use â
as an infix synonym (in which case the infix operator is both "wrong" in your parlance and the naming is inconsistent) or it doesn't have an infix synonym (except in one special case).
I can foresee professors in a classroom explaining things like: "You know, we are now gonna define the inner product for our space, but in Julia someone (@stevengj) decided to call it dot."
Ha ha, I'm willing to take the heat from this imaginary outraged professor. Seriously, you need to look around more if you think the term "dot product" is only ever used in ââż, or that mathematicians are outraged if the term is used in other Hilbert spaces.
this will be a historical disaster
Seriously?
This discussion seems to be eroding beyond what one might consider a welcoming, civil and constructive environment. Opinions and backgrounds differ, but please refrain from making personal attacks or placing blame on anyone and assume all parties are debating for their point in good faith.
I can foresee professors in a classroom explaining things like: "You know, we are now gonna define the inner product for our space, but in Julia someone (@stevengj) decided to call it dot."
It may also be worthwhile here to note that Steven _is_ a professor. :wink:
I am also on the fence about removing dot
in favor of inner
. The dot
term is quite widely used, and not having the function in Julia, when it is in Python and MATLAB would be surprising. However, I do also like the term inner
, given it is more appropriate for non-ââż vector spaces, and especially matrices.
Incidentally, while I was testing what methods were doing in Julia, I noticed that dot
only works on real vectors/matrices. Is that intentional?
Having both inner and dot would be confusing, since they would coincide in some cases but maybe not others (if we keep the current dot meaning).
@stevengj Would it be completely ridiculous to replace vecdot
with inner
, and also keep dot
? Right now, that exact problem you are describing exists already, just with vecdot
instead of inner
.
OK... looking forward, what are the live suggestions? Are they to:
dot
as a generic inner product for a wider range of types. It's already correctly recursive on vectors-of-vectors, but we would make it work on matrices, etc (@jebej I don't feel having both dot
and inner
is that useful, and as Steven says, we at least colloquially use dot
to mean inner product quite often, and this not incorrect - it's just terminology). norm
a bit more consistent with the above dot
and across all AbstractArray
, eventually introducing e.g. opnorm
for operator norms (on AbstractMatrix
) and having (in new-to-old notation) norm(matrix) == vecnorm(matrix)
after suitable deprecations. At this point perhaps we don't need vecdot
and vecnorm
anymore?Is that right? I think these would at least get us to a relatively consistent linear algebra story with "clean" interfaces, where generic code can use dot
and norm
as a reliable pair for working with inner-product spaces independent of type.
@andyferris, yes, I think if we make this change then we only need dot
and norm
(which are now the recursive Euclidean operations on arrays or arrays-of-arrays of any dimensionality, though for norm we also define norm(x,p)
to be the p-norm) and opnorm
, and no longer have vecdot
or vecnorm
.
Note that the change to dot
is a breaking change because dot
is currently not a true inner product for vectors of matrices (#22392), something that was debated for a long time in #22220 (at which point eliminating vecdot
was not considered IIRC). However, that was introduced in 0.7, so it doesn't break any actual released code. In fact, dot
in 0.6 is already the Euclidean dot product on arbitrary-dimensionality arrays, somewhat by accident (#22374). The suggested change here would restore and extend that 0.6 behavior and change norm
to be consistent with it.
One question is whether norm(x,p)
would call norm(x[i])
or norm(x[i],p)
recursively. Both are potentially useful behaviors. I lean towards the former because it is more general â x[i]
may be some arbitrary normed vector space that only defines norm
but not the p-norm. Calling norm
recursively is also what vecnorm
does now, so it is consistent with deprecating vecnorm
to norm
.
@jebej, dot
on both master and 0.6 works for me on complex arrays: dot([3im],[4im])
correctly returns 12+0im
, for example.
Another good point about changing norm(matrix)
to be the Frobenius norm is that is a lot cheaper. It is common to just use norm(A-B)
to get a sense of how big the difference between two matrices is, but not to care too much about the specific choice of norm, but many users won't realize that the current default norm(matrix)
requires us to compute the SVD.
Wonderful to see consensus forming around several major points! :) (Unless someone beats me to it (please do if you have bandwidth!) or an alpha tag hits prior, I will give implementing the present consensus points a shot after shipping #26997.) Best!
Another link for future reference: https://math.stackexchange.com/a/476742
To illustrate the poor naming that is being adopted here consciously, and the poor decision imposed by a single mind. Dot and inner products have different mathematical properties. You are forcing a whole community against what is well known in the mathematics literature.
And for future readers, what should have been done instead had we had a collective decision:
# make dot what it is, a NOTATION
â
(x::AbstractVector, y::AbstractVector) = sum(x[i]*y[i] for i in indices(x))
# replace the name dot by the more general inner
inner(x, y) = # anything
I guess we will just be the first people in the universe to employ the term "dot product" for an inner product on anything but ââż. It's a good thing I was able to impose my will on this thread (mainly by blackmailing the other developers) to force this innovation into the world! No longer will the dot product be relegated to mere "notation": instead, it will be a symbol that means an inner product (as all should know, assigning meanings to symbols is the opposite of "notation").
Very good decision making :clap: it was definitely a consensus. Read the comments above, and you will see how everyone agreed. :+1:
Or maybe I should quote some comments so that it is very clear how it was a consensus:
>
Right - vecdot could be renamed inner
by @andyferris
Option 2 (probably better): use more mathematically correct names
inner
dimension
But what to do with norm?
by @Jutho
I agree, as an alternative to vecdot we could introduce a new method inner
by @Jutho
I also find the vecdot name odd, in fact, I didn't even know it existed and had made my own function for it... called inner.
by @jebej
And many more...
People can debate vociferously with one another, and raise many points of disagreement, but still arrive at a consensus (albeit not always unanimity) by being persuaded and by balancing the pros/cons. (I agree that there are both pros and cons of each option here.) I'm sorry that the result which seems (tentatively!) to be gelling here is not the outcome that you preferred, but I'm not sure how you think I "imposed" my will.
(Not that any final decision has been made, of course â there isn't even a PR yet, much less anything merged.)
I only wish we could make a decision that is based on the audience of the language. If someone picks Julia as a tool, I am sure the person has at least heard of the term inner
product. It is a quite popular concept and far from being exotic. Exotic things include "persistent homology", "quantum theory", this are less widely spread, and I would be against including this type of terminology.
After all I just want to have a language that is the best language for scientific computing, math, etc.
@juliohm, all of the arguments have been based on the needs of who we think the audience is, and all of us are trying to make Julia as good a language as possible. Reasonable people can come to different conclusions about terminology, since mathematics does not determine spelling.
Firstly, as mentioned above, I can certainly agree with @stevengj 's current proposal and sticking to dot
as the general name for inner product. Also, I dislike the way this discussion is going and would certainly like to be quoted correctly. @juliohm, the second quote you attribute to me is not mine.
That being said, I would like to mention the following as food for thought in the consideration of pros and cons. The following are mostly cons, but I agree with the pros mentioned by @stevengj. There could easily be separate use cases for having dot
just mean sum(x[i]*y[i] for i ...)
. In the cases where the infix dot notation is most used in mathematics, this is indeed typically the meaning. As an inner product, the infix dot notation is typically (though certainly not exclusively) reserved for real vector spaces. Other use cases include enabling things like Ï â
n
with Ï
a vector of Pauli matrices and n
a vector of scalars. This was one of the motivations behind the way dot
is currently implemented, as was pointed out to me in some other thread. The fact that BLAS decided to only use dot
for real vectors and make a distinction between dotu
and dotc
for complex vectors is another issue to consider. People with BLAS background might get confused whether, having complex vectors, they want to compute dot(conj(u),v)
or dot(u,v)
when they want the true inner product (i.e. dotc
). Furthermore, they might look for a way to do dotu
without first making a conjugate copy of the vector at hand.
@Jutho the quote is yours, your full comment is copied below:
I agree, as an alternative to vecdot we could introduce a new method inner, but I don't know of a good name to "replace" vecnorm. In fact, I don't find vecnorm that bad, vector norm is a well established and explicit term for the operation we want.
In any case, the quoting is intended to show what is the desire of many here (at least as a first natural thought) when we think about this subject. If you changed your desire over time, that is another story. I myself would never pop up the term "dot" out of my head during any modeling with Hilbert spaces. It feels unnatural and inconsistent with what I learned.
@Jutho: Furthermore, they might look for a way to do
dotu
without first making a conjugate copy of the vector at hand.
The possibility of exporting a dotu
function has come up from time to time (see e.g. #8300). I agree that this is sometimes a useful function: an unconjugated Euclidean "inner product" (not really an inner product anymore) that is a symmetric bilinear (not sesquilinear) form dotu(x,y) == dotu(y,x)
(not conjugated) even for complex vector spaces. But the utility of that operation is not limited to ââż â for example, this kind of product often shows up in infinite-dimensional vector spaces (functions) for Maxwell's equations as a consequence of reciprocity (essentially: the Maxwell operator in typical lossy materials is analogous to a "complex-symmetric matrix" â symmetric under the unconjugated "inner product"). So, if we define dot(x,y)
to be the general Euclidean inner product (with the first argument conjugated), it would be quite natural to define a dotu(x,y)
function for the unconjugated Euclidean product on any vector space where it makes sense. I don't see the possibility of a dotu
function as an argument against dot
, however. In the majority of cases, when you are working with complex vector spaces you want the conjugated product, so this is the right default behavior.
But I agree that one possibility would be to define dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x))
, which is how it's currently defined in master (not 0.6), and define inner(x,y)
as the true dot product. This has the advantage of supplying both functions, both of which may be useful in certain cases. However, we then have two functions that almost always coincide except for arrays of matrices, and I suspect it would be a bit confusing to decide when to use one or the other. Many people would write dot
when they meant inner
, and it would work fine for them in most cases, but then their code would do something unexpected if it is passed an array of matrices. My suspicion is that in 99% of cases people want the true inner product, and the "sum of product" version can be left to a package, if indeed it is needed at all (as opposed to just calling sum
).
@juliohm , I misread your post as I thought the names were above (instead of below) the respective quotes, hence I thought you attributed the quote of @jebej to me. My apologies for that.
@stevengj, I certainly was not thinking of having dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x))
as a reasonable default. In the case like Ï â
n
, the complex/hermitian conjugation of the first or second argument is unnecessary. So what I was saying is that, in many (but indeed not all) cases where the infix dot notation is used in scientific formulas, its meaning coincides with dotu
, i.e sum(x[i]*y[i] for i = 1:length(x))
without conjugation, either as inner product on real vector spaces or as some more general construction.
So if I were to make an alternative proposal (though I am not necessarily advocating it), is to have two functions:
dot(x,y) = sum(x[i]*y[i] for i...)
, which is still be the correct inner product for real vectors (which is likely the use case of the people who are less or not familiar with the term inner product) but also allows more general constructions like Ï â
n
, and is thus the function corresponding to the infix notation
inner(x,y)
being the always valid inner product, with conjugation and recursion, that will be used by people in more general en technical contexts.
I am not defending this as a good choice to adopt in the Julia Language, but I do think this is how it is used in much of the literature. When infix dot is used, it is either as an inner product in the context of real vectors, or in some more general construction where it just means contraction. When a general inner product on arbitrary vector spaces is intended, most scientific literature (but you certainly have shown counter examples) switches to <u,v>
or <u|v>
(where in the first notation there is still discussion which of the two arguments is conjugated).
I could live with this proposal, but I could equally well live with having only dot
as the general inner product. In the end, it's a matter of having good documentation, and I too cannot believe that anyone would stumble over this "design" choice.
@Jutho, I agree that it is not uncommon to define dot
to just mean contraction. Certainly, one can find examples both ways. For example, in programming languages and popular libraries:
Unconjugated: Numpy dot
(and, bizarrely, inner), Mathematica's Dot
, Maxima .
, BLAS dotu
Conjugated: Matlab's dot
, Fortran's DOT_PRODUCT, Maple's DotProduct
, Petsc's VecDot
, Numpy vdot
, BLAS dotc
(note that the lack of overloading in Fortran 77 made it impossible to call this dot
even if they wanted to), Eigen's dot
On the one hand, the conjugated inner product is usually introduced in textbooks as the "natural" extension of the "dot product" notion to complex vectors â the unconjugated version is in some sense an "unnatural" extension, in that it is usually not what you want. (Consider the fact that, of the languages that provide a conjugated dot
function in their standard libraries â Matlab, Fortran, Julia, Maple â only Maple provides an unconjugated variant, hinting at a lack of demand.) On the other hand, an unconjugated dotu
function is convenient (as a supplement) in certain special cases (some of which I mentioned above).
If we have both dot
and inner
, I suspect that many people will end up using dot
by accident when they really want inner
for their code to be generic. (I'd bet that Numpy's inner
is unconjugated due to just such an accident â they implemented it with real arrays in mind, and didn't think about the complex case until it was too late to change so they added the awkwardly named vdot
.) Whereas if we have dot
and (possibly) dotu
, it will be clearer that dot
is the default choice and dotu
is the special-case variant.
(I agree that âšu,vâ©
, âšu|vâ©
, or (u,v)
are more common notations for inner products on arbitrary Hilbert spacesâthey are what I typically use myselfâbut those notations are a nonstarter for Julia. There was some discussion of parsing Unicode brackets as function/macro calls, e.g. #8934 and #8892, but it never went anywhere and this seems unlikely to change soon.)
I fully agree with your assessment @stevengj .
Me too.
I suspect itâs time for one of us to play with either implementation in a PR and see how it comes out.
@Jutho I always saw the dot product with Pauli matrices as shorthand for a contraction over higher order tensors... one of the vector spaces is real, 3D.
I agree that âšu,vâ©, âšu|vâ©, or (u,v) are more common notations for inner products on arbitrary Hilbert spacesâthey are what I typically use myselfâbut those notations are a nonstarter for Julia.
It would actually be possible to make âšu,vâ©
work.
@StefanKarpinski: It would actually be possible to make âšu,vâ© work.
Absolutely, and supporting this precise notation was suggested in #8934, but it never went anywhere. (Note also that angle brackets have other common uses, e.g. âšuâ© often denotes an average of some kind.) It is non-breaking and could still be added at some point, but it doesn't seem reasonable to expect in the near term. It's also quite slow to type \langle<tab> x, y \rangle<tab>
, so it's not very convenient from a programming standpoint for an elementary operation.
and we can't overload <> for it, right?
No
Can't say I've read every comment on this humongous thread, but I just want to highlight a few points, some of which have been made before:
dot
being conjugated by default is the best thing ever. No decision point here, I just wanted to say how glad I am that I no longer have to do these conj(dot())
s!norm
: if you're coding up an optimization algorithm and want to stop whenever norm(delta x) < eps
, you're going to write norm
. But then you want to optimize wrt an image or something, you run your code, and it suddenly launches into an unkillable (because BLAS) SVD of a big array. This is not academic, it has caused trouble in Optim.jl, and doubtless in other packages as well. Nobody is going to know that vecnorm
exists unless they have a specific reason of looking for it.dot
and vecdot
, and norm
and vecnorm
is good, even if it removes a bit of flexibility in array-of-arrays cases. For norms, I'd add that often when working with things on which there's multiple norms defined (eg matrices), what the user wants is to call norm
to get a norm, without particularly caring which one. Induced norms are most often of theoretical rather than practical interest due to their computational intractability. They are also specific to the 2D-array-as-operator interpretation rather than the 2D-array-as-storage one (an image is a 2D array, but it's not an operator in any useful sense). It's good to have the possibility of computing them, but they don't deserve to be the default norm
. Reasonable, simple and well-documented defaults that have discovereable alternatives are better than attempted cleverness (if the user wants to do a clever thing, let them do it explicitly).Therefore, +1 on @stevengj 's
yes, I think if we make this change then we only need dot and norm (which are now the recursive Euclidean operations on arrays or arrays-of-arrays of any dimensionality, though for norm we also define norm(x,p) to be the p-norm) and opnorm, and no longer have vecdot or vecnorm.
A more "julian" alternative to norm/opnorm might be to have an Operator type, which could wrap a 2D array, on which norm
does opnorm. This can be done at the level of packages (several of which already exist)
I'd much rather type opnorm(matrix)
than norm(Operator(matrix))
âŠ
I'm going to chime in from the peanut gallery here and say that I like where this is goingâvecnorm
and vecdot
have always bothered me. Needing to explicitly ask for the operator normâwhich has always seemed fairly specialized to meâseems much saner than having to ask for a norm that's much faster and easier to compute (e.g. the Frobenius norm). Writing opnorm
seems like a fine interface for asking for the relatively specialized operator norm.
I also feel that having a subtle distinction between dot
and inner
is likely to lead to confusion and rampant misuse. Lecturing users about which function they're _supposed_ to use when both functions do what they want and one of them is easier tends not to work out very well. My impression is that it's relatively rare in generic code that sum(x*y for (x,y) in zip(u,v))
is actually what you want when a true inner product âšu,vâ©
actually exists. When that's really what's wanted, it's fairly easy, clear and efficient (because Julia is what it is) to just write something like that to compute it.
Whether to call the uâ
v
function dot
or inner
seems like the least consequential part of all of this. I'm fairly sure that neither choice would be looked back upon by historians as a disasterâalthough the notion that historians would care at all certainly is flattering. On the one hand, if we agree to keep the "true inner product" meaning of uâ
v
then yes, inner
is the more correct mathematical term. On the other hand, when there is a syntax with a corresponding function name it tends to confuse users less when the name matches the syntax. Since the syntax here uses a dot, that rule of thumb supports spellings this operation as dot
. Perhaps this might be a reasonable case to define const dot = inner
and export both? Then people can use or extend whichever name they prefer since they're the same thing. If someone wants to use either name for something else they can, and the other name will remain available with its default meaning. Of course that would make three exported names for the same functionâdot
, inner
and â
âwhich seems a bit excessive.
Is it an option to remove the â
symbol or replace it with <u,v>
?
Comments:
<u,v> * M * x
vs.
u â
v * M * x
The <u,v>
syntax imply association: first we operate on u
and v
and then the rest of the expression follows.
If a user made the effort to type <u,v>
, it is very unlikely that he had in mind a simple sum(x[i]*y[i])
. The symbol â
is easy to skip with the eyes, and has many other connotations. Particularly, in linear algebra, for a vector space V over a field F, the product of a scalar α â F
with a vector v â V
is denoted α â
v
in various textbooks.
Removing or replacing the â
would also eliminate the issue of multiple names being exported. One would have to only export inner
and <,>
for general inner products, with the default implementation for arrays matching the iterable summation semantics.
If one needs to define a scalar-by-vector product like described above for a vector space V over a field F, he/she would be able to define â
notation for it. The vector space would then be fully defined with nice short syntax, and could be extended to a Hilbert space by further defining <u,v>
.
We definitely cannot use the syntax <u,v>
; the syntax we could use is âšu,vâ©
ânote the Unicode brackets, not the less than and greater than signs, <
and >
. We also have u'v
as a syntax for something that is either a dot product or an inner product? (I'm not sure which...)
Yes, sorry, the unicode version of it. It would be very clear to read. It would also solve this issue with multiple names, and free â
for other purposes.
I don't think we want to use â
for any other purposeâthat seems like it would be confusing.
Just imagining how wonderful would it be to be able to write code that looks like:
âšÎ± â
u, vâ© + âšÎČ â
w, zâ©
for abstract vectors (or types) u,v,w,z â V
and scalars α, ÎČ â F
.
u'v
is an inner product (and a dot product, if you follow the conjugated convention) only for 1d arrays, not for e.g. matrices. (This is another reason why it is pointless to limit infix dot to 1d arrays, since we already have a terse notation for that case.)
Stefan, âcorrect mathematical termâ is a category errorâmathematical correctness is not a concept that applies to terminology/notation. (Substitute âconventionalâ for âcorrect.â But then the concern becomes less urgent,)
More use cases: https://stackoverflow.com/questions/50408177/julia-calculate-an-inner-product-using-boolean-algebra
And a formal derivation of boolean inner products using the âš,â©
notation: https://arxiv.org/abs/0902.1290
EDIT: fixed link to paper
What do you think of the angle brackets syntax proposal? Would it solve the issues raised here?
So what's your proposal exactly? Is it roughly this:
dot
to inner
uâ
v
to âšu,vâ©
So then there would be no dot
function and no â
operator?
Would that change be something reasonable?
Sorry for the delay to reply, I am at a conference with limited access to internet.
And just for clarity and completeness, what is the counterproposal here? Do nothing?
To clarify the proposal even further, there is semantic change involved: generalized inner products.
Heads up: we've now debated this to the point where there's a real risk it won't make it into 0.7-alpha. That doesn't mean it can't be changed after the alpha, but there will be a lot more reluctance to change things after that.
Yes, I wish I had the skills to have submitted a PR a long time ago. It is beyond my abilities to make it happen, even though I find it to be extremely important feature.
Even discounting the operator syntax question, there's still some complexity with name shadowing and multi-stage deprecations for each set of semantic concepts (current dot
and vecdot
and current norm
and vecnorm
).
For the dot
side it seems like the whole space of options (again discounting operators) is:
I. Silently break dot
on vectors of arrays by changing the behavior in 0.7 to an inner product without a standard depwarn (although you can warn that the behavior is being changed). Deprecate vecdot
to dot
in 0.7 as well.
II. In 0.7, deprecate vecdot
on all inputs to inner
.
III. In 0.7, deprecate dot
on vectors of arrays to its definition, and dot
on other inputs and vecdot
on all inputs to inner
.
IV. In 0.7, deprecate both dot
and vecdot
on vectors of arrays either to unexported functions or to their definitions, and vecdot
on all other inputs to dot
. In 1.0, add dot
on vectors of arrays with inner product semantics.
For the norm side there's some consensus around a single path (in 0.7, deprecate norm
on matrices to opnorm
and possibly deprecate vecnorm
to innernorm
; in 1.0, add norm
on matrices with current vecnorm
semantics), but that also results in an extra name in 1.0 (either vecnorm
or innernorm
); again a way to avoid that could be to deprecate vecnorm
in 0.7 to its definition or to an unexported function like Base.vecnorm
rather than to an exported name.
...I think. Hope I didn't make things mushier than they already were.
Can anyone familiar with the codebase submit a PR for the change?
Can we split off the norm stuff that everyone seems to agree on and get that done at least? The dot
versus inner
bit is quite a bit more controversial, but let's not let that stymie the part that's not.
@StefanKarpinski, note that they are somewhat coupled: for types where you have both a dot (inner) product and a norm, they should be consistent.
Ok, I don't really care which way this goes. Whoever does the work gets to decide.
I had a PR ( #25093 ) to make vecdot
behave as true inner product (and vecnorm
as the corresponding norm), by making them recursive. This might be useful as a starting point of how the future dot
and norm
should look like. Unfortunately, my lack of git skills made me screw up that PR so I closed it, planning to return to it after the new iteration syntax was completed.
However, having just become father for the second time a few days ago means that there are currently no "free time" slots in my calendar.
having just become father for the second time a few days ago
Congratulations Jutho! đ
Yes, congrats!
It seems like there might be some consensus forming around the idea of having both dot
and inner
, where:
inner
is a true recursive inner productdot = dot(x,y) = sum(x[i]'*y[i] for i = 1:length(x))
conjugated or not, and would therefore overlap with dot
for Vector{<:Number}
or Vector{<:Real}
Regarding:
Many people would write dot when they meant inner, and it would work fine for them in most cases, but then their code would do something unexpected if it is passed an array of matrices.
I don't believe that would be an issue. Since this is a fairly uncommon operation, I would expect people to at the very least try it to see what it does and/or go look at the documentation.
I think that the above would be a great change, and not very disruptive since the semantics of dot
are not changed in most cases.
It seems like there might be some consensus forming around the idea of having both
dot
andinner
To the contrary, the discussion from https://github.com/JuliaLang/julia/issues/25565#issuecomment-390069503 on appears to favor having one or the other but not both, as e.g. laid out in https://github.com/JuliaLang/julia/issues/25565#issuecomment-390388230 and well supported with reactions.
Maybe inner
(and also dot
) should be recursive inner/dot/scalar products and the old behaviour could be implemented in functions such as dotc(x,y) = sum(x[i]' * y[i] for i in eachindex(x))
and dotu(x,y) = sum(transpose(x[i]) * y[i] for i in eachindex(x))
? The names dotu
and dotc
would match corresponding BLAS names.
(I agree that âšu,vâ©, âšu|vâ©, or (u,v) are more common notations for inner products on arbitrary Hilbert spacesâthey are what I typically use myselfâbut those notations are a nonstarter for Julia. There was some discussion of parsing Unicode brackets as function/macro calls, e.g. #8934 and #8892, but it never went anywhere and this seems unlikely to change soon.)
@stevengj, when you added this paragraph to a previous comment by yourself, you meant that the syntax âšu,vâ©
is hard to implement in the language?
Any chance this feature will make it to Julia v1.0? I have so many ideas and packages that depend on the notion of general inner products. Please let me know if I should lower my expectations. Sorry for the constant reminder.
Have you not seen #27401?
Thank you @jebej and thank you @ranocha for taking the lead :heart:
when you added this paragraph to a previous comment by yourself, you meant that the syntax âšu,vâ© is hard to implement in the language?
Not technically hard to add to the parser, but it has proven difficult to come up with a consensus on how (and whether) to represent custom brackets in the language. See the discussion at #8934, which went nowhere 4 years ago and has not been revived since. (Add that to the fact that in different fields people use the same brackets for many different things, e.g. âšuâ© is used for ensemble averages in statistical physics.) Another issue, raised in #8892, is the visual similarity of many different Unicode brackets.
Thank you @stevengj, I appreciate the clarifications. I am already very excited that we are gonna have general inner products standardized across packages. :100: Maybe the angle bracket notation could shine in another release cycle in the future. Not as important, but quite convenient to be able to write code that is literally like the math in our publications.
If âšargs...â©
is a valid syntax for calling the anglebrackets
operator or something (what to call the function this syntax calls is actually kind of tricky since we don't have any precedent), then people could opt into whatever meaning they want for the syntax.
@StefanKarpinski, the argument in #8934 was that it should be a macro. I don't think we ever came to a consensus.
(If we decide in Base that anglebrackets(a,b)
means inner(a,b)
, that will discourage people from "opting into whatever meaning they want" because the decision will have already been made.
It's not a terrible choice, of course, but it may be unnecessary to assign this a meaning in Base as long as they are parsed.)
I don't recall the details of that discussion but making a macro seems like obviously a bad idea to me.
With #27401 I think we can consider inner products to have been taking seriously.
Traditionally, an issue is closed only when the relevant PR is merged...
Sure, we can leave it open I guess. Just wanted to get it off the triage label.
Should this be closed since #27401 is merged now?
Most helpful comment
Should this be closed since #27401 is merged now?