My recent series of Unicode pull requests omitted some characters because their proper treatment wasn’t obvious. This issue addresses some of those characters, presents a proposed treatment, and solicits opinions. Anyone can opine, but responses will be especially valued from @kevinbarabash, @edemaine, @xymostech, @sophiebits, @gagern, @kasperpeulen, and @flying-sheep, If and when we reach agreement, I’ll submit PRs for the characters that we’ve agreed upon.
For each of the following items, I ask that you consider the resolution and either vote thumbs up, or vote thumbs down and state a reason why.
∴ ∵ These two characters obviously map to \therefore and \because. I’ve withheld them until now because there is a conflict as to the proper type of atom. KaTeX symbols.js says they are rel atoms. unicode-math says they are \mathord. I think KaTeX got this one right. They come from the amssymb package, which contains: \DeclareMathSymbol{\therefore} {\mathrel}{AMSa}{"29}.
Resolved: Map ∴ ∵ to KaTeX symbols \therefore and \because respectively.
⋗ ⋖ These two characters obviously map to \gtrdot and \lessdot, but there is again a type conflict. KaTeX symbols.js says they are bin atoms. unicode-math says they are \mathrel. I’’ve never used these symbols, so I don’t have a personal opinion on this one. The symbols come from the amssymb package, which contains: \DeclareMathSymbol{\gtrdot} {\mathbin}{AMSb}{"6D}
Resolved: Map ⋗ ⋖ to KaTeX symbols \gtrdot and \lessdot respectively.
∣ This character is U+2223, DIVIDES. I’ve withheld it until now only because it looks just like |, U+007C. So the question here is do we include any confusables at all. If we do, then the path forward is reasonably clear. unicode-math maps this character to \mid, the vertical line with rel spacing, useful for set builder notation. John Cook maps it the same way, as does Microsoft Word.
Resolved: Map ∣ to KaTeX symbol \mid.
That’s all I have prepared for now. More to come later.
I would have expected U+2223 (DIVIDES) to resolve to \divides, though I guess that's only defined in a somewhat obscure package (mathabx). I think \mvert which is equivalent to \mid is the typical way to denote the divides operator in AMSMath. (FWIW, mathabx's \divides is a thinner vertical line with somewhat larger space than \mid.)
Also I observe that unicode-math defines U+2223 to \mid. Overall I'm in favor of this.
The type conflicts are weird -- why does unicode-math redefine the type of existing characters? I think it makes sense for us to follow the aliasing, but not redefine the types. I find it weird that \gtrdot and \lessdot are \mathbin not \mathrel, but if AMS defines them that way, I'm fine with it.
i’m for using confusables.
are all confusable, and those are just the most common ones, there’s more!
LaTeX translates hyphen-minus into minus, idk if we do it, too.
Most helpful comment
∣This character is U+2223, DIVIDES. I’ve withheld it until now only because it looks just like|, U+007C. So the question here is do we include any confusables at all. If we do, then the path forward is reasonably clear. unicode-math maps this character to\mid, the vertical line withrelspacing, useful for set builder notation. John Cook maps it the same way, as does Microsoft Word.Resolved: Map
∣to KaTeX symbol\mid.