Hello everyone,
the MathML structure now supports two markups: "presentation MathML" renders the layout of equations (it is what KaTeX already implements today), and "content MathML" encodes the semantic and makes the equations understandable by computers and especially screenreaders; implementing it in KaTeX would be a good idea.
If you're looking for the specification of content MathML, all the details can be found in the fourth chapter of the specification. I will be a long work, because I guess a need serializer needs to be written, but may be a good way to get fully standard-compliant speaking of MathML.
I you're looking for more information and an example, this MDN page is a good place to start.
I am familiar with "content" MathML. In general converting from a presentational description of math to a semantic one is very difficult. Some parts of TeX are pretty easy to convert, e.g. \frac{x}{y}. Others are verify difficult/impossible, e.g. typesetting long division. A more middle of the road example would be something like f(x) = x^2. We'd have to recognize the series of atoms: f, (, x, ), =, represents a function definition. This is the kind of thing that would fit well as an extension in our contrib folder.
Converting from a "content" MathML (or some other semantic representation) to a LaTeX (or other "presentational" format) is an easier direction to go. Why not start with a semantic representation and use KaTeX for rendering?
One thing that would make things a little easier is being able to specify a list of macros to not expand. We'd also need a way to specify what family each of the unexpanded macros is.
I thought about \frac a little more and even that's not that easy to convert since \frac is also used for partial differentials.
Most helpful comment
I thought about
\fraca little more and even that's not that easy to convert since\fracis also used for partial differentials.