Katex: Add \html command to insert HTML

Created on 12 Aug 2018  路  20Comments  路  Source: KaTeX/KaTeX

I would like to be able to insert some html into katex output (ex of use: inserting a svg image in the middle of latex code).
Would it be possible to have a simple function, say html, that takes as arguments the height/depth and the html code, and inserts it into the katex output?
I have actually managed to do it myself, but my code is a bad hack, and I won't even suggest using it as a basis for proper code...
(if this issue has already arisen, feel free to direct me to past discussions)

enhancement security

All 20 comments

I think https://github.com/Khan/KaTeX/issues/898 would cover your request for insert an SVG image in the middle of some LaTeX. \html seems useful too. I think we'd want to put some restrictions on the \html. For instance with \includegraphics you can optionally include dimensions. Making that mandatory would be helpful for both \html and \includegraphics. In the latter case, we may not know how big an image is until it's loaded but we need that information during typesetting which happens first.

I hadn't thought of using \includegraphics, I'll give it a try.

@pzinn we don't support \includegraphics yet. PRs are welcome.

oh. oops. in any case, to reply to your comment, I agree that the dimensions (say height/depth) should be mandatory. In a sense I think katex shouldn't worry about it -- if the user gives wrong information (e.g., if the dimensions can't really be determined yet), then so be it.

This probably goes without saying, but \html and even \includegraphics create rather large security holes if the LaTeX source is untrusted. (\html{<script>...</script>} most notably, but also including an image from an arbitrary host can enable broad tracking.) We're wondering the same in #1437 with the much simpler notion of class and id attributes. I feel like we need a security option...

I'm no expert on security issues, so pardon my very naive question: how is \html{<script>...</script> a security hole exactly? i.e., whenever one opens a webpage in the browser, one allows the author of the page to execute arbitrary javascript code on our client, so how is the \html above any different?

As to contributing myself, I'd be happy to, but to be honest there's just not enough documentation for an outsider to be able to write proper code. For example, what are the different types of arguments to a function? is that explained anywhere? I'm reading defineFunction.js but don't see anything there (incidentally, I suspect L73 should read "math mode", not "text mode")

The reason why it could be a security hole is for pages that allow users to submit math and then render it to other users. Rendering arbitrary JS in your own browser isn't a big deal, but this would all a user of such a site to run JS in other people's browsers.

@pzinn You'll get a lot of mileage by looking at the types. defineFunction refers to type FunctionDefSpec, and its props has type FunctionPropSpec, and its argTypes has type ArgType[] (admittedly, this is a deep chain). ArgType is defined in types.js with a decent comment:

// LaTeX argument type.
//   - "size": A size-like thing, such as "1em" or "5ex"
//   - "color": An html color, like "#abc" or "blue"
//   - "url": An url string, in which "\" will be ignored
//   -        if it precedes [#$%&~_^\{}]
//   - "original": The same type as the environment that the
//                 function being parsed is in (e.g. used for the
//                 bodies of functions like \textcolor where the
//                 first argument is special and the second
//                 argument is parsed normally)
//   - Mode: Node group parsed in given mode.
export type ArgType = "color" | "size" | "url" | "original" | Mode;

Unfortunately, \html is going to be tricky because (I think) it needs to work at the lexer level -- you don't want it parsed by LaTeX at all. You'd want to mimic how \verb, \url, and \href are currently implemented, primarily in Lexer.js and Parser.js.

Agreed about L73 of defineFunction.js. Feel free to submit that as a tiny PR!

thanks, will take a look when I have time. and, yes, my current version of html is a lame hack of url / href.

I've been experimenting with injecting HTML into KaTeX to provide an entry box in which users can enter simplified roots of the quadratic formula (https://trkern.github.io/quadratic_test.html). The result could use some tweaking but it looks really good and even by default correctly handles adjusting the width of the input boxes.

The basic process is to renderToString the formula with unique characters (in this case alpha, beta,... with adjusted heights) in place of the input boxes, and then String.replace the result to replace the characters with "

Hopefully this can provide some motivation and use cases for insertion of HTML.

As much as I would like to see being able to fully adjust the size of inserted HTML, this would require shifting all computations of dimensions to the browser via CSS which is probably possible, but seems like a great deal of work.

@trkern that's really neat. Thank you for sharing. We're in the process of finishing up #1794. Once that's merged we'll have a new trust setting that people can hide unsafe commands behind. I think implementing \html is probably the way to go since that should be able to output <input> elements directly.

The trust setting changes have been merged into master so if anyone's interested in putting up a PR for this, go for it.

I added the security tag to this not b/c it adds security, but rather we'll need to be mindful of security when implementing this.

For this command we may want to provide a way to limit which tags are allowed inside of it.

We could just give the entire argument to the trust function (as a new field, maybe html) and let it decide.

That would definitely be the least amount of work. I'm sure there's all sorts of tools out there already for checking for certain tags in an HTML string. That fits with our goal of not having dependencies.

Quite excited about this feature. I am working on a text format called "Recursive Text", which is supposed to mix features of Markdown and Latex, and make HTML/CSS/KaTeX the primary output of rendering. Having the html command would allow for much neater integration as it would be possible to nest inner recursive text within outer recursive text even when the outer recursive text is rendered via KaTeX.

Is there any progress on this?

Yes. See #2082.

looks great, will try it out!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

HughGrovesArup picture HughGrovesArup  路  4Comments

q2apro picture q2apro  路  3Comments

hagenw picture hagenw  路  3Comments

mpolyak picture mpolyak  路  3Comments

mbourne picture mbourne  路  3Comments