Katex: Chaining \sqrt can be used to break sites

Created on 16 Oct 2014  路  12Comments  路  Source: KaTeX/KaTeX

Consider the following:

\Huge \sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{\sqrt{}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}

Someone on my /b/ board just discovered this, it makes JS take up 100% CPU. I've had to emergency disable KaTeX.

Most helpful comment

@DouweM Anysite that allows user input that has to run something through JS would eventually get pegged. GH uses a checkbox list renderer that I am sure if you added a couple thousand markdown checkboxes you could get GH to crash. The question is _how many_ sqrt's make it crash. If it's 5 that's a problem. If it's 9000 that's probably fine.

All 12 comments

I tried this out and if you give KaTeX enough time it will render it. Once it's rendered the browser is very sluggish. There are two problems:

  • rendering time (spent in KaTeX) for super complex expression (my guess is that nesting affects this more than the number of \sqrts, but I'll have to do some testing to validate this assumption, but at a certain point a super lengthy non-nested expression will be too much)
  • complexity of the resulting HTML

The easiest solution would be have some sort of heuristic for judging whether an expression is too complex or not and then don't process it or throw an "expression is too big error". We could also keep track of time and if we've be processing for more than 20ms, throw a "taking too long error". The timeout could be user configured.

We could also look at running KaTeX in a web worker. This doesn't solve the issue of the resulting HTML being overly complex.

We are thinking of using your library on our own large site. But are concerned about these security vulnerabilities. What is the status of these?

@jschatz1 no update. You could count how many times \sqrt occurs in the input string, but this doesn't protect against other as of yet unknown vulnerabilities. I think rendering to a string in a web worker's probably the way to go when dealing with possibly hostile input.

I tried the same \sqrt chain with MathJax and it renders it in a fraction of the time.

@kevinbarabash do you mean MathJax renders it faster or KaTeX renders it faster? I find that MathJax renders a low-quality version quickly and then still pegs my CPU to 100% for a while while it renders the high quality version.

@xymostech I pasted the code from the original post into the demo on https://www.mathjax.org/ and it shows up quick and it seems to be the final render. It's using MathJax's HTML renderer.

I tested putting the giant expression in to the "ask a question" box on math.stackexchange.com/questions/ask, and it indeed also pegs my CPU and freezes the page (they use MathJax, for the record). They seem like a pretty high-profile site, so if they don't have protections against it, I wouldn't be tooo worried.

@kevinbarabash Wow, you're right! Much quicker there for me. Maybe it's a newer version of MathJax? It's speeding up! :)

The easiest solution would be have some sort of heuristic for judging whether an expression is too complex or not and then don't process it or throw an "expression is too big error". We could also keep track of time and if we've be processing for more than 20ms, throw a "taking too long error". The timeout could be user configured.

I would like either or both of those, how involved do you think it would be to implement that?

We could also look at running KaTeX in a web worker. This doesn't solve the issue of the resulting HTML being overly complex.

Interesting. That would involve using renderToString in the web worker, and putting it into the DOM after it's finished?

We are thinking of adding math rendering support using KaTeX to GitLab (https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/8003), and although I don't think it's a blocker, we would prefer for users to _not_ be able to blow up the page with a specially crafted comment :)

@DouweM Anysite that allows user input that has to run something through JS would eventually get pegged. GH uses a checkbox list renderer that I am sure if you added a couple thousand markdown checkboxes you could get GH to crash. The question is _how many_ sqrt's make it crash. If it's 5 that's a problem. If it's 9000 that's probably fine.

It seems KaTeX rendering speed and performance have been greatly improved, and the nested \sqrts render instantly. If this is still an issue, please reopen the issue. Thank you!

I tried it for myself just now and was pleasantly surprised. There is probably still some number of \sqrts that will cause issues, but any large amount of input will causes issues at some point so sanitizing user input is important. That being said, I don't believe input sanitization is necessarily something KaTeX needs to solve.

@kevinbarabash Welcome back!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

HughGrovesArup picture HughGrovesArup  路  4Comments

pvnr0082t picture pvnr0082t  路  4Comments

asmeurer picture asmeurer  路  3Comments

OisinMoran picture OisinMoran  路  4Comments

oddhack picture oddhack  路  3Comments