cache
in the GitHub issues, found some issues on cache but not what I'm asking for.I'm parsing a fairly heavy (500KB) piece of user-provided text using a ~1000 lines grammar.
{cache: true}
...--max-old-space-size=3000
), the heap grows to 2.5GB, and parsing succeeds in 12s.{cache: false}
, as expected, parsing clocks slightly faster at 10s (non-pathological case) and doesn't balloon memory usage.This is user data and my server resources are limited, so bumping Node to use X GB of heap isn't an option, as tomorrow I might get 1MB user data that would require X+1 GB of heap. And of course I'd like to keep using {cache: true}
when possible, to " avoid exponential parsing time in pathological cases", which I've met.
What approach do you recommend?
{cache: true}
based on the size of the input. That will cost me more CPU usage, but at least I won't OOM.Thanks for PEG.js! 馃檪
Exponential parsing times is something that happens in very pathological cases, and I'd recommend to rewrite the grammar there.
Consider https://github.com/sirthias/pegdown/issues/43#issuecomment-18469752
(I'm not a contributor)
As @polkovnikov-ph pointed out, it is best to rewrite parts of your grammar that deal with pathological cases, but if you continue hitting OOM cases, it might be better to do what you (@ronjouch) suggested; toggle the _cache_ option based on the size of the input: cache: input.length >= 250000
After this (and only if you have access to the user provided text), I would suggest examining any input that hits OOM cases to locate any common pathological cases and update your grammar to explicitly handle these so that you can reduce the number of OOM cases hitting your app.
If you are still hitting OOM cases often, and are willing to not only rewrite your grammar but also add an extra pass (or few) to your toolchain, I'de suggest trying any of these methods:
@polkovnikov-ph @futagoza thanks to both of you for taking the time to come back with advice 馃憤! That makes sense. I deployed the size workaround, and will consider rewriting the grammar next time trouble knocks at the door. Good day; closing the question.