When trying to load a very old and large post from my blog and I get these errors:

As far as I've tested this fails randomly at lines 96, 349, or 372 of the parser when I reload the page.
I've tried with increasing the maximum file size allowed for the network but it didn't solve the issue. For context, if I deactivate the Gutenberg plugin, the post is loaded and shows the content.
Not sure how I can share any more details for debugging, just ping me and I'll provide what you need.
cc @aduth @dmsnell @pento as you did recent work on that part of the code.
Can you share the source of the post?
@dmsnell you can find it here. Expect broken links, non-existing media, and the like. I haven't touched it since I discovered the bug, so we can reproduce and debug what happened.
Hm. Not that big of a document. I'll start thinking about this - thanks!
I can reproduce similar excessive memory usage with Gutenberg on this post:
I believe that I know _why_ it's occurring and it's probably because of how we're handling those freeform HTML blocks. we build a list of each character in the parser and then later join them into a single string.
this was a fear of mine when converting to the nested parser with our move back to the idea of tokens (necessary) but not sure yet how to fix this.
A follow-up to this in Gutenberg 2.0: the post loads and the memory issue is gone.
Not sure if related to the fix or not, but when the post loads it has the raw HTML and the text is grey:

The console shows an error:

Attached the error log.
When I convert this to blocks, it's done but some paragraphs and items within lists are stripped.
For the record, the mismatch that fails the validation is based on around tags:

In contrast, _pasting_ the raw content works, though it's very slow (several seconds on a 3.1 GHz i7).
I can confirm that @aduth https://github.com/WordPress/gutenberg/pull/4591 fixed the issue: convert to blocks works. Thanks!
I have found some issues with a couple of lists when they are converted to Gutenblocks, but will report in a separate issue. The source of the error is likely an originally bad-formatted HTML.
Thanks for the follow ups, @nosolosw. :)
I've been battle-testing Gutenberg again with this post I wrote 10 years ago. I've tried to convert it to blocks using Gutenberg 4.4:
cc @dmsnell @mcsf @mtias @aduth @youknowriad
Going to see if I can bisect when this started to happen. Will post any follow ups here.
Bisected from v4.0.0-rc.1: it looks like the content loss may have started happening in 489eb792391a4d2bcb2bde701b52507f1d7334d1 cc @iseulde
Note that before that commit, Gutenberg reported 9.981 words after converting the old post to blocks (less than 10.605 in the original) but I couldn't find noticeable content loss.
@nosolosw could you report a new issue with it?
@iseulde created at https://github.com/WordPress/gutenberg/issues/12029
Thanks for battle-testing and dog-fooding, @nosolosw. Out of curiosity, since this issue was originally about the parser, have you noticed any issues with calling the server-side parser on that post of yours? Because since #10463 we no longer skip the parser (as fixed in #4591) but rather use the improved parser implementation added in #8083.
I wasn't able to become familiar with all Gutenberg components in these few weeks in the project, so I'm not sure what the server-side parsing means. I'm happy to try if I'm given some steps to reproduce.
Oh, now that I had some coffee, it occurred to me that by _server-side parsing_ you meant if the content within the classic block in Gutenberg was the same that the content in the classic editor? I've just compared both and yes, the content is equal.
cc @mcsf
Oh, now that I had some coffee
馃槃
it occurred to me that by _server-side parsing_ you meant if the content within the classic block in Gutenberg was the same that the content in the classic editor? I've just compared both and yes, the content is equal.
@nosolosw, actually I meant whether PHP's gutenberg_parse_blocks ever acts up with your largest posts. At its simplest, this means: when loading the front-end for a post, does anything ever fail? In more detail, this could mean examining the output of gutenberg_parse_blocks (called in do_blocks) for anything unexpected.
Most helpful comment
Thanks for the follow ups, @nosolosw. :)