As already mentioned in a comment in #2718, a simple WordPress search for "paragraph" or "core" or "image" (if an image was added) shows unexpected results:
example.com/?s=paragraph
Gutenberg serialization markup leads to unexpected search results with above and many more keywords and keyword parts like para, graph, text, but, butt, button, cat, ate, categories, code, over, cover, form, head, ding, html, late, latest, post, list, quote, tor, table, ...
WP 4.9.1, Gutenberg Plugin 1.8.1
Could #1422 be related? CC: @youknowriad
@jasmussen I don't think so, this is a separate issue while searching uses the raw value of the post_content which includes the block comments and mess up with the results. wp:paragraph...
But this is not specific to Gutenberg, Gutenberg makes it more visible but this is a Core Bug that can be reproduced using shortcodes as well.
One new roadblock ?
Unfortunately, this is a known issue in WordPress core - in a vanilla WordPress install, if you search for "table", you'll get results including <table> tags. MySQL's string searching isn't capable of dealing with this kind of contextual parsing.
If you require 100% accurate search results, the best option is to use a dedicated search engine, like Elasticsearch. There are also Elasticsearch services available within the WordPress world, if setting up a dedicated search server is not an option.
Given Gutenberg blocks will add substantially more "hidden" strings, I wonder how much larger of a problem this will become. It'd be interesting to do some analysis comparing an English-language dictionary to partial string matches with Gutenberg blocks.
Most helpful comment
Given Gutenberg blocks will add substantially more "hidden" strings, I wonder how much larger of a problem this will become. It'd be interesting to do some analysis comparing an English-language dictionary to partial string matches with Gutenberg blocks.