Gutenberg: Unexpected block validation error with unescaped ampersands

Created on 29 Nov 2018  路  6Comments  路  Source: WordPress/gutenberg

Describe the bug
I have a block with a save function identical to the HTML block. When the block contains one unescaped ampersand and one escaped ampersand, validation fails after updating and refreshing.

This is a simple description to illustrate the issue. Since people will likely paste into this block, we could very well end up with a mix of escaped and unescaped ampersands.

I've used the HTML block to demonstrate this since the issue happens there too. The issue does not occur if both ampersands are unescaped...

screenshot

This is the result after updating and refreshing...

screenshot

To Reproduce
Steps to reproduce the behavior:

  1. Add an HTML block.
  2. Add a h2 tag with an unescaped ampersand.
  3. Add a p tag with an escaped ampersand.
  4. Update and refresh the page.

Expected behavior
The block should be valid.

Desktop:

  • OS: OSX 10.14
  • Browser: Chrome
  • Version: 70.0.3538.102

Additional context

  • WordPress 5.0 RC1
[Block] HTML [Type] Bug

All 6 comments

Is there anything I can do to help move this forward? I know everyone is super busy :)

For my use case, I'd be fine with overwriting with the expected markup. We want raw HTML without any kind of validation issues for this block. It's very similar to the HTML block.

It looks like #12708, #10444, #8166, and others might be working towards addressing this but it would be nice if the user didn't have to worry about that. They just want the block to work without a scary warning :)

FYI, the markup in my example above saves/validates fine on the first try but breaks once the page is reloaded.

Thanks!

Following up with an example that was sent in to us. Notice the use of & and & in the screenshot below. When those are present together in a block like the HTML block, you get the invalid content error...

screenshot

Hey @designsimply! I noticed that you added the Needs Testing tag. Is there anything I can do to help move this one forward (or any of the related issues)?

We're still seeing quite a few people running into broken blocks. While the ampersand example is all I have right now, I have a feeling there are other similar issues with HTML style blocks.

I don't mind the entities being encoded (and prefer it!) but it looks like there's a bug where they are encoded and pass the validation check sometimes, while other times they do not.

Thanks for your help!

@fastlinemedia! I'm so sorry for the delay! Thank you for the extra nudge, it is appreciated.

I tested and confirmed that using both & and & in a Custom HTML block and then refreshing the page results in an invalid block and the error: "This block contains unexpected or invalid content."

Steps to reproduce:

  1. Add a Custom HTML block.
  2. Paste the following content into the block:
<h2>Test & Test</h2>
<p>Test &amp; Test</p>
  1. Publish the post.
  2. Either reload the editor for that post or close and re-open it.

Result: the block becomes invalid and the following message is displayed:

This block contains unexpected or invalid content.

12488-19s

Tested with WordPress 5.0.3 and no active plugins using Chrome 71.0.3578.98 on macOS 10.13.6.

No problem and thanks @designsimply!

I have a pretty good sense of what's going on here. It's an issue within IdentityEntityParser implementation in the validator, where it's attempting to evaluate as an entity the following string:

' Test</h2>\n<p>Test &amp'

(The segment of text between an opening & and terminating ;)

Note that this only seems to affect strings where there's a plain ampersand and at some point later in the same markup string an encoded entity.

I'll try to put together a fix shortly.

Was this page helpful?
0 / 5 - 0 ratings