Gutenberg: Characters for greater than and less than are not encoded as entities in attribute values

Created on 14 May 2019  ·  6Comments  ·  Source: WordPress/gutenberg

When adding > or < into an attribute value as follows:

<!-- wp:paragraph -->
<p><abbr title="1 &gt; 2">False</abbr> and <abbr title="1 &lt; 2">True</abbr></p>
<!-- /wp:paragraph -->

The serializer forces them to be > and < respectively.

<!-- wp:paragraph -->
<p><abbr title="1 > 2">False</abbr> and <abbr title="1 < 2">True</abbr></p>
<!-- /wp:paragraph -->

This gets rendered by WordPress as:

<p><abbr title="1 > 2&#8243;>False</abbr> and <abbr title="1 < 2">True</abbr></p>

The “False and” text is erroneously consumed be the attribute:

image

In this case, it seems to be a problem with wptexturize(), as it is converting the attribute closing " into &#8243;.

But if the block serializer output > as &gt; then this symptom could be avoided.

Possible regression after https://github.com/WordPress/gutenberg/pull/9963.

Relates to https://github.com/WordPress/gutenberg/issues/9915, https://github.com/WordPress/gutenberg/issues/8779, https://github.com/WordPress/gutenberg/issues/12683.

Tested in v5.7.0-rc.1.

[Feature] Parsing [Feature] Rich Text [Feature] Saving [Type] Bug

Most helpful comment

So the blame lies entirely with wptexturize()?

Technically it always has 😆 #9963 was always a bandaid workaround because nobody dared touch the texturize function.

https://core.trac.wordpress.org/ticket/45387

All 6 comments

Technically it would be a duplicate of #12683, though the earlier issue was intended to have been closed as a result of the merge of #9963.

I don't think it's entirely a regression of #9963, in that the element serializer itself is not the culprit, and it does still encode the characters.

I think the issue is a combination of RichText.Content effectively bypassing the serializer using RawHTML (source) and the behavior of the browser DOM in parsing a the paragraph contents as HTML, where &gt; is normalized in this context to > (because, as noted in #9963, it is _not invalid_).

var d = document.createElement( 'div' );
d.innerHTML = '<p><abbr title="1 &gt; 2">False</abbr> and <abbr title="1 &lt; 2">True</abbr></p>';
console.log( d.innerHTML );
// "<p><abbr title="1 > 2">False</abbr> and <abbr title="1 < 2">True</abbr></p>"

So the blame lies entirely with wptexturize()?

So the blame lies entirely with wptexturize()?

Technically it always has 😆 #9963 was always a bandaid workaround because nobody dared touch the texturize function.

https://core.trac.wordpress.org/ticket/45387

Anything actionable? Is this still an issue?

@ellatrix This is still an issue as of v7.1.0-rc.1.

Was this page helpful?
0 / 5 - 0 ratings