Do you want to request a feature or report a bug?
I'm guessing it's a bug.
What is the current behavior?
The following source code,
<meta property="og:image" content="https://onepixel.imgix.net/60366a63-1ac8-9626-1df8-9d8d5e5e2601_1000.jpg?auto=format&q=80&mark=watermark%2Fcenter-v5.png&markalign=center%2Cmiddle&h=500&w=500&s=60ec785603e5f71fe944f76b4dacef08" />
, is being escaped once server side rendered:
<meta property="og:image" content="https://onepixel.imgix.net/60366a63-1ac8-9626-1df8-9d8d5e5e2601_1000.jpg?auto=format&q=80&mark=watermark%2Fcenter-v5.png&markalign=center%2Cmiddle&h=500&w=500&s=60ec785603e5f71fe944f76b4dacef08"/>
You can reproduce the behavior like this:
const React = require("react");
const ReactDOMServer = require("react-dom/server");
const http = require("http");
const doc = React.createElement("html", {
children: [
React.createElement("head", {
children: React.createElement("meta", {
property: "og:image",
content:
"https://onepixel.imgix.net/60366a63-1ac8-9626-1df8-9d8d5e5e2601_1000.jpg?auto=format&q=80&mark=watermark%2Fcenter-v5.png&markalign=center%2Cmiddle&h=500&w=500&s=60ec785603e5f71fe944f76b4dacef08"
})
}),
React.createElement("body", { children: "og:image" })
]
});
//create a server object:
http
.createServer(function(req, res) {
res.write("<!DOCTYPE html>" + ReactDOMServer.renderToStaticMarkup(doc)); //write a response to the client
res.end(); //end the response
})
.listen(8080); //the server object listens on port 8080
editor: https://codesandbox.io/s/my299jk7qp
output : https://my299jk7qp.sse.codesandbox.io/
What is the expected behavior?
I would expect the content not being escaped. It's related to https://github.com/zeit/next.js/issues/2006#issuecomment-355917446.
I'm using the og:image meta element so my pages can have nice previews within Facebook :).

Which versions of React, and which browser / OS are affected by this issue? Did this work in previous versions of React?
16.5.2
Has this ever worked as you intend? Can you send a fix?
We are solving the problem this way:
import Entities from 'html-entities/lib/html5-entities'
const entities = new Entities()
const contentRegExp = /content="([^"]+)"/g
const handleContent = (match, content) => {
return `content="${entities.decode(content)}"`
}
html = html.replace(contentRegExp, handleContent)
We spend ~1ms per request in the path. It's not too bad. I can give it a look at some point.
I have found this related issue: #6873. Digging into the implementation, the behavior comes from
https://github.com/facebook/react/blob/0005d1e3f54b79fe4707fbccc44b89e0fb4ce565/packages/react-dom/src/server/DOMMarkupOperations.js#L61
猬囷笍
https://github.com/facebook/react/blob/b87aabdfe1b7461e7331abb3601d9e6bb27544bc/packages/react-dom/src/server/quoteAttributeValueForBrowser.js#L17
猬囷笍
https://github.com/facebook/react/blob/b87aabdfe1b7461e7331abb3601d9e6bb27544bc/packages/react-dom/src/server/escapeTextForBrowser.js#L108
Now, all the escaping tests I could find are covering the children use case:
https://github.com/facebook/react/blob/b87aabdfe1b7461e7331abb3601d9e6bb27544bc/packages/react-dom/src/__tests__/escapeTextForBrowser-test.js#L23-L24
I have limited knowledge of web escaping related security issues.
I don't see any harm potential with:
const response = ReactDOMServer.renderToString(<span data-src={'&'}></span>);
expect(response).toMatch('<span data-reactroot="" data-src="&"></span>');
I have the same problem in the content of <style> elements:
const React = require("react");
const ReactDOMServer = require("react-dom/server");
console.log(ReactDOMServer.renderToStaticMarkup(
<html>
<head>
<link
href="https://fonts.googleapis.com/css?family=Source+Sans+Pro"
rel="stylesheet"
/>
<style>{`
html {
font-family: "Source Sans Pro", sans-serif;
}
`}</style>
</head>
<body>
<p>Test.</p>
</body>
</html>
));
This outputs:
<html><head><link href="https://fonts.googleapis.com/css?family=Source+Sans+Pro" rel="stylesheet"/><style>
html {
font-family: "Source Sans Pro", sans-serif;
}
</style></head><body><p>Test.</p></body></html>
By the parsing rules in the HTML spec (I'm consulting WHATWG here), the contents of elements style, xmp and iframe (as well as noscript, noframes and noembed when they're not being rendered) are parsed with the RAWTEXT tokenizer state, which treats everything as plaintext until it finds a matching closing tag.
Escaping the contents of style elements _is_, however, valid (in fact, mandatory for angled brackets) in the XML syntax of HTML; and indeed, adding an xmlns="http://www.w3.org/1999/xhtml" attribute to the <html> element results in valid XML. But if the intention of ReactDOMServer is indeed to render XML syntax, that should be explicitly noted in the documentation, because there are a number of tools (such as Next.js) which serve the output of these functions with content-type text/html.
@andreubotella This is a different problem, you should use dangerouslySetInnerHTML. Can an admin mark the comments as "resolved"?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contribution.
Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!
This is not a bug in React. Using an entity reference for & (e.g. &) is the correct behavior for xhtml documents:
In both SGML and XML, the ampersand character ("&") declares the beginning of an entity reference (e.g., ® for the registered trademark symbol "庐"). Unfortunately, many HTML user agents have silently ignored incorrect usage of the ampersand character in HTML documents - treating ampersands that do not look like entity references as literal ampersands. XML-based user agents will not tolerate this incorrect usage, and any document that uses an ampersand incorrectly will not be "valid", and consequently will not conform to this specification. In order to ensure that documents are compatible with historical HTML user agents and XML-based user agents, ampersands used in a document that are to be treated as literal characters must be expressed themselves as an entity reference (e.g. "
&"). For example, when thehrefattribute of the a`element refers to a CGI script that takes parameters, it must be expressed ashttp://my.site.dom/cgi-bin/myscript.pl?class=guest&name=userrather than ashttp://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user`.
In the HTML spec you do not need to use a character reference for & as long as what follows it is not a string that forms a named character reference.
The example they give is:
<a href="?bill&ted">Bill and Ted</a> <!-- &ted is ok, since it's not a named character reference -->
<a href="?art&copy">Art and Copy</a> <!-- the & has to be escaped, since © is a named character reference -->
Personally, I feel like React made the right call with escaping & since that works in both XHTML and HTML5.
In meta tags escaped paths don't work... otherwise, this bug would not have be opened.
This is the change needed to get the behavior you expect:
with an escape hatch:
if (tagVerbatim === 'meta' && propKey === 'content') {
markup = 'content="' + propValue + '"';
} else {
markup = createMarkupForProperty(propKey, propValue);
}
This would explicitly exempt the meta tag's content attribute from being properly escaped which wouldn't help @oliviertassinari's issue of wanting <span data-src={'&'}></span>.
A more generic solution would involve having something like dangerouslySetAttributes
<span
dangerouslySetAttributes={{__attributes: [{name: 'data-src', value: '&'}]}}
/>
This could easily lead to parsing errors and unexpected results if any value after the & is a named character reference e.g. © (without the ;)
Again, the issue was with the HTML parser FB was using for the Sharing Debugger, not React. It is properly parsing the escaped paths now.
Most helpful comment
This is the change needed to get the behavior you expect:
Replace https://github.com/facebook/react/blob/ee409ea3b577f9ff37d36ccbfc642058ad783bb0/packages/react-dom/src/server/ReactPartialRenderer.js#L383
with an escape hatch:
This would explicitly exempt the
metatag'scontentattribute from being properly escaped which wouldn't help @oliviertassinari's issue of wanting<span data-src={'&'}></span>.A more generic solution would involve having something like
dangerouslySetAttributesThis could easily lead to parsing errors and unexpected results if any value after the
&is a named character reference e.g.©(without the;)Again, the issue was with the HTML parser FB was using for the Sharing Debugger, not React. It is properly parsing the escaped paths now.