Express: Option to unicode escape <, >, and & in JSON output

Created on 5 Apr 2017  路  17Comments  路  Source: expressjs/express

An option to unicode escape <, >, & as \u003c, \u003e, \u0026 "to keep some browsers from misinterpreting JSON output as HTML."

https://golang.org/src/encoding/json/encode.go?s=6456:6499#L48

This can be implemented as a middleware layer monkey-patching res.send as below, but it might be nicer to have in core.

const escapeJson = function (json) {
    if (typeof json !== 'string') {
    return json;
    }
    return json.replace(/</g, '\\u003c')
           .replace(/>/g, '\\u003e')
           .replace(/&/g, '\\u0026');
};

module.exports = function(req, res, next) {
    var _send = res.send;

    res.send = function (body) {
    if (res.get('Content-Type') === 'application/json') {
        body = escapeJson(body);
    }
    _send.call(res, body);
    };

    next();
};
discuss ideas

Most helpful comment

Delivered as part of the 4.16 release, currently off by default.

All 17 comments

Hi @g-k I can certainly see the use-case this would serve. Would like to hear what others think as well, as I'm generally neutral on this right now.

I think it would be useful.

I do not think this is the responsibility of express. I think it is, and should remain, the developer who is inserting the json into the html's job to properly escape the content.

FWIW, I have two modules to for this, one for the server side escaping, and one for the client side unescaping, both using the underlying escape method from lodash:

https://github.com/wesleytodd/recursive-escape
https://github.com/wesleytodd/recursive-unescape

The issue is escaping JSON so it isn't misinterpreted as an HTML page when navigated to directly (for browsers that do mime type sniffing). For example, XSS from a JSON response like:

{"post_id":"81730c8682f1efa5","title":"<img src=x onerror=alert(1)>"}}

X-Content-Type-Options: nosniff should be set as a defense in depth measure too.

Using an escaping library and res.send with a string or buffer would work too, but I'd rather have a safe default that works with res.json and res.send called with an object.

I think the proposed change is a reasonable default or option for users to enable. Providing a hook for encoding (like node-restify formatters) is an option too.

Oh, that is a different context than I got from reading the comments in the go code, which just talked about embedding it in a script tag. I still think doing it by default is a bad idea, but I could get onboard with an option to turn this escaping on.

Oh yeah, the script context thing is for the line and paragraph separators which express already does for JSONP: http://timelessrepo.com/json-isnt-a-javascript-subset

This does incur a small perf cost and some apps won't care or can guarantee they aren't consumed by older brower clients (e.g. for internal or mobile app apis), so I'd be in favor for having an option to enable this with it off by default.

Just for clarification, this use case is ONLY for browsers directly requesting the json asset and displaying it to a user, not for any ajax or programmatic use?

If the answer to the above is true, I really do think that this should be left to a third party library, as it is too niche of a use case, IMO, to receive built in support. This could be as simple as:

var myJsonEscaper = require('my-json-escaper')

app.use(function (req, res) {
  res.json(myJsonEscaper({foo: '<bar>'}))
})

this use case is ONLY for browsers directly requesting the json asset and displaying it to a user, not for any ajax or programmatic use?

Yep, at least I can't think of any other use cases at the moment. If an attacker can get someone to visit their site they can programmatically redirect or window.open the JSON page for XSS. Or iframe it if the JSON endpoint is missing X-Frame-Options. There are a couple examples from the wild in: https://blog.qualys.com/securitylabs/2014/09/11/xss-vulnerability-shows-how-security-issues-can-creep-into-popular-software

In general, I'd rather have a safe default to enable than use middleware to monkey patch res.send or manually escape all JSON passed to res.json and res.send.

Escaping is a built-in option in rails and google considers it enough of a best practice to enable by default in the stdlib in go.

According to that post:

By referring to the methods provided in the article, I have confirmed that this vulnerability could be reproduced and exploited with IE8 (Exploited with IE8.0.6001.17184), IE7 and IE6. But I was unable to exploit the vulnerability with IE9 because the response page whose content type is application/json could not be rendered by IE 9.

As much as I totally agree we should have safe defaults, Microsoft dropped support for ie9 almost a year ago, so IMO setting a default behavior that would be a breaking change is not only a bad idea, but mostly pointless from a security perspective. The real issue is that by escaping the data, something on the other end has to unescape it, thus this being a breaking change.

I'm fine with this being opt-in I'd just like to have the option available.

Unless there is a strong objection, I'm proposing that this lands in the 4.16.0 release.

My only objection is that this has weak reasons for inclusion. IE9 support is a non-reason for me, and I have never worked on, nor heard of an application, where non-developer users are encouraged to directly visit a url that serves a json body. Also considering how easy it is for a third party library to solve this issue for the people who have it, I believe this is best left to the user space.

It loos like the discusion above has some confusion going on. This is strictly a to prevent a form of XSS attack. For example, the extra encoded chars do not require any extra unescaping on the cliebt side to work; it is just standard JSON escaping. An attacking website can include an iframe (or even a top level redirect) to a JSON endpoint due to no faut of yours. If it is then one of the web browsers who then in turn see the large number of angle brackets and decide that it should be displayed as HTML instead of as text, then whatever HTML was in that response will execute.

An example attack is a comment board where all comments are plain text. User can submit anything and you duefully properly insert it on your page to display. You can load these comments via AJAX over a JSON endpoint, though. Now a user posts a comment with a script tag which doesn't cause XSS because you accounted for properly encoding everything that displays it, but it was unfortunate for a user who visited a sute where the ad network was malicious and included an iframe tag to your comments JSON where they planted that comment. Cookies are send to the attacker.

Effectively this is an issue for all users of Express.

Ok, that is much more clear to me. Thanks @dougwilson! This explanation changes my whole position from above. Do we plan on turning this on by default for 5.x?

Glad it helped :) ideally it could just be default in 4.x especially since it is transparent for all well behaved clients, but the size of the install base makes me iffy on that. We can always turn it on in a later 4.x if warranted or even yell (ugh) at users to make them either opt in or out, not sure. As for 5.x itself I didn't givr too much though besides to follow 4.x, but maybe it should default on in 5 even if it never does in 4.

Delivered as part of the 4.16 release, currently off by default.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ZeddYu picture ZeddYu  路  3Comments

wxs77577 picture wxs77577  路  3Comments

Domiii picture Domiii  路  3Comments

extensionsapp picture extensionsapp  路  3Comments

haider0324 picture haider0324  路  3Comments