I have a handler like this:
app.get('/rename/:id/:name', rename);
function rename(req, res, next)
{
var id = req.params.id;
var name = req.params.name;
}
And if I request the URL "/rename/1/%A0" I get this stack trace:
URIError: URI malformed
at decodeURIComponent (native)
at match (/usr/lib/node/.npm/express/2.3.3/package/lib/router/index.js:316:17)
at pass (/usr/lib/node/.npm/express/2.3.3/package/lib/router/index.js:98:19)
at Object.router as handle
A similar issue came up with querystring (https://github.com/joyent/node/issues/60) and seems to have been resolved with their unescape method.
the last time I saw this issue the guy used encodeURI() instead of encodeURIComponent()
or maybe it was escape
None of my code shows up in the stack trace, it seems to be all express and underlying libraries:
URIError: URI malformed
at decodeURIComponent (native)
at match (/usr/lib/node/.npm/express/2.3.3/package/lib/router/index.js:316:17)
at pass (/usr/lib/node/.npm/express/2.3.3/package/lib/router/index.js:98:19)
at Object.router as handle
at next (/usr/lib/node/.npm/connect/1.4.1/package/lib/http.js:204:15)
at next (/usr/lib/node/.npm/connect/1.4.1/package/lib/http.js:183:54)
at Object.assetManager as handle
at next (/usr/lib/node/.npm/connect/1.4.1/package/lib/http.js:204:15)
at Object.favicon as handle
at next (/usr/lib/node/.npm/connect/1.4.1/package/lib/http.js:204:15)
I haven't had time to trace through all of those calls to find an encodeURI() or escape() call yet.
what is generating that url?
For this simplified repro I'm typing it in by hand into Chrome. Here is a very minimal repro app:
var app = require('express').createServer();
app.get('/:param', function(req, res){ res.send('Hello world'); });
app.listen(3000);
Run that and visit the URL http://localhost:3000/%A0 to throw an exception. That is a valid (RFC 1738) URL that Express does not seem to cope with.
> encodeURIComponent(String.fromCharCode(0xA0))
'%C2%A0'
Yeah, 0xc2 0xa0 is the UTF-8 encoding for a non-breaking space, while 0xa0 is the ISO 8859-1 encoding. I fixed the bug on my client side where it was using 8859 encoding instead of UTF-8 for JSON requests, but is it a bug or known issue that express throws an exception for requests using a non-UTF-8 encoding?
it's not express it's decodeURIComponent
Would it make sense to wrap the call to decodeURIComponent in a try/catch and pass the original raw request through or trigger some sort of bad request response? I'd rather return a 500 error specifying exactly what happened to the client (invalid characters in the request) than just triggering my catch-all error handler and logging a stack trace.
I think it's fine to leave it as a regular error, doesn't differ much from anything else that could go wrong, leaving out the high byte just doesn't play well with the function but people should use encodeURIComponent() in whatever they are serializing
Ok, I'll close this one out. Thanks for looking into it.
it could always be treated specially in the error handler if you wanted since it uses a different constructor
Most helpful comment
the last time I saw this issue the guy used
encodeURI()instead ofencodeURIComponent()