http, querystringNode is not able to properly send query string parameters with russian language characters.
Example:
var http = require('http');
http.get('http://httpbin.org/get?search=芯斜褟蟹邪褌械谢褜薪褘泄&a=b c', function (response) {
// Continuously update stream with data
var body = '';
response.on('data', function (d) {
body += d;
});
response.on('end', function () {
console.log(body);
});
});
Results in the output:
{
"args": {
"a": "b c",
"search": ">1O70B5;L=K9"
},
"headers": {
"Host": "httpbin.org"
},
"origin": "106.51.38.69",
"url": "http://httpbin.org/get?search=>1O70B5%3BL=K9&a=b c"
}
Note that the space between b and c was correctly escaped by Node, but the first parameter ("search") sent garbage. Ideally, that should be escaped as well. Chrome XMLHttpRequest handles the encoding correctly.
Oh I remember this, there's an older issue for it. IIRC it's not allowed in the spec and is a potential security issue of we do.
Edit: hmmm, maybe I'm remembering https://github.com/nodejs/node/issues/1693 ... this could be different.
cc @nodejs/http probably
This is one of the several issues we have in the current url.parse implementation relating to extended character support. The new WHATWG URL parser should handle this without problem but it's going to take a bit too work that in still. I'll investigate to see how difficult it would be to fix this in the current impl.
I think the problem lies here: https://github.com/nodejs/node/blob/c8619ea3c38d025e6558ee19b40cd5b8f9d49f73/lib/querystring.js#L96, specifically in the multi-byte handling.
In the meantime, is there any workaround for this?
In the meantime, is there any workaround for this?
Well, you can alway manually encode query parameters
@vkurchatkin That results in double-encoding of parameters. Unfortunately, there does not seem to be a way to turn off encoding in the http.request() API.
@czardoz I'm 98% sure you can work around that by using a { host: '...', path: '...' } options object.
Something like this:
var http = require('http');
var options = {
host: 'httpbin.org',
path: `/get?search=${encodeURIComponent('芯斜褟蟹邪褌械谢褜薪褘泄')}&a=b%20c`
};
http.get(options, function (response) {
// Continuously update stream with data
var body = '';
response.on('data', function (d) {
body += d;
});
response.on('end', function () {
console.log(body);
});
});
Just to confirm, the new WHATWG URL API does handle this correctly:
> process.versions
{ http_parser: '2.7.0',
node: '7.0.0',
v8: '5.4.500.36',
uv: '1.9.1',
zlib: '1.2.8',
ares: '1.10.1-DEV',
icu: '57.1',
modules: '51',
openssl: '1.0.2j' }
> new url.URL('http://httpbin.org/get?search=芯斜褟蟹邪褌械谢褜薪褘泄&a=b c')
URL {
href: 'http://httpbin.org/get?search=%D0%BE%D0%B1%D1%8F%D0%B7%D0%B0%D1%82%D0%B5%D0%BB%D1%8C%D0%BD%D1%8B%D0%B9&a=b%20c',
protocol: 'http:',
hostname: 'httpbin.org',
pathname: '/get',
search: '?search=%D0%BE%D0%B1%D1%8F%D0%B7%D0%B0%D1%82%D0%B5%D0%BB%D1%8C%D0%BD%D1%8B%D0%B9&a=b%20c'
}
Until the big is fixed in node.js core, you may be able to blindly toss the URL into the https://www.npmjs.com/package/encodeurl module to get it encoded (and without encountering double-encoding).
This issue has been inactive for sufficiently long that it seems like perhaps it should be closed. Feel free to re-open (or leave a comment requesting that it be re-opened) if you disagree. I'm just tidying up and not acting on a super-strong opinion or anything like that.
Most helpful comment
Until the big is fixed in node.js core, you may be able to blindly toss the URL into the https://www.npmjs.com/package/encodeurl module to get it encoded (and without encountering double-encoding).