https://runkit.com/szmarczak/5f0d0b92c01e77001bd2be09
const https = require('https');
https.request('https://www.sachadrake.com/images/assetimages/social_media_default.jpg', {
method: 'HEAD'
}, response => {
console.log(response.statusCode);
}).end();
Always.
Error {
bytesParsed: 478
code: "HPE_INVALID_CONSTANT"
rawPacket: Buffer <48, 54, 54, 50, 2F, 31, 2E, 31, 20, 34, 30, 34, 20, 4E, 6F, 74, 20, 46, 6F, 75, 6E, 64, 0D, 0A, 43, 6F, …>
reason: "Expected HTTP/"
message: "Parse Error: Expected HTTP/"
}
The server replies with these data:
HTTP/1.1 404 Not Found
Content-Type: text/html
Server:
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: NOSNIFF
X-XSS-Protection: 1
Content-Security-Policy: default-src * 'unsafe-inline' 'unsafe-eval' ; img-src * data: 'unsafe-inline' ; font-src * data: 'unsafe-inline' ; media-src * blob: 'unsafe-inline' ; frame-src * data: 'unsafe-inline' 'unsafe-eval' ;
Date: Tue, 14 Jul 2020 01:35:23 GMT
Strict-Transport-Security: max-age=3600
Transfer-Encoding: chunked
/cc @Kikobeats
@nodejs/http-parser
Here is a simpler test case without external servers:
'use strict';
const http = require('http');
const net = require('net');
const server = net.createServer();
server.on('connection', function (socket) {
socket.resume();
socket.write(
[
'HTTP/1.1 404 Not Found',
'Transfer-Encoding: chunked',
'',
'0',
'',
''
].join('\r\n')
);
});
server.listen(function () {
const request = http.request({
method: 'HEAD',
port: server.address().port
});
request.on('response', function (response) {
response.resume();
console.log(response.statusCode);
});
request.end();
});
It seems this happens every time the HTTP response to a HEAD request has a message body. The only relevant parts I can find on RFC 7230 are:
https://tools.ietf.org/html/rfc7230#section-3.3
Responses to the HEAD request method (Section 4.3.2
of [RFC7231]) never include a message body because the associated
response header fields (e.g., Transfer-Encoding, Content-Length,
etc.), if present, indicate only what their values would have been if
the request method had been GET (Section 4.3.1 of [RFC7231])
https://tools.ietf.org/html/rfc7230#section-3.3.3
Any response to a HEAD request and any response with a 1xx
(Informational), 204 (No Content), or 304 (Not Modified) status
code is always terminated by the first empty line after the
header fields, regardless of the header fields present in the
message, and thus cannot contain a message body.
Right, this issue has come up before - numerous times, in fact. Servers shouldn't include a response body when replying to HEAD requests but some buggy servers do.
The parser (correctly) assumes that the first byte after the headers-terminating newline is the start of a new response.
Node could work around it if it tracked requests and responses, i.e., if it knew no second response is expected because no matching request has been fired off yet. Would still break with HTTP pipelining though.
/cc @ronag (slightly related with https://github.com/mcollina/undici/issues/246 and https://github.com/mcollina/undici/pull/250)
From https://tools.ietf.org/html/rfc7231#section-4.3.2
The HEAD method is identical to GET except that the server MUST NOT
send a message body in the response (i.e., the response terminates at
the end of the header section).
so I think it's ok to mark this as "wontfix".
/cc @ronag (slightly related with mcollina/undici#246 and mcollina/undici#250)
What we did in undici was that if we suspect that the server might be sending extra data after the expected response we simply stop pipelining, ignore the extra data and close the connection. Though currently we only do this when we send requests that have undefined semantics, i.e. payload with GET or HEAD, and we know that the server might "misbehave".
Though in this case we are sending a valid request and the server is providing an invalid response, i.e. no hint regarding server behavior. I don't have any good ideas how to detect that without fully disabling pipelining.
Without pipelining we could do the above and just ignore the rest of the data once we have received what we expect.
@lpinca @ronag Thanks for explaination, closing :)
@szmarczak: One way to actually solve this is to always reset the connection after a HEAD response. Though that would significantly reduce performance of HEAD requests. But it would make it more resilient.
Most helpful comment
It seems this happens every time the HTTP response to a HEAD request has a message body. The only relevant parts I can find on RFC 7230 are:
https://tools.ietf.org/html/rfc7230#section-3.3
https://tools.ietf.org/html/rfc7230#section-3.3.3