Node: Unexpected HTTP Parse Error

Created on 14 Jul 2020  Â·  9Comments  Â·  Source: nodejs/node

  • Version: v14.5.0
  • Platform: Linux solus 5.6.18-156.current #1 SMP PREEMPT Sun Jun 21 07:16:38 UTC 2020 x86_64 GNU/Linux
  • Subsystem: https

What steps will reproduce the bug?

https://runkit.com/szmarczak/5f0d0b92c01e77001bd2be09

const https = require('https');

https.request('https://www.sachadrake.com/images/assetimages/social_media_default.jpg', {
    method: 'HEAD'
}, response => {
    console.log(response.statusCode);
}).end();

How often does it reproduce? Is there a required condition?

Always.

What is the expected behavior?


What do you see instead?

Error {
    bytesParsed: 478
    code: "HPE_INVALID_CONSTANT"
    rawPacket: Buffer <48, 54, 54, 50, 2F, 31, 2E, 31, 20, 34, 30, 34, 20, 4E, 6F, 74, 20, 46, 6F, 75, 6E, 64, 0D, 0A, 43, 6F, …>
    reason: "Expected HTTP/"
    message: "Parse Error: Expected HTTP/"
}

Additional information

The server replies with these data:

HTTP/1.1 404 Not Found
Content-Type: text/html
Server: 
X-Frame-Options: SAMEORIGIN
X-Content-Type-Options: NOSNIFF
X-XSS-Protection: 1
Content-Security-Policy: default-src  * 'unsafe-inline' 'unsafe-eval' ; img-src * data: 'unsafe-inline' ; font-src * data: 'unsafe-inline' ; media-src * blob: 'unsafe-inline' ; frame-src * data: 'unsafe-inline' 'unsafe-eval' ;
Date: Tue, 14 Jul 2020 01:35:23 GMT
Strict-Transport-Security: max-age=3600
Transfer-Encoding: chunked

/cc @Kikobeats

http http_parser wontfix

Most helpful comment

It seems this happens every time the HTTP response to a HEAD request has a message body. The only relevant parts I can find on RFC 7230 are:

https://tools.ietf.org/html/rfc7230#section-3.3

Responses to the HEAD request method (Section 4.3.2
of [RFC7231]) never include a message body because the associated
response header fields (e.g., Transfer-Encoding, Content-Length,
etc.), if present, indicate only what their values would have been if
the request method had been GET (Section 4.3.1 of [RFC7231])

https://tools.ietf.org/html/rfc7230#section-3.3.3

Any response to a HEAD request and any response with a 1xx
(Informational), 204 (No Content), or 304 (Not Modified) status
code is always terminated by the first empty line after the
header fields, regardless of the header fields present in the
message, and thus cannot contain a message body.

All 9 comments

@nodejs/http-parser

Here is a simpler test case without external servers:

'use strict';

const http = require('http');
const net = require('net');

const server = net.createServer();

server.on('connection', function (socket) {
  socket.resume();
  socket.write(
    [
      'HTTP/1.1 404 Not Found',
      'Transfer-Encoding: chunked',
      '',
      '0',
      '',
      ''
    ].join('\r\n')
  );
});

server.listen(function () {
  const request = http.request({
    method: 'HEAD',
    port: server.address().port
  });

  request.on('response', function (response) {
    response.resume();
    console.log(response.statusCode);
  });

  request.end();
});

It seems this happens every time the HTTP response to a HEAD request has a message body. The only relevant parts I can find on RFC 7230 are:

https://tools.ietf.org/html/rfc7230#section-3.3

Responses to the HEAD request method (Section 4.3.2
of [RFC7231]) never include a message body because the associated
response header fields (e.g., Transfer-Encoding, Content-Length,
etc.), if present, indicate only what their values would have been if
the request method had been GET (Section 4.3.1 of [RFC7231])

https://tools.ietf.org/html/rfc7230#section-3.3.3

Any response to a HEAD request and any response with a 1xx
(Informational), 204 (No Content), or 304 (Not Modified) status
code is always terminated by the first empty line after the
header fields, regardless of the header fields present in the
message, and thus cannot contain a message body.

Right, this issue has come up before - numerous times, in fact. Servers shouldn't include a response body when replying to HEAD requests but some buggy servers do.

The parser (correctly) assumes that the first byte after the headers-terminating newline is the start of a new response.

Node could work around it if it tracked requests and responses, i.e., if it knew no second response is expected because no matching request has been fired off yet. Would still break with HTTP pipelining though.

/cc @ronag (slightly related with https://github.com/mcollina/undici/issues/246 and https://github.com/mcollina/undici/pull/250)

From https://tools.ietf.org/html/rfc7231#section-4.3.2

The HEAD method is identical to GET except that the server MUST NOT
send a message body in the response (i.e., the response terminates at
the end of the header section).

so I think it's ok to mark this as "wontfix".

/cc @ronag (slightly related with mcollina/undici#246 and mcollina/undici#250)

What we did in undici was that if we suspect that the server might be sending extra data after the expected response we simply stop pipelining, ignore the extra data and close the connection. Though currently we only do this when we send requests that have undefined semantics, i.e. payload with GET or HEAD, and we know that the server might "misbehave".

Though in this case we are sending a valid request and the server is providing an invalid response, i.e. no hint regarding server behavior. I don't have any good ideas how to detect that without fully disabling pipelining.

Without pipelining we could do the above and just ignore the rest of the data once we have received what we expect.

@lpinca @ronag Thanks for explaination, closing :)

@szmarczak: One way to actually solve this is to always reset the connection after a HEAD response. Though that would significantly reduce performance of HEAD requests. But it would make it more resilient.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

filipesilvaa picture filipesilvaa  Â·  3Comments

fanjunzhi picture fanjunzhi  Â·  3Comments

Icemic picture Icemic  Â·  3Comments

akdor1154 picture akdor1154  Â·  3Comments

srl295 picture srl295  Â·  3Comments