Node: Http connections aborted after 5s / keepAliveTimeout

Created on 2 Jun 2017 · 37Comments · Source: nodejs/node

Version: v8.0.0
Platform: Windows 10, 64bit
Subsystem: http

Short Description

Node 8 introduced a change in the handling of http keep-alive connections. IMHO, this is (at least) a breaking change. When an http server does long-running requests (>5s), and the client requests a Connection: keep-alive connection, the http server closes the connection after 5s. This potentially causes browsers to re-send the request even if it is a POST request.

To Reproduce

clone https://github.com/pbininda/node8keepAliveTimeout and npm install. Then

    npm test

Starts a little express server (server.js) and a client.

The server is a standard express server with a long running post request (/longpost takes 10s).
The client calls the POST /longpost with a preflight OPTIONS /longpost.

The test runs through fine on node 6 and node 7:

> node test.js

got request OPTIONS /longpost
got options response 200
sending post request
got request POST /longpost
got post response 200 { status: 'OK' }

but fails on node 8 with

> node test.js

got request OPTIONS /longpost
got options response 200
sending post request
got request POST /longpost
C:\Users\pbininda\projects\ATRON\node8keepAliveTimeout\client.js:39
            throw err;
            ^

Error: socket hang up
    at createHangUpError (_http_client.js:343:15)
    at Socket.socketOnEnd (_http_client.js:435:23)
    at emitNone (events.js:110:20)
    at Socket.emit (events.js:207:7)
    at endReadableNT (_stream_readable.js:1045:12)
    at _combinedTickCallback (internal/process/next_tick.js:102:11)
    at process._tickCallback (internal/process/next_tick.js:161:9)

Browser Retries

It seems, most of the major browsers (Chrome, Firefox, Edge) implement https://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.2.4. Since the server closes the connection on which it received the POST request before sending an answer, the Browsers re-send the POST. Note that you don't see the re-send in chrome dev tools but using Wireshark shows the retransmission. To have a look at this, run

    npm start

which launches the server (server.js) and then load browsertest.html in chrome. This runs browsertest.js in the browser which does a simple $.ajax request against the server. On the server side you will see:

> node server.js

got request OPTIONS /longpost
got request POST /longpost
got request POST /longpost
format called 5003ms after previous

This shows, that the server received two POST requests the second one 5s after the first one, even though the browser client code only does one request.

Bug or Breaking Change?

I'm not sure if this is a bug or a breaking change. It probably got introduced through #2534. It only seems to happen when two connections are used (that's why the prefight OPTIONS is forced to happen in my code), so it may be that the wrong connection is being closed here.

Workaround

Setting the keepAliveTimeout (see https://nodejs.org/dist/latest-v8.x/docs/api/http.html#http_server_keepalivetimeout) of the http server to a value greater than the maximum duration of a request solves the problem. You can try this with

    npm start -- --keepAliveTimeout 20000

and then in another terminal

    node client.js

http

Source

pbininda

👍11

Most helpful comment

This problem still exists in node 8.7.0

rickywu on 18 Oct 2017

👍3 😕1

All 37 comments

/cc @indutny @tshemsedinov @aqrln
Could be related to #2534.

ChALkeR on 2 Jun 2017

Thanks for the ping, I'll be able look into this in a few hours.

aqrln on 2 Jun 2017

👍1

Thanks for bringing this up @pbininda. We hit this issue as well when running requests longer than a few seconds. We also see multiple POSTs effecting all browsers on 8 ... reverted back to 7.10 and all is well. Seems like a pretty pernicious bug. We're using Koa 1.x middleware on Linux.

lvpro on 2 Jun 2017

Same behaviour here; request bit longer than usual and the browser kept resending the request. A general remark is that Firefox and Chrome did resend the request, but Safari didn't. Using Express server 4.15.2. Finally, we reverted to 7.10 and is working normally.

juanecabellob on 3 Jun 2017

Thanks for the note regarding Safari, I'll change the wording regarding "all major browsers" 😓

pbininda on 3 Jun 2017

Ugh... sorry for the delay, I was more busy that I hoped for. Let's fix it today. Thanks a lot for a detailed report and reproduction!

aqrln on 5 Jun 2017

Confirmed issue still present in release 8.1 as well.

lvpro on 8 Jun 2017

@pbininda @lvpro @juanecabellob I'm very sorry for not making it in time for the 8.1 release. I took a look at the reproduction back then, but didn't have an opportunity to debug it until now. https://github.com/nodejs/node/pull/13549 should fix it.

aqrln on 8 Jun 2017

Don't be sorry @aqrln! Thank you very much for getting this resolved! :)

lvpro on 8 Jun 2017

@aqrln No problem, I can keep the server.keepAliveTimeout workaround in place until the fix is released. Thanks for your effort.

pbininda on 8 Jun 2017

I have this issue too with webpack-dev-server. It somehow does not work.

How would I pass the paratmeter keepAlive Timeout so that the default timeout is 20 secs?

Above this is the example

npm start -- --keepAliveTimeout 20000

I tried this:

 npm run-script start -- --"keepAliveTimeout 20000"

It gets expanded to:

webpack-dev-server --inline --hot --history-api-fallback  --public --host 104.222.96.51 --port 9090 "--keepAliveTimeout 20000"

Unfortunately it does not have any effect.

awb99 on 15 Oct 2017

This problem still exists in node 8.7.0

rickywu on 18 Oct 2017

👍3 😕1

@rickywu what problem and show your code that produces problem

Delagen on 18 Oct 2017

@rickywu I am using webpack dev server to host a bundle.js file that is about 5 MB in size.

webpack-dev-server --inline --hot --history-api-fallback  --public --host 127.0.0.1 --port 9090 --keepAliveTimeout 20000

It all worked when the bundle size was 2 MB. Now that it is bigger, I do get timeouts in the browser.
I believe that it is a pretty well known problem. The first link shows some solutions, which unfortunately do not work; the other two links show that the issue is there on webpack-dev-server and also with angular-cli

https://github.com/nodejs/node/issues/13391
https://github.com/webpack/webpack-dev-server/issues/1044
https://github.com/angular/angular-cli/pull/7563

awb99 on 18 Oct 2017

I have this bug till nide 8.2 as i know. Now it fixed and does not related with this. I have bundles about 10Mb. And if server respond then all fine. I recommed to tune mtu if you using some vpn

Delagen on 19 Oct 2017

@Delagen This is really strange. I run this on a dedicated server with direct IP. I have solved this issue so far, by basically creating a bundle that I now serve via express.js where I set the timeout in code. To me all the descriptions fit 100% to the issue that came up with node recently. I cannot run the webpack dev server anymore on my server; it only works on the localhost on my development machine, as there I do not have any transfer delay.

awb99 on 19 Oct 2017

@Delagen see this issue

I used server.keepAliveTimeout=30000 in app.js solved, but I think the default value 5s is too short.

rickywu on 20 Oct 2017

@rickywu As for me error on express was in repeating request callback. I have no any problems with expressjs in Nodejs 8.7.0. I only set timeout as I have heavy requests, that need up to minute to begin response. But I don't set ant keepAliveTimeout

Delagen on 20 Oct 2017

FWIW, there is a PR open that's dealing with this: https://github.com/nodejs/node/pull/15791 and there's a more recent (open) issue: https://github.com/nodejs/node/issues/15082

apapirovski on 20 Oct 2017

👍1

@apapirovski As I understand It only occurs when use of TLS in node?

Delagen on 20 Oct 2017

@Delagen No, it occurs in both tls & net (the PR just fixes both tls & net issues related to it). See https://github.com/nodejs/node/issues/15082 for a good explanation of what triggers it.

But to summarize, once Node starts writing data, the timeout starts ticking (whether it's keep alive or the regular one) and it completely ignores whether any writing is still going on or not. The timer is only updated when a write starts or ends. That's why most reports of this issue have to do with downloading/serving large files. Also this issue has existed for a very long time, it just wasn't super obvious before keepAliveTimeout because the regular timeout is 120s.

apapirovski on 20 Oct 2017

👍1

@Delagen What's the size of your data per request? My data is about 3Mb
I use debug tool monitor network, transmission suspend after 6s if not set keep alive timeout.

rickywu on 21 Oct 2017

About 5 mb max, but seems I have fast channel, so I don't take this timeout

Delagen on 21 Oct 2017

FYI A fix for this just landed on master so it should be in 8.x LTS & 9.x at some point within the next month. With the fix there should no longer be a need to bump up the keepAliveTimeout just so transfers don't timeout.

apapirovski on 21 Oct 2017

@apapirovski can you link to the fix please :D

v4.x and v6.x are unaffected correct?

MylesBorins on 23 Oct 2017

@MylesBorins It's here https://github.com/nodejs/node/commit/a627c5fc136b3233959e8e1bff90ae4b2d0ec322

Let me confirm re: v6.x

apapirovski on 23 Oct 2017

v6.x is affected. Probably v4.x too.

Here's a way to test:

const http = require('http');

const server = http.createServer((req, res) => {

  var content = Buffer.alloc(30000000, 0x44);

  res.writeHead(200, {
    'Content-Type': 'application/octet-stream',
    'Content-Length': content.length.toString(),
    'Vary': 'Accept-Encoding'
  });

  var handle = res.socket._handle;
  var endRc = res.end(content, function() {
    console.log(handle.writeQueueSize);
  });

  res.end();
});
server.timeout = 5000;

server.listen(8000);

curl -v http://localhost:8000/ | dd bs=1 of=/dev/null

I don't know if my code can be backported straight up though. Maybe label with backport requested and I'll try to get around to figuring out how to make it work on those.

apapirovski on 23 Oct 2017

👍1

@apapirovski please make a backport to the v6.x and v4.x branches.

We are doing a maintenance release of v4.x soon which can be postponed for this

MylesBorins on 23 Oct 2017

Ok, I labeled the PR and will work on it this week. Will keep you posted on it. Let me know if it lands cleanly on v8.x or not — want to make sure it makes it into the first LTS release of that.

apapirovski on 23 Oct 2017

@apapirovski I just landed it in v8.x-staging in f6a725ea09, it should hopefully go out in tomorrows release

MylesBorins on 23 Oct 2017

@MylesBorins You'll need the other commit: https://github.com/nodejs/node/commit/aaf2a1c2264fffd6eea82fe3778f41ac046e6349 — they probably should've landed as one or I should've indicated that the 2nd one depends on the first working properly. My bad.

apapirovski on 23 Oct 2017

@apapirovski done

landed in eeef06a4bb

MylesBorins on 23 Oct 2017

Would this have been present in node 10.6.0? Test code above seems to pass.

tony-gutierrez on 10 May 2019

3 years and bug still exists? How's that possible?

janswist on 25 Mar 2020

How's that possible?

As with all things in Node.js, it requires someone free to work on it. Pull requests are always welcome. One thing that may be helpful is a reproduction of the issue in the form of a known_issue test that can be used to guide someone in making a fix.

jasnell on 25 Mar 2020

How's that possible?

As with all things in Node.js, it requires someone free to work on it. Pull requests are always welcome. One thing that may be helpful is a reproduction of the issue in the form of a known_issue test that can be used to guide someone in making a fix.

Not intended to sound rude. I'm trying to solve that riddle for hours now - maybe someone else has it as well:

trying to send in response an 8KB array of 50 items that takes 6.7 seconds to perform [ERROR]
when I slice(10) it suddenly takes 2 seconds and everything works like a charm.

My question would be: how is that possible? Seems like if send data is too big (?) it just freezes and then times out.

I'm using Node 10.19.0. Thanks for your help.

janswist on 25 Mar 2020

In case this helps - I think there's an additional use case which isn't related to the transmission size, but rather it's a race condition between the start / end of requests on the same connection.

First request, new connection opened, keep-alive timer starts.
First request ends
Keep-alive timer kicks in, "timeout" event is about to be emitted on the socket
3.1 First bytes of second request come in on the soon-to-be-destroyed socket
"timeout" event is emitted on the socket, resulting in the destruction of the socket
4.1 Second request gets dropped
Second requests parsing (parseOnIncoming..) starts, trying to reset the timeout on the already-destroyed socket

Important:
Note that in this test (see "Reproduce" below) we're not transmitting large amounts of data on the connection - which is (I think) why f6a725e doesn't fix this issue. Put differently, I think it's a timing issue from the time one requests ends (and keepAlive timer is about to end) - and just at the (almost) same time - a new request comes in.

*_I'm not 100% sure this is the correct flow - it's just my best educated-guess, based on some debugging_

Reproduce:

I simply used the test code in this repo: https://github.com/yoavain/node8keepAliveTimeout
Pretty consistent result - socket gets destroyed after the first couple of requests (easy to reproduce)

Tested Versions:

v8.16.1
v12.18.0
v12.19.1
v12.20.0
v14.7.0

Naive Approach to a Fix
Assuming my debug analysis is correct, I would assume that there are 2 main ways to go about a fix for this:

either prevent the timeout event from firing on the socket if the socket is still in use
or, when ever the timeout event has fired, we then check if the socket is still in use, in which case we do nothing

Here's a naive code listing of the second approach:

// file: _http_server.js
function socketOnTimeout() {
  // "this" is the socket
  if (this.isInUse) {
    return;
  }  
  ...
}

uri-chandler on 25 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

is there any way to retrieve client ipv4 and ipv6 address from request object.

ksushilmaurya · 3Comments

Crypto SHA1 - NodeJs v C#

filipesilvaa · 3Comments

verbose unsafe-perm in lifecycle true

fanjunzhi · 3Comments

querystring should maybe support Buffers

addaleax · 3Comments

stream, test: add test for _readableState.readingMore

mcollina · 3Comments