Due to what I thought would be a simple documentation PR, I have recently started investigating how http.Agent works for both connection pooling and HTTP keep-alive behavior. I am much confused. I would like to know if my understanding of the existing code is correct, and if so, is this the intended behavior.
The following text is a copy/paste of one of my comments from that documentation pull request.
I have been reading the code all morning, and the keep alive logic is mind boggling, shared between _http_outgoing.js, _http_agent.js, _http_client.js, and net.js. If I'm reading it correctly, as far as clients go, by default, we're not implementing HTTP/1.1 persistence.
The HTTP/1.1 spec says,
A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Both the http.globalAgent and the default constructor behave so that multiple, _simultaneous_ requests will reuse an existing connection, but once the Agent instance has no more pending requests for a given host/port, the socket is destroyed, instead of assuming the server will maintain the connection.
But really, I wonder if this is the intended behavior. It's not necessarily broken, but seems like an odd default. That, combined with the fact that the default also does not pool sockets -- you must specify keepAlive: true to enable this -- makes it seems like the Agent doesn't do much on its own, unless expressly created/configured to do so.
To summarize. If I read it correctly, the default behavior for all HTTP client requests using an Agent (even the default http._globalAgent is to:
Connection: keep-alive headers. See: https://github.com/nodejs/node/blob/master/lib/_http_outgoing.js#L276-L279{ keepAlive: true } as a constructor option in a new AgentConnection: keep-alive header. See above re: always sending Connection: keep-aliveConnection: keep-alive headers (because you can't have one without the other in this implementation), TCP SO_KEEPALIVE is used via lib_uv's uv_tcp_keepalive function. See: https://github.com/nodejs/node/blob/master/lib/_http_agent.js#L73, https://github.com/nodejs/node/blob/master/lib/net.js#L368, and https://github.com/nodejs/node/blob/master/src/tcp_wrap.cc#L161createConnection function to the http.request() function.I would love for someone with better knowledge of the code to chime in here. My reading of the code could be wrong. But if it's correct, it sure is confusing and I find it hard to believe this is the true intent.
Always send HTTP/1.0Connection: keep-alive headers.
I suspect that's an accident of history. Node has always done that.
Destroy socket connections when there are no more pending requests for a given host/port.
...
To use connection pooling, you must specify { keepAlive: true } as a constructor option in a new Agent
The logic is essentially that the agent doesn't know more requests will follow unless they are already queued up and it's bad netizenship to keep connections open unnecessarily. It would also prevent the node process from exiting.
You can then override the default by passing { keepAlive: true, keepAliveMsecs: ms }. (Aside: I expect { keepAliveMsecs: Infinity } won't work. Probably a bug.)
When using connection pooling, and therefore also HTTP/1.0 connection persistence via Connection: keep-alive headers (because you can't have one without the other in this implementation), TCP SO_KEEPALIVE is used via lib_uv's uv_tcp_keepalive function.
Yes. TCP keep-alive needs to be enabled for long-lived persistent connections, otherwise the local or remote TCP stack may decide to kill the connection due to inactivity.
@bnoordhuis thanks for the reply. It's still a little confusing to me. Maybe it will help if I am more specific about what seems odd to me.
Connection: keep-alive for all requests, but then close the connection after all _simultaneous_ requests complete.Connection: keep-alive headers, but this only works in conjunction with socket pooling in the Agent.I just want to be sure that this is the intended behavior before finalizing updates to http.md
Clients don't keep the connection open unless told otherwise, and that's intentional, yes.
That node sends a keep-alive header may or may not be intentional. It used to be good practice ca. 2000 to work around servers that implemented HTTP/1.1 poorly but I don't know if that is why node does it. I didn't write that code, that was Ryan.
Where you say 'simultaneous', I would probably use 'queued'. Node works down a backlog and closes the connection once the backlog is empty (unless told otherwise.)
OK - thanks for the clarification. I'll close this then.