(go1.2rc4) We're working on a reverse HTTP proxy that does some "stuff" (that's not relevant) and proxies towards one or more backend HTTP servers. During testing I ran into connection failures apparently due to running out of file descriptors. In narrowing this down I reduced the problem to; The "null" reverse proxy (doing nothing but proxying, using the httputil.ReverseProxy): http://play.golang.org/p/p1g4bpTZ_g A trivial HTTP server acting as the backend behind the above proxy; it simply counts the number of connections and responds successfully: http://play.golang.org/p/F7W-vbXEUt Running both of these, I have the proxy on :8080 forwarding to the server at :8090. To test it, I use wrk (https://github.com/wg/wrk); jb@jborg-mbp:~ $ wrk -c 1 -t 1 http://localhost:8080/ ... Granted, this isn't a super realistic reproduction of the real world since the latency on localhost is minimal. I can't prove the same thing _can't_ happen in production though. Using one connection (-c 1) up to about three, this works perfectly. The server side sees a bunch of requests over one to three connections, i.e. the number of backend connections from the proxy matches the number of incoming connections. At around -c 4 and upwards, it blows up. The proxy doesn't manage to recycle connections quickly enough and starts doing regular Dial:s at a rate of thousands per second, resulting in ... 2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested address 2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested address 2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested address 2013/11/18 20:18:21 http: proxy error: dial tcp 127.0.0.1:8090: can't assign requested address ... from the proxy code and of course HTTP errors as seen by wrk. My theory, after going through the http.Transport, is that when the number of requests/s to the proxy goes up, the small amount of bookkeeping that is required to recycle a connection (putIdleConn etc) starts taking just long enough that the next request in the pipeline gets in before a connection is idle. The Transport starts Dial:ing, adding more connections to take care of, and it explodes chaotically. I would prefer if it blocked and awaited an idle connection instead of Dial:ing at some point. Line 92, "// TODO: tunable on global max cached connections" seems relevant, although in my case the tunable should probably be max connections per host instead of globally. I'll probably take a stab at implementing something like that to fix this (i.e. rewriting Transport to limit connections), since I couldn't figure out a way around it by just using the available API:s. Unless I seem to have misunderstood something obvious... //jb
Just wanted to check if this is planned anytime soon. Ran into similar issue - at scale looks like there is high contention on the semaphore for PutIdleConn and GetIdleConn. 316 @ 0x415989 0x415a0b 0x429eee 0x42a130 0x49ad56 0x45056e 0x450ab8 0x44ef26 0x43390d 0x4330db 0x43456f 0x433bbf 0x400e4f 0x415c20 # 0x42a130 sync.runtime_Semacquire+0x30 /usr/local/go/src/pkg/runtime/sema.goc:199 # 0x49ad56 sync.(*Mutex).Lock+0xd6 /usr/local/go/src/pkg/sync/mutex.go:66 # 0x45056e net/http.(*Transport).getIdleConn+0x9e /usr/local/go/src/pkg/net/http/transport.go:402 # 0x450ab8 net/http.(*Transport).getConn+0xa8 /usr/local/go/src/pkg/net/http/transport.go:452 # 0x44ef26 net/http.(*Transport).RoundTrip+0x416 /usr/local/go/src/pkg/net/http/transport.go:201 # 0x43390d net/http.send+0x43d /usr/local/go/src/pkg/net/http/client.go:195 # 0x4330db net/http.(*Client).send+0x15b /usr/local/go/src/pkg/net/http/client.go:118 # 0x43456f net/http.(*Client).doFollowingRedirects+0x97f /usr/local/go/src/pkg/net/http/client.go:343 # 0x433bbf net/http.(*Client).Get+0xaf /usr/local/go/src/pkg/net/http/client.go:275 # 0x400e4f main.startClient+0x16f /home/adsteam/go_workspace/src/suraj.narkhede/stresstester/main.go:48 347 @ 0x415989 0x415a0b 0x429eee 0x42a130 0x49ad56 0x44fd5e 0x455b87 0x454c1b 0x454931 0x483291 0x480f6d 0x400f0b 0x415c20 # 0x42a130 sync.runtime_Semacquire+0x30 /usr/local/go/src/pkg/runtime/sema.goc:199 # 0x49ad56 sync.(*Mutex).Lock+0xd6 /usr/local/go/src/pkg/sync/mutex.go:66 # 0x44fd5e net/http.(*Transport).putIdleConn+0xce /usr/local/go/src/pkg/net/http/transport.go:342 # 0x455b87 net/http.func·023+0x97 /usr/local/go/src/pkg/net/http/transport.go:853 # 0x454c1b net/http.(*bodyEOFSignal).condfn+0xab /usr/local/go/src/pkg/net/http/transport.go:1161 # 0x454931 net/http.(*bodyEOFSignal).Read+0x321 /usr/local/go/src/pkg/net/http/transport.go:1133 # 0x483291 io/ioutil.devNull.ReadFrom+0xb1 /usr/local/go/src/pkg/io/ioutil/ioutil.go:151 # 0x480f6d io.Copy+0x13d /usr/local/go/src/pkg/io/io.go:349 # 0x400f0b main.startClient+0x22b /home/adsteam/go_workspace/src/suraj.narkhede/stresstester/main.go:55 I have written a simple load tool in go (to evaluate go's http client performance) - http://play.golang.org/p/MA-a2ZAVNe.
Wondering if any progress has been made on this? I'm seeing similar results when running synthetic benchmarks - does this performance profile hold in production/real scenarios? If so, does using go as a serious reverse proxy become impractical?
@marbemac, are you asking about max connections per host (what this bug is about), or the separate issue of mutex contention brought up in Comment 6? If the latter, I'd prefer to move it to a new bug.
@bradfitz I'm definitely seeing the first issue, I'm not sure if the issue in comment 6 is related. Basically I'm relatively new to go, and am trying to evaluate wether I can use it to build a decently fast reverse proxy. Running wrk against a go http server that simply returns a hello world string, I get the following (very unscientific) results:
With no proxy
wrk -t2 -c100 -d15s http://127.0.0.1:3000
29k requests/second, 0% failure
With nginx in front
wrk -t2 -c100 -d15s http://127.0.0.1:4205
28k requests/second, 0% failure
With NewSingleHostReverseProxy in front (https://play.golang.org/p/vk33Eg4QOS)
wrk -t2 -c100 -d15s http://127.0.0.1:4204
800 requests/second, 20% failure
Is this typical, or am I doing something wrong in the go code / testing setup. I'm running this on a Macbook Pro, max file descriptors have been increased.
I'm seeing a similar issue with a go program on both OSX and Linux. With only one concurrent request the proxy is stable, but when adding more it starts to break. My implementation is _very_ simple.
@mentat Try playing with these two settings. I'm seeing some improvement with:
runtime.GOMAXPROCS(runtime.NumCPU())
http.DefaultTransport.(*http.Transport).MaxIdleConnsPerHost = 100
I'm curious to hear if it helps in your case, let me know.
Thanks @marbemac I've pushed out a new version with these settings and will let you know how it does with real-world traffic.
Hit this issue in prod, blocking would definitely be preferable to exhausting sockets
Hitting this issue as well in production. Was there a better solution here than crank up your MaxIdleConns to fit your workload?
The persistConn
objects used by the Transport
object define a closech
to listen for the closing of the underlying connections.
https://github.com/golang/go/blob/master/src/net/http/transport.go#L821
If this is reliable, it seems like you could use it as the basis for a release on a (per-host) semaphore that dialConn
would need to aquire in order to create a new connection. As a start on a proposal for limiting total per-host connections, does this sound way off?
Related CL: https://go-review.googlesource.com/#/c/20485/
Sorry, didn't happen for Go 1.8.
I think I'm running into the same issue with a proxy lsof
is showing hundreds of "ESTABLISHED" connections even when a response has been sent back to the client:
https://github.com/alexellis/faas/blob/master/watchdog/main.go
Can anyone confirm? @bradfitz
@alexellis don't you just need to add an r.Body.Close()
to close the io.ReadCloser
? Sent you a PR: https://github.com/alexellis/faas/pull/13
@tleyden From the Go http docs:
https://golang.org/pkg/net/http/#Request
// The Server will close the request body. The ServeHTTP
// Handler does not need to.
Body io.ReadCloser
@tleyden @rogchap I added a Close call to both ends (client and server were both Golang) which fixed the issue.
@alexellis Yup, you have to call Close on the client. Glad you got it working.
Should I file a separate issue or is this the place to mention: I have code that does:
ht := http.DefaultTransport.(*http.Transport)
ht.MaxIdleConns = numConnections
ht.MaxIdleConnsPerHost = numConnections
and then establishes numConnections
simultaneous (go routines) back to back http connections (load testing tool)
if I check the number of sockets, it works as expected where I have numConnections established sockets for numConnections input with smaller numbers (between 1 and 9 or 10) but for 11 and up I have more sockets than the limit:
# with up to 10 it works
$ fortio -qps 0 -c 10 http://localhost:8080
$ netstat -na |awk ' $4 == "::1.8080" { print $0}'|wc -l
10
# with 16 there are more sockets:
$ echo `netstat -na |awk ' $4 == "::1.8080" { print $5}'| sort`
::1.56603 ::1.56604 ::1.56605 ::1.56606 ::1.56607 ::1.56608 ::1.56609 ::1.56610 ::1.56611 ::1.56612 ::1.56613 ::1.56615 ::1.56616 ::1.56617 ::1.56618 ::1.56619 ::1.56621 ::1.56622 ::1.56623 ::1.56624
$ netstat -na |awk ' $4 == "::1.8080" { print $0}'|wc -l
20
this happens both on go version go1.8.3 darwin/amd64
and go version go1.8.1 linux/amd64
the full code to repro is go get istio.io/istio/devel/fortio/cmd/fortio
(branch ldemailly-bench6 )
beside doing my own connection pooling is there a way to guarantee a maximum/exact number of socket used ? (and why do I get a few more than I ask for)
Given that this has been like this since Go 1.2 and still not fixed it seems likely we should just close it. But moving to Go 1.11 at least.
I don't want to close this. There are a number of related bugs I'd like to all address.
Looks like Transport.MaxConnsPerHost (new in Go1.11) addresses this.
Agree that Transport.MaxConnsPerHost fixes this.
Most helpful comment
I don't want to close this. There are a number of related bugs I'd like to all address.