go version)?$ go version
go version go1.10.3 linux/amd64
go env)?$ go env | grep -v GOPATH
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/davrodpin/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build822424801=/tmp/go-build -gno-record-gcc-switches"
I was trying to write a program to perform HTTP requests to a web server that is running on the same linux machine, but on another linux namespace using the http client provided by the net/http package, which failed constantly with the message Get http://127.0.0.1:8080: dial tcp 127.0.0.1:8080: connect: connection refused.
The code to reproduce the issue is published as gist. Link below:
https://gist.github.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692
There are some pre-steps to be executing before running the code in order to create a new linux namespace, so I am providing a step-by-step process on how to fully reproduce this bug:
mkdir golang-http-client-bug && cd golang-http-client-bug
curl -O https://gist.githubusercontent.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692/raw/140a40e17e6dfabd8be5d8dafbb5c49da2330420/client.go
curl -O https://gist.githubusercontent.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692/raw/140a40e17e6dfabd8be5d8dafbb5c49da2330420/server.go
gobugip netns add gobug
ip netns exec gobug ifconfig lo 127.0.0.1 netmask 255.0.0.0 up
go get github.com/vishvananda/netns && go build server.go && go build client.go
ip netns exec gobug ./server
./client
The output that you should see is:
switching linux namespace to 'gobug'
request to http server using exec.Command(curl) returned with success: Hello, "/foo"
request to http server using net.Dial returned with success: HTTP/1.0 200 OK
error while sending http request using http.Client: Get http://127.0.0.1:8080: dial tcp 127.0.0.1:8080: connect: connection refused
switching linux namespace to previous one
The client first switches to the new linux namespace, gobug, using an open source library, https://github.com/vishvananda/netns, and then tries to perform a http request to a web server listening on 127.0.0.1:8080 in three (3) different ways: using curl, using net.Dial directly and using http.Client.
The first two methods (curl and net.Dial) work, which means they could reach out the server running on the linux namespace gobug, but the third fails.
I am suspicious that is related to the goroutines created by http.Transport (links to source code below) to manage the connections, since they will be running on the default linux namespace instead of gobug namespace.
This behavior is explained here: https://golang.org/doc/go1.10#runtime
Links to http.Transport source code that creates goroutines:
I was expecting the http request, GET http://127.0.0.1:8080/foo to return with success
The error message below:
Get http://127.0.0.1:8080: dial tcp 127.0.0.1:8080: connect: connection refused
Your problem is indeed that Go is dialing from another goroutine which will run on a different thread. What you can do to fix this is providing a custom dialer that locks the thread and sets the namespace right before the dial.
I have not tested this code as I am not able to at the moment:
func main() {
// ...
if err = RequestUsingHttpClient(func() {
// Setup
err := netns.Set(ns)
if err != nil {
panic(fmt.Sprintf("can't switch to linux namespace 'gobug': %v", err))
}
}, func() {
// Teardown
netns.Set(origin)
}); err != nil {
fmt.Printf("%v\n", err)
}
// ...
}
func RequestUsingHttpClient(setup, teardown func()) error {
defer teardown()
defer runtime.UnlockOSThread()
c := http.Client{
Transport: &http.Transport{
Proxy: http.ProxyFromEnvironment,
DialContext: func(ctx context.Context, network, address string) (net.Conn, error) {
runtime.LockOSThread()
setup()
return net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
DualStack: true,
}.DialContext(ctx, network, address)
},
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
TLSHandshakeTimeout: 10 * time.Second,
ExpectContinueTimeout: 1 * time.Second,
},
}
if _, err := c.Get(fmt.Sprintf("http://%s", serverAddress)); err != nil {
return fmt.Errorf("error while sending http request using http.Client: %v", err)
}
fmt.Println("http.Client is working as expected")
return nil
}
Hi @erikdubbelboer,
Thank you very much for providing the code snippet. It works!
I have updated the gist with the working version of your code: https://gist.github.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692#file-client_custom_dialer-go
That solves my specific issue with the http client, but I believe there is a broader problem to be solved when your program has a component that relies on concurrency (it spawns goroutines) and you want to make sure all of them will be executed on a given linux namespace.
I was wondering if we could have a way to pass some sort of context to a goroutine to give hints to the scheduler on how a goroutine should be executed, which could include a given OS thread that was previously changed to run on a specific linux namespace.
Pseudo golang code below:
osThread := runtime.GetCurrentOSThread()
//code to change `osThread` to a different linux namespace
hints := SchedulerHints(Hints{
"osThread": osThread
})
go(hints) func() {
// go scheduler use the given hints to schedule the goroutine
//do something
go(hints) func() {
//do something else
}()
}()
```
To be honest I don't see anything like that ever being added to Go. The use case is too specific.
If all the operations that require a specific thread are fast you could also run all of them on the main thread. To do this you can use this library: https://github.com/faiface/mainthread so you do your netns.Set in main and do Dial in a mainthread.Call(func() { ... }) closure.
If you need multiple threads you could even expand on the idea of this library and make a work queue per thread. Spawning new threads would be spawning a Goroutine that calls runtime.LockOSThread() and then waits for work to be done on that thread. Using this you could in theory have different threads with different namespaces that you can dispatch work to.
Agreed. Use case might be too specific and there is a current solution for making http requests on another linux namespace by providing a custom Dialer. Maybe this is not a real issue at all.
All very valuable thoughts, @erikdubbelboer. Thanks for sharing your ideas. It will help me a lot with what I am currently working on.
Yeah, sorry, we're not going to modify the standard library to accommodate different OS threads being in different namespaces. If you need to do that, do it early in init before goroutines are created (or from a parent process) so all your threads (and thus goroutines) are running in a consistent environment.
I'm just leaving this here in case it could be useful for someone. I had a similar issue but using the library inside a container and calling another container in a different port. I replaced 127.0.0.1 for localhost and it worked.
@judavi - That helped me for my case. I really appreciate you sharing! 馃檱
Most helpful comment
I'm just leaving this here in case it could be useful for someone. I had a similar issue but using the library inside a container and calling another container in a different port. I replaced 127.0.0.1 for localhost and it worked.