go version)?$ go version
go version go1.10.1 linux/amd64
yes
go env)?GOARCH="amd64"
GOBIN=""
GOCACHE="/home/test/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GORACE=""
GOROOT="/home/test/go"
GOTMPDIR=""
GOTOOLDIR="/home/test/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build614182611=/tmp/go-build -gno-record-gcc-switches"
and nginx version: nginx/1.13.3
Here is my client:
package main
import (
"fmt"
"net/http"
"sync"
"time"
"errors"
"bytes"
"io/ioutil"
)
type Client struct {
urlBase string
httpClient *http.Client
}
func NewClient(url string) *Client {
cc := &Client{
urlBase: url,
httpClient: &http.Client{Transport: &http.Transport{MaxIdleConnsPerHost: 100, MaxIdleConns: 100, IdleConnTimeout: 90 * time.Second, ExpectContinueTimeout: 5 * time.Second}},
}
cc.httpClient.Timeout = 5 * time.Second
return cc
}
func (self *Client) Send(content []byte) error {
if content == nil {
return errors.New("nil content")
}
body := &bytes.Buffer{}
body.Write(content)
begin := time.Now()
req, err := http.NewRequest("POST", self.urlBase, body)
if err != nil {
return fmt.Errorf("Send %+v (from %v to %v)", err, begin, time.Now())
}
req.Header = make(http.Header)
req.Header.Set("Content-Type", "application/json")
begin = time.Now()
resp, err := self.httpClient.Do(req)
if err != nil {
return fmt.Errorf("Send %+v (from %v to %v)", err, begin, time.Now())
}
defer resp.Body.Close()
if resp.StatusCode >= http.StatusBadRequest {
bs, _ := ioutil.ReadAll(resp.Body)
return fmt.Errorf("Send : HTTP : %d , BODY : %s , REQ : %s", resp.StatusCode, string(bs), string(content))
}
bs, err := ioutil.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("Send : ERR : %v", err)
}
fmt.Println("Send : Resp : ", string(bs))
return nil
}
func main() {
client := NewClient(
"https://test.cn", // I use /etc/hosts to bind the domain
)
var content = []byte(`{"test":1}`)
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
client.Send(content)
}()
}
wg.Wait()
}
And here is my nginx configure of http2:
http2_max_concurrent_streams 40960;
http2_max_requests 3;
http2_max_field_size 8k;
http2_max_header_size 32k;
http2_recv_buffer_size 1024k;
http2_recv_timeout 120s;
I expect all requests are successfully.
But I got 400 occasionally and found client prematurely closed stream: only 0 out of 10 bytes of request body received line in nginx log.
I used GODEBUG to try to figure out, and found data not sent after END_HEADERS:
2018/04/23 18:32:25 http2: Transport encoding header ":method" = "POST"
2018/04/23 18:32:25 http2: Transport encoding header ":path" = "/"
2018/04/23 18:32:25 http2: Transport encoding header ":scheme" = "https"
2018/04/23 18:32:25 http2: Transport encoding header "content-type" = "application/json"
2018/04/23 18:32:25 http2: Transport encoding header "content-length" = "10"
2018/04/23 18:32:25 http2: Transport encoding header "accept-encoding" = "gzip"
2018/04/23 18:32:25 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2018/04/23 18:32:25 http2: Framer 0xc4205480e0: wrote HEADERS flags=END_HEADERS stream=5 len=8
2018/04/23 18:32:25 http2: Framer 0xc4205480e0: wrote DATA flags=END_STREAM stream=5 len=0 data=""
That's the reason why nginx responsed 400 directly.
Does this happen only with http2 ?
Is it possible to for you to write a quick http2 server in Go and make this entire issue to be self-contained and fully reproducible ?
@agnivade Yes, it's fine with http1.1. Here is a quite simple server:
package main
import (
"fmt"
"log"
"net/http"
)
func myHandler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello there!\n")
}
func main() {
http.HandleFunc("/", myHandler)
log.Fatal(http.ListenAndServe(":18800", nil))
}
And nginx is needed.
And nginx is needed.
I don't understand. The whole point of having a Go server was to have the repro self-contained. If this is a client issue, why do we need nginx ? Is the Go client-server not enough to cause the repro ?
@agnivade Nginx is needed because it will check the body length against the header content-length. That's why nginx reponse 400 Bad Request directly. I don't think this is nginx issue because according to the tcpdump and GODEBUG, no body is sent indeed
And please note that in order to make this issue reproducible, http2_max_requests is set to 3 for nginx.
The point of having a Go server is to allow us to reproduce the issue without nginx. You can simply add the equivalent check in the Go server so that it's behavior is same as nginx.
Thank you @ictar for the report, we'll investigate it.
@agnivade thank you for handling this. I just wanted to add that I can relate to @ictar's need to have nginx for a repro(many companies that I know heavily use it) and that the question of absolving the issue only because Go's servers can't reproduce it might be tricky. This is because many companies will usually put nginx or some other web server as their frontend to their web apps, and from there proxy the content to polyglot services on their backend e.g Go, Node.js, Python(etc) web server. Interoperability between protocols and servers matters a lot and discrepancies between Go clients talking to various servers creep in :)
Yes I understand. I was trying to isolate the issue.
I did not mean to say to try to reproduce the issue with a plain Go HTTP server. Rather I meant to write equivalent code in Go to mimic nginx behavior so that we hit the bug. If that is hard to do, then we obviously need nginx. But if a simple check of verifying the content-length header and the body shows the issue, then we theoretically don't need nginx. Kind of like the test case that one will write if this indeed is a bug.
Seems like just verifying content-length to body length does not show the issue and we need nginx for that.
Also does not reproduce if http.Client is used just by single goroutine (in a non-concurrent way).
/cc @bradfitz , @tombergan
@ictar I just got sometime this evening to try to reproduce this:
ssg --host localhostuser www-data;
worker_processes auto;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;
events {
worker_connections 768;
# multi_accept on;
}
http {
http2_max_concurrent_streams 40960;
http2_max_requests 3;
http2_max_field_size 8k;
http2_max_header_size 32k;
http2_recv_buffer_size 1024k;
http2_recv_timeout 120s;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
##
# SSL Settings
##
ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE
ssl_prefer_server_ciphers on;
##
# Logging Settings
##
access_log /var/log/nginx/access.log;
error_log /var/log/nginx/error.log;
##
# Gzip Settings
##
gzip on;
# gzip_vary on;
# gzip_proxied any;
# gzip_comp_level 6;
# gzip_buffers 16 8k;
# gzip_http_version 1.1;
# gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
##
# Virtual Host Configs
##
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
server {
listen 9889;
ssl_certificate /home/emmanuel/Desktop/opensrc/bugs/golang/24763/cert.pem;
ssl_certificate_key /home/emmanuel/Desktop/opensrc/bugs/golang/24763/key.pem;
ssl on;
location / {
proxy_pass https://127.0.0.1:9789;
}
}
}
and I ran it 50 times in a row for ((i=0; i<50; i++)) do go run client.go >> output;done
and even with curl as a control
$ curl -i -k -X POST https://localhost:9889
HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Fri, 15 Jun 2018 06:45:44 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 12
Connection: keep-alive
Hello world!
$ curl -i -k -X POST https://localhost:9889
HTTP/1.1 200 OK
Server: nginx/1.14.0 (Ubuntu)
Date: Fri, 15 Jun 2018 06:45:46 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 12
Connection: keep-alive
Hello world!
but no failures and the client and server look like this
package main
import (
"bytes"
"crypto/tls"
"errors"
"fmt"
"io/ioutil"
"net/http"
"sync"
"time"
)
type client struct {
base string
hc *http.Client
}
func newClient(baseURL string) *client {
return &client{
base: baseURL,
hc: &http.Client{
Timeout: 5 * time.Second,
Transport: &http.Transport{
MaxIdleConnsPerHost: 100,
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
ExpectContinueTimeout: 5 * time.Second,
TLSClientConfig: &tls.Config{
InsecureSkipVerify: true,
},
},
},
}
}
func (c *client) send(content []byte) error {
if content == nil {
return errors.New("nil content")
}
body := new(bytes.Buffer)
body.Write(content)
start := time.Now()
req, err := http.NewRequest("POST", c.base, body)
if err != nil {
return fmt.Errorf("Send %+v (from %s)", err, time.Since(start))
}
req.Header = make(http.Header)
req.Header.Set("Content-Type", "application/json")
resp, err := c.hc.Do(req)
if err != nil {
return fmt.Errorf("Send: HTTPClient.Do: %+v (from %s)", err, time.Since(start))
}
defer resp.Body.Close()
blob, err := ioutil.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("Send: Read: %+v (from %s)", err, time.Since(start))
}
fmt.Printf("Send: Resp: %s\n", blob)
return nil
}
func main() {
client := newClient("https://localhost:9889/")
content := []byte(`{"test":1}`)
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func(i int) {
defer wg.Done()
if err := client.send(content); err != nil {
fmt.Printf("#%d: err: %v\n", i, err)
}
}(i)
}
wg.Wait()
}
package main
import (
"io/ioutil"
"log"
"net/http"
)
func main() {
addr := ":9789"
log.Printf("Serving at: %s", addr)
http.HandleFunc("/", helloWorld)
if err := http.ListenAndServeTLS(addr, "cert.pem", "key.pem", nil); err != nil {
log.Fatalf("Failed to serve: %v", err)
}
}
func helloWorld(w http.ResponseWriter, r *http.Request) {
_, _ = ioutil.ReadAll(r.Body)
defer r.Body.Close()
w.Write([]byte("Hello world!"))
}
@ictar with the setup that I've provided above, could you please try it locally too and see if this can reproduce the bug? Thank you.
@odeke-em
Can you try a http server instead of https?
Also, you can take a look at #25652 where there are client and server code.
It is likely that 50 requests are not enough.
With a minor edit to client.go to ensure we've configured the transport for HTTP2 with
--- old.go 2018-06-14 23:55:32.030299882 -0700
+++ client.go 2018-06-14 23:55:10.526187144 -0700
@@ -9,6 +9,8 @@
"net/http"
"sync"
"time"
+
+ "golang.org/x/net/http2"
)
type client struct {
@@ -17,19 +19,22 @@
}
func newClient(baseURL string) *client {
+ tr := &http.Transport{
+ MaxIdleConnsPerHost: 100,
+ MaxIdleConns: 100,
+ IdleConnTimeout: 90 * time.Second,
+ ExpectContinueTimeout: 5 * time.Second,
+ TLSClientConfig: &tls.Config{
+ InsecureSkipVerify: true,
+ },
+ }
+
+ http2.ConfigureTransport(tr)
return &client{
base: baseURL,
hc: &http.Client{
- Timeout: 5 * time.Second,
- Transport: &http.Transport{
- MaxIdleConnsPerHost: 100,
- MaxIdleConns: 100,
- IdleConnTimeout: 90 * time.Second,
- ExpectContinueTimeout: 5 * time.Second,
- TLSClientConfig: &tls.Config{
- InsecureSkipVerify: true,
- },
- },
+ Timeout: 5 * time.Second,
+ Transport: tr,
},
}
}
@ictar would you mind sharing your nginx.conf adapted from the one I posted above?
*rather @ernado
@odeke-em
I will try to create something like docker-compose so we can have fully reproducible environment.
@ernado roger that
It is likely that 50 requests are not enough.
@ernado I used 100 requests and ran it 50 times, so 5000 requests. I've done 4 concurrent for ((i=0; i<1000; i++)) do go run client.go;done so 4 * 1000 * 100 = 400,000 requests
@odeke-em
I've created an example via docker-compose.
Please take a look at https://github.com/ernado/go-issue-25009
Awesome, thank you @ernado for the repro via docker-compose

I am going to update the title a little though as this seems to happen a few times and not always
and I'll defer for some help from @bradfitz @rs @tombergan
The http2_max_requests nginx directive affects the error rate. If I set it high enough (more than total requests is done by the client) there are no errors.
Somehow relates to #18112.
I've updated my go-issue-25009 with test.sh so one can play with nginx config or client/server and quickly do the repro.
Yeah, I wondered and noticed that too that the max_requests if high enough, depreciates the error rate.
Interesting point: if go1.7 is used, I'm getting an error on http request:
http2: Transport: cannot retry err [http2: Transport received Server's graceful shutdown GOAWAY] after Request.Body was written; define Request.GetBody to avoid this error
(tried setting request.GetBody with no effect).
And with 1.8 I get http2: no cached connection was available.
Maybe that could help.
With such a low max_request, you have a high chance of getting a GOAWAY frame from the server before the body is sent. The h2 client will decide that that the request can be retried, but because it is a POST, the client needs a way to get a new version of the body buffer. To let the client retry POST you need to define Request.GetBody on your client as describe in the error message. I'm not sure why it would not work if defined though.
The h2 retry logic is here for reference: https://github.com/golang/net/blob/master/http2/transport.go#L422
@rs , I'm passing *bytes.Reader to http.NewRequest which is already setting req.GetBody: https://github.com/golang/go/blob/master/src/net/http/request.go#L816
Also if we use separate http.Client for each goroutine (so we don't call it concurrently) there are no errors.
I've updated my repo so you can try it via environment variables. E.g.
$ CONCURRENT=0 ./test.sh
# works fine
$ CONCURRENT=1 ./test.sh
# same 400 Bad Request
Also you can set BODY=1 to setreq.GetBody explicitly (no effect) and pass REQUESTS and JOBS count.
Something to consider is the way h2 is integrated with h1 in the std lib to handle upgrades (I'm not used to this setup as I mostly use h2 lib directly). In the configureTransport we set the h2 transport with a special version of the pool named noDialClientConnPool which is not able to dial on call to GetClientConn. Instead, new connections are only added to the pool on upgrade, by promoting an existing TLS connection created by the h1 Transport.
I'm not sure what happen in the case this connection is becoming invalid while the request is being sent. As seen, h2 may decide to retry, but if the pool does not contain a valid connection anymore and is not able to dial, you will get the http2: no cached connection was available you are seeing. I guess the proper way to retry would be to go back to h1 and go over the full upgrade process all over again.
So basically, without GetBody set on request, the request can't be retried and when set, it can't either because it can't reconnect, giving a different error. Nginx might send 400 back because Go decides to not send the DATA frames after the HEADERS because it received a GOAWAY frame in the meantime and failed to properly cancel the stream. Note that I did not verify those assumptions, I'm just interpreting what I'm reading in the code.
Please pay attention that http2: no cached connection was available is only on go1.8, there are no such errors on latest versions, just 400 code from nginx. Sorry for possible misguidance.
Also GetBody is set in all (my) cases,NewRequest handles it implicitly.
This error is now logged with vlogf. If you turn on h2 debugging, you should see them: GODEBUG=http2debug=1.
I tried your test by changing the transport of the client you create in newClient with a http2.Transport directly in order to remove the h1->h2 upgrade logic. With this configuration, the problem goes away.
I've updated my test and added HTTP2_TRANSPORT env var support.
Can confirm that with HTTP2_TRANSPORT=1 the problem goes away.
Adding the output of following command:
$ REQUESTS=10 HTTP2_TRANSPORT=0 GODEBUG=http2debug=1 ./test.sh
https://gist.github.com/ernado/2bc6a64764ce6ee6e05960967ed999a0
Some output from it:
Transport failed to get client conn for nginx:443: http2: no cached connection was available
Transport received GOAWAY len=8 LastStreamID=9 ErrCode=NO_ERROR Debug=""
Transport readFrame error on conn 0xc42038c000: (*errors.errorString) EOF
The most interesting is that with JOBS=1 (only 1 goroutine) there are no errors.
Some race in retry code?
I think you need the concurrency to get the GOAWAY sent by the server while another request is being posted by the client. Otherwise you have a clean 3 requests THEN GOWAY / conn recycle then 3 other requests on a new connection and so on.
Seems like req.Body can be read during cc.roundTrip(req) call here:
https://github.com/golang/net/blob/master/http2/transport.go#L378
And then is never reset.
I've tried to add following before the call:
body, err := req.GetBody()
if err != nil {
return nil, err
}
req.Body = body
And the problem goes away.
Not sure about the cause of such behavior.
Any additional info needed?
@ernado, I think you're close, but that's not exactly it. That GetBody code is already being done in shouldRetryRequest.
I think the problem is that the gotErrAfterReqBodyWrite value is wrong, or at least the wrong thing to track.
We don't care whether it's been written as much as whether it's been read.
Change https://golang.org/cl/123476 mentions this issue: http2: use GetBody unconditionally on Transport retry, when available
@ernado, I mailed out https://go-review.googlesource.com/c/net/+/123476 for Go 1.11 but I'd still like to understand more why the old code was wrong.
If you're able to reproduce this easily, can you run the old code (without my patch) under the race detector and see if you see any reported data races?
Thanks.
@bradfitz, on go1.10.3 it reports the following:
==================
WARNING: DATA RACE
Read at 0x00c42009f278 by goroutine 98:
bytes.(*Reader).Read()
/usr/local/go/src/bytes/reader.go:40 +0x5c
io/ioutil.(*nopCloser).Read()
<autogenerated>:1 +0x7d
net/http.(*http2clientStream).writeRequestBody()
/usr/local/go/src/net/http/h2_bundle.go:7598 +0x5ff
net/http.(*http2Transport).getBodyWriterState.func1()
/usr/local/go/src/net/http/h2_bundle.go:8794 +0x132
Previous write at 0x00c42009f278 by goroutine 96:
bytes.(*Reader).Read()
/usr/local/go/src/bytes/reader.go:45 +0x160
io/ioutil.(*nopCloser).Read()
<autogenerated>:1 +0x7d
net/http.(*http2clientStream).writeRequestBody()
/usr/local/go/src/net/http/h2_bundle.go:7598 +0x5ff
net/http.(*http2Transport).getBodyWriterState.func1()
/usr/local/go/src/net/http/h2_bundle.go:8794 +0x132
Goroutine 98 (running) created at:
net/http.http2bodyWriterState.scheduleBodyWrite()
/usr/local/go/src/net/http/h2_bundle.go:8841 +0xb2
net/http.(*http2ClientConn).roundTrip()
/usr/local/go/src/net/http/h2_bundle.go:7403 +0x77a
net/http.(*http2Transport).RoundTripOpt()
/usr/local/go/src/net/http/h2_bundle.go:6929 +0x3ba
net/http.(*http2Transport).RoundTrip()
/usr/local/go/src/net/http/h2_bundle.go:6891 +0x4b
net/http.(*Transport).RoundTrip()
/usr/local/go/src/net/http/transport.go:420 +0x986
net/http.send()
/usr/local/go/src/net/http/client.go:252 +0x352
net/http.(*Client).send()
/usr/local/go/src/net/http/client.go:176 +0x1b0
net/http.(*Client).Do()
/usr/local/go/src/net/http/client.go:615 +0x4dd
main.startClient.func1()
/go/src/github.com/ernado/go-issue-25009/main.go:108 +0x666
Goroutine 96 (finished) created at:
net/http.http2bodyWriterState.scheduleBodyWrite()
/usr/local/go/src/net/http/h2_bundle.go:8841 +0xb2
net/http.(*http2ClientConn).roundTrip()
/usr/local/go/src/net/http/h2_bundle.go:7403 +0x77a
net/http.(*http2Transport).RoundTripOpt()
/usr/local/go/src/net/http/h2_bundle.go:6929 +0x3ba
net/http.(*http2Transport).RoundTrip()
/usr/local/go/src/net/http/h2_bundle.go:6891 +0x4b
net/http.(*Transport).RoundTrip()
/usr/local/go/src/net/http/transport.go:420 +0x986
net/http.send()
/usr/local/go/src/net/http/client.go:252 +0x352
net/http.(*Client).send()
/usr/local/go/src/net/http/client.go:176 +0x1b0
net/http.(*Client).Do()
/usr/local/go/src/net/http/client.go:615 +0x4dd
main.startClient.func1()
/go/src/github.com/ernado/go-issue-25009/main.go:108 +0x666
==================
However I can't trigger that warning on cc422e6 (current tip), while behaviour is staying the same (test failing).
You can do the repro with my https://github.com/ernado/go-issue-25009, now it allows to run custom binary, e.g. go build -race . in project root and run test-bin.sh (or only docker-compose -f docker-compose.bin.yml up to just play with it)
Keeping this bug open to investigate the race in the code in Go 1.10 and before https://github.com/golang/net/commit/cffdcf672aee934982473246bc7e9a8ba446aa9b.
Change https://golang.org/cl/123515 mentions this issue: net/http: update bundled http2
@bradfitz I'm still getting the "400" codes even when GetBody is utilized on go1.11.
I've debugged it once more and now I'm 100% sure that the request body is not getting rewinded by HTTP/2 as expected here:
https://github.com/golang/go/blob/e03220a594a1a4b7fa8c901eebddb9ea11ecbece/src/net/http/transport.go#L480-L483
So we are ending up with non-rewinded body with effective length of zero here:
https://github.com/golang/go/blob/e03220a594a1a4b7fa8c901eebddb9ea11ecbece/src/net/http/transport.go#L460-L463
However, we still have non-zero content length in headers, which results in 400 error from nginx.
I will work on easier ways to reproduce that issue.
Some logs (the most interesting part is trasnport: calling alt.RoundTrip which is tracing I've added before the 463 line of transport.go):
Click to view logs
2018/08/25 07:56:35 http2: RoundTripOpt bytes.Reader length: 5 [/01/001]
2018/08/25 07:56:35 http2: Transport failed to get client conn for go-issue-25009:443: http2: no cached connection was available
2018/08/25 07:56:35 http2: RoundTripOpt bytes.Reader length: 5 [/00/002]
2018/08/25 07:56:35 http2: Transport failed to get client conn for go-issue-25009:443: http2: no cached connection was available
2018/08/25 07:56:35 http2: Transport creating client conn 0xc000108380 to 127.0.0.1:443
2018/08/25 07:56:35 trasnport: calling alt.RoundTrip with bytes.Reader length: 5 [/01/001]
2018/08/25 07:56:35 http2: RoundTripOpt bytes.Reader length: 5 [/01/001]
2018/08/25 07:56:35 trasnport: calling alt.RoundTrip with bytes.Reader length: 5 [/00/002]
2018/08/25 07:56:35 http2: roundTrip bytes.Reader length: 5 [/01/001]
2018/08/25 07:56:35 http2: RoundTripOpt bytes.Reader length: 5 [/00/002]
2018/08/25 07:56:35 http2: Transport encoding header ":path" = "/01/001"
2018/08/25 07:56:35 http2: Transport received SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=128, INITIAL_WINDOW_SIZE=65536, MAX_FRAME_SIZE=16777215
2018/08/25 07:56:35 http2: roundTrip bytes.Reader length: 5 [/00/002]
2018/08/25 07:56:35 http2: Client 0xc00027c000 writeRequestBody
2018/08/25 07:56:35 http2: Client 0xc00027c000 body.Read: 5, <nil>
2018/08/25 07:56:35 http2: Transport received WINDOW_UPDATE len=4 (conn) incr=2147418112
2018/08/25 07:56:35 http2: Transport received SETTINGS flags=ACK len=0
2018/08/25 07:56:35 http2: Transport encoding header ":path" = "/00/002"
2018/08/25 07:56:35 http2: Client 0xc00027c000 body.Read: 0, EOF
2018/08/25 07:56:35 http2: Transport received GOAWAY len=8 LastStreamID=1 ErrCode=NO_ERROR Debug=""
2018/08/25 07:56:35 processGoAway
2018/08/25 07:56:35 http2: Transport failed to get client conn for go-issue-25009:443: http2: no cached connection was available
2018/08/25 07:56:35 http2: Client 0xc00027c000 writeRequestBody
2018/08/25 07:56:35 http2: Client 0xc00027c000 body.Read: 5, <nil>
2018/08/25 07:56:35 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=1 len=40
2018/08/25 07:56:35 http2: Transport readFrame error on conn 0xc000108380: (*errors.errorString) EOF
2018/08/25 07:56:35 http2: Transport creating client conn 0xc000108fc0 to 127.0.0.1:443
2018/08/25 07:56:35 trasnport: calling alt.RoundTrip with bytes.Reader length: 0 [/00/002]
2018/08/25 07:56:35 http2: RoundTripOpt bytes.Reader length: 0 [/00/002]
2018/08/25 07:56:35 http2: roundTrip bytes.Reader length: 0 [/00/002]
2018/08/25 07:56:35 http2: Transport encoding header ":path" = "/00/002"
2018/08/25 07:56:35 http2: Client 0xc00027c460 writeRequestBody
2018/08/25 07:56:35 http2: Client 0xc00027c460 body.Read: 0, EOF
2018/08/25 07:56:35 http2: Transport received SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=128, INITIAL_WINDOW_SIZE=65536, MAX_FRAME_SIZE=16777215
2018/08/25 07:56:35 http2: Transport received WINDOW_UPDATE len=4 (conn) incr=2147418112
2018/08/25 07:56:35 http2: Transport received SETTINGS flags=ACK len=0
2018/08/25 07:56:35 http2: Transport received GOAWAY len=8 LastStreamID=1 ErrCode=NO_ERROR Debug=""
2018/08/25 07:56:35 processGoAway
2018/08/25 07:56:35 http2: Transport received HEADERS flags=END_HEADERS stream=1 len=51
2018/08/25 07:56:35 http2: Transport received DATA flags=END_STREAM stream=1 len=173 data="<html>\r\n<head><title>400 Bad Request</title></head>\r\n<body bgcolor=\"white\">\r\n<center><h1>400 Bad Request</h1></center>\r\n<hr><center>nginx/1.15.2</center>\r\n</body>\r\n</html>\r\n"
2018/08/25 07:56:35 http2: Transport readFrame error on conn 0xc000108fc0: (*errors.errorString) EOF
UPD:
Also everything works fine if I just change the condition
https://github.com/golang/go/blob/e03220a594a1a4b7fa8c901eebddb9ea11ecbece/src/net/http/transport.go#L482
To
if req.GetBody != nil {
I've investigated it further and I think that I finally understood what is going on.
So in HTTP/2 we are rewinding body here:
https://github.com/golang/go/blob/e03220a594a1a4b7fa8c901eebddb9ea11ecbece/src/net/http/h2_bundle.go#L7091-L7103
But we are not setting the Body field of original req, we are making a copy.
If we will do the following:
if getBody != nil {
body, err := getBody()
if err != nil {
return nil, err
}
req.Body = body
return req, nil
}
Everything works too, because the assumption "HTTP/2 does this itself" in transport.go is now true.
So from my point of view (I don't know the internals of http2 and http packages) we either can remove that false assumption and remove the pconn.alt == nil condition in transport.go or make that assumption true by setting the req.Body in http2shouldRetryRequest.
Technically we are rewinding the request in the RountTripOpt method scope, but we are returning here:
The connection is getting closed after GOAWAY and we can't get it from pool, so we are stopping the retry cycle in RoundTripOpt and returning to transport.roundTrip with http2: no cached connection was available, where the pconn.shouldRetryRequest(req, err) will be true because of http2isNoCachedConnError in it, but the req.Body was already read and not re-winded.
Btw I'm getting ALL TESTS PASSED on current tip after applying the following patch:
--- a/src/net/http/transport.go
+++ b/src/net/http/transport.go
@@ -477,9 +477,8 @@ func (t *Transport) roundTrip(req *Request) (*Response, error) {
}
testHookRoundTripRetried()
- // Rewind the body if we're able to. (HTTP/2 does this itself so we only
- // need to do it for HTTP/1.1 connections.)
- if req.GetBody != nil && pconn.alt == nil {
+ // Rewind the body if we're able to.
+ if req.GetBody != nil {
newReq := *req
var err error
newReq.Body, err = req.GetBody()
I've implemented a test that does not require nginx:
https://github.com/ernado/go-issue-25009/blob/9f8369b77105a097b36f583c37f66391b31f85d5/main_test.go#L85-L214
$ git clone https://github.com/ernado/go-issue-25009.git && cd go-issue-25009 && go test
[send] SETTINGS Frame (length:6, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:0, flags:0x01, stream_id:0)
[recv] WINDOW_UPDATE Frame (length:4, flags:0x00, stream_id:0)
[recv] HEADERS Frame (length:39, flags:0x04, stream_id:1)
[recv] DATA Frame (length:1, flags:0x00, stream_id:1)
[send] HEADERS Frame (length:4, flags:0x04, stream_id:1)
[send] DATA Frame (length:7, flags:0x01, stream_id:1)
[send] GOAWAY Frame (length:20, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:6, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:0, flags:0x01, stream_id:0)
[recv] WINDOW_UPDATE Frame (length:4, flags:0x00, stream_id:0)
[recv] HEADERS Frame (length:39, flags:0x04, stream_id:1)
[send] GOAWAY Frame (length:12, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:6, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:0, flags:0x01, stream_id:0)
[recv] WINDOW_UPDATE Frame (length:4, flags:0x00, stream_id:0)
[recv] HEADERS Frame (length:39, flags:0x04, stream_id:1)
[recv] DATA Frame (length:0, flags:0x01, stream_id:1)
[send] HEADERS Frame (length:1, flags:0x05, stream_id:1)
[send] DATA Frame (length:5, flags:0x01, stream_id:1)
[send] GOAWAY Frame (length:20, flags:0x00, stream_id:0)
--- FAIL: TestIssue25009 (0.05s)
main_test.go:119: server: read request: [header field ":authority" = "127.0.0.1:39629" header field ":method" = "POST" header field ":path" = "/" header field ":scheme" = "https" header field "content-length" = "1" header field "accept-encoding" = "gzip" header field "user-agent" = "Go-http-client/2.0"]
main_test.go:140: server: got length: 1
main_test.go:208: client: ok: 002
main_test.go:119: server: read request: [header field ":authority" = "127.0.0.1:39629" header field ":method" = "POST" header field ":path" = "/" header field ":scheme" = "https" header field "content-length" = "1" header field "accept-encoding" = "gzip" header field "user-agent" = "Go-http-client/2.0"]
main_test.go:119: server: read request: [header field ":authority" = "127.0.0.1:39629" header field ":method" = "POST" header field ":path" = "/" header field ":scheme" = "https" header field "content-length" = "1" header field "accept-encoding" = "gzip" header field "user-agent" = "Go-http-client/2.0"]
main_test.go:140: server: got length: 0
main_test.go:206: client: failed: code=400 001
FAIL
exit status 1
FAIL github.com/ernado/go-issue-25009 0.057s
So, as mentioned earlier, we are getting zero length data frame:
[recv] DATA Frame (length:0, flags:0x01, stream_id:1)
If I remove the pconn.alt == nil condition, I'm getting the following:
go test -v
=== RUN TestIssue25009
[send] SETTINGS Frame (length:6, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:0, flags:0x01, stream_id:0)
[recv] WINDOW_UPDATE Frame (length:4, flags:0x00, stream_id:0)
[recv] HEADERS Frame (length:39, flags:0x04, stream_id:1)
[recv] DATA Frame (length:1, flags:0x00, stream_id:1)
[send] HEADERS Frame (length:4, flags:0x04, stream_id:1)
[send] DATA Frame (length:7, flags:0x01, stream_id:1)
[send] GOAWAY Frame (length:20, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:6, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:0, flags:0x01, stream_id:0)
[recv] WINDOW_UPDATE Frame (length:4, flags:0x00, stream_id:0)
[recv] HEADERS Frame (length:39, flags:0x04, stream_id:1)
[send] GOAWAY Frame (length:12, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:6, flags:0x00, stream_id:0)
[send] SETTINGS Frame (length:0, flags:0x01, stream_id:0)
[recv] WINDOW_UPDATE Frame (length:4, flags:0x00, stream_id:0)
[recv] HEADERS Frame (length:39, flags:0x04, stream_id:1)
[recv] DATA Frame (length:1, flags:0x00, stream_id:1)
[send] HEADERS Frame (length:4, flags:0x04, stream_id:1)
[send] DATA Frame (length:7, flags:0x01, stream_id:1)
[send] GOAWAY Frame (length:20, flags:0x00, stream_id:0)
--- PASS: TestIssue25009 (0.05s)
main_test.go:119: server: read request: [header field ":authority" = "127.0.0.1:37237" header field ":method" = "POST" header field ":path" = "/" header field ":scheme" = "https" header field "content-length" = "1" header field "accept-encoding" = "gzip" header field "user-agent" = "Go-http-client/2.0"]
main_test.go:140: server: got length: 1
main_test.go:208: client: ok: 002
main_test.go:119: server: read request: [header field ":authority" = "127.0.0.1:37237" header field ":method" = "POST" header field ":path" = "/" header field ":scheme" = "https" header field "content-length" = "1" header field "accept-encoding" = "gzip" header field "user-agent" = "Go-http-client/2.0"]
main_test.go:119: server: read request: [header field ":authority" = "127.0.0.1:37237" header field ":method" = "POST" header field ":path" = "/" header field ":scheme" = "https" header field "content-length" = "1" header field "accept-encoding" = "gzip" header field "user-agent" = "Go-http-client/2.0"]
main_test.go:140: server: got length: 1
main_test.go:208: client: ok: 001
PASS
ok github.com/ernado/go-issue-25009 0.056s
@bradfitz Do you need more info?
CC @rs
UPD: Can we consider fixing it in go1.11.1 instead of go1.12?
UPD2: Re-checked on b7d3e14a52, no changes, still failing without patch
I think you have inferred a behavior that is incorrect. The GOAWAY Frame returns the stream id that was last seen. Given the test is sending back stream id 0, stream 1 is guaranteed not to succeed. See GOAWAY
@fraenkel I'm 100% sure that my test server behavior is incorrect, but http client should not retry the request without body rewind even in that case. I'm sure that server behavior triggers the condition that is very close to bug that I've discovered. I can try to do the full repro, but I think it is not necessary.
So the logs for nginx as a server with http2debug=2:
=== RUN TestWithNginx
2018/08/26 18:48:54 http2: Transport failed to get client conn for go-issue-25009:443: http2: no cached connection was available
2018/08/26 18:48:54 http2: Transport failed to get client conn for go-issue-25009:443: http2: no cached connection was available
2018/08/26 18:48:54 http2: Transport creating client conn 0xc0003a4380 to 127.0.0.1:443
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote SETTINGS len=18, settings: ENABLE_PUSH=0, INITIAL_WINDOW_SIZE=4194304, MAX_HEADER_LIST_SIZE=10485760
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote WINDOW_UPDATE len=4 (conn) incr=1073741824
2018/08/26 18:48:54 http2: Transport encoding header ":authority" = "go-issue-25009"
2018/08/26 18:48:54 http2: Transport encoding header ":method" = "POST"
2018/08/26 18:48:54 http2: Transport encoding header ":path" = "/00/002"
2018/08/26 18:48:54 http2: Transport encoding header ":scheme" = "https"
2018/08/26 18:48:54 http2: Transport encoding header "content-length" = "1"
2018/08/26 18:48:54 http2: Transport encoding header "accept-encoding" = "gzip"
2018/08/26 18:48:54 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote HEADERS flags=END_HEADERS stream=1 len=44
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=128, INITIAL_WINDOW_SIZE=65536, MAX_FRAME_SIZE=16777215
2018/08/26 18:48:54 http2: Transport encoding header ":authority" = "go-issue-25009"
2018/08/26 18:48:54 http2: Transport received SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=128, INITIAL_WINDOW_SIZE=65536, MAX_FRAME_SIZE=16777215
2018/08/26 18:48:54 http2: Transport encoding header ":method" = "POST"
2018/08/26 18:48:54 http2: Transport encoding header ":path" = "/01/001"
2018/08/26 18:48:54 http2: Transport encoding header ":scheme" = "https"
2018/08/26 18:48:54 http2: Transport encoding header "content-length" = "1"
2018/08/26 18:48:54 http2: Transport encoding header "accept-encoding" = "gzip"
2018/08/26 18:48:54 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote HEADERS flags=END_HEADERS stream=3 len=13
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote DATA stream=3 len=1 data="1"
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote DATA stream=1 len=1 data="2"
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote SETTINGS flags=ACK len=0
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read WINDOW_UPDATE len=4 (conn) incr=2147418112
2018/08/26 18:48:54 http2: Transport received WINDOW_UPDATE len=4 (conn) incr=2147418112
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read SETTINGS flags=ACK len=0
2018/08/26 18:48:54 http2: Transport received SETTINGS flags=ACK len=0
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read GOAWAY len=8 LastStreamID=1 ErrCode=NO_ERROR Debug=""
2018/08/26 18:48:54 http2: Transport received GOAWAY len=8 LastStreamID=1 ErrCode=NO_ERROR Debug=""
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read WINDOW_UPDATE stream=1 len=4 incr=1
2018/08/26 18:48:54 http2: Transport received WINDOW_UPDATE stream=1 len=4 incr=1
2018/08/26 18:48:54 http2: Transport failed to get client conn for go-issue-25009:443: http2: no cached connection was available
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote DATA flags=END_STREAM stream=3 len=0 data=""
2018/08/26 18:48:54 http2: Framer 0xc00035e000: wrote DATA flags=END_STREAM stream=1 len=0 data=""
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read HEADERS flags=END_STREAM|END_HEADERS stream=1 len=41
2018/08/26 18:48:54 http2: decoded hpack field header field ":status" = "200"
2018/08/26 18:48:54 http2: decoded hpack field header field "server" = "nginx/1.15.2"
2018/08/26 18:48:54 http2: decoded hpack field header field "date" = "Sun, 26 Aug 2018 15:48:54 GMT"
2018/08/26 18:48:54 http2: decoded hpack field header field "content-length" = "0"
2018/08/26 18:48:54 http2: Transport received HEADERS flags=END_STREAM|END_HEADERS stream=1 len=41
2018/08/26 18:48:54 http2: Framer 0xc00035e000: read WINDOW_UPDATE stream=1 len=4 incr=2147418111
2018/08/26 18:48:54 http2: Transport received WINDOW_UPDATE stream=1 len=4 incr=2147418111
2018/08/26 18:48:54 http2: Transport readFrame error on conn 0xc0003a4380: (*errors.errorString) EOF
2018/08/26 18:48:54 http2: Transport creating client conn 0xc0001c81c0 to 127.0.0.1:443
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: wrote SETTINGS len=18, settings: ENABLE_PUSH=0, INITIAL_WINDOW_SIZE=4194304, MAX_HEADER_LIST_SIZE=10485760
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: wrote WINDOW_UPDATE len=4 (conn) incr=1073741824
2018/08/26 18:48:54 http2: Transport encoding header ":authority" = "go-issue-25009"
2018/08/26 18:48:54 http2: Transport encoding header ":method" = "POST"
2018/08/26 18:48:54 http2: Transport encoding header ":path" = "/01/001"
2018/08/26 18:48:54 http2: Transport encoding header ":scheme" = "https"
2018/08/26 18:48:54 http2: Transport encoding header "content-length" = "1"
2018/08/26 18:48:54 http2: Transport encoding header "accept-encoding" = "gzip"
2018/08/26 18:48:54 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: wrote HEADERS flags=END_HEADERS stream=1 len=44
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: wrote DATA flags=END_STREAM stream=1 len=0 data=""
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: read SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=128, INITIAL_WINDOW_SIZE=65536, MAX_FRAME_SIZE=16777215
2018/08/26 18:48:54 http2: Transport received SETTINGS len=18, settings: MAX_CONCURRENT_STREAMS=128, INITIAL_WINDOW_SIZE=65536, MAX_FRAME_SIZE=16777215
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: wrote SETTINGS flags=ACK len=0
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: read WINDOW_UPDATE len=4 (conn) incr=2147418112
2018/08/26 18:48:54 http2: Transport received WINDOW_UPDATE len=4 (conn) incr=2147418112
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: read SETTINGS flags=ACK len=0
2018/08/26 18:48:54 http2: Transport received SETTINGS flags=ACK len=0
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: read GOAWAY len=8 LastStreamID=1 ErrCode=NO_ERROR Debug=""
2018/08/26 18:48:54 http2: Transport received GOAWAY len=8 LastStreamID=1 ErrCode=NO_ERROR Debug=""
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: read HEADERS flags=END_HEADERS stream=1 len=52
2018/08/26 18:48:54 http2: decoded hpack field header field ":status" = "400"
2018/08/26 18:48:54 http2: decoded hpack field header field "server" = "nginx/1.15.2"
2018/08/26 18:48:54 http2: decoded hpack field header field "date" = "Sun, 26 Aug 2018 15:48:54 GMT"
2018/08/26 18:48:54 http2: decoded hpack field header field "content-type" = "text/html"
2018/08/26 18:48:54 http2: decoded hpack field header field "content-length" = "173"
2018/08/26 18:48:54 http2: Transport received HEADERS flags=END_HEADERS stream=1 len=52
2018/08/26 18:48:54 http2: Framer 0xc0003d41c0: read DATA flags=END_STREAM stream=1 len=173 data="<html>\r\n<head><title>400 Bad Request</title></head>\r\n<body bgcolor=\"white\">\r\n<center><h1>400 Bad Request</h1></center>\r\n<hr><center>nginx/1.15.2</center>\r\n</body>\r\n</html>\r\n"
2018/08/26 18:48:54 http2: Transport received DATA flags=END_STREAM stream=1 len=173 data="<html>\r\n<head><title>400 Bad Request</title></head>\r\n<body bgcolor=\"white\">\r\n<center><h1>400 Bad Request</h1></center>\r\n<hr><center>nginx/1.15.2</center>\r\n</body>\r\n</html>\r\n"
2018/08/26 18:48:54 http2: Transport readFrame error on conn 0xc0001c81c0: (*errors.errorString) EOF
--- FAIL: TestWithNginx (0.04s)
main_test.go:262: client: ok: 002
main_test.go:260: client: failed: code=400 001
FAIL
And the corresponding client code:
func TestWithNginx(t *testing.T) {
const (
requests = 2
addr = "https://go-issue-25009"
)
go func() {
http.HandleFunc("/", func(writer http.ResponseWriter, request *http.Request) {
io.Copy(ioutil.Discard, request.Body)
request.Body.Close()
writer.WriteHeader(http.StatusOK)
})
if err := http.ListenAndServe("localhost:31001", nil); err != nil {
t.Fatal(err)
}
}()
var (
wg = new(sync.WaitGroup)
client = newClient()
count int64
)
for i := 0; i < requests; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
for {
current := atomic.AddInt64(&count, 1)
if current > requests {
break
}
buf := new(bytes.Buffer)
fmt.Fprint(buf, current)
req, err := http.NewRequest(http.MethodPost, addr+fmt.Sprintf("/%02d/%03d", id, current), buf)
if err != nil {
t.Error(err)
continue
}
res, err := client.Do(req)
if err != nil {
t.Errorf("client: %v", err)
continue
}
io.Copy(ioutil.Discard, res.Body)
res.Body.Close()
if res.StatusCode != http.StatusOK {
t.Errorf("client: failed: code=%d %03d", res.StatusCode, current)
} else {
t.Logf("client: ok: %03d", current)
}
}
}(i)
}
wg.Wait()
}
nginx config:
server {
listen 443 ssl http2;
http2_max_requests 1;
error_log /var/log/nginx/error.log debug;
location / {
proxy_http_version 1.1;
proxy_request_buffering off;
proxy_buffering off;
proxy_pass_request_headers on;
proxy_set_header Connection "";
proxy_pass http://localhost:31001;
}
server_name go-issue-25009;
ssl_certificate /src/go-issue-25009/certs/go-issue-25009.pem;
ssl_certificate_key /src/go-issue-25009/certs/go-issue-25009-key.pem;
}
There is something funny with that trace.
You send out 2 requests, although the order gets a bit flipped due to racy goroutines which is fine.
stream=3 is 00 and stream=1 is 01.
We get a GOAWAY on stream 1. And we receive a 200 for stream 1. The connection is then closed.
There is a retry for 00, a GOAWAY comes back for it with a 400.
The question is what was nginx doing.
@ernado You are correct that the retry does not reset the body. If you look at the retry, it sends a 0 byte body instead of a 1 byte body.
I will have to try your testcase and see if it reproduces that issue.
From https://trac.nginx.org/nginx/ticket/1250#comment:4
When nginx reaches http2_max_requests, it sends GOAWAY and closes the socket using close() (observed in strace).
But packets from the other side can still be in flight.
When an in-flight packet arrives at the kernel of the machine running nginx, after close(), the kernel will reply with TCP RST.
When the browser side receives the RST, its kernel will discard any receive buffers, even for packets that it already ACKed.
These receive buffers may contain the GOAWAY (or the last headers/data after it in the same stream).
Can be related to that issue.
@fraenkel
My tetstcase seems to reproduce the issue:
server: got length: 0
Also I've described why we the body is not reset in
https://github.com/golang/go/issues/25009#issuecomment-415939211
You can try the following patch to force the correct body reset:
--- a/src/net/http/transport.go
+++ b/src/net/http/transport.go
@@ -477,9 +477,8 @@ func (t *Transport) roundTrip(req *Request) (*Response, error) {
}
testHookRoundTripRetried()
- // Rewind the body if we're able to. (HTTP/2 does this itself so we only
- // need to do it for HTTP/1.1 connections.)
- if req.GetBody != nil && pconn.alt == nil {
+ // Rewind the body if we're able to.
+ if req.GetBody != nil {
newReq := *req
var err error
newReq.Body, err = req.GetBody()
Your testcase does not replicate the issue. You are trying to mimic the behavior but you aren't. The problem is that the request is retried and the body isn't reset so nginx sends back a 400 since it expected 1 byte of data and got 0.
The fix should be in the http2 code and not http which is what you have done.
Your testcase does not replicate the issue. You are trying to mimic the behavior but you aren't.
You are absolutely right, but the test case still triggers the invalid retry, isn't it? The client retries with request which body is not reset properly.
The problem is that the request is retried and the body isn't reset so nginx sends back a 400 since it expected 1 byte of data and got 0.
Yes, it is exactly what I was trying to describe.
And I've already described why it is not actually resetting the body in my https://github.com/golang/go/issues/25009#issuecomment-415937471
The fix should be in the http2 code and not http which is what you have done
I've provided both possible http2 and http "fixes", and they are described in https://github.com/golang/go/issues/25009#issuecomment-415939211
Why aren't you using the req.GetBody func? The is how you provide a body on retries.
Because it is already used under the hood when body is *bytes.Buffer
So what is currently there should work because the newReq is returned and retried in the loop. Setting it back on the original request is racy, but the outcome should be the same. Let me see if I can reproduce this and dig as to why the newReq isn't working which is what should be working.
Also I've tried both of my "patches", they are forcing the correct body reset which results in 200 OK from nginx (and passed test case).
NewReq is not working because we are not using it, we are failing to get connection from the pool and returning from http2 retry loop.
I agree that you found the bug. The fix while it works doesn't seem right.
@bradfitz : The bug is when http2 fails to obtain a connection and leaves http2/transport.go(RoundTripOpt:389), the new request is not returned to be retried from http/transport.go (roundtrip:463) instead the original one is used with a body that has already been read.
@ernado Your fix in http/transport.go looks correct. We still would need a test case to go along with the change.
I've implemented a test case for transport and currently preparing the patch.
Change https://golang.org/cl/131755 mentions this issue: net/http: rewind request body unconditionally
Gentle ping for @bradfitz @ianlancetaylor
Please take a look, seems like we are close to resolving that issue.
Have you found the solution to this issue? I encountered the same problem where I'd get back a statuscode 400 from nginx sometimes when making a lot of concurrent calls. At some point I'd even get a 'broken pipe' error after which I'd have to restart the client. Using an environment variable (GODEBUG="http2client=0") fixes the issue, but I don't like this solution.
@kaspergyselinck what version of go are you using?
This problem should be fixed in go1.12.
Hi @ernado,
I am still seeing this issue. I have tested on both Go 1.12.9 and 1.13
It seems that others are as well in https://github.com/golang/go/issues/32441
I can confirm that my go-issue-25009 repro still fails on go1.12 and go1.13 for some reason.
Seems like issue is not fixed (or there is a regression), I will investigate it further.
Thank you @kwestein, I've found the cause and commented in referenced issue with possible solution.
Change https://golang.org/cl/210123 mentions this issue: net/http: rewind request body on ErrSkipAltProtocol
@ernado hello, i have meet the same solution, have you fix it ?
Hey @Centhero, I'm trying to fix in https://golang.org/cl/210123, but it seems like fix is not as trivial as I expected, sorry.
Most helpful comment
I can confirm that my go-issue-25009 repro still fails on go1.12 and go1.13 for some reason.
Seems like issue is not fixed (or there is a regression), I will investigate it further.