Google Cloud Storage
SDK is used in Hashicorp Vault running on Ubuntu 18.04, on GCE
When canceling a context passed to ObjectHandle.NewWriter(ctx), all resources associated with the request should be cleaned up and it won't affect subsequent operations
I have a minimal reproduction test case here: https://github.com/KJTsanaktsidis/refused_stream_repro
The reproduction tries to upload lots of files to GCS with ObjectHandle.NewWriter(ctx), and cancels some of the contexts at random times. Eventually, after running for a few minutes, all uploads start returning the following error:
Post https://www.googleapis.com/upload/storage/v1/b/kjs_cool_vault_bucket/o?alt=json&prettyPrint=false&projection=full&uploadType=multipart: stream error: stream ID 45457; REFUSED_STREAM
I used a packet capture to debug the communication between Vault and GCS, and I found that the SDK would often
/upload/storage/v1/b/vaultgcs_backend_us1_staging/o?alt=json&prettyPrint=false&projection=full&uploadType=multipartSo, eventually, the GCS server starts sending REFUSED_STREAM errors back to the client when any new upload attempt is made, because the number of forgotten-about upload streams has exceeded the servers HTTP2 MAX_CONCURRENT_STREAM value of 100.
This issue seems to be the cause of https://github.com/hashicorp/vault/issues/5419
This is likely related to #753, as well.
Another related issues: golang/go#20985.
Thanks for the reproduction @KJTsanaktsidis. It's been very helpful.
Meanwhile, here's another related issue: golang/go#27208.
Yup - https://github.com/golang/go/issues/27208 is exactly it!
I tried my reproduction using golang.org/x/net from the provided PR from that issue https://github.com/golang/net/pull/18 (see branch working_version in my repo). This seems to have fixed the issue - I no longer got STREAM_REFUSED errors!
So I think this isn't a bug in google-cloud-go at all, and should go away on go 1.12?
Well done!
Provided the patch lands in Go 1.12, yes, the problem should go away. I'll ping the Go issue.
Awesome - thanks heaps for your help in joining these dots!
Since this is an issue outside this repo, I'm going to close it.
Thanks again for all your help getting to the bottom of this.