Title: RST on HTTP/2 half-close bug
Description:
envoy version = v1.7.0
Envoy resets bidirectional gRPC streams when only the client side of the stream is closed. HTTP/2 streams should not be reset on half-close. Instead they should remain open until both sides of the stream are closed.
Repro steps:
Here's the example Python code and Envoy config I used for testing:
envoy.yaml: https://pastebin.com/s9ECZRjy
server: https://pastebin.com/VrpwFQYZ
client no-close: https://pastebin.com/ZGPhFZFm
client half-close: https://pastebin.com/4ww0c5y6
grpc api: https://github.com/grpc/grpc/blob/master/examples/protos/route_guide.proto
With the no-close client, I continue to stream forever. With the half-close client, after about 15 seconds I see:
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INTERNAL, Received RST_STREAM with error code 0)>
This was also reported on the envoy-users group: https://groups.google.com/forum/#!topic/envoy-users/wRtoav1QfL4
This is going to be non-trivial to fix. I'm marking design proposal because someone will need to dig into the code and come up with a proposal before we do any implementation.
I'm running into this problem as well, but a gRPC endpoint that only streams from the server side.
rpc Subscriber(SubscriberRequest) returns (stream SubscriberResponse)
What's also strange is that I only have issues when using the Python client; the Go client works fine. Anyone have any insight into why?
FWIW I'm running into this problem as well, but with a gRPC endpoint where only the client streams.
This is a problem for us too (Java client/server).
I'm closing because the reported problem is now the documented default behavior and can be resolved through configuration.
Here's the relevant setting:
Specifies the upstream timeout for the route. If not specified, the default is 15s. This spans between the point at which the entire downstream request (i.e. end-of-stream) has been processed and when the upstream response has been completely processed.
For my application, I resolve by setting timeout to 0s which disables it entirely.
Most helpful comment
I'm closing because the reported problem is now the documented default behavior and can be resolved through configuration.
Here's the relevant setting:
route.RouteAction.timeout
For my application, I resolve by setting
timeoutto0swhich disables it entirely.