/area autoscale
/area networking
0.11.x
When pod autoscales with GRPC traffic, responses should be generated from more than one pod.
Only the first pod responds to GRPC client.
os.HostnameFor more context refer: https://github.com/knative/serving/pull/6618#issuecomment-578357029
Once you establish gRPC connection it is bound to a single pod. You need to have several client side connections (https://grpc.io/blog/grpc_on_http2/)
The alternative is
@vagababov : In this PR every goroutine creates separate connection and they still connect to the first pod. https://github.com/knative/serving/pull/6618
Am I missing something here?
Do you see pods actually scale?
Yes pods scale up and down with varying traffic. In e2e test PR I have added assertions to verify the same.
But all the new connections go to the same pod?
Yes thats what I notice in the test
What's your container concurrency or tbc settings?
It's 6, it seems. Try with 1 and tbc=-1, I don't think we can really control how Istio would LB grpc traffic.
I wonder if using the grpc port name will have an affect
@vagababov : Updating to cc=1, tbc to -1 did not make any difference.
That's interesting. I wonder if the gRPC subsystem reuses the connection internally.
@vagababov : @tanzeeb suspected the same. We looked into GPRC library and grpc.Dial should create a new conn from codebase. From cursory glance of codebase does not appear there is pooling , nor there is an option to create client without reusing connections
Updating to cc=1, tbc to -1 did not make any difference.
Interesting
/assign @shashwathi @tanzeeb
After taking a deeper look with @shashwathi, it looks like gRPC load balancing works fine. The test conditions in #6618 were sending all of the requests to the first pod before any of the other pods had a chance to come up.
We should add an e2e test that waits for the scaling to happen and then assert whether any incoming requests are load balanced across existing multiple instances.
馃槃 and there I was worried
@shashwathi @tanzeeb I see report of no-load-balancing when using gRPC streaming (new streams keep getting the same pod). So when we add e2e test for https://github.com/knative/serving/issues/6681#issuecomment-580813409 , I wonder if we can add coverage for streaming as well? Thanks
Is there anything to be done here or we can close this one out?
@vagababov : @tcnghia had asked for e2e grpc test coverage. There is an outstanding PR for this #6778 . Please consider reviewing the PR
Most helpful comment
@shashwathi @tanzeeb I see report of no-load-balancing when using gRPC streaming (new streams keep getting the same pod). So when we add e2e test for https://github.com/knative/serving/issues/6681#issuecomment-580813409 , I wonder if we can add coverage for streaming as well? Thanks