Serving: GRPC traffic should load balance during autoscale

Created on 29 Jan 2020  路  19Comments  路  Source: knative/serving

In what area(s)?

/area autoscale
/area networking

What version of Knative?

0.11.x

Expected Behavior

When pod autoscales with GRPC traffic, responses should be generated from more than one pod.

Actual Behavior

Only the first pod responds to GRPC client.

Steps to Reproduce the Problem

  1. Update the GPRC example app to respond with os.Hostname
  2. Bombard GPRC traffic to trigger autoscale
  3. Verify that responses from pods include more than one hostname.

For more context refer: https://github.com/knative/serving/pull/6618#issuecomment-578357029

areautoscale arenetworking kinbug

Most helpful comment

@shashwathi @tanzeeb I see report of no-load-balancing when using gRPC streaming (new streams keep getting the same pod). So when we add e2e test for https://github.com/knative/serving/issues/6681#issuecomment-580813409 , I wonder if we can add coverage for streaming as well? Thanks

All 19 comments

Once you establish gRPC connection it is bound to a single pod. You need to have several client side connections (https://grpc.io/blog/grpc_on_http2/)
The alternative is

  • building grpc LB proxy (probably we don't want that)
  • http2 LB?

@vagababov : In this PR every goroutine creates separate connection and they still connect to the first pod. https://github.com/knative/serving/pull/6618
Am I missing something here?

Do you see pods actually scale?

Yes pods scale up and down with varying traffic. In e2e test PR I have added assertions to verify the same.

But all the new connections go to the same pod?

Yes thats what I notice in the test

What's your container concurrency or tbc settings?

It's 6, it seems. Try with 1 and tbc=-1, I don't think we can really control how Istio would LB grpc traffic.

I wonder if using the grpc port name will have an affect

@vagababov : Updating to cc=1, tbc to -1 did not make any difference.

That's interesting. I wonder if the gRPC subsystem reuses the connection internally.

@vagababov : @tanzeeb suspected the same. We looked into GPRC library and grpc.Dial should create a new conn from codebase. From cursory glance of codebase does not appear there is pooling , nor there is an option to create client without reusing connections

Updating to cc=1, tbc to -1 did not make any difference.

Interesting

/assign @shashwathi @tanzeeb

After taking a deeper look with @shashwathi, it looks like gRPC load balancing works fine. The test conditions in #6618 were sending all of the requests to the first pod before any of the other pods had a chance to come up.

We should add an e2e test that waits for the scaling to happen and then assert whether any incoming requests are load balanced across existing multiple instances.

馃槃 and there I was worried

@shashwathi @tanzeeb I see report of no-load-balancing when using gRPC streaming (new streams keep getting the same pod). So when we add e2e test for https://github.com/knative/serving/issues/6681#issuecomment-580813409 , I wonder if we can add coverage for streaming as well? Thanks

Is there anything to be done here or we can close this one out?

@vagababov : @tcnghia had asked for e2e grpc test coverage. There is an outstanding PR for this #6778 . Please consider reviewing the PR

Was this page helpful?
0 / 5 - 0 ratings