From @ryanolson
@richarddli by just increasing the replicas in the example i get:
root@7aad1319fc7d:/devel# python greeter_client.py Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-6xdvh! Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-6xdvh! Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-6xdvh! ... root@7aad1319fc7d:/devel# python greeter_client.py Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-qzlt9! Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-qzlt9! Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-qzlt9! ... root@7aad1319fc7d:/devel# python greeter_client.py Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-6xdvh! Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-6xdvh! Greeter client received: Hello, you from host=grpc-greet-f468d5c7d-6xdvh! ...
as you can see, i get load-balancing per-invocation of a client; essentially L4 load-balancing.
however, i'm looking for 1 client, i.e. 1 grpc stub, to load-balance over all backend services - L7
note: i had to modify the greeter_server to output it's HOSTNAME in the response; similarly, the client makes 10 repeated calls using the same stub. (I shortened the output above).
Also reported by Jean-Christophe Baey @jcbaey_twitter
Sounds like https://github.com/envoyproxy/envoy/issues/2744
I'm able to reproduce this pretty consistently so now I'm digging deeper. The linked issue doesn't apply as we're not using socket_address.
So when I switch to a headless service in Kubernetes such as below:
apiVersion: v1
kind: Service
metadata:
name: grpc-basic
namespace: stable
annotations:
getambassador.io/config: |
---
apiVersion: ambassador/v0
kind: Mapping
name: grpc-basic-stable
grpc: true
prefix: /helloworld.Greeter/
rewrite: /helloworld.Greeter/
service: grpc-basic:50051
---
apiVersion: ambassador/v0
kind: Mapping
name: grpc-basic-stable-grpcreflect
grpc: true
prefix: /grpc.
rewrite: /grpc.
service: grpc-basic:50051
forge.repo: [email protected]:datawire/ambassador-examples.git
forge.descriptor: grpc-basic/service.yaml
forge.version: d1e8a449005bc14fa572a7b54296ade0fa3fcf2d.sha
labels: {forge.service: grpc-basic, forge.profile: stable}
spec:
clusterIP: None
type: ClusterIP
ports:
- name: grpc
port: 50051
targetPort: grpc
selector:
app: grpc-basic
The requests are load balanced "correctly" across my three GRPC service pods:
plombardi@plombowski ~/w/ambassador-examples> ./grpcurl.sh -plaintext -d '{"name": "Phil"}' <REDACTED>:32212 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-68879db8d9-8pq6l)!"
}
plombardi@plombowski ~/w/ambassador-examples> ./grpcurl.sh -plaintext -d '{"name": "Phil"}' <REDACTED>:32212 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-68879db8d9-czmcg)!"
}
plombardi@plombowski ~/w/ambassador-examples> ./grpcurl.sh -plaintext -d '{"name": "Phil"}' <REDACTED>:32212 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-68879db8d9-czmcg)!"
}
plombardi@plombowski ~/w/ambassador-examples> ./grpcurl.sh -plaintext -d '{"name": "Phil"}' <REDACTED>:32212 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-68879db8d9-xktlp)!"
}
plombardi@plombowski ~/w/ambassador-examples> ./grpcurl.sh -plaintext -d '{"name": "Phil"}' <REDACTED>:32212 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-68879db8d9-8pq6l)!"
}
plombardi@plombowski ~/w/ambassador-examples> ./grpcurl.sh -plaintext -d '{"name": "Phil"}' <REDACTED>:32212 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-68879db8d9-xktlp)!"
}
So I have a working theory here. Envoy is opening a persistent connection to the first backend server it resolves.
I reverted from headless mode to use the following configuration:
apiVersion: v1
kind: Service
metadata:
name: grpc-basic
namespace: stable
annotations:
getambassador.io/config: |
---
apiVersion: ambassador/v0
kind: Mapping
name: grpc-basic-stable
grpc: true
prefix: /helloworld.Greeter/
rewrite: /helloworld.Greeter/
service: grpc-basic:50051
---
apiVersion: ambassador/v0
kind: Mapping
name: grpc-basic-stable-grpcreflect
grpc: true
prefix: /grpc.
rewrite: /grpc.
service: grpc-basic:50051
forge.repo: [email protected]:datawire/ambassador-examples.git
forge.descriptor: grpc-basic/service.yaml
forge.version: 0b75a9d4a1e7357b71c91cd73714e46e9ef94fb8.sha
labels: {forge.service: grpc-basic, forge.profile: stable}
spec:
type: ClusterIP
ports:
- name: grpc
port: 50051
targetPort: grpc
selector:
app: grpc-basic
Then I booted up another Pod on the cluster and started to hit grpcbasic.stable:50051 with grpcurl. This is the no Ambassador/Envoy path. Because grpcurl is a short lived program it's going to get a new connection everytime.
# ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-n6chn)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-n6chn)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-k9trs)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-n6chn)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-k9trs)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' grpc-basic.stable:50051 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
Talking to the service through Ambassador yields:
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
{
"message": "Hello, Phil (host: grpc-basic-799d574f58-8z46f)!"
}
/ # ~/go/bin/grpcurl -plaintext -d '{"name": "Phil"}' ambassador.stable:80 helloworld.Greeter/SayHello
Reading through the Envoy docs it may also be the case we want to use logical_dns rather than strict_dns but that's just a guess at the moment.
So we think we understand the problem and have a workaround:
When you create a Kubernetes v1.Service object you are creating a virtual host representing an iptables rule that randomly selects Pod addresses for you (see: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies).
Envoy asynchronously queries DNS and only ever receives a single IP address from the DNS server (the Kubernetes v1.Service objects cluster IP). The current working assumption is that because Envoy only ever sees a single address and it is never changing that a single persistent connection is established to a backend Pod which is why traffic does not get load balanced.
We need to perform more testing, but that will take some time. In the meantime we have a simple workaround detailed below:
The workaround to this problem is use a a headless Kubernetes service. A headless Kubernetes service creates a DNS A record that points to the individual Pod IP addresses for a service. When Envoy performs one of its asynchronous DNS queries to populate its internal concept of a cluster then it receives
You can create a headless service using the clusterIP: None attribute on a Kubernetes v1.Service, for example:
---
apiVersion: v1
kind: Service
metadata:
name: grpc-basic
namespace: stable
annotations:
getambassador.io/config: |
---
apiVersion: ambassador/v0
kind: Mapping
name: grpc-basic-stable
grpc: true
prefix: /helloworld.Greeter/
rewrite: /helloworld.Greeter/
service: grpc-basic:50051
---
apiVersion: ambassador/v0
kind: Mapping
name: grpc-basic-stable-grpcreflect
grpc: true
prefix: /grpc.
rewrite: /grpc.
service: grpc-basic:50051
spec:
type: ClusterIP
clusterIP: None
ports:
- name: grpc
port: 50051
targetPort: grpc
selector:
app: grpc-basic
More information about headless services can be found in the Kubernetes docs: https://kubernetes.io/docs/concepts/services-networking/service/#headless-services
It sounds like your hypothesis is on the right trail. Here is an article I found that is more specifically oriented around Envoy but goes over load balancing algorithms and confirms your hypothesis of needing to use Headless services: https://blog.markvincze.com/how-to-use-envoy-as-a-load-balancer-in-kubernetes/
Hi, i just have question about this solution. I already used a headless service on top of my grpc server A and ambassador and the load balancing work perfectly when i try to scale up or down server A. My concern is more about scaling up and down ambassador, in my grpc client side i build the channel with ambassador only once, but when i scale up ambassador, i see that the new ambassador instances are not getting any traffic, i assume because the client is creating a long live connection with only one ambassador instance. Do you have any suggestions in this case?
Thanks
Closing this as it's resolved with endpoint routing
Most helpful comment
So we think we understand the problem and have a workaround:
Current Theory
When you create a Kubernetes
v1.Serviceobject you are creating a virtual host representing aniptablesrule that randomly selects Pod addresses for you (see: https://kubernetes.io/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies).Envoy asynchronously queries DNS and only ever receives a single IP address from the DNS server (the Kubernetes
v1.Serviceobjects cluster IP). The current working assumption is that because Envoy only ever sees a single address and it is never changing that a single persistent connection is established to a backend Pod which is why traffic does not get load balanced.We need to perform more testing, but that will take some time. In the meantime we have a simple workaround detailed below:
Workaround
The workaround to this problem is use a a headless Kubernetes service. A headless Kubernetes service creates a DNS records from DNS where represents the number running pods.
Arecord that points to the individual Pod IP addresses for a service. When Envoy performs one of its asynchronous DNS queries to populate its internal concept of a cluster then it receivesYou can create a headless service using the
clusterIP: Noneattribute on a Kubernetesv1.Service, for example:More information about headless services can be found in the Kubernetes docs: https://kubernetes.io/docs/concepts/services-networking/service/#headless-services