Charts: [stable/redis] networkPolicy blocks slave pods from connecting

Created on 19 Jun 2019  Â·  15Comments  Â·  Source: helm/charts

Describe the bug
Enabling networkPolicy blocks the slave pods from connecting to master as the slave pods are missing the pod selector label (e.g. redis-client=true)

Version of Helm and Kubernetes:
Helm: v2.14.1
Kubernetes: v1.12.7

Which chart:
Redis

What happened:
Redis-slave pods are unhealthy and restarts constantly as it cannot connect to master

What you expected to happen:
Redis-slave pods come up clean

How to reproduce it :
helm install stable/redis --values values-production.yaml (or networkPolicy.enabled=true)

Most helpful comment

Thank you very much for the PR, we will review it.

All 15 comments

Hi @Yitaek ,
I cannot reproduce the issue. I deployed using helm install stable/redis --set networkPolicy.enabled=true.
Could you clarify the description of the issue? And also, there are any other steps to reproduce it?

@miguelaeh thank you for the quick response.

I am deploying this to GKE with Calico installed and enabled. Then I run the command helm install --name redis stable/redis --set networkPolicy.enabled=true which brings up redis-master fine. However, the slave pods attempt to connect to the master pod and fails:

Connecting to MASTER redis-master-0.redis-headless.default.svc.cluster.local:6379
Unable to connect to MASTER: Resource temporarily unavailable
MASTER <-> REPLICA sync started
Timeout connecting to the MASTER...
Received SIGTERM scheduling shutdown...

It then continues to restart.

It makes sense to me since enabling network policy only allows pods with the label to connect as the note says:

Note: Since NetworkPolicy is enabled, only pods with label
redis-client=true"
will be able to connect to redis.

And the slave pods only have the following labels:

  • app:redis
  • chart: redis-8.0.10
  • release: redis
  • role: slave
  • controller-revision-hash : redis-slave-c6d8bccdf
  • statefulset.k.../pod-name : redis-slave-0

It's likely because you did not set redis.clusterDomain. By default it is cluster.local and probably that is why slaves cannot resolve the name of the master.

This reminds me of another bug i fixed in my local redis chart (but forgot to mention).
In https://github.com/helm/charts/blob/master/stable/redis/templates/configmap.yaml there is one line which is broken:
sentinel monitor {{ .Values.sentinel.masterSet }} {{ template "redis.fullname" . }}-master-0.{{ template "redis.fullname" . }}-headless.{{ .Release.Namespace }}.svc.cluster.local {{ .Values.redisPort }} {{ .Values.sentinel.quorum }}

cluster.local should be {{.Values.clusterDomain}}

Hi,
@avirtual we would be very pleased if you create pull requests to fix that kind of bugs that you found.
Regarding to the issue, @Yitaek can you try with the @avirtual suggestion?
In any case, if that was the problem, why can't I reproduce it?

It makes sense to me since enabling network policy only allows pods with the label to connect as the note says

It is true there is something weird here, thank you for reporting.

In any case, if that was the problem, why can't I reproduce it?

My guess would be that your dns resolves service.namespace.svc.cluster.local addresses, while mine and others do not.

I have no name!@redis-required-master-0:/$ redis-cli -h redis-required.required
redis-required.required:6379>
I have no name!@redis-required-master-0:/$ redis-cli -h redis-required.required.svc.kube-staging
redis-required.required.svc.kube-staging:6379>
I have no name!@redis-required-master-0:/$ redis-cli -h redis-required.required.svc.cluster.local
Could not connect to Redis at redis-required.required.svc.cluster.local:6379: Name or service not known

I tried with that, and mine doesn't resolve it neither:
8:01:13 › k exec -it redis-slave-0 bash I have no name!@redis-slave-0:/$ redis-cli -h redis-required.required.svc.cluster.local Could not connect to Redis at redis-required.required.svc.cluster.local:6379: Name or service not known

BTW, I have created a pull request to fix what you commented: https://github.com/helm/charts/pull/14955

You need to use whatever service name and namespace you have.
The url typically is redis-NAMESPACE.NAMESPACE.svc.cluster.local
In my case the namespace is 'required', in your case probably it isn't.
kubectl get svc -n WHATEVER_NAMESPACE_YOU_USED |grep redis
then start from there.

@miguelaeh and @avirtual thanks for the discussion. Unfortunately, I don't believe that it is a clusterDomain issue for me. It fails for the simple setup (just using default namespace) and I can connect just fine to master when network policies are disabled.

I have no name!@redis-slave-0:/$ redis-cli -h redis-master.default.svc.cluster.local
redis-master.default.svc.cluster.local:6379

It may be a race condition perhaps? If I initially bring up the redis pods with networkPolicy disabled and then enable it on a subsequent upgrade, then it connects fine until the pod dies. If I bring it up with networkPolicy enabled and I manually add redis-client: 'true' label on the slave pods, it can connect fine as you would expect.

Hi @Yitaek ,
Could you try to set a label to the pods?

kubectl label pod <pod-name> redis-client=true

Hi @miguelaeh, once I set the label then it works. If we add the Helm templates to have that label when networkPolicy is enabled, it would work out.

Hi @Yitaek,
We can do that, but anyway, I would like to understand why in my cluster it works. Could you tell me where are you deploying the chart?

@miguelaeh yup, here are the steps I'm taking.

  1. Create a standard gke cluster network policy enabled (it will install Calico)
  2. Install Helm
  3. Install Redis chart with network policy enabled

The slave pods will not come up healthy as it fails to talk to the master pods. You can label the pods and see it connect.

Thank you for your reporting @Yitaek,
I will try to create a pull request as soon as I have the time, please set the labels manually in the mean time.

@miguelaeh I took a crack at adding the PR. It whitelists the slave pods when cluster.slavecount is >= 1

Thank you very much for the PR, we will review it.

Was this page helpful?
0 / 5 - 0 ratings