Kops: kube-dns stop working after upgrade to 1.6.2 from 1.6.1 using kops rolling-update

Created on 16 May 2017 · 9Comments · Source: kubernetes/kops

I upgraded cluster from 1.6.1 to 1.6.2 using kops update/rolling-update features.

Initially I was seeing following error with weave-net.

bkpandey [12 minutes ago] 
modprobe: module br_netfilter not found in modules.dep
Ignore the error if "br_netfilter" is built-in in the kernel
2017/05/16 01:42:13 error contacting APIServer: the server has asked for the client to provide credentials (get nodes); trying with fallback: http://localhost:8080
2017/05/16 01:42:13 Could not get peers: Get http://localhost:8080/api/v1/nodes: dial tcp [::1]:8080: getsockopt: connection refused
Failed to get peers

Once I deleted weave-net deployment manually and re-created it, stop seeing that error. Now I am seeing following,

kubectl logs kube-dns-16966370-02741 --namespace=kube-system kubedns

E0516 17:11:28.050939       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:28.413863       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:28.913878       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:29.052806       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
E0516 17:11:29.052806       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:29.413835       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:29.913822       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:30.054696       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
E0516 17:11:30.054707       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
I0516 17:11:30.413847       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:30.913821       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:31.056598       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
E0516 17:11:31.056598       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
I0516 17:11:31.413867       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:31.913832       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:32.058466       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
E0516 17:11:32.058521       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:32.413821       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:32.913855       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:33.060197       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
E0516 17:11:33.060237       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:33.413805       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:33.913821       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:34.062302       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
E0516 17:11:34.062380       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:34.413816       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:34.913869       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:35.064285       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
E0516 17:11:35.064361       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:35.413949       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
I0516 17:11:35.913850       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
E0516 17:11:36.065885       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Endpoints: the server has asked for the client to provide credentials (get endpoints)
E0516 17:11:36.065884       1 reflector.go:199] k8s.io/dns/vendor/k8s.io/client-go/tools/cache/reflector.go:94: Failed to list *v1.Service: the server has asked for the client to provide credentials (get services)
I0516 17:11:36.413823       1 dns.go:174] DNS server not ready, retry in 500 milliseconds
F0516 17:11:36.913871       1 dns.go:168] Timeout waiting for initialization

Also seeing similar error message in kubernetes-dashboard also

kubectl logs kubernetes-dashboard-2457468166-mfz3g  --namespace=kube-system
Using HTTP port: 9090
Creating API server client for https://100.64.0.1:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: the server has asked for the client to provide credentials (get .meta.k8s.io)
Refer to the troubleshooting guide for more information: https://github.com/kubernetes/dashboard/blob/master/docs/user-guide/troubleshooting.md

lifecyclrotten

Source

pandeybk

Most helpful comment

@pandeybk I saw this too after a rolling-update. The fix for me was to delete the token associated with the service account:
You can find the token with this:

kubectl get serviceaccount/kube-dns --namespace=kube-system -o yaml

It will be under secrets.[0].name.
Then just delete that token:

kubectl delete secret kube-dns-token-xxxx --namespace=kube-system

The masters should automatically recreate the token. You may need to delete the kube-dns pods after doing so.

iterion on 17 May 2017

👍3

All 9 comments

@pandeybk I saw this too after a rolling-update. The fix for me was to delete the token associated with the service account:
You can find the token with this:

kubectl get serviceaccount/kube-dns --namespace=kube-system -o yaml

It will be under secrets.[0].name.
Then just delete that token:

kubectl delete secret kube-dns-token-xxxx --namespace=kube-system

The masters should automatically recreate the token. You may need to delete the kube-dns pods after doing so.

iterion on 17 May 2017

👍3

@iterion, unfortunately I can not test this now to confirm. I deleted that cluster completely and created new one with 1.6.2 version. I will try this solution If I run into same issue again.

Thank you for your answer.

pandeybk on 17 May 2017

@iterion do we need to rotate a secret when we do a rolling update? Can you give me an ELI5 break down of steps?

chrislovecnm on 18 May 2017

@chrislovecnm yeah, it probably wouldn't hurt to rotate the secret with the rolling update. Also, I'm not sure why this affected kube-dns but no other services. Though, I guess the kube-dns pods were the only pods in kube-system that weren't running on the masters. The ELI5 steps are pretty close to what I had above:
To remove the secret, you can do it in one step with:

kubectl delete secret $(kubectl get serviceaccount/kube-dns --namespace=kube-system -o jsonpath={.secrets[0].name}) --namespace=kube-system

Then the only thing left would be to restart the dns pods, which you could do rather violently with:

kubectl delete pod -l k8s-app=kube-dns --namespace=kube-system

Let me know if I can clarify anything.

iterion on 18 May 2017

I'm getting the exact same error when running kubectl --namespace=kube-system logs kubernetes-dashboard-3498442487-4mxbj after upgrading from 1.5.4 to 1.6.4, and @iterion's fix doesn't work for me. I've also tried restarting each master and node sequentially to no avail.

jonjomckay on 25 May 2017

Just to update, I managed to get my dashboard working by running this taken straight from the dashboard repository:

kubectl create -f https://git.io/kube-dashboard

I was originally running this:

kubectl create -f https://raw.githubusercontent.com/kubernetes/kops/master/addons/kubernetes-dashboard/v1.6.0.yaml

jonjomckay on 26 May 2017

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot on 25 Dec 2017

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot on 24 Jan 2018

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 23 Feb 2018

Was this page helpful?

0 / 5 - 0 ratings