K3s: Network or DNS problem for some pods

Created on 14 Jun 2019  路  25Comments  路  Source: k3s-io/k3s

I have a k3s cluster that has been running fine for some time but suddenly started having problems with DNS and/or networking. Unfortunately I haven't been able to determine what caused it or even what exactly the problem is.

This issue seems related but according to that it should be enough to change the coredns ConfigMap and that should already be fixed in this release of k3s.

The first sign of trouble was that metrics-server didn't report metrics for nodes. I found out that it was because it couldn't fully scrape metrics and timed out. Further investigation lead me to believe that it wasn't able to resolve the nodes hostnames.

To work around the first problem, I added the flags --kubelet-insecure-tls and --kubelet-preferred-address-types=InternalIP. It works but I don't like it, it was working fine before without this.

After this, I realized that this problem was not isolated to metrics-server. Other pods in the cluster are also unable to resolve any hostnames (cluster services or public). I haven't been able to find a pattern to it. The cert-manager pod can resolve everything correctly, but my test pods cannot resolve anything no matter what host they run on, same as metrics-server.
I guess it is relevant also to note that I can reach the internet just fine and lookup any public domain names on the nodes directly.
I have also tried changing the coredns ConfigMap to use 8.8.8.8 instead of /etc/resolv.conf.

System description

The cluster consists of 3 Raspberry Pis running Fedora IoT.

$ kubectl get nodes -o wide
NAME     STATUS   ROLES    AGE    VERSION         INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                             KERNEL-VERSION           CONTAINER-RUNTIME
fili     Ready    master   104d   v1.14.1-k3s.4   10.0.0.13     <none>        Fedora 29.20190606.0 (IoT Edition)   5.1.6-200.fc29.aarch64   containerd://1.2.5+unknown
kili     Ready    <none>   97d    v1.14.1-k3s.4   10.0.0.15     <none>        Fedora 29.20190606.0 (IoT Edition)   5.1.6-200.fc29.aarch64   containerd://1.2.5+unknown
pippin   Ready    <none>   41d    v1.14.1-k3s.4   10.0.0.2      <none>        Fedora 29.20190606.0 (IoT Edition)   5.1.6-200.fc29.aarch64   containerd://1.2.5+unknown

Relevant logs

CoreDNS logs messages like the following when one of the pods is trying to reach a service in another namespace (gitea):

2019-06-14T16:36:16.234Z [ERROR] plugin/errors: 2 gitea.gitea. AAAA: unreachable backend: read udp 10.42.4.93:49037->10.0.0.1:53: i/o timeout
2019-06-14T16:36:16.234Z [ERROR] plugin/errors: 2 gitea.gitea. A: unreachable backend: read udp 10.42.4.93:59310->10.0.0.1:53: i/o timeout

This is from the start of the CoreDNS logs:

$ kubectl -n kube-system logs coredns-695688789-lm947 
.:53
2019-06-12T19:01:15.388Z [INFO] CoreDNS-1.3.0
2019-06-12T19:01:15.389Z [INFO] linux/arm64, go1.11.4, c8f0e94
CoreDNS-1.3.0
linux/arm64, go1.11.4, c8f0e94
2019-06-12T19:01:15.389Z [INFO] plugin/reload: Running configuration MD5 = ef347efee19aa82f09972f89f92da1cf
2019-06-12T19:01:36.395Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:60396->10.0.0.1:53: i/o timeout
2019-06-12T19:01:39.397Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:56286->10.0.0.1:53: i/o timeout
2019-06-12T19:01:42.397Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:38791->10.0.0.1:53: i/o timeout
2019-06-12T19:01:45.399Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:39417->10.0.0.1:53: i/o timeout
2019-06-12T19:01:48.401Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:39276->10.0.0.1:53: i/o timeout
2019-06-12T19:01:51.401Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:36239->10.0.0.1:53: i/o timeout
2019-06-12T19:01:54.403Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:47541->10.0.0.1:53: i/o timeout
2019-06-12T19:01:57.404Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:39486->10.0.0.1:53: i/o timeout
2019-06-12T19:02:00.405Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:53211->10.0.0.1:53: i/o timeout
2019-06-12T19:02:03.405Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:53654->10.0.0.1:53: i/o timeout
2019-06-12T20:03:31.063Z [ERROR] plugin/errors: 2 update.containous.cloud. AAAA: unreachable backend: read udp 10.42.4.93:38504->10.0.0.1:53: i/o timeout
2019-06-12T20:03:36.064Z [ERROR] plugin/errors: 2 update.containous.cloud. AAAA: unreachable backend: read udp 10.42.4.93:38491->10.0.0.1:53: i/o timeout
2019-06-12T20:03:41.570Z [ERROR] plugin/errors: 2 api.github.com. AAAA: unreachable backend: read udp 10.42.4.93:56122->10.0.0.1:53: i/o timeout
2019-06-12T20:03:46.572Z [ERROR] plugin/errors: 2 api.github.com. AAAA: unreachable backend: read udp 10.42.4.93:39048->10.0.0.1:53: i/o timeout
2019-06-13T00:00:50.170Z [ERROR] plugin/errors: 2 stats.drone.ci. AAAA: unreachable backend: read udp 10.42.4.93:38093->10.0.0.1:53: i/o timeout

Cert-manager pod has working DNS:

$ kubectl exec -it -n utils cert-manager-66bc958d96-b6b7k -- nslookup gitea.gitea
nslookup: can't resolve '(null)': Name does not resolve

Name:      gitea.gitea
Address 1: 10.43.111.72 gitea.gitea.svc.cluster.local
[lennart@legolas ~]$ kubectl exec -it -n utils cert-manager-66bc958d96-b6b7k -- nslookup www.google.com
nslookup: can't resolve '(null)': Name does not resolve

Name:      www.google.com
Address 1: 216.58.207.228 arn09s19-in-f4.1e100.net
Address 2: 2a00:1450:400f:80c::2004 arn09s19-in-x04.1e100.net

Debugging DNS with busybox pods:

[lennart@legolas ~]$ kubectl get pods -o wide
NAME           READY   STATUS    RESTARTS   AGE    IP            NODE     NOMINATED NODE   READINESS GATES
busybox        1/1     Running   47         2d     10.42.4.90    pippin   <none>           <none>
busybox-fili   1/1     Running   26         25h    10.42.0.132   fili     <none>           <none>
busybox-kili   1/1     Running   1          116m   10.42.2.167   kili     <none>           <none>
[lennart@legolas ~]$ kubectl exec -it  busybox -- nslookup www.google.com
;; connection timed out; no servers could be reached

command terminated with exit code 1
[lennart@legolas ~]$ kubectl exec -it  busybox -- nslookup gitea.gitea
;; connection timed out; no servers could be reached

command terminated with exit code 1
[lennart@legolas ~]$ kubectl exec -it  busybox-fili -- nslookup www.google.com
Server:     10.43.0.10
Address:    10.43.0.10:53

Non-authoritative answer:
Name:   www.google.com
Address: 2a00:1450:400f:80a::2004

*** Can't find www.google.com: No answer

[lennart@legolas ~]$ kubectl exec -it  busybox-fili -- nslookup gitea.gitea
;; connection timed out; no servers could be reached

command terminated with exit code 1
[lennart@legolas ~]$ kubectl exec -it  busybox-kili -- nslookup www.google.com
Server:     10.43.0.10
Address:    10.43.0.10:53

Non-authoritative answer:
Name:   www.google.com
Address: 2a00:1450:400f:807::2004

*** Can't find www.google.com: No answer

[lennart@legolas ~]$ kubectl exec -it  busybox-kili -- nslookup gitea.gitea
;; connection timed out; no servers could be reached

command terminated with exit code 1

Description of coredns ConfigMap:

====
Corefile:
----
.:53 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
      pods insecure
      upstream
      fallthrough in-addr.arpa ip6.arpa
    }
    hosts /etc/coredns/NodeHosts {
      reload 1s
      fallthrough
    }
    prometheus :9153
    proxy . /etc/resolv.conf
    cache 30
    loop
    reload
    loadbalance
}

NodeHosts:
----
10.0.0.13 fili
10.0.0.2 pippin
10.0.0.15 kili

Some IP related prints:
fili-ip-route.txt
fili-iptables-save.txt
kili-ip-route.txt
kili-iptables-save.txt
pippin-ip-route.txt
pippin-iptables-save.txt

If you made it through all that, kudos to you! Sorry for the long description.

statumore-info

Most helpful comment

I don't know if this helps anyone else. But on my raspi 4 cluster, I installed docker.io package and that's when DNS inside the cluster stopped working. apt-get remove docker.io solved this particular issue for me

All 25 comments

Thanks for reporting this issue and all of the info @lentzi90 !

I think a good clue is if metrics-server needs kubelet-insecure-tls that might indicate a cert issue. I am curious if there is some time drift between servers which may be causing an issue, if you aren't already syncing with ntp periodically that might be a good thing to test & setup. There may be additional information in the k3s server or agent logs which would prove useful.

Sorry, actually you probably need kubelet-insecure-tls with kubelet-preferred-address-types=InternalIP as I think we only provide a cert for the hostname. The times from the log files look close enough I am guessing that is not an issue. Can you provide some more info about the setup? Is it on a laptop, hosted, vm, etc? It is interesting that you are getting only ipv6 for nslookup www.google.com, from the iptables entry I am curious if CNI has run out of IPs or otherwise having issues. Do you have a lot of pods that restart or otherwise remove & deploy a lot of pods? If using the install script k3s-killall.sh may help to reset the containers and networking on the nodes, and then you would need to start up the k3s server/agents again.

Thanks for the fast response!

You were right about the insecure-tls part, the certs are just for the hostnames.

This is a bare metal setup with Raspberry Pis (specifically fili and kili are model 3B+ and pippin is 3B).
They are all connected to a switch which in turn is connected to my home router (IP 10.0.0.1).

I wouldn't say that I restart/deploy/remove a lot of pods, but this cluster has been running for several weeks so in that time maybe yes.
The master node is already over 100 days old :)

I did use the install script but the k3s-killall.sh script did not exist at the time.
Would it be enough to just restart the systemd units after running this script or would everything be wiped?

Here are some logs from the nodes. I'm afraid they are a bit messy and have been rotated at different times.
fili-k3s-server.txt
kili-k3s-agent.txt
pippin-k3s-agent.txt

Checking time for all nodes:

$ for node in fili kili pippin; do ssh $node date; done
Sat Jun 15 09:17:13 UTC 2019
Sat Jun 15 09:17:14 UTC 2019
Sat Jun 15 09:17:16 UTC 2019

Checking resolv.conf for all nodes:

$ for node in fili kili pippin; do ssh $node cat /etc/resolv.conf; done
# Generated by NetworkManager
nameserver 10.0.0.1
# Generated by NetworkManager
nameserver 10.0.0.1
# Generated by NetworkManager
nameserver 10.0.0.1

For reference, the systemd units I use:

# Server unit
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
After=network-online.target

[Service]
Type=notify
EnvironmentFile=/etc/systemd/system/k3s.service.env
ExecStart=/usr/local/bin/k3s server --no-deploy=servicelb --kubelet-arg system-reserved=cpu=100m,memory=100Mi --kubelet-arg kube-reserved=cpu=200m,memory=300Mi
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always

[Install]
WantedBy=multi-user.target

# Agent unit
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
After=network.target

[Service]
Type=exec
EnvironmentFile=/etc/systemd/system/k3s-agent.service.env
ExecStart=/usr/local/bin/k3s agent --kubelet-arg system-reserved=cpu=100m,memory=100Mi --kubelet-arg kube-reserved=cpu=200m,memory=100Mi
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Restart=always

[Install]
WantedBy=multi-user.target

I should probably also mention that SELinux is set to permissive and Firewalld is disabled.

I installed the killall script on all nodes and run it on one at a time. Unfortunately, it didn't help. :disappointed:

met the same problem
can't resolve by pod's hostname

thanks for all the great info @lentzi90 ! Is 10.0.0.1 pointing to DNS running on the router? Is it possible to view the state of the home router to ensure that something like the NAT table hasn't filled up and DNS server is good? Rebooting the router and making sure the wires are still good (ping test to google or something similar) might help. It might also help to perform nslookup tests against a specific server (maybe 8.8.8.8) or start k3s pointed to a different resolv.conf. Seems like an issue where maybe the network itself is having problems, or perhaps a DNS change upstream is causing issues. If it is an upstream DNS change I would think that killing everything and restarting would help, maybe a reboot of the router & nodes are in order too.

The switch may also be a suspect for dropping UDP packets.

544 may be related

@ericchiang On my situation, no dropping UDP packets. resolve address by service name is ok, only pod name resolving was failed.

dig @XXX some-service.default.svc.cluster.local OK
dig @XXX podname-n.some-service.default.svc.cluster.local FAILED with no adress returned

@chennqqi it looks like you might have accidentally pinged someone else by mistake
I think this issue is unique in that DNS was working fine for a long period of time but is now having sporadic issues. I suspect that it is a general network issue, so I am not sure there is a lot we can do without being more heavy handed and opinionated on the default resolv.conf settings for CoreDNS.

It looks like you are having a basic configuration issue with your pods @chennqqi, it might be worth looking over the instructions at https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pods, and file a new issue with lots of information if you are still having issues. For what it is worth I was able to modify the example a little to verify pod DNS is working:

[ 2019-06-19 17:56:27 ]
root馃惍k3s-1:~$ kubectl exec -ti busybox -- nslookup k3s-1.default-subdomain.default.svc.cluster.local
Server:    10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local

Name:      k3s-1.default-subdomain.default.svc.cluster.local
Address 1: 10.42.1.3 k3s-1.default-subdomain.default.svc.cluster.local
[ 2019-06-19 17:56:34 ]
root馃惍k3s-1:~$ kubectl exec -ti busybox -- nslookup default-subdomain.default.svc.cluster.local
Server:    10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local

Name:      default-subdomain.default.svc.cluster.local
Address 1: 10.42.1.3 k3s-1.default-subdomain.default.svc.cluster.local
Address 2: 10.42.0.7 k3s-2.default-subdomain.default.svc.cluster.local

@erikwilson you're right. I checked my service.yml, add subdomain, pod resolve ok now. Thank you very much!

@erikwilson the router at 10.0.0.1 is a very simple netgear router for home usage. I tried restarting it but it didn't help. As far as I can see it is operating normally, all laptops, phones and other devices connected to it work just fine.

Also, all nodes have been updated to k3s 0.6.1 now and rebooted without any effect.

The router is running a DHCP server configured to give out addresses from 10.0.0.2 to 10.0.0.254. It is using 8.8.8.8 and 8.8.4.4 for DNS.

I did some more debugging:

  • Connected fili (master) and pippin (worker, running the coreDNS pod at the time) directly to the router instead of to the switch. No effect.
  • Changed the /etc/resolv.conf to include 8.8.8.8 and 8.8.4.4 (in addition to the automatic 10.0.0.1) on all nodes. No effect.
  • Tried pinging external IP addresses and the router itself. This is working from pods on all nodes.

I have run into #544 or similar before on Ubuntu with kubeadm but I don't think this is the same issue. Actually systemd-resolved is not running on any of the machines, could that be a problem?

Found a pattern: The node where CoreDNS is running is unable to nslookup www.google.com if I don't specify a server. (If I do nslookup www.google.com 8.8.8.8 they all work fine.) Any Idea what this could mean?

# Note that coredns is running on node pippin
$ kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS      RESTARTS   AGE     IP           NODE     NOMINATED NODE   READINESS GATES
coredns-695688789-625vl          1/1     Running     0          8m5s    10.42.4.24   pippin   <none>           <none>
helm-install-traefik-lnx75       0/1     Completed   0          64m     10.42.2.17   kili     <none>           <none>
metrics-server-7cf965b7d-k4h5v   1/1     Running     0          5h16m   10.42.4.22   pippin   <none>           <none>
tiller-deploy-5d47d8c8f7-2m976   1/1     Running     0          5h16m   10.42.4.20   pippin   <none>           <none>
traefik-56688c4464-47t4h         1/1     Running     0          63m     10.42.2.18   kili     <none>           <none>

# Lookup from pod running on pippin fails
$ kubectl exec -it  busybox-pippin -- nslookup www.google.com
;; connection timed out; no servers could be reached

command terminated with exit code 1

# Lookup on other node works
$ kubectl exec -it  busybox-kili -- nslookup www.google.com
Server:         10.43.0.10
Address:        10.43.0.10:53

Non-authoritative answer:
Name:   www.google.com
Address: 172.217.20.36

*** Can't find www.google.com: No answer

Now if I cordon pippin and kill the coredns pod it ends up on kili instead and I get the reverse result:

# Note that coredns is running on node kili
$ kubectl get pods -n kube-system -o wide
NAME                             READY   STATUS      RESTARTS   AGE     IP           NODE     NOMINATED NODE   READINESS GATES
coredns-695688789-8jclm          1/1     Running     0          5m45s   10.42.2.20   kili     <none>           <none>
helm-install-traefik-lnx75       0/1     Completed   0          74m     10.42.2.17   kili     <none>           <none>
metrics-server-7cf965b7d-k4h5v   1/1     Running     0          5h26m   10.42.4.22   pippin   <none>           <none>
tiller-deploy-5d47d8c8f7-2m976   1/1     Running     0          5h26m   10.42.4.20   pippin   <none>           <none>
traefik-56688c4464-47t4h         1/1     Running     0          73m     10.42.2.18   kili     <none>           <none>

# Lookup from pod running on kili fails
$ kubectl exec -it  busybox-kili -- nslookup www.google.com
;; connection timed out; no servers could be reached

command terminated with exit code 1

# Lookup on other node works
$ kubectl exec -it  busybox-pippin -- nslookup www.google.com
Server:         10.43.0.10
Address:        10.43.0.10:53

Non-authoritative answer:
Name:   www.google.com
Address: 216.58.207.196

*** Can't find www.google.com: No answer

Nothing interesting in the CoreDNS logs:

$ kubectl -n kube-system logs coredns-695688789-8jclm 
.:53
2019-06-21T14:55:05.782Z [INFO] CoreDNS-1.3.0
2019-06-21T14:55:05.782Z [INFO] linux/arm64, go1.11.4, c8f0e94
CoreDNS-1.3.0
linux/arm64, go1.11.4, c8f0e94
2019-06-21T14:55:05.783Z [INFO] plugin/reload: Running configuration MD5 = ef347efee19aa82f09972f89f92da1cf

I'll give it another few days but if I don't find a solution after that I'll just reinstall.

met the same problem
can't resolve by pod's hostname

EDIT: Though I think there are two separate issues here, one about no DNS records being present for pods and another about the larger DNS issue with the nodes that @lentzi90 is having. Maybe this should be split into two separate tickets?

EDIT2: Actually I didn't have an issue after all. I was looking for the old .pod.cluster.local DNS entries which apparently have been replaced by ...svc.cluster.local.

Yes, it should be split into another ticket @varesa. This issue isn't about resolving a pod's DNS entries, it is about resolving any DNS from within a pod on a system which used to function fine.

Please look over this comment and file a new issue: https://github.com/rancher/k3s/issues/535#issuecomment-503800950

@erikwilson actually it turns out that this was also a misunderstanding on my part in addition to being off-topic (even if the topic was already mixed here before). Cleaned up some of the irrelevant information as it's unlikely to be of use to anyone. Hope you find a resolution to the real issue here as well

Alright, here is an update.
I reinstalled the cluster and it is now working fine. Metrics-server works without any extra args and all pods can resolve services just the way they should. I used the install script to install k3s after some struggles with other methods (see below).

Is there anything more I could do to find the cause of this, or should I just close the issue for now?

As a side note, I tried to use ansible to set everything up this time instead of using the install script. It did not go well. Basically there were problems with containerd all the time. Anyway, that doesn't belong in this issue. I will investigate some more and open a new issue if necessary.

I am too seeing this issue, my cluster is now ~32 days old and I am seeing issues of failed DNS requests either internal or external requests now. The issue is very strange because it's sporadic the DNS requests start working but then after awhile start failing and then the cycles repeats. I have turned on logging for coredns and I hope to capture something useful.

Hi all, I'm currently facing with this issue too
ENV:

  • Rancher: v2.2.7
  • Kubernetes: v1.13.5
  • Docker: v.18.9.6
    But it's only happen by the time when benchmark our system
    image
    One more thing, i'm wondering if there any issue when deploy all 5 node running with all 3 role (etcd,controlplane,worker), when we benmark our api pods that deployed, rancher UI was suspended and the etcd keep inform with unhealthy status

I don't know if this helps anyone else. But on my raspi 4 cluster, I installed docker.io package and that's when DNS inside the cluster stopped working. apt-get remove docker.io solved this particular issue for me

I ran into a very similar problem and solved it by restarting dockerd on the system.

Before the fix, my coredns pod had exact same error logs as OP:

2019-06-12T19:01:36.395Z [ERROR] plugin/errors: 2 3521834610273354494.4337686964088628928. HINFO: unreachable backend: read udp 10.42.4.93:60396->10.0.0.1:53: i/o timeout

Then in journalctl I saw this message, and followed this post.

May  4 10:44:44 ari dockerd: time="2020-05-04T10:44:44.186337155+08:00" level=warning msg="IPv4 forwarding is disabled. Networking will not work."

similar to @zackb , after removing docker.io from my master node and rebooting, and killing all pods, everything returned back to normal. Looks like it's related to docker using the older iptables vs k3s using nftables -- mixing both is recipe for disaster it seems.

Im having the same issues, but restarting k3s systemctl restart k3s on master and agents fixes it.

I have a similar problem I even tried setting up nodelocaldns in order to improve the situation but the problem still exists.

I have encounted this problem many times, This problem bothering me for a long long time.

It seems it was caused by wrong iptable rules, But I didnt find the root cause,

The direct phenomenon is you cannot accsess other services with clusterr ip, so all the pod running on the node cannot access kube-dns service, when I encounter this problem, The following method works:

iptables -F
iptables -X
iptables -F -t nat
iptables -X -t nat

the command above flushes the iptable rules, then restart k3s to recreate the iptable rules, The problem resolved, but I don't know when it happenes again, because running for hours or days, it happens again.

the fowwling are the iptable rules snapshot(the left is when the node is abnormal, the right is when if normal.):
image

Was this page helpful?
0 / 5 - 0 ratings

Related issues

theonewolf picture theonewolf  路  3Comments

e-nikolov picture e-nikolov  路  3Comments

joakimr-axis picture joakimr-axis  路  3Comments

wpwoodjr picture wpwoodjr  路  3Comments

weber-software picture weber-software  路  3Comments