K3s: DNS failed when using more than one node

Created on 22 Aug 2019 · 13Comments · Source: k3s-io/k3s

I'm chasing this bug for months, pods can't talk to coredns on nodes agents.

On a fresh 0.8.1 arm64 deployment with 3 nodes (one master, two agents), but same issue existed with previous k3s version, kernel 4.4 or 5.3, host is Arch.
iptables v1.8.3 (legacy)

Using default install script

curl -sfL https://get.k3s.io | K3S_URL=https://rk0:6443 K3S_TOKEN=xxxx sh -

Expected:
10.43.0.10 DNS service (and I assume the whole network) should be correctly setup on each nodes.
It's easy to test since the host can't reach the dns when the problem appears.

dig www.google.com @10.43.0.10

I found that scaling up the coredns deployment displaces the working node to an agent, making the master node unable to reach the DNS.

sudo kubectl scale -n kube-system deployment.v1.apps/coredns --replicas=3

For some reasons, sometimes, it just works from, the 3 nodes, but most of the time it doesn't.
I've tried to start the agent manually after the boot sequence is complete with no luck, compared the iptables output, everything is fine ...
I've also tried to point coredns to 8.8.8.8 directly with no result.

Source

akhenakh

Most helpful comment

I think it would be okay to have an option for host-gw, not sure how @ibuildthecloud feels about it.

It would be good to get to the bottom of the issue tho.

erikwilson on 17 Oct 2019

👍3

All 13 comments

I've got the same issue last month, I'm using k3s v0.7.0 on master, with 3 nodes, but coredns deploy only in one node, even if the node scheduling is "beta.kubernetes.io/os = linux".
To "solve" this issue I created a clone of the coredns deployment for each node...
OBS: this don't solve the issue, Kubernetes need just one instance of CoreDNS, putting a copy of the CoreDNS on the local node solved the issue on the local node, to have the DNS service on pods, but broke the online node.

EDIT: In my case I've tested running 2 k3s VM on VBox (but using shared network), with the same OS as I run on my server, ubuntu 18.04, and the default DNS os k3s worked fine. So, I think my problem is related with my "router" (mikrotik), even if I disabled all the firewall rules, bu my server are behind a NAT too. I will keep trying.

EDIT 2: Well, I've installed a new cluster using the same setup using kubeadm with weave cni, one VM on DigitalOcean (all-in-one k8s) and one machine (bare metal) behind the NAT (Mikrotik); forwarded all port (dst-nat) to my local node. Pod communication worked fine, but don't work if enable encryption of weave net (by default if using the Rancher command to deploy the cluster).
Don't work using Rancher and Canal to network. I will try all Rancher options and discover better the cause, but I think the issue is not related on the SO (iptables), but on the Mikrotik NAT.

jadsonlourenco on 25 Aug 2019

Same LAN here.

akhenakh on 26 Aug 2019

I've patched 0.9.1 to use host-gw instead of vxlan and all problems disappeared.

Since 0.10 brings options for flannel are you interested in a patch to enable host-gw ?

Diffs for 0.9.1 are very small

--- a/pkg/agent/flannel/flannel.go
+++ b/pkg/agent/flannel/flannel.go
@@ -29,6 +29,7 @@ import (
        log "k8s.io/klog"

        // Backends need to be imported for their init() to get executed and them to register
+ _ "github.com/coreos/flannel/backend/hostgw"
        _ "github.com/coreos/flannel/backend/vxlan"
 )

diff --git a/pkg/agent/flannel/setup.go b/pkg/agent/flannel/setup.go
index c2da4f34..a6f6f11b 100644
--- a/pkg/agent/flannel/setup.go
+++ b/pkg/agent/flannel/setup.go
@@ -38,7 +38,7 @@ const (
        netJSON = `{
     "Network": "%CIDR%",
     "Backend": {
-    "Type": "vxlan"
+    "Type": "host-gw"
     }
 }
 `

akhenakh on 17 Oct 2019

I think it would be okay to have an option for host-gw, not sure how @ibuildthecloud feels about it.

It would be good to get to the bottom of the issue tho.

erikwilson on 17 Oct 2019

👍3

I have a similar issue, not sure if it's the same. I noticed this problem when deploying k3s to more than one node. In my case it seems the master node cannot resolve dns while the other nodes can. So any workload ending up on the master fails connecting to things. For example deploying external-dns or cert-manager, if they end up on the master they fail.

johnae on 5 Nov 2019

👍2

Can confirm, I'm experiencing exactly the same issue as @johnae on k3s version v1.0.0 (18bd921c) on a multi-node Raspberry Pi setup. Have you been able to find a workaround for that problem other than scheduling pods onto nodes other than master?

pckbls on 3 Dec 2019

I have a 1.0 version of k3s cluster with 3 masters and 2 agents, same problem here, to add some details:

only the node run coredns can resolve via 10.43.0.10, other (both master and agents) can't.
pods with hostNetwork: true can't resolve via it, normal in cluster network is okay.
TCP works everywhere, eg: dig SERVICE @10.43.0.10 +tcp on any node is fine.

iyzsong on 6 Dec 2019

Another workaround/fix is to use NodeLocal DNSCache https://kubernetes.io/docs/tasks/administer-cluster/nodelocaldns/.

Download the yaml: https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml
Replace __PILLAR__DNS__SERVER__, __PILLAR__LOCAL__DNS__ and __PILLAR__DNS__DOMAIN__ with derised values.
The apply it.

Hope it helps.

iyzsong on 17 Dec 2019

I'm thinking this issue happen when your dns server is one of the hosts itself.

akhenakh on 30 Dec 2019

I feel like I have a similar problem.

With only master node, DNS are working well.
With another nodes, DNS are also working but only on pods that are not on the same node than coredns pod.

I'm thinking this issue happen when your dns server is one of the hosts itself.

Indeed ! My DNS server is deployed on the same host.

Neonox31 on 24 Apr 2020

I'm thinking this issue happen when your dns server is one of the hosts itself.

Can confirm this is not universal. I believe I am running into this issue, and my upstream DNS server is external to both the router and any k3s node.

I had thought that perhaps it might be an issue with a mixed-architecture cluster; my master is running on a Raspberry Pi 4 with Raspbian Buster; I have one worker node on AMD64/Ubuntu 18.04. I haven't been able to test out the multi-arch theory due to lack of nodes (I only have the one RPi right now).

Another commonality I see mentioned in this thread is that I have a MikroTik router. I will go down that rabbit hole here momentarily. I think it is a fair possibility that this or something host-OS side could be the issue and it's related to VXLAN, because I can only ping pod IPs on the local node in the cluster.

e3b0c442 on 10 May 2020

And I've resolved my issue.

Check your firewalls -- make sure that your nodes can communicate with each other on UDP port 8472 (assuming you're using the default VXLAN backend for Flannel).

@akenakh this could explain why host-gw backend was working and VXLAN was not, in your case.

e3b0c442 on 10 May 2020

I'm thinking this issue happen when your dns server is one of the hosts itself.

This seems to be the case for me.

Everything was working fine when I had my DHCP and DNS handled by my router and forwarding DNS requests to a DNS server inside my cluster (PiHole).

When I tried changing the DHCP and DNS to use PiHole directly (no changes to the pods, only router settings), the pods using hostNetwork: true all broke (they are locked to the same node). All of the requests for the host network seem to be going through coredns and failing when it tries to retrieve them from 8.8.8.8.

Playing around inside one of the host network pods, I found that queries to 10.43.0.1 resolved fine for cluster DNS and querying 127.0.0.1.

EDIT: Sorry, I had some config here to try get all names to resolve but it seemed to only go to the first nameserver. Specifying both didn't have the desired effect.