K3s: coreDNS unable to resolve upstream

Created on 27 Feb 2019  路  9Comments  路  Source: k3s-io/k3s

Hello, I have plain installation of k3s on an ubuntu 18.04

I am running a container which is failing to resolve DNS

# nslookup index.docker.io 10.43.0.10
Server:    10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'index.docker.io': Try again

# nslookup quay.io 10.43.0.10
Server:    10.43.0.10
Address 1: 10.43.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'quay.io': Try again
# k3s kubectl logs -f pod/coredns-7748f7f6df-8htwl -n kube-system
2019-02-26T22:52:50.556Z [ERROR] plugin/errors: 2 index.docker.io. AAAA: unreachable backend: read udp 10.42.0.6:50878->1.1.1.1:53: i/o timeout
2019-02-26T22:52:50.556Z [ERROR] plugin/errors: 2 index.docker.io. A: unreachable backend: read udp 10.42.0.6:38587->1.1.1.1:53: i/o timeout
2019-02-26T22:53:18.425Z [ERROR] plugin/errors: 2 quay.io. AAAA: unreachable backend: read udp 10.42.0.6:48427->1.1.1.1:53: i/o timeout
2019-02-26T22:53:18.425Z [ERROR] plugin/errors: 2 quay.io. A: unreachable backend: read udp 10.42.0.6:53214->1.1.1.1:53: i/o timeout

I am not sure what 1.1.1.1 is and where its coming from

help wanted

Most helpful comment

I tried it out on Ubuntu 16.04.6 LTS with v0.3.0 (9a1a1ec) since the final 0.3 got released a few hours ago. Using curl -sfL https://get.k3s.io | K3S_RESOLV_CONFIG=192.168.0.19 sh - and removing my sed workaround from cm/coredns it works, but only without providing a custom TLD:

root@rocket-chat:/# ping my-pc
PING my-pc.fritz.box (192.168.0.20) 56(84) bytes of data.
64 bytes from my-PC.fritz.box (192.168.0.20): icmp_seq=1 ttl=61 time=0.787 ms

But when I try ping my-pc.fritz.box it can't resolve. nslookup also timed out:

root@rocket-chat:/# nslookup my-pc.fritz.box
;; connection timed out; no servers could be reached

Using other machines in the same networks that use 192.168.0.19 as dns-server, both domains were resolved successfully. Altough inside Vagrant I'm able to resolve my-pc.fritz.box, it may has something to do that I'm trying this in Vagrant on Ubuntu 18.04. Content of /etc/resolv.conf inside vagrant:

nameserver 10.0.2.2
search fritz.box

Update: It's a Kubernetes issue

Found out that this was caused by Kubernetes ndots config. Per default, we have options ndots:5 set in resolv.conf. This means that dns names must contain at least five dots before they were processed as an absolute name. my-pc doesn't contain any dots, so it's resolved absolute by our upsteam 192.168.0.19 where we have an alias without .fritz.box suffix by default.

But my-pc.fritz.box contains two dots. The default setting is ndots:1 so that any dns name with at least one dot would be resolved as absolute dns. Since Kubernetes has ndots:5 the my-pc.fritz.box is resolved as relative dns. So it would apply all suffixes from search. This can't work since it would apply another .fritz.box suffix, so my-pc.fritz.box would become my-pc.fritz.box.fritz.box.

I assume that this should speed up things for internal cluster dns entrys. But for external dns, it can slow down things. Using apt-get for installing some debug packages like netutils was very slow. Since I switched to the default ndots:5 it got pretty fast like on my working machine. You can also find blog posts about this issue. But in my case, the primary problem was that it breaks my absolute external dns entrys.

To solve this, customize the pods dns configuration by applying it to containerlevel at the pods definition:

  containers:
  # ...
  dnsConfig:
    options:
      - name: ndots
        value: "1"

But regarding to Kubernetes own dns, I'd consider this as a workaround for local purpose since I'm not completely aware of the productive peformance yet. As another solution, we may force absolute domain names by a leading dot.

Currently, I'm using thednsConfig entry and dns works well with my custom server. So this problem wasn't related to k3s directly and the fix in 0.3 works well :)

All 9 comments

I am not sure what 1.1.1.1 is and where its coming from

This is the CloudFlare DNS public service, like the Google DNS 8.8.8.8

Hmm, its probably being blocked on my network. Any idea how its being configured and how I could change it?

@latchmihay We may have hard coded 1.1.1.1. We will make that configurable. The default behavior of k8s is to use the hosts /etc/resolv.conf as the upstream DNS but because of systemd-resolved being the default these days (and older dnsmasq setups) it is typically 127.0.0.x IP and then breaks. So it's super hard in general to figure out what the upstream DNS should actually be. So we probably hardcoded it to 1.1.1.1.

We will add this as an option to the agent and also document it.

@ibuildthecloud Please maintain the current way, make configurable, but if you keep save a log of time because of the issue that you said, I've got it many time, and need be a new step on the new servers installations...
Anyway thank you, I was hoping it to migrate to the new Rancher v2.

i fixed it changing the configmap for coredns from 1.1.1.1 to 8.8.8.8 ... for whatever reason 1.1.1.1:53 I could not reach

This can be done by replacing proxy . 1.1.1.1 with your own dns server in cm coredns. I wrote a detailled guide how to change this manually and automated for tools like ansible here: https://devops.stackexchange.com/a/6521/6923

We have created a release candidate v0.3.0-rc3 which will hopefully fix these DNS issues. Please try it out and let me know if it helps!

The settings are configurable in that we will either take a --resolv-conf flag to pass down to the kubelet, or a K3S_RESOLV_CONF environment variable will work also. We now try to use system resolv.conf files (from /etc & systemd), and will create a /tmp/k3s-resolv.conf file with nameserver 8.8.8.8 if nameservers in the system files are not global unicast ips.

I tried it out on Ubuntu 16.04.6 LTS with v0.3.0 (9a1a1ec) since the final 0.3 got released a few hours ago. Using curl -sfL https://get.k3s.io | K3S_RESOLV_CONFIG=192.168.0.19 sh - and removing my sed workaround from cm/coredns it works, but only without providing a custom TLD:

root@rocket-chat:/# ping my-pc
PING my-pc.fritz.box (192.168.0.20) 56(84) bytes of data.
64 bytes from my-PC.fritz.box (192.168.0.20): icmp_seq=1 ttl=61 time=0.787 ms

But when I try ping my-pc.fritz.box it can't resolve. nslookup also timed out:

root@rocket-chat:/# nslookup my-pc.fritz.box
;; connection timed out; no servers could be reached

Using other machines in the same networks that use 192.168.0.19 as dns-server, both domains were resolved successfully. Altough inside Vagrant I'm able to resolve my-pc.fritz.box, it may has something to do that I'm trying this in Vagrant on Ubuntu 18.04. Content of /etc/resolv.conf inside vagrant:

nameserver 10.0.2.2
search fritz.box

Update: It's a Kubernetes issue

Found out that this was caused by Kubernetes ndots config. Per default, we have options ndots:5 set in resolv.conf. This means that dns names must contain at least five dots before they were processed as an absolute name. my-pc doesn't contain any dots, so it's resolved absolute by our upsteam 192.168.0.19 where we have an alias without .fritz.box suffix by default.

But my-pc.fritz.box contains two dots. The default setting is ndots:1 so that any dns name with at least one dot would be resolved as absolute dns. Since Kubernetes has ndots:5 the my-pc.fritz.box is resolved as relative dns. So it would apply all suffixes from search. This can't work since it would apply another .fritz.box suffix, so my-pc.fritz.box would become my-pc.fritz.box.fritz.box.

I assume that this should speed up things for internal cluster dns entrys. But for external dns, it can slow down things. Using apt-get for installing some debug packages like netutils was very slow. Since I switched to the default ndots:5 it got pretty fast like on my working machine. You can also find blog posts about this issue. But in my case, the primary problem was that it breaks my absolute external dns entrys.

To solve this, customize the pods dns configuration by applying it to containerlevel at the pods definition:

  containers:
  # ...
  dnsConfig:
    options:
      - name: ndots
        value: "1"

But regarding to Kubernetes own dns, I'd consider this as a workaround for local purpose since I'm not completely aware of the productive peformance yet. As another solution, we may force absolute domain names by a leading dot.

Currently, I'm using thednsConfig entry and dns works well with my custom server. So this problem wasn't related to k3s directly and the fix in 0.3 works well :)

Is there any way to set the dnsConfig options globally instead of on a per-pod basis?

Was this page helpful?
0 / 5 - 0 ratings