Kubeadm: Install on a system using `systemd-resolved` leads to broken DNS

Created on 19 May 2017  路  18Comments  路  Source: kubernetes/kubeadm

What keywords did you search in kubeadm issues before filing this one?

systemd resolved dns

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): v1.6.3
Environment:

  • Kubernetes version (use kubectl version): v1.6.3
  • Cloud provider or hardware configuration: bare metal
  • OS (e.g. from /etc/os-release): Ubuntu 17.04
  • Kernel (e.g. uname -a): Linux gjc-XPS-8500 4.10.0-21-generic #23-Ubuntu SMP Fri Apr 28 16:14:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
  • Others:

What happened?

Installed kubernetes on bare metal using kubeadm. Dns inside pods did not work.

What you expected to happen?

Would expect dns inside pods to work.

How to reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

As noted in kubernetes/kubernetes#45828, the problem is due to the fact that on a normal Ubuntu desktop (and maybe other desktop Linux OSes), /etc/resolve.conf contains 127.0.0.35, which doesn't work inside Pods.

The correct thing to do is to add --resolv-conf=/run/systemd/resolve/resolv.conf to the kubelet config in case systemd-resolved is running with DNSStubListener and /etc/resolv.conf is configured with the local resolver (solution suggested by @antoineco and @thockin).

kinbug prioritimportant-soon

Most helpful comment

As an FYI: as I commented on https://github.com/kubernetes/kubernetes/issues/45828, I don't believe that over-riding kubelet's resolv.conf reference will work anyway. This will just dump a broken (referencing 127.0.0.53) resolv.conf into all the pods and bypass cluster-local resolution. The current state of affairs is that just external resolution is broken because kube-dns has a broken upstream, but it is able to stub the cluster-local zones off to k8s. The only fix I can see is adding / editing config to kube-dns / CoreDNS.

NB

  • It's not just ubuntu desktop, this isn't a NetworkManager thing, this is systemd-resolved, which is used on server version 17.10 at least.
  • It's 127.0.0.53 (as in the DNS port), not 35

All 18 comments

So kubeadm doesn't lay down the kubelet startup, that's done in the system unit file, which is done here: https://github.com/kubernetes/release

/cc @marcoceppi @castrojo - this appears to be an ubuntu default for desktop setups.

@timothysc @marcoceppi @castrojo Critical for v1.7?

@luxas no.

Sorry, not sure if anybody will still look at closed issues. #272 is not resolved by the solution suggested here.

Please reopen #272 or start working on this issue considering the other context as well.

I'm hitting this when I try to use kubeadm with GCE's ubuntu-1710 image so it looks like it's _not_ limited to the desktop install.

As an FYI: as I commented on https://github.com/kubernetes/kubernetes/issues/45828, I don't believe that over-riding kubelet's resolv.conf reference will work anyway. This will just dump a broken (referencing 127.0.0.53) resolv.conf into all the pods and bypass cluster-local resolution. The current state of affairs is that just external resolution is broken because kube-dns has a broken upstream, but it is able to stub the cluster-local zones off to k8s. The only fix I can see is adding / editing config to kube-dns / CoreDNS.

NB

  • It's not just ubuntu desktop, this isn't a NetworkManager thing, this is systemd-resolved, which is used on server version 17.10 at least.
  • It's 127.0.0.53 (as in the DNS port), not 35

@mt-inside that's why pointing kubelet to /run/systemd/resolve/resolv.conf makes sense because in an environment running systemd-resolved

  1. /etc/resolv.conf contains only one entry: localhost
  2. /run/systemd/resolve/resolv.conf contains your actual DNS servers

kube-dns merely uses whatever nameservers kubelet provides as its forwarders, so if kubeadm configures kubelet to use 2) instead of 1) you're all set.

@antoineco I agree that'll get kube-dns forwarding correctly, but won't every other user-level Pod in the system then go straight to your upstream servers and not query kube-dns at all? When I tried the --resolv-conf option, it just used that file verbatim and didn't inject the kube-dns Service ClusterIP (the --resolv-conf option was ignored until I removed the --cluster-dns option)

By default, if --cluster-dns is set (should be!), all user workloads send DNS requests to kube-dns, which in turn does the forwarding job for you.

What you described is the behaviour of ClusterFirst.

ref https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pods-dns-policy

@antoineco Ah, you're right. I was confused about dnsPolicy. I was confused about what coredns is running as, because _Default isn't the default_. I also confused myself by looking at a ClusterFirst Pod that was failing back to Default when I didn't specify --cluster-dns in some of my tests. Also the scope of --resolv-conf (not applying to ClusterFirst) and --cluster-dns (not applying to Default) isn't documented, and I didn't think of it until I really grokked the different dns modes.

I agree this fix is perfectly sensible.

So what is the consensus?

@timothysc Sorry, it's not spelt out. A combination of what @antoineco says here and @thockin says on https://github.com/kubernetes/kubernetes/issues/45828
Kubelet needs the argument --resolv-conf=/run/systemd/resolve/resolv.conf
My kubeadm wrapper script adds that to KUBELET_DNS_ARGS in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

However (deferring to the kubeadm authors here):

  • I don't know what kubelet's behaviour is wrt non-existant files. If it doesn't like them, this should _only_ be done on systems running systemd-resolvd
  • You seem to think the kubelet's args file isn't laid down by kubeadm, but by https://github.com/kubernetes/release ? I take it .../10-kubeadm.conf comes from this project at least and could be used?

I've hit the very same issue with kubeadm 1.10.0 and CoreDNS - with even worse results, as CoreDNS asked to resolve any external name starts looping to itself, consuming all allowed RAM and getting OOM-killed.

Obviously it can be fixed either by kubelet --resolv-conf param (as mentioned above), or by editing config map with Corefile, but it takes a moment to realise what's failing and why. It's unfortunate that default setup fails so miserably.

I've raised an issue in CoreDNS tracker for better handling of such a misconfiguration on CoreDNS side: https://github.com/coredns/coredns/issues/1647

/assign @detiber @timothysc

seems like a duplicate of https://github.com/kubernetes/kubeadm/issues/787
which is being worked on.

Yes, this one and #787 are duplicates. I'll close #787 as this one is older.

As we have the preflight check (added in https://github.com/kubernetes/kubernetes/pull/63691), I'm gonna close this
To make this work automatically, we have filed https://github.com/kubernetes/kubeadm/issues/845

Thank you a lot everyone who have contributed to fixing this!

Was this page helpful?
0 / 5 - 0 ratings