Kind: Private DNS Fails on 0.8 with systemd-resolved

Created on 15 May 2020  Â·  18Comments  Â·  Source: kubernetes-sigs/kind

What happened: Deployments that reference a private registry that is only resolvable via a private DNS server still fails in 0.8. It is not exactly clear to me what is running DNS on the 172.18.0.1 interface. The systemd DNS resolver is mounted on 127.0.0.53 and the authors are pretty adamant that it should never be exposed to other interfaces.
Relevant logs below.

What you expected to happen: The new DNS approach would k8s to successfully resolve my hosts and fetch containers.

How to reproduce it (as minimally and precisely as possible): hmm. configure systemd-resolved to reference a DNS server in a private-network/VPN. That same private network hosts a container registry.

Anything else we need to know?: I see that it is attempting to use the Bridge Interface (172.18.0.1) but the DNS server (proxy?) on that IP still does not resolve my private hosts.

Environment: Arch Linux connecting to a remote network via a Wireguard VPN.
The VPN (and its DNS) are defined in systemd/networkd and systemd/resolved.

  • kind version: (use kind version): kind v0.8.0 go1.14.2 linux/amd64
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-30T20:19:45Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info): Server Version: 19.03.8-ce
  • OS (e.g. from /etc/os-release):
NAME="Manjaro Linux"
ID=manjaro
ID_LIKE=arch
PRETTY_NAME="Manjaro Linux"

Pod Docker Log

  Warning  Failed     23m (x4 over 25m)    kubelet, ricks-local-control-plane  Failed to pull image "registry.myprivatedomain.com/project/core/bifrost:latest": rpc error: code = Unknown desc = failed to pull and unpack image "registry.myprivatedomain.com/project/core/bifrost:latest": failed to resolve reference "registry.myprivatedomain.com/project/core/bifrost:latest": failed to do request: Head https://registry.myprivatedomain.com/v2/project/core/bifrost/manifests/latest: dial tcp: lookup registry.myprivatedomain.com on 172.18.0.1:53: no such host

Checking DNS from my local host

➜ drill @127.0.0.53 registry.myprivatedomain.com
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 9308
;; flags: qr rd ra ; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; registry.myprivatedomain.com.    IN  A

;; ANSWER SECTION:
registry.myprivatedomain.com.   60  IN  A   10.192.5.119
registry.myprivatedomain.com.   60  IN  A   10.192.10.209
kinexternal kinsupport

Most helpful comment

For future Googlers of this issue:

Workarounds

I have found either of these two approaches to work:

  1. Remove the symlinked /etc/resolve.conf and create your own, then add your nameserver. Then launch kind.
    This is a supported configuration. In this mode, systemd-resolved will read /etc/resolv.conf instead of setting it. Since /etc/resolv.conf only really exists for legacy reasons in the eyes of systemd-resolved, this is a reasonable approach.
cp /etc/resolv.conf /tmp/resolv.conf
echo "nameserver 10.10.0.2" >> /tmp/resolv.conf
sudo rm /etc/resolv.conf
sudo mv /tmp/resolve.conf /etc/
  1. Add a dns entry to your /etc/docker/daemon.json
    I am not sure why this works, but it does. In a user-defined network of type "Bridge", like the one that kind uses at 0.8+, a container's /etc/resolv.conf points to the bridge'd ethernet interface, then iptables redirects that to a "kernel" DNS server on some high level random port. This is Docker's embedded DNS server, and it acts according to some undocumented magic. At the moment, this works. There is no way to know if this is intended behavior, or if it will last forever.

(if you already have a /etc/docker/daemon.json file}

cat /etc/docker/daemon.json | jq --arg ns "10.10.0.2" '. + { dns: [$ns] }' > /tmp/daemon.json && sudo mv /tmp/daemon.json /etc/docker 

If you don't have an /etc/docker/daemon.json file just create one with these contents:

{
  "dns": [
    "10.10.0.2"
  ]
}

then systemctl restart docker

All 18 comments

Digging into iptables within the docker container, I see :

[1735:124684] -A DOCKER_OUTPUT -d 172.18.0.1/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:50484
[0:0] -A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -m tcp --sport 46063 -j SNAT --to-source 172.18.0.1:53
[0:0] -A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -m udp --sport 50484 -j SNAT --to-source 172.18.0.1:53

I am pretty sure it's pointing at Docker's embedded DNS, then. Now to figure out what that server/proxy is using for resolution, and/or if there is a way to tweak it.

Following that route, it appears that what I need to be able to do is pass a --dns flag containing by private DNS server to the container invocation. This will set it as a fallback DNS server, which should suit my purposes well.

Not sure if this is the most current documentation, but:

In the absence of the --dns=IP_ADDRESS..., --dns-search=DOMAIN..., or --dns-opt=OPTION... options, Docker uses the /etc/resolv.conf of the host machine (where the docker daemon runs). While doing so the daemon filters out all localhost IP address nameserver entries from the host's original file.

Filtering is necessary because all localhost addresses on the host are unreachable from the container's network. After this filtering, if there are no more nameserver entries left in the container's /etc/resolv.conf file, the daemon adds public Google DNS nameservers (8.8.8.8 and 8.8.4.4) to the container's DNS configuration.

If I may summarize what I've found:

  1. systemd-resolved cannot listen on anything other than localhost (127.0.0.*)
  2. Docker's embedded DNS will consult the host's /etc/resolv.conf but it will not use any servers there that are localhost, because they are unreachable from the docker container.
  3. Systemd-resolved does not add any entries to /etc/resolv.conf other than 127.0.0.53. If you configure it to use/forward additional servers, it will update the daemon, not resolv.conf

I'm still investigating a blessed configuration that would supply my desired DNS server to Docker via resolv.conf.

Having reviewed the documentation as well as some forums requesting hacks to get around this, I'm pretty convinced that we shouldn't rely on a person's ability to directly curate their /etc/resolv.conf and that kind should provide a means of adding RR's to Docker via the --dns flag or something similar.

The DNS in 0.8+ uses docker's embedded DNS, which actually does resolution on the host in dockerd, proxying your host resolvers.

It's up to the user to ensure that docker can use your resolver.

All of the bit about docker filtering localhost addresses is a red-herring, that only applies on the default docker network which we no longer use. That's not happening here. Instead docker does not pass your host resolv.conf at all, it injects a listener into the container which is then handled inside dockerd, which does resolution out in the host.

Also, it's not actually using the bridge interface, it is using the embedded DNS listener. https://docs.docker.com/network/bridge/#differences-between-user-defined-bridges-and-the-default-bridge

@BenTheElder - Thanks for clarifying. That gels with my observations. Also, looking at the docs, it does say that it uses /etc/hosts and /etc/resolv.conf to configure the default DNS server for a container. It doesn't directly state that it filters out localhost, but that would explain my problem.

Testing this hypothesis, there are 2 points in which I can supply additional DNS servers: In the daemon.json or by passing the --dns flag to docker run

I have updated my /etc/docker/daemon.json to include my private DNS server and this does fix the problem

However, I would really prefer not to require my users to alter their /etc/docker/daemon.json in order to use kind as a local dev environment. I would greatly prefer the ability to add a DNS configuration to the KindConfig that adds servers via the --dns flag.

Also, looking at the docs, it does say that it uses /etc/hosts and /etc/resolv.conf to configure the default DNS server for a container. It doesn't directly state that it filters out localhost, but that would explain my problem.

Let me be a bit more specific, docker doesn't do _any_ filtering to supply /etc/resolv.conf inside containers that are on a non-default bridge (as in kind 0.8+). It doesn't pass these through _at all_, instead injecting config specifying the embedded resolver which is just a proxy to your host (well, and a proxy that intercepts requests for container names and returns their IPs). We slightly modify this in kind but it's totally unrelated to what is resolved.

However, I would really prefer not to require my users to alter their /etc/docker/daemon.json in order to use kind as a local dev environment. I would greatly prefer the ability to add a DNS configuration to the KindConfig that adds servers via the --dns flag.

This isn't really about kind though, for whatever reason dockerd (not the containers!) will not use your resolver by default, that's a bug somewhwere between docker and your resolver.

docker does inject a loopback-filtered config into containers on the default bridge, but that's because the default bridge doesn't implement the embedded listener and a loopback resolver couldn't be reached from inside the container because loopback is local to your network namespace.

the embedded listener on the other hand only has the injected socket listening inside the container namespace, the rest of the resolver is out on the host.

Docker literally will not respect the --dns flag of docker run when on a non-default network, and infact even if you run a container on the default network and then connect it to any user defined network it will switch it exclusively to the embedded resolver.

This is because the embedded resolver is intended to solve these kinds of problems by ensuring resolution actually occurs on the host outside the container. it also solves peer discovery.

https://github.com/docker/for-linux/issues/325#issuecomment-397431363

I'm not sure why you needed to modify the daemon settings though, it should be respecting the host configuration.

AFAICT your actual issue is https://github.com/moby/moby/issues/38243
/kind external

I have run into this issue before, largely because systemd-resolved's resolver uses the per-interface DNS internally, but it doesn't set those nameservers in the /etc/resolv.conf. This is one part of my problem, since my private DNS comes from a VPN interface.

This isn't the only problem, though. Despite the documentation stating otherwise, Docker's embedded DNS does not use my host's default DNS resolver, otherwise it would work. If I ping/host/gethostbyname() from my host, the internal DNS names resolve. If I do this from within any container, it does not (unless I set the dns settings in daemon.json)

Within the kind container, they don't resolve unless I set dns in daemon.json. All of the various sym-linkable resolve.confs provided by systemd-resolved all say the same thing: nameserver 127.0.0.53

I can't prove it just yet, but I'm still quite sure that nothing within the container is consulting my host's DNS, instead it is going straight to my upstream DNS provider.

The kind container is not consulting upstream DNS, it is consulting the embedded listener, which then consults whatever dockerd / the docker daemon picks up. That much I am certain of.

Docker not picking up your resolver is a docker bug, and can't be fixed in KIND.

The VPN interface and the loopback resolver would not be reachable inside the container for a container using the default bridge. However since we use a non-default network (like say, docker compose does), if you fix the docker daemon config then the embedded DNS should make it viable for your clusters to reach your resolver.

To summarize:

  • The systemd-resolved loopback resolver will not be directly reachable in any docker container, due to being on loopback, which is in the host's namespace, and is not reachable from within any container.

    • We cannot point the containers at this directly due to this, adding a config option to do so won't change that

  • The systemd-resolved loopback resolver should be reachable via the embedded DNS listener <-> dockerd on the host

    • dockerd seems to not be respecting the systemd-resolved setting on the host

To fix that last point, we'll need work done upstream like https://github.com/moby/moby/issues/38243.

It doesn't make sense for kind to do anything else about this, we're not responsible for the docker daemon on the host, nor the system resolver / config. There isn't any reasonable workaround for this within kind.

@BenTheElder - Thanks for looking into this. I agree with your assessment, even though I don't like it. :)

I don't like it either fwiw :disappointed:
We prefer to work around broken behavior where possible, e.g. https://github.com/kubernetes-sigs/kind/pull/1589 / https://github.com/kubernetes-sigs/kind/issues/1581 / https://github.com/containers/libpod/issues/4318 ...

For future Googlers of this issue:

Workarounds

I have found either of these two approaches to work:

  1. Remove the symlinked /etc/resolve.conf and create your own, then add your nameserver. Then launch kind.
    This is a supported configuration. In this mode, systemd-resolved will read /etc/resolv.conf instead of setting it. Since /etc/resolv.conf only really exists for legacy reasons in the eyes of systemd-resolved, this is a reasonable approach.
cp /etc/resolv.conf /tmp/resolv.conf
echo "nameserver 10.10.0.2" >> /tmp/resolv.conf
sudo rm /etc/resolv.conf
sudo mv /tmp/resolve.conf /etc/
  1. Add a dns entry to your /etc/docker/daemon.json
    I am not sure why this works, but it does. In a user-defined network of type "Bridge", like the one that kind uses at 0.8+, a container's /etc/resolv.conf points to the bridge'd ethernet interface, then iptables redirects that to a "kernel" DNS server on some high level random port. This is Docker's embedded DNS server, and it acts according to some undocumented magic. At the moment, this works. There is no way to know if this is intended behavior, or if it will last forever.

(if you already have a /etc/docker/daemon.json file}

cat /etc/docker/daemon.json | jq --arg ns "10.10.0.2" '. + { dns: [$ns] }' > /tmp/daemon.json && sudo mv /tmp/daemon.json /etc/docker 

If you don't have an /etc/docker/daemon.json file just create one with these contents:

{
  "dns": [
    "10.10.0.2"
  ]
}

then systemctl restart docker

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cig0 picture cig0  Â·  4Comments

nielsvbrecht picture nielsvbrecht  Â·  3Comments

leelavg picture leelavg  Â·  3Comments

cjwagner picture cjwagner  Â·  3Comments

philipstaffordwood picture philipstaffordwood  Â·  4Comments