When running an alpine:latest container attached to a non-default bridged network, meaning it has the Docker internal DNS enabled, I observe that DNS resolution failures are processed very slowly.
Steps to reproduce:
$ docker network create --driver bridge alpine_test
$ docker network inspect alpine_test
[
{
"Name": "alpine_test",
"Id": "a1bf8d14aa4b6918a6810f93f46dcf953d91a31532544d7eac89760cccfbcdda",
"Created": "2017-07-24T10:36:56.348996363+02:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.20.0.0/16",
"Gateway": "172.20.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"Containers": {},
"Options": {},
"Labels": {}
}
]
alpine:latest container attached to this network:$ docker run -it --network alpine_test alpine
/ #
/ # time getent hosts unknown_host
Command exited with non-zero status 2
real 0m 10.00s
user 0m 0.00s
sys 0m 0.00s
/ # time getent hosts google.com
2a00:1450:4001:820::200e google.com google.com
real 0m 0.05s
user 0m 0.00s
sys 0m 0.00s
/ # time getent hosts $(hostname)
172.20.0.2 846b9c66c58f 846b9c66c58f
real 0m 0.00s
user 0m 0.00s
sys 0m 0.00s
strace-ing this command shows multiple DNS requests and SERVFAIL replies during this process (see the attached file strace.txt ).
Software versions:
$ docker --version
Docker version 17.05.0-ce, build 89658be
$ uname -rv
4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017
$ cat /etc/os-release
NAME="Linux Mint"
VERSION="18.2 (Sonya)"
ID=linuxmint
ID_LIKE=ubuntu
PRETTY_NAME="Linux Mint 18.2"
VERSION_ID="18.2"
HOME_URL="http://www.linuxmint.com/"
SUPPORT_URL="http://forums.linuxmint.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/linuxmint/"
VERSION_CODENAME=sonya
UBUNTU_CODENAME=xenial
This does not happen if the default bridge network is used, where /etc/resolv.conf points to 8.8.8.8 and 8.8.4.4.
Further observations:
centos:latest results in multiple DNS requests as well. However, these get answered much faster, that is, within 0.1 s at my system.--network host Docker option.centos:latest and ubuntu:latest do not have this issue.tcpdump also shows some really weird queries containing apparently random character sequences during the container startup and shutdown:13:52:07.761760 IP 127.0.0.1.32151 > 127.0.1.1.53: 50468+ A? nyqtvpnmnuzwvy. (32)
13:52:07.762060 IP 127.0.0.1.56184 > 127.0.1.1.53: 57235+ A? yuoyizvh. (26)
13:52:07.762296 IP 127.0.1.1.53 > 127.0.0.1.32151: 50468 ServFail 0/0/0 (32)
13:52:07.762399 IP 127.0.0.1.13377 > 127.0.1.1.53: 31969+ A? lbinbfgjcfssof. (32)
13:52:07.762533 IP 127.0.0.1.52391 > 127.0.1.1.53: 5419+ A? nyqtvpnmnuzwvy. (32)
13:52:07.762537 IP 127.0.1.1.53 > 127.0.0.1.56184: 57235 ServFail 0/0/0 (26)
13:52:07.762607 IP 127.0.0.1.39877 > 127.0.1.1.53: 56596+ A? yuoyizvh. (26)
13:52:07.762963 IP 127.0.1.1.53 > 127.0.0.1.13377: 31969 ServFail 0/0/0 (32)
13:52:07.763021 IP 127.0.1.1.53 > 127.0.0.1.52391: 5419 ServFail 0/0/0 (32)
13:52:07.763045 IP 127.0.1.1.53 > 127.0.0.1.39877: 56596 ServFail 0/0/0 (26)
13:52:07.763141 IP 127.0.0.1.37716 > 127.0.1.1.53: 61231+ A? lbinbfgjcfssof. (32)
13:52:07.763547 IP 127.0.0.1.53119 > 127.0.1.1.53: 33293+ A? nyqtvpnmnuzwvy. (32)
13:52:07.763657 IP 127.0.1.1.53 > 127.0.0.1.37716: 61231 ServFail 0/0/0 (32)
13:52:07.763671 IP 127.0.0.1.52634 > 127.0.1.1.53: 14470+ A? yuoyizvh. (26)
13:52:07.763994 IP 127.0.0.1.53713 > 127.0.1.1.53: 5245+ A? lbinbfgjcfssof. (32)
13:52:07.764196 IP 127.0.1.1.53 > 127.0.0.1.53119: 33293 ServFail 0/0/0 (32)
13:52:07.764245 IP 127.0.1.1.53 > 127.0.0.1.52634: 14470 ServFail 0/0/0 (26)
13:52:07.764341 IP 127.0.0.1.43291 > 127.0.1.1.53: 33293+ A? nyqtvpnmnuzwvy. (32)
13:52:07.764381 IP 127.0.0.1.60569 > 127.0.1.1.53: 14470+ A? yuoyizvh. (26)
13:52:07.764494 IP 127.0.1.1.53 > 127.0.0.1.53713: 5245 ServFail 0/0/0 (32)
13:52:07.764632 IP 127.0.0.1.59036 > 127.0.1.1.53: 5245+ A? lbinbfgjcfssof. (32)
13:52:07.764934 IP 127.0.1.1.53 > 127.0.0.1.43291: 33293 ServFail 0/0/0 (32)
13:52:07.764981 IP 127.0.1.1.53 > 127.0.0.1.60569: 14470 ServFail 0/0/0 (26)
13:52:07.765089 IP 127.0.1.1.53 > 127.0.0.1.59036: 5245 ServFail 0/0/0 (32)
Looks like musl's DNS resolver is going crazy for some reason.
We are observing the same. Running Kubernetes 1.5/1.6/1.7 with calico or weave. All have the same problem with alpine images. Switching to another distro like debian with docker image solves our problem. (Happening with Alpine 3.4/3.5/3.6)
Seems to have resolved by itself -- not happening with the latest Alpine. Thanks anyway :)
Facing the same issue even on Alpine 3.7
Alpine linux is using musl libc to resolve the DNS, it sends both A query and AAAA query concurrently by default.
in my case no need to query AAAA by default, and I've coded a fix that removed the AAAA by default (for the AF_UNSPEC family) just did an A query, and it worked for me.
I posted the solution here https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-409603030
hope it can help
thanks
harper
Most helpful comment
We are observing the same. Running Kubernetes 1.5/1.6/1.7 with calico or weave. All have the same problem with alpine images. Switching to another distro like debian with docker image solves our problem. (Happening with Alpine 3.4/3.5/3.6)