Docker-alpine: slow DNS resolution failures in the `alpine:latest` attached to a non-default bridge network

Created on 24 Jul 2017  路  5Comments  路  Source: gliderlabs/docker-alpine

When running an alpine:latest container attached to a non-default bridged network, meaning it has the Docker internal DNS enabled, I observe that DNS resolution failures are processed very slowly.

Steps to reproduce:

  1. Create a new bridged network:
$ docker network create --driver bridge alpine_test
$ docker network inspect alpine_test 
[
    {
        "Name": "alpine_test",
        "Id": "a1bf8d14aa4b6918a6810f93f46dcf953d91a31532544d7eac89760cccfbcdda",
        "Created": "2017-07-24T10:36:56.348996363+02:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.20.0.0/16",
                    "Gateway": "172.20.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "Containers": {},
        "Options": {},
        "Labels": {}
    }
]
  1. Run an alpine:latest container attached to this network:
$ docker run -it --network alpine_test alpine
/ #
  1. Try resolving an unknown hostname as compared with known hostnames:
/ # time getent hosts unknown_host
Command exited with non-zero status 2
real    0m 10.00s
user    0m 0.00s
sys     0m 0.00s
/ # time getent hosts google.com
2a00:1450:4001:820::200e  google.com  google.com
real    0m 0.05s
user    0m 0.00s
sys     0m 0.00s
/ # time getent hosts $(hostname)
172.20.0.2        846b9c66c58f  846b9c66c58f
real    0m 0.00s
user    0m 0.00s
sys     0m 0.00s

strace-ing this command shows multiple DNS requests and SERVFAIL replies during this process (see the attached file strace.txt ).

Software versions:

$ docker --version
Docker version 17.05.0-ce, build 89658be
$ uname  -rv
4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017
$ cat /etc/os-release 
NAME="Linux Mint"
VERSION="18.2 (Sonya)"
ID=linuxmint
ID_LIKE=ubuntu
PRETTY_NAME="Linux Mint 18.2"
VERSION_ID="18.2"
HOME_URL="http://www.linuxmint.com/"
SUPPORT_URL="http://forums.linuxmint.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/linuxmint/"
VERSION_CODENAME=sonya
UBUNTU_CODENAME=xenial

This does not happen if the default bridge network is used, where /etc/resolv.conf points to 8.8.8.8 and 8.8.4.4.

question

Most helpful comment

We are observing the same. Running Kubernetes 1.5/1.6/1.7 with calico or weave. All have the same problem with alpine images. Switching to another distro like debian with docker image solves our problem. (Happening with Alpine 3.4/3.5/3.6)

All 5 comments

Further observations:

  • Running centos:latest results in multiple DNS requests as well. However, these get answered much faster, that is, within 0.1 s at my system.
  • Same happens when using the --network host Docker option.
  • Images centos:latest and ubuntu:latest do not have this issue.
  • tcpdump also shows some really weird queries containing apparently random character sequences during the container startup and shutdown:
13:52:07.761760 IP 127.0.0.1.32151 > 127.0.1.1.53: 50468+ A? nyqtvpnmnuzwvy. (32)
13:52:07.762060 IP 127.0.0.1.56184 > 127.0.1.1.53: 57235+ A? yuoyizvh. (26)
13:52:07.762296 IP 127.0.1.1.53 > 127.0.0.1.32151: 50468 ServFail 0/0/0 (32)
13:52:07.762399 IP 127.0.0.1.13377 > 127.0.1.1.53: 31969+ A? lbinbfgjcfssof. (32)
13:52:07.762533 IP 127.0.0.1.52391 > 127.0.1.1.53: 5419+ A? nyqtvpnmnuzwvy. (32)
13:52:07.762537 IP 127.0.1.1.53 > 127.0.0.1.56184: 57235 ServFail 0/0/0 (26)
13:52:07.762607 IP 127.0.0.1.39877 > 127.0.1.1.53: 56596+ A? yuoyizvh. (26)
13:52:07.762963 IP 127.0.1.1.53 > 127.0.0.1.13377: 31969 ServFail 0/0/0 (32)
13:52:07.763021 IP 127.0.1.1.53 > 127.0.0.1.52391: 5419 ServFail 0/0/0 (32)
13:52:07.763045 IP 127.0.1.1.53 > 127.0.0.1.39877: 56596 ServFail 0/0/0 (26)
13:52:07.763141 IP 127.0.0.1.37716 > 127.0.1.1.53: 61231+ A? lbinbfgjcfssof. (32)
13:52:07.763547 IP 127.0.0.1.53119 > 127.0.1.1.53: 33293+ A? nyqtvpnmnuzwvy. (32)
13:52:07.763657 IP 127.0.1.1.53 > 127.0.0.1.37716: 61231 ServFail 0/0/0 (32)
13:52:07.763671 IP 127.0.0.1.52634 > 127.0.1.1.53: 14470+ A? yuoyizvh. (26)
13:52:07.763994 IP 127.0.0.1.53713 > 127.0.1.1.53: 5245+ A? lbinbfgjcfssof. (32)
13:52:07.764196 IP 127.0.1.1.53 > 127.0.0.1.53119: 33293 ServFail 0/0/0 (32)
13:52:07.764245 IP 127.0.1.1.53 > 127.0.0.1.52634: 14470 ServFail 0/0/0 (26)
13:52:07.764341 IP 127.0.0.1.43291 > 127.0.1.1.53: 33293+ A? nyqtvpnmnuzwvy. (32)
13:52:07.764381 IP 127.0.0.1.60569 > 127.0.1.1.53: 14470+ A? yuoyizvh. (26)
13:52:07.764494 IP 127.0.1.1.53 > 127.0.0.1.53713: 5245 ServFail 0/0/0 (32)
13:52:07.764632 IP 127.0.0.1.59036 > 127.0.1.1.53: 5245+ A? lbinbfgjcfssof. (32)
13:52:07.764934 IP 127.0.1.1.53 > 127.0.0.1.43291: 33293 ServFail 0/0/0 (32)
13:52:07.764981 IP 127.0.1.1.53 > 127.0.0.1.60569: 14470 ServFail 0/0/0 (26)
13:52:07.765089 IP 127.0.1.1.53 > 127.0.0.1.59036: 5245 ServFail 0/0/0 (32)

Looks like musl's DNS resolver is going crazy for some reason.

We are observing the same. Running Kubernetes 1.5/1.6/1.7 with calico or weave. All have the same problem with alpine images. Switching to another distro like debian with docker image solves our problem. (Happening with Alpine 3.4/3.5/3.6)

Seems to have resolved by itself -- not happening with the latest Alpine. Thanks anyway :)

Facing the same issue even on Alpine 3.7

Alpine linux is using musl libc to resolve the DNS, it sends both A query and AAAA query concurrently by default.
in my case no need to query AAAA by default, and I've coded a fix that removed the AAAA by default (for the AF_UNSPEC family) just did an A query, and it worked for me.

I posted the solution here https://github.com/kubernetes/kubernetes/issues/56903#issuecomment-409603030

hope it can help

thanks
harper

Was this page helpful?
0 / 5 - 0 ratings