Kind: coredns failure in ipv6 kind cluster

Created on 19 Jul 2020 · 9Comments · Source: kubernetes-sigs/kind

What happened:
Unable to perform DNS lookup in ipv6 cluster.

__PS__: This might be a limitation of container runtime for ipv6, so it's kind of question, but I liked the bug kind's bug report template, so that I can provide all related informations.

What you expected to happen:
DNS lookup should be working

How to reproduce it (as minimally and precisely as possible):

Create the cluster with ipfamily as ipv6
Check coredns log
Quickly install dnsutils pod to perform nslookup, and run kubectl exec -i -t dnsutils -- nslookup www.google.com

Kind configuration

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
  - role: worker
networking:
  ipFamily: ipv6
  podSubnet: "fd00:10:244::/64"
  serviceSubnet: "fd00:10:96::/112"

DNS util pod

apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
  namespace: default
spec:
  containers:
    - name: dnsutils
      image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
      command:
        - sleep
        - "3600"
      imagePullPolicy: IfNotPresent
  restartPolicy: Always

Anything else we need to know?:

Please find below coredns logs

```shell scripts
$ ksyslo coredns-66bff467f8-nnch5 --timestamps
2020-07-19T06:05:57.061277847Z .:53
2020-07-19T06:05:57.061385231Z [INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
2020-07-19T06:05:57.061393166Z CoreDNS-1.6.7
2020-07-19T06:05:57.061398375Z linux/amd64, go1.13.6, da7f65b
2020-07-19T06:05:58.061904725Z [ERROR] plugin/errors: 2 2244698976610727589.491498813804393448. HINFO: dial udp 172.18.0.1:53: connect: network is unreachable --> this log entry is appearing right after cluster creation
2020-07-19T06:07:04.701610849Z [ERROR] plugin/errors: 2 www.google.com.lan. A: dial udp 172.18.0.1:53: connect: network is unreachable
2020-07-19T06:07:37.996168762Z [ERROR] plugin/errors: 2 www.google.com.lan. A: dial udp 172.18.0.1:53: connect: network is unreachable
2020-07-19T06:14:51.500618905Z [ERROR] plugin/errors: 2 www.google.com.lan. A: dial udp 172.18.0.1:53: connect: network is unreachable

**Environment:**

- kind version: (use `kind version`): kind v0.9.0-alpha+edecdfee8878ac go1.15beta1 linux/amd64. The same behavior is happening for latest release version 0.8.1
- Kubernetes version: (use `kubectl version`):

```shell scripts
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.5", GitCommit:"e6503f8d8f769ace2f338794c914a96fc335df0f", GitTreeState:"clean", BuildDate:"2020-06-26T03:47:41Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-20T01:49:49Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

Docker version: (use docker info):

docker info

$ docker info       
Client:
 Debug Mode: false

Server:
 Containers: 5
  Running: 2
  Paused: 0
  Stopped: 3
 Images: 38
 Server Version: 19.03.12
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.4.0-40-generic
 Operating System: Linux Mint 20
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 62.82GiB
 Name: linuxmint
 ID: BXVS:IPS6:3P55:RP3K:TFBR:LZCF:RUCC:52SH:ML2N:7ESH:WPO6:HJNJ
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: sayboras
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

OS (e.g. from /etc/os-release):

$ cat /etc/os-release 
NAME="Linux Mint"
VERSION="20 (Ulyana)"
ID=linuxmint
ID_LIKE=ubuntu
PRETTY_NAME="Linux Mint 20"
VERSION_ID="20"
HOME_URL="https://www.linuxmint.com/"
SUPPORT_URL="https://forums.ubuntu.com/"
BUG_REPORT_URL="http://linuxmint-troubleshooting-guide.readthedocs.io/en/latest/"
PRIVACY_POLICY_URL="https://www.linuxmint.com/"
VERSION_CODENAME=ulyana
UBUNTU_CODENAME=focal

kinsupport

Source

sayboras

Most helpful comment

Just a quick note, outgoing traffic from pod is also failed now (e.g. curl www.google.com). Not sure if it's my ISP issue (no ipv6 support), or there is something else that I have missed.

no ISP with ipv6 support no fun :)
you can have a free ipv6 tunnel with hurricane electric if you want to use IPv6 Internet, there are plenty of tutorials , I can confirm that works well. Just an advice, managing dual stack environments is a nightmare, so start enabling the tunnel only in a few machines until you are comfortable to move it to the whole network ;)

aojea on 21 Jul 2020

❤1 🎉1

All 9 comments

Is that the full kind config? Because the linked issue suggests that this involves using a nonstandard CNI in kind (calico).

BenTheElder on 19 Jul 2020

Is that the full kind config? Because the linked issue suggests that this involves using a nonstandard CNI in kind (calico).

@BenTheElder It's the full configuration, I try my best to provide minimal configuration and avoid any additional dependecies. Let me know if you cannot replicate the issue.

The linked issue is mainly for my reference.

sayboras on 19 Jul 2020

👍1

172.18.0.1:53

It seems CodeDNS has as upstream dns server 172.18.0.1, since CoreDNS is an IPv6 only pod it can´t reach it and fail.

If we dump the CoreDNS config we can see that it uses resolv.conf to obtain the upstream DNS servers

$ kubectl -n kube-system get configmap coredns -o yaml
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        log
        kubernetes cluster.local lan in-addr.arpa ip6.arpa {
           pods insecure
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        reload
        loadbalance
    }

you should replace the forward line, editing the configmap, and using an IPv6 DNS server that CoreDNS can reach (2003::1 is an example)

 forward .  [2003::1]:53

aojea on 19 Jul 2020

👍2

@aojea appreciated for your time discussing on slack :tada:

I have continued checking this issue following your suggestion. I have performed the below steps and get it working.

Get IPv6Gateway in kind network
```shell scripts
$ docker inspect --format='{{range .NetworkSettings.Networks}}{{.IPv6Gateway}}{{end}}' kind-control-plane
fc00:f853:ccd:e793::1

- Update CM for coredns with `forward [fc00:f853:ccd:e793::1]:5300`
- Route IPv6 UDP traffic from 5300 port to 53 in **host machine**. I just _socat_ as it's fairly easy for me.

$ socat UDP6-RECVFROM:5300,fork UDP4-SENDTO:127.0.0.53:53

Now DNS lookup is working
```shell scripts
$ kex dnsutils -- nslookup www.google.com
Server:     fd00:10:96::a
Address:    fd00:10:96::a#53

Non-authoritative answer:
Name:   www.google.com
Address: 216.58.199.68
Name:   www.google.com
Address: 2404:6800:4006:806::2004

Not sure if you are planning to make any changes in kind as such, otherwise, feel free to close this issue. Thanks again for your kind help @aojea @BenTheElder :tada:. Feel free to let me know if anything is required.

sayboras on 19 Jul 2020

👍2

Just a quick note, outgoing traffic from pod is also failed now (e.g. curl www.google.com). Not sure if it's my ISP issue (no ipv6 support), or there is something else that I have missed.

sayboras on 21 Jul 2020

Just a quick note, outgoing traffic from pod is also failed now (e.g. curl www.google.com). Not sure if it's my ISP issue (no ipv6 support), or there is something else that I have missed.

aojea on 21 Jul 2020

❤1 🎉1

I think we can close it
Thanks
/close

aojea on 24 Jul 2020

@aojea: Closing this issue.

In response to this:

I think we can close it
Thanks
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.