Kind: WSLv1: No DNS between pods or to the internet

Created on 16 Jul 2019  路  29Comments  路  Source: kubernetes-sigs/kind

What happened:

There is no DNS within the cluster, pods cannot resolve services within the cluster or access the internet on hostnames.

What you expected to happen:

DNS working to the internet and to services within the cluster

How to reproduce it (as minimally and precisely as possible):

I am running kind in WSL (v1) setup to run containers in Docker for Desktop installed within Win10. Docker for Desktop exposes the daemon over tcp and the DOCKER_HOST env var within WSL is set to tcp://127.0.0.1:2375.

I used the following configuration

kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
- role: worker
  extraPortMappings:
  # Home Assistant
  - containerPort: 31123
    hostPort: 31123
    listenAddress: 127.0.0.1
  # Mosquitto
  - containerPort: 31883
    hostPort: 31883
    listenAddress: 127.0.0.1
  # Traefik
  - containerPort: 30080
    hostPort: 30080
    listenAddress: 127.0.0.1
  # Traefik
  - containerPort: 30443
    hostPort: 30443
    listenAddress: 127.0.0.1

Anything else we need to know?:

I am experiencing a very similar situation on my home arm64 cluster so this might not be related to kind and more to kubernetes. I am not to the bottom of this issue yet but the symptoms (no DNS) are the same.

I updated the CoreDNS configmap the following to see the logs and I updated the forward to 9.9.9.9 to ensure it isn't a loopback issue in /etc/resolv.conf (link):

apiVersion: v1
data:
  Corefile: |
    .:53 {
        log
        errors
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           upstream
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . 9.9.9.9
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2019-07-16T13:00:43Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "1234"
  selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
  uid: 862c0677-19d5-48ee-8647-1b962737e2c9

When I start a simple shell pod, drop to the shell and execute ping google.com:

---
# Drop to shell like kubectl exec -it shell -- /bin/sh
apiVersion: v1
kind: Pod
metadata:
  name: shell
spec:
  containers:
  - image: alpine:3.9
    args:
    - sleep
    - "1000000"
    name: shell

I see the logs in CoreDNS:

2019-07-16T13:27:19.993Z [INFO] 10.244.1.2:50752 - 43950 "AAAA IN google.com.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,aa,rd 150 0.000251s
2019-07-16T13:27:19.993Z [INFO] 10.244.1.2:50752 - 43450 "A IN google.com.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,aa,rd 150 0.0003521s
2019-07-16T13:27:19.994Z [INFO] 172.17.0.1:38702 - 3964 "AAAA IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.0002506s
2019-07-16T13:27:19.994Z [INFO] 172.17.0.1:38702 - 3364 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000387s
2019-07-16T13:27:22.494Z [INFO] 172.17.0.1:38702 - 3964 "AAAA IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,rd 139 0.0001031s
2019-07-16T13:27:22.494Z [INFO] 172.17.0.1:38702 - 3364 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,rd 139 0.0002011s

But ping returns bad address. Same issue for an internal service.

More debugging output:

/ # cat /etc/resolv.conf
search automating.svc.cluster.local svc.cluster.local cluster.local intermax.local
nameserver 10.96.0.10
options ndots:5
/ # ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=36 time=10.310 ms
64 bytes from 8.8.8.8: seq=1 ttl=36 time=10.009 ms



md5-4077bb26433eac2a821041a2b9435c41



kubectl -n kube-system get svc
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   28m



md5-4077bb26433eac2a821041a2b9435c41



kubectl -n kube-system get endpoints
NAME                      ENDPOINTS                                               AGE
kube-controller-manager   <none>                                                  29m
kube-dns                  10.244.0.4:53,10.244.1.3:53,10.244.0.4:53 + 3 more...   28m
kube-scheduler            <none>                                                  29m

Environment:

  • kind version: v.0.4.0
  • Kubernetes version: v1.15.0
  • Docker version: 18.09.2
  • OS (e.g. from /etc/os-release): Windows WSL (v1)
kinbug triagneeds-information

All 29 comments

can you check if you have SERVFAIL answers in CoreDNS logs?

@aojea Thanks for the fast respone <3

No SERVFAIL answers in the log, this is the current full log:

CoreDNS-1.3.1
linux/amd64, go1.11.4, 6b56a9c
2019-07-16T13:10:50.710Z [INFO] plugin/reload: Running configuration MD5 = 4535707ba6147e45b0d2cb9e689e1760
2019-07-16T13:11:07.754Z [INFO] 172.17.0.1:52318 - 6580 "AAAA IN dl-cdn.alpinelinux.org.svc.cluster.local. udp 58 false 512" NXDOMAIN qr,rd 151 0.0002369s
2019-07-16T13:10:50.735Z [INFO] 127.0.0.1:52319 - 53664 "HINFO IN 5478662953990634785.3045020278082460155. udp 57 false 512" NXDOMAIN qr,rd,ra 132 0.0249001s
2019-07-16T13:11:07.755Z [INFO] 172.17.0.1:52318 - 6080 "A IN dl-cdn.alpinelinux.org.svc.cluster.local. udp 58 false 512" NXDOMAIN qr,rd 151 0.0006047s
2019-07-16T13:11:00.244Z [INFO] 10.244.1.2:37159 - 52400 "A IN dl-cdn.alpinelinux.org.automating.svc.cluster.local. udp 69 false 512" NXDOMAIN qr,aa,rd 162 0.000206s
2019-07-16T13:27:19.994Z [INFO] 172.17.0.1:38702 - 3964 "AAAA IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.0002506s
2019-07-16T13:27:19.994Z [INFO] 172.17.0.1:38702 - 3364 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000387s
2019-07-16T13:11:00.244Z [INFO] 10.244.1.2:37159 - 53000 "AAAA IN dl-cdn.alpinelinux.org.automating.svc.cluster.local. udp 69 false 512" NXDOMAIN qr,aa,rd 162 0.0004419s
2019-07-16T13:27:22.494Z [INFO] 172.17.0.1:38702 - 3964 "AAAA IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,rd 139 0.0001031s
2019-07-16T13:11:05.250Z [INFO] 10.244.1.2:53339 - 25600 "AAAA IN dl-cdn.alpinelinux.org.automating.svc.cluster.local. udp 69 false 512" NXDOMAIN qr,rd 162 0.0002075s
2019-07-16T13:27:22.494Z [INFO] 172.17.0.1:38702 - 3364 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,rd 139 0.0002011s
2019-07-16T13:30:38.074Z [INFO] 172.17.0.1:50198 - 15930 "AAAA IN kubernetes.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,aa,rd 150 0.0001651s
2019-07-16T13:11:05.251Z [INFO] 10.244.1.2:53339 - 25100 "A IN dl-cdn.alpinelinux.org.automating.svc.cluster.local. udp 69 false 512" NXDOMAIN qr,rd 162 0.0003713s
2019-07-16T13:30:38.074Z [INFO] 172.17.0.1:50198 - 15030 "A IN kubernetes.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,aa,rd 150 0.0002506s
2019-07-16T13:27:19.993Z [INFO] 10.244.1.2:50752 - 43950 "AAAA IN google.com.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,aa,rd 150 0.000251s
2019-07-16T13:27:19.993Z [INFO] 10.244.1.2:50752 - 43450 "A IN google.com.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,aa,rd 150 0.0003521s
2019-07-16T13:30:40.576Z [INFO] 172.17.0.1:50198 - 15930 "AAAA IN kubernetes.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,rd 150 0.000052s
2019-07-16T13:30:40.576Z [INFO] 172.17.0.1:50198 - 15030 "A IN kubernetes.automating.svc.cluster.local. udp 57 false 512" NXDOMAIN qr,rd 150 0.0000963s

shouldn麓t it be trying to append all the domains?

search automating.svc.cluster.local svc.cluster.local cluster.local intermax.local

Hmm I don't know but it also does not resolve to the outside where that shouldn't be an issue.

Edit: I will run the same test on a prod cluster to see CoreDNS output :)

Ran an apk update and ping kubernetes in the same shell pod an a prod cluster:

2019-07-16T14:50:12.113Z [INFO] 10.233.70.174:56125 - 17900 "AAAA IN dl-cdn.alpinelinux.org.default.svc.autotest.local. udp 67 false 512" NXDOMAIN qr,aa,rd,ra 163 0.000251416s
2019-07-16T14:50:12.113Z [INFO] 10.233.70.174:56125 - 17616 "A IN dl-cdn.alpinelinux.org.default.svc.autotest.local. udp 67 false 512" NXDOMAIN qr,aa,rd,ra 163 0.000341572s
2019-07-16T14:50:12.115Z [INFO] 10.233.70.174:50949 - 25837 "AAAA IN dl-cdn.alpinelinux.org.autotest.local. udp 55 false 512" NXDOMAIN qr,aa,rd,ra 151 0.000083053s
2019-07-16T14:50:12.115Z [INFO] 10.233.70.174:50949 - 25537 "A IN dl-cdn.alpinelinux.org.autotest.local. udp 55 false 512" NXDOMAIN qr,aa,rd,ra 151 0.000070021s
2019-07-16T14:50:12.114Z [INFO] 10.233.70.174:46324 - 46638 "AAAA IN dl-cdn.alpinelinux.org.svc.autotest.local. udp 59 false 512" NXDOMAIN qr,aa,rd,ra 155 0.000190704s
2019-07-16T14:50:12.115Z [INFO] 10.233.70.174:46324 - 46348 "A IN dl-cdn.alpinelinux.org.svc.autotest.local. udp 59 false 512" NXDOMAIN qr,aa,rd,ra 155 0.000390738s
2019-07-16T14:50:12.136Z [INFO] 10.233.70.174:53617 - 45145 "AAAA IN dl-cdn.alpinelinux.org. udp 40 false 512" NOERROR qr,rd,ra 440 0.020942952s
2019-07-16T14:50:57.3Z [INFO] 10.233.70.174:47037 - 26066 "AAAA IN kubernetes.default.svc.autotest.local. udp 55 false 512" NOERROR qr,aa,rd,ra 151 0.000140426s
2019-07-16T14:50:57.3Z [INFO] 10.233.70.174:47037 - 25827 "A IN kubernetes.default.svc.autotest.local. udp 55 false 512" NOERROR qr,aa,rd,ra 108 0.00007576s

So the output above seems normal, the issue might be network related? But I have no clue how to debug this.

Hmm when I set the replicas to one in the CoreDNS deployment the issue is partially gone. I can now ping the internet!:

/ # ping google.nl
PING google.nl (216.58.208.99): 56 data bytes
64 bytes from 216.58.208.99: seq=0 ttl=36 time=20.553 ms
64 bytes from 216.58.208.99: seq=1 ttl=36 time=13.854 ms
64 bytes from 216.58.208.99: seq=2 ttl=36 time=35.334 ms
^C
--- google.nl ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 13.854/23.247/35.334 ms
/ # ping kubernetes
PING kubernetes (10.96.0.1): 56 data bytes
^C
--- kubernetes ping statistics ---
7 packets transmitted, 0 packets received, 100% packet loss

2019-07-16T15:14:32.882Z [INFO] 10.244.1.5:34450 - 50160 "AAAA IN google.nl.default.svc.cluster.local. udp 53 false 512" NXDOMAIN qr,aa,rd 146 0.0002705s
2019-07-16T15:14:32.883Z [INFO] 10.244.1.5:34450 - 49660 "A IN google.nl.default.svc.cluster.local. udp 53 false 512" NXDOMAIN qr,aa,rd 146 0.0003856s
2019-07-16T15:14:32.883Z [INFO] 10.244.1.5:55131 - 1104 "A IN google.nl.svc.cluster.local. udp 45 false 512" NXDOMAIN qr,aa,rd 138 0.000258s
2019-07-16T15:14:32.883Z [INFO] 10.244.1.5:55131 - 1604 "AAAA IN google.nl.svc.cluster.local. udp 45 false 512" NXDOMAIN qr,aa,rd 138 0.0004112s
2019-07-16T15:14:32.884Z [INFO] 10.244.1.5:54365 - 21620 "AAAA IN google.nl.cluster.local. udp 41 false 512" NXDOMAIN qr,aa,rd 134 0.0001243s
2019-07-16T15:14:32.884Z [INFO] 10.244.1.5:54365 - 21020 "A IN google.nl.cluster.local. udp 41 false 512" NXDOMAIN qr,aa,rd 134 0.000166s
2019-07-16T15:14:32.914Z [INFO] 10.244.1.5:48644 - 18440 "A IN google.nl.intermax.local. udp 42 false 512" NXDOMAIN qr,rd,ra 117 0.0298592s
2019-07-16T15:14:32.914Z [INFO] 10.244.1.5:48644 - 19040 "AAAA IN google.nl.intermax.local. udp 42 false 512" NXDOMAIN qr,rd,ra 117 0.0299696s
2019-07-16T15:14:32.950Z [INFO] 10.244.1.5:46038 - 55740 "A IN google.nl. udp 27 false 512" NOERROR qr,rd,ra 52 0.0354438s
2019-07-16T15:14:32.990Z [INFO] 10.244.1.5:46038 - 56140 "AAAA IN google.nl. udp 27 false 512" NOERROR qr,rd,ra 64 0.0759243s
2019-07-16T15:14:46.290Z [INFO] 10.244.1.5:42557 - 12225 "AAAA IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,rd 147 0.0002117s
2019-07-16T15:14:46.290Z [INFO] 10.244.1.5:42557 - 11625 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,rd 106 0.0003265s

Is that debugging etc/resolv.conf from within a kind node or?

Kind nodes will pick up the nameserver from the host, and then the pods will get that + what Kubernetes sets.

This is a strange failure, I'm not sure if this is WSL2 related or not.

Oh I see this is WSLv1, that is completely untested (until you now, as far as I know). WSL2 has been tested and kind on windows with docker for windows.

Is that debugging etc/resolv.conf from within a kind node or?

No it is the /etc/resolv.conf within the alpine pod. The /etc/resolv.conf of the kind nodes looks like this:

nameserver 192.168.65.1
search intermax.local
domain intermax.local

And I can resolve the internet fine from within the kind nodes.

I upped the CoreDNS replicas to two again and the DNS stops working. When downscaling it to one I have a fully functional DNS to the internet and within the cluster:

/ # nslookup google.nl
Server:         10.96.0.10
Address:        10.96.0.10#53

Non-authoritative answer:
Name:   google.nl
Address: 172.217.17.67
Name:   google.nl
Address: 2a00:1450:400e:808::2003

/ # nslookup kubernetes
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

/ # nslookup mosquitto.automating
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   mosquitto.automating.svc.cluster.local
Address: 10.98.54.254

Scaling it up to two again and it stops working.

/ # nslookup mosquitto.automating
^C

Will open a bug over at Kubernetes to triage this further since it doesn't seem to be kind specific. On the other hand it could be network related but we will see what the Kubernetes guys have to say :)
Thanks for the fast response @BenTheElder 馃憤

I will leave this open until a conclusion is reached, will keep you guys updated!

馃 I can麓t reproduce the issue with your config

possibly interesting comment for anyone else following along: https://github.com/kubernetes/kubernetes/issues/80243#issuecomment-512262392

Followed up over at kubernetes/kubernetes#80243 and the PR with the fix #739

@wilmardo can you post your iptables rules in the nodes iptables-save and your pods kubectl get pods --all-namespaces -o wide when it is failing?

@aojea Of course, just did a fresh deployment.

# kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                         READY   STATUS    RESTARTS   AGE     IP           NODE                 NOMINATED NODE   READINESS GATES
default       shell                                        1/1     Running   0          19s     10.244.1.2   kind-worker          <none>           <none>
kube-system   coredns-5c98db65d4-6jbkm                     1/1     Running   0          7m37s   10.244.0.3   kind-control-plane   <none>           <none>
kube-system   coredns-5c98db65d4-cx427                     1/1     Running   0          7m37s   10.244.0.2   kind-control-plane   <none>           <none>
kube-system   etcd-kind-control-plane                      1/1     Running   0          6m43s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kindnet-lcr6d                                1/1     Running   0          7m22s   172.17.0.3   kind-worker          <none>           <none>
kube-system   kindnet-nr6jt                                1/1     Running   0          7m37s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-apiserver-kind-control-plane            1/1     Running   0          6m35s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-controller-manager-kind-control-plane   1/1     Running   0          6m33s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-proxy-kc6n8                             1/1     Running   0          7m37s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-proxy-xwv5r                             1/1     Running   0          7m22s   172.17.0.3   kind-worker          <none>           <none>
kube-system   kube-scheduler-kind-control-plane            1/1     Running   0          6m40s   172.17.0.2   kind-control-plane   <none>           <none>

Control-plane

# iptables-save
# Generated by iptables-save v1.6.1 on Tue Jul 30 08:12:28 2019
*filter
:INPUT ACCEPT [899:143525]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [898:164852]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -s 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -d 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Tue Jul 30 08:12:28 2019
# Generated by iptables-save v1.6.1 on Tue Jul 30 08:12:28 2019
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [7:420]
:POSTROUTING ACCEPT [7:420]
:KIND-MASQ-AGENT - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-5ZUVGKEDQRTZFI3V - [0:0]
:KUBE-SEP-6E7XQMQ4RAYOWTTM - [0:0]
:KUBE-SEP-IT2ZTR26TO4XFPTO - [0:0]
:KUBE-SEP-N4G2XR5TDX7PQE7P - [0:0]
:KUBE-SEP-YIL6JZP7A3QYXJU2 - [0:0]
:KUBE-SEP-ZP3FB6NMPNCO4VBJ - [0:0]
:KUBE-SEP-ZXMNUKOKXUTL2MK2 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain" -m addrtype ! --dst-type LOCAL -j KIND-MASQ-AGENT
-A KIND-MASQ-AGENT -d 10.244.0.0/16 -m comment --comment "kind-masq-agent: local traffic is not subject to MASQUERADE" -j RETURN
-A KIND-MASQ-AGENT -m comment --comment "ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain)" -j MASQUERADE
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -s 172.17.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -p tcp -m tcp -j DNAT --to-destination 172.17.0.2:6443
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -p udp -m udp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SEP-IT2ZTR26TO4XFPTO -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-IT2ZTR26TO4XFPTO -p tcp -m tcp -j DNAT --to-destination 10.244.0.2:53
-A KUBE-SEP-N4G2XR5TDX7PQE7P -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-N4G2XR5TDX7PQE7P -p tcp -m tcp -j DNAT --to-destination 10.244.0.2:9153
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -p udp -m udp -j DNAT --to-destination 10.244.0.2:53
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -p tcp -m tcp -j DNAT --to-destination 10.244.0.3:9153
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -p tcp -m tcp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-IT2ZTR26TO4XFPTO
-A KUBE-SVC-ERIFXISQEP7F7OF4 -j KUBE-SEP-ZXMNUKOKXUTL2MK2
-A KUBE-SVC-JD5MR3NA4I4DYORP -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-N4G2XR5TDX7PQE7P
-A KUBE-SVC-JD5MR3NA4I4DYORP -j KUBE-SEP-ZP3FB6NMPNCO4VBJ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -j KUBE-SEP-5ZUVGKEDQRTZFI3V
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-YIL6JZP7A3QYXJU2
-A KUBE-SVC-TCOU7JCQXEZGVUNU -j KUBE-SEP-6E7XQMQ4RAYOWTTM
COMMIT
# Completed on Tue Jul 30 08:12:28 2019

Kind-worker

# iptables-save
# Generated by iptables-save v1.6.1 on Tue Jul 30 08:16:00 2019
*filter
:INPUT ACCEPT [27:15269]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [24:1991]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -s 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -d 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Tue Jul 30 08:16:00 2019
# Generated by iptables-save v1.6.1 on Tue Jul 30 08:16:00 2019
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:KIND-MASQ-AGENT - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-5ZUVGKEDQRTZFI3V - [0:0]
:KUBE-SEP-6E7XQMQ4RAYOWTTM - [0:0]
:KUBE-SEP-IT2ZTR26TO4XFPTO - [0:0]
:KUBE-SEP-N4G2XR5TDX7PQE7P - [0:0]
:KUBE-SEP-YIL6JZP7A3QYXJU2 - [0:0]
:KUBE-SEP-ZP3FB6NMPNCO4VBJ - [0:0]
:KUBE-SEP-ZXMNUKOKXUTL2MK2 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain" -m addrtype ! --dst-type LOCAL -j KIND-MASQ-AGENT
-A KIND-MASQ-AGENT -d 10.244.0.0/16 -m comment --comment "kind-masq-agent: local traffic is not subject to MASQUERADE" -j RETURN
-A KIND-MASQ-AGENT -m comment --comment "ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain)" -j MASQUERADE
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -s 172.17.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -p tcp -m tcp -j DNAT --to-destination 172.17.0.2:6443
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -p udp -m udp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SEP-IT2ZTR26TO4XFPTO -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-IT2ZTR26TO4XFPTO -p tcp -m tcp -j DNAT --to-destination 10.244.0.2:53
-A KUBE-SEP-N4G2XR5TDX7PQE7P -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-N4G2XR5TDX7PQE7P -p tcp -m tcp -j DNAT --to-destination 10.244.0.2:9153
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -p udp -m udp -j DNAT --to-destination 10.244.0.2:53
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -p tcp -m tcp -j DNAT --to-destination 10.244.0.3:9153
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -p tcp -m tcp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-IT2ZTR26TO4XFPTO
-A KUBE-SVC-ERIFXISQEP7F7OF4 -j KUBE-SEP-ZXMNUKOKXUTL2MK2
-A KUBE-SVC-JD5MR3NA4I4DYORP -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-N4G2XR5TDX7PQE7P
-A KUBE-SVC-JD5MR3NA4I4DYORP -j KUBE-SEP-ZP3FB6NMPNCO4VBJ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -j KUBE-SEP-5ZUVGKEDQRTZFI3V
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-YIL6JZP7A3QYXJU2
-A KUBE-SVC-TCOU7JCQXEZGVUNU -j KUBE-SEP-6E7XQMQ4RAYOWTTM
COMMIT
# Completed on Tue Jul 30 08:16:00 2019

@wilmardo is still not working?
I can see that boths coredns pods are in the same node

kube-system coredns-5c98db65d4-6jbkm 1/1 Running 0 7m37s 10.244.0.3 kind-control-plane
kube-system coredns-5c98db65d4-cx427 1/1 Running 0 7m37s 10.244.0.2 kind-control-plane

and the iptables rules DNAT the traffic to their internal ip addresses with probability 0.5

-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
...
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-YIL6JZP7A3QYXJU2
-A KUBE-SVC-TCOU7JCQXEZGVUNU -j KUBE-SEP-6E7XQMQ4RAYOWTTM
...
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -s 10.244.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-YIL6JZP7A3QYXJU2 -p udp -m udp -j DNAT --to-destination 10.244.0.2:53
...
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -s 10.244.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -p udp -m udp -j DNAT --to-destination 10.244.0.3:53

maybe I`m missing something but it should work :man_shrugging:

I can see that boths coredns pods are in the same node

Yes happens most of the time with a kubeadm deployment since CoreDNS gets started replicated before any other node has joined. So they start on the same node since that is the only option, since they are already scheduled when the next node joins. So they stay on one node until something triggers a reschedule.
Did that now by running kubectl delete -n kube-system pods --selector k8s-app=kube-dns

Now that is fixed I can test the traffic between nodes.

all pods:

NAMESPACE     NAME                                         READY   STATUS    RESTARTS   AGE     IP           NODE                 NOMINATED NODE   READINESS GATES
default       shell                                        1/1     Running   0          86s     10.244.1.4   kind-worker          <none>           <none>
kube-system   coredns-5c98db65d4-mg6dt                     1/1     Running   0          2m2s    10.244.0.5   kind-control-plane   <none>           <none>
kube-system   coredns-5c98db65d4-nd5rp                     1/1     Running   0          2m2s    10.244.1.3   kind-worker          <none>           <none>
kube-system   etcd-kind-control-plane                      1/1     Running   0          110s    172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kindnet-4nwxc                                1/1     Running   0          2m50s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kindnet-gz2qv                                1/1     Running   0          2m34s   172.17.0.3   kind-worker          <none>           <none>
kube-system   kube-apiserver-kind-control-plane            1/1     Running   0          2m7s    172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-controller-manager-kind-control-plane   1/1     Running   0          2m1s    172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-proxy-r7clv                             1/1     Running   0          2m50s   172.17.0.2   kind-control-plane   <none>           <none>
kube-system   kube-proxy-z2jhb                             1/1     Running   0          2m34s   172.17.0.3   kind-worker          <none>           <none>
kube-system   kube-scheduler-kind-control-plane            1/1     Running   0          112s    172.17.0.2   kind-control-plane   <none>           <none>

Run within the shell pod:

/ # nslookup google.nl
;; connection timed out; no servers could be reached

/ # nslookup google.nl 10.244.0.5
;; connection timed out; no servers could be reached

/ # nslookup google.nl 10.244.1.3
Server:         10.244.1.3
Address:        10.244.1.3#53

Non-authoritative answer:
Name:   google.nl
Address: 172.217.17.131
Name:   google.nl
Address: 2a00:1450:400e:807::2003

So as soon as I try to lookup DNS on the CoreDNS pod not on the same node as the shell pod it fails. But the requests are received by CoreDNS (forget to save the logs) but nslookup returns a timeout. It seems the response is never received.

Just to be sure my iptables output once more (needed to recreate the cluster).

Control-plane

# iptables-save
# Generated by iptables-save v1.6.1 on Tue Jul 30 11:56:45 2019
*filter
:INPUT ACCEPT [3228:536047]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [3229:598814]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -s 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -d 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Tue Jul 30 11:56:45 2019
# Generated by iptables-save v1.6.1 on Tue Jul 30 11:56:45 2019
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [17:1020]
:POSTROUTING ACCEPT [17:1020]
:KIND-MASQ-AGENT - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-5ZUVGKEDQRTZFI3V - [0:0]
:KUBE-SEP-EJJ3L23ZA35VLW6X - [0:0]
:KUBE-SEP-FVQSBIWR5JTECIVC - [0:0]
:KUBE-SEP-LASJGFFJP3UOS6RQ - [0:0]
:KUBE-SEP-LPGSDLJ3FDW46N4W - [0:0]
:KUBE-SEP-P6ZV3VC6PB5OMAHT - [0:0]
:KUBE-SEP-R75T7LXI5PWKQPQA - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain" -m addrtype ! --dst-type LOCAL -j KIND-MASQ-AGENT
-A KIND-MASQ-AGENT -d 10.244.0.0/16 -m comment --comment "kind-masq-agent: local traffic is not subject to MASQUERADE" -j RETURN
-A KIND-MASQ-AGENT -m comment --comment "ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain)" -j MASQUERADE
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -s 172.17.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -p tcp -m tcp -j DNAT --to-destination 172.17.0.2:6443
-A KUBE-SEP-EJJ3L23ZA35VLW6X -s 10.244.1.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-EJJ3L23ZA35VLW6X -p udp -m udp -j DNAT --to-destination 10.244.1.3:53
-A KUBE-SEP-FVQSBIWR5JTECIVC -s 10.244.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-FVQSBIWR5JTECIVC -p tcp -m tcp -j DNAT --to-destination 10.244.0.5:9153
-A KUBE-SEP-LASJGFFJP3UOS6RQ -s 10.244.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-LASJGFFJP3UOS6RQ -p tcp -m tcp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-LPGSDLJ3FDW46N4W -s 10.244.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-LPGSDLJ3FDW46N4W -p udp -m udp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-P6ZV3VC6PB5OMAHT -s 10.244.1.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-P6ZV3VC6PB5OMAHT -p tcp -m tcp -j DNAT --to-destination 10.244.1.3:9153
-A KUBE-SEP-R75T7LXI5PWKQPQA -s 10.244.1.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-R75T7LXI5PWKQPQA -p tcp -m tcp -j DNAT --to-destination 10.244.1.3:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LASJGFFJP3UOS6RQ
-A KUBE-SVC-ERIFXISQEP7F7OF4 -j KUBE-SEP-R75T7LXI5PWKQPQA
-A KUBE-SVC-JD5MR3NA4I4DYORP -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-FVQSBIWR5JTECIVC
-A KUBE-SVC-JD5MR3NA4I4DYORP -j KUBE-SEP-P6ZV3VC6PB5OMAHT
-A KUBE-SVC-NPX46M4PTMTKRN6Y -j KUBE-SEP-5ZUVGKEDQRTZFI3V
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LPGSDLJ3FDW46N4W
-A KUBE-SVC-TCOU7JCQXEZGVUNU -j KUBE-SEP-EJJ3L23ZA35VLW6X
COMMIT
# Completed on Tue Jul 30 11:56:45 2019

Worker

# iptables-save
# Generated by iptables-save v1.6.1 on Tue Jul 30 11:59:05 2019
*filter
:INPUT ACCEPT [68:25196]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [66:5294]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -s 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -d 10.244.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Tue Jul 30 11:59:05 2019
# Generated by iptables-save v1.6.1 on Tue Jul 30 11:59:05 2019
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [4:240]
:POSTROUTING ACCEPT [4:240]
:KIND-MASQ-AGENT - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-5ZUVGKEDQRTZFI3V - [0:0]
:KUBE-SEP-EJJ3L23ZA35VLW6X - [0:0]
:KUBE-SEP-FVQSBIWR5JTECIVC - [0:0]
:KUBE-SEP-LASJGFFJP3UOS6RQ - [0:0]
:KUBE-SEP-LPGSDLJ3FDW46N4W - [0:0]
:KUBE-SEP-P6ZV3VC6PB5OMAHT - [0:0]
:KUBE-SEP-R75T7LXI5PWKQPQA - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -m comment --comment "kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain" -m addrtype ! --dst-type LOCAL -j KIND-MASQ-AGENT
-A KIND-MASQ-AGENT -d 10.244.0.0/16 -m comment --comment "kind-masq-agent: local traffic is not subject to MASQUERADE" -j RETURN
-A KIND-MASQ-AGENT -m comment --comment "ip-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain)" -j MASQUERADE
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -s 172.17.0.2/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-5ZUVGKEDQRTZFI3V -p tcp -m tcp -j DNAT --to-destination 172.17.0.2:6443
-A KUBE-SEP-EJJ3L23ZA35VLW6X -s 10.244.1.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-EJJ3L23ZA35VLW6X -p udp -m udp -j DNAT --to-destination 10.244.1.3:53
-A KUBE-SEP-FVQSBIWR5JTECIVC -s 10.244.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-FVQSBIWR5JTECIVC -p tcp -m tcp -j DNAT --to-destination 10.244.0.5:9153
-A KUBE-SEP-LASJGFFJP3UOS6RQ -s 10.244.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-LASJGFFJP3UOS6RQ -p tcp -m tcp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-LPGSDLJ3FDW46N4W -s 10.244.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-LPGSDLJ3FDW46N4W -p udp -m udp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-P6ZV3VC6PB5OMAHT -s 10.244.1.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-P6ZV3VC6PB5OMAHT -p tcp -m tcp -j DNAT --to-destination 10.244.1.3:9153
-A KUBE-SEP-R75T7LXI5PWKQPQA -s 10.244.1.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-R75T7LXI5PWKQPQA -p tcp -m tcp -j DNAT --to-destination 10.244.1.3:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LASJGFFJP3UOS6RQ
-A KUBE-SVC-ERIFXISQEP7F7OF4 -j KUBE-SEP-R75T7LXI5PWKQPQA
-A KUBE-SVC-JD5MR3NA4I4DYORP -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-FVQSBIWR5JTECIVC
-A KUBE-SVC-JD5MR3NA4I4DYORP -j KUBE-SEP-P6ZV3VC6PB5OMAHT
-A KUBE-SVC-NPX46M4PTMTKRN6Y -j KUBE-SEP-5ZUVGKEDQRTZFI3V
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-LPGSDLJ3FDW46N4W
-A KUBE-SVC-TCOU7JCQXEZGVUNU -j KUBE-SEP-EJJ3L23ZA35VLW6X
COMMIT
# Completed on Tue Jul 30 11:59:05 2019

I still can't reproduce it, I have followed exactly the same steps, can it be because I`m using Linux?

kubectl get -n kube-system pods --selector k8s-app=kube-dns -o wide
NAME                       READY   STATUS    RESTARTS   AGE     IP           NODE                 NOMINATED NODE   READINESS GATES
coredns-5c98db65d4-qdmhs   1/1     Running   0          4m47s   10.244.1.3   kind-worker          <none>           <none>
coredns-5c98db65d4-x52mq   1/1     Running   0          4m46s   10.244.0.4   kind-control-plane   <none>           <none>
/ # nslookup google.nl
nslookup: can't resolve '(null)': Name does not resolve

Name:      google.nl
Address 1: 172.217.17.3 mad07s09-in-f3.1e100.net
Address 2: 2a00:1450:4003:809::2003 mad08s05-in-x03.1e100.net
/ # nslookup google.nl 10.244.1.3
Server:    10.244.1.3
Address 1: 10.244.1.3 10-244-1-3.kube-dns.kube-system.svc.cluster.local

Name:      google.nl
Address 1: 172.217.17.3 mad07s09-in-f3.1e100.net
Address 2: 2a00:1450:4003:809::2003 mad08s05-in-x03.1e100.net
/ # nslookup google.nl 10.244.0.2
Server:    10.244.0.2
Address 1: 10.244.0.2

Name:      google.nl
Address 1: 172.217.17.3 mad07s09-in-f3.1e100.net
Address 2: 2a00:1450:4003:809::2003 mad08s05-in-x03.1e100.net

I still can't reproduce it, I have followed exactly the same steps, can it be because I`m using Linux?

Yes could be I suspect some incompatibility with WSL v1 and Docker for Windows. Will try to get a colleague to try and reproduce this :)

I'm on linux and currently having this problem. No dns resolution, but connecting via IP address works fine.

I'm getting really sporadic results from nslookup

/ # nslookup mysql-server
Server:         10.96.0.10
Address:        10.96.0.10:53

Name:   mysql-server.default.svc.cluster.local
Address: 10.98.244.179

*** Can't find mysql-server.svc.cluster.local: No answer
*** Can't find mysql-server.cluster.local: No answer
*** Can't find mysql-server.default.svc.cluster.local: No answer
*** Can't find mysql-server.svc.cluster.local: No answer
*** Can't find mysql-server.cluster.local: No answer

```
/ # nslookup mysql-server
Server: 10.96.0.10
Address: 10.96.0.10:53

Name: mysql-server.default.svc.cluster.local
Address: 10.98.244.179

* Can't find mysql-server.svc.cluster.local: No answer
Can't find mysql-server.cluster.local: No answer
Can't find mysql-server.default.svc.cluster.local: No answer
Can't find mysql-server.svc.cluster.local: No answer
*
* Can't find mysql-server.cluster.local: No answer


/ # nslookup mysql-server
Server: 10.96.0.10
Address: 10.96.0.10:53

Name: mysql-server.default.svc.cluster.local
Address: 10.98.244.179

* Can't find mysql-server.svc.cluster.local: No answer
Can't find mysql-server.cluster.local: No answer
Can't find mysql-server.default.svc.cluster.local: No answer
Can't find mysql-server.svc.cluster.local: No answer
*
* Can't find mysql-server.cluster.local: No answer


meanwhile, here's the logs from core-dns...

(cmd)-> k logs -n kube-system -l k8s-app=kube-dns
.:53
2019-08-08T20:29:45.442Z [INFO] CoreDNS-1.2.6
2019-08-08T20:29:45.442Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
.:53
2019-08-08T20:29:45.421Z [INFO] CoreDNS-1.2.6
2019-08-08T20:29:45.421Z [INFO] linux/amd64, go1.11.2, 756749c
CoreDNS-1.2.6
linux/amd64, go1.11.2, 756749c
[INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

which confuses me because that seems an awful lot like they're not accepting requests

@wreed4 Do you have log in the CoreDNS configmap?

/close
this has not activity, we couldn't reproduce it and e2e DNS tests are running periodically without any issue
Feel free to reopen

@aojea: Closing this issue.

In response to this:

/close
this has not activity, we couldn't reproduce it and e2e DNS tests are running periodically without any issue
Feel free to reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

I encountered exactly the same issue in my kind env in my MacBook.
As I've got similar experience from here, so simply killed the CoreDNS pods helped, again!

I believe it's a bug somewhere

It maybe, but the problem is that there are many moving parts and difference between the environment reported.
Kind is running in the kubernetes CI with a very low rate of failure and a considerable amount of times per day, that makes me think the failure has to be environmental and not in kind default setup.

I guess i have the same problem on Ubuntu Server arm64 https://github.com/Trackhe/Raspberry64bitKubernetesServerDualstack

sometime works and in the next second on pod

master node /etc/resolv.conf

nameserver 127.0.0.53
options edns0
search fritz.box

for debug i use only one dns pod on master node: kubectl logs --namespace=kube-system -l k8s-app=kube-dns

[INFO] 200.200.208.17:33318 - 408 "AAAA IN google.com.fritz.box. udp 38 false 512" NXDOMAIN qr,aa,rd,ra 107 0.000339958s
[INFO] 200.200.208.17:33318 - 65259 "A IN google.com.fritz.box. udp 38 false 512" NOERROR qr,aa,rd,ra 38 0.000587436s
[INFO] 200.200.208.17:60191 - 7669 "AAAA IN google.com.default.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000403772s
[INFO] 200.200.208.17:60191 - 6521 "A IN google.com.default.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000721008s
[INFO] 200.200.208.17:56756 - 57719 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000352365s
[INFO] 200.200.208.17:56756 - 58756 "AAAA IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000859247s
[INFO] 200.200.208.17:59421 - 26342 "AAAA IN google.com.cluster.local. udp 42 false 512" NXDOMAIN qr,aa,rd 135 0.000350032s
[INFO] 200.200.208.17:59421 - 25527 "A IN google.com.cluster.local. udp 42 false 512" NXDOMAIN qr,aa,rd 135 0.000672139s
[INFO] 200.200.208.17:36074 - 10831 "AAAA IN google.com.fritz.box. udp 38 false 512" NXDOMAIN qr,aa,rd,ra 107 0.000352976s
[INFO] 200.200.208.17:36074 - 9942 "A IN google.com.fritz.box. udp 38 false 512" NOERROR qr,aa,rd,ra 38 0.000646046s

To be clear: the link above does NOT appear to involve using KIND.
There are lots of ways to wind up with broken cluster networking unrelated to KIND 馃槄

Its the way to reproduce the problem in my case. i can remove it if you want. I though it can help

i realy dont understand why it is working for a small moment. and in the next second it fails..

 [INFO] 200.200.208.17:38092 - 5890 "AAAA IN google.de.default.svc.cluster.local. udp 53 false 512" NXDOMAIN qr,aa,rd 146 0.000803937s
[INFO] 200.200.208.17:38092 - 4722 "A IN google.de.default.svc.cluster.local. udp 53 false 512" NXDOMAIN qr,aa,rd 146 0.001110953s
[INFO] 200.200.208.17:45374 - 4306 "A IN google.de.svc.cluster.local. udp 45 false 512" NXDOMAIN qr,aa,rd 138 0.000632586s
[INFO] 200.200.208.17:45374 - 5121 "AAAA IN google.de.svc.cluster.local. udp 45 false 512" NXDOMAIN qr,aa,rd 138 0.000953695s
[INFO] 200.200.208.17:60195 - 55455 "AAAA IN google.de.cluster.local. udp 41 false 512" NXDOMAIN qr,aa,rd 134 0.000715363s
[INFO] 200.200.208.17:60195 - 54715 "A IN google.de.cluster.local. udp 41 false 512" NXDOMAIN qr,aa,rd 134 0.001026694s
[INFO] 200.200.208.17:44190 - 20295 "A IN google.de.fritz.box. udp 37 false 512" NXDOMAIN qr,aa,rd,ra 106 0.004858937s
[INFO] 200.200.208.17:44190 - 20832 "AAAA IN google.de.fritz.box. udp 37 false 512" NOERROR qr,aa,rd,ra 37 0.006597845s
[INFO] 200.200.208.17:50558 - 64294 "AAAA IN google.de. udp 27 false 512" NOERROR qr,rd,ra 64 0.016610308s
[INFO] 200.200.208.17:50558 - 63572 "A IN google.de. udp 27 false 512" NOERROR qr,rd,ra 52 0.017251375s
[INFO] 200.200.208.17:33200 - 50524 "PTR IN 195.16.217.172.in-addr.arpa. udp 45 false 512" NOERROR qr,rd,ra 177 0.00622646s
[INFO] 200.200.208.17:40313 - 20979 "AAAA IN google.com.default.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.000727567s
[INFO] 200.200.208.17:40313 - 20182 "A IN google.com.default.svc.cluster.local. udp 54 false 512" NXDOMAIN qr,aa,rd 147 0.001072305s
[INFO] 200.200.208.17:49206 - 40776 "AAAA IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000631531s
[INFO] 200.200.208.17:49206 - 40313 "A IN google.com.svc.cluster.local. udp 46 false 512" NXDOMAIN qr,aa,rd 139 0.000975269s
[INFO] 200.200.208.17:39480 - 21779 "AAAA IN google.com.cluster.local. udp 42 false 512" NXDOMAIN qr,aa,rd 135 0.000554291s
[INFO] 200.200.208.17:39480 - 21353 "A IN google.com.cluster.local. udp 42 false 512" NXDOMAIN qr,aa,rd 135 0.000822177s
[INFO] 200.200.208.17:40032 - 46613 "AAAA IN google.com.fritz.box. udp 38 false 512" NXDOMAIN qr,aa,rd,ra 107 0.004968694s
[INFO] 200.200.208.17:40032 - 46058 "A IN google.com.fritz.box. udp 38 false 512" NOERROR qr,aa,rd,ra 38 0.006061092s
Was this page helpful?
0 / 5 - 0 ratings