BUG REPORT:
Environment:
Minikube version: v0.22.2
What happened:
Trying to resolve URLs won't work, for example connecting to the GH API from a pod returns: Error: getaddrinfo EAI_AGAIN api.github.com.
What you expected to happen:
Resolve an URL.
How to reproduce it (as minimally and precisely as possible):
sudo minikube start --vm-drive=nonekubectl create -f busybox.yaml (busyboxy from k8s docs)kubectl exec -ti busybox -- nslookup kubernetes.defaultReturns:
Server: 10.0.0.10
Address 1: 10.0.0.10
nslookup: can't resolve 'kubernetes.default'
Output of minikube logs (if applicable):
:warning: It looks like dnsmaq is failing to start, tail from minikube logs:
Oct 03 16:34:18 glooming-asteroid localkube[26499]: I1003 16:34:18.653793 26499 kuberuntime_manager.go:457] Container {Name:dnsmasq Image:gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4 Command:[] Args:[-v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -restartDnsmasq=true -- -k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] WorkingDir: Ports:[{Name:dns HostPort:0 ContainerPort:53 Protocol:UDP HostIP:} {Name:dns-tcp HostPort:0 ContainerPort:53 Protocol:TCP HostIP:}] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:150 scale:-3} d:{Dec:<nil>} s:150m Format:DecimalSI} memory:{i:{value:20971520 scale:0} d:{Dec:<nil>} s:20Mi Format:BinarySI}]} VolumeMounts:[{Name:kube-dns-config ReadOnly:false MountPath:/etc/k8s/dns/dnsmasq-nanny SubPath:} {Name:default-token-dkjg2 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthcheck/dnsmasq,Port:10054,Host:,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:60,TimeoutSeconds:5,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:5,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Oct 03 16:34:18 glooming-asteroid localkube[26499]: I1003 16:34:18.653979 26499 kuberuntime_manager.go:741] checking backoff for container "dnsmasq" in pod "kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"
Oct 03 16:34:18 glooming-asteroid localkube[26499]: I1003 16:34:18.654088 26499 kuberuntime_manager.go:751] Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)
Oct 03 16:34:18 glooming-asteroid localkube[26499]: E1003 16:34:18.654121 26499 pod_workers.go:182] Error syncing pod 7b11e42b-a79a-11e7-b83c-0090f5ed1486 ("kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"), skipping: failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"
Oct 03 16:34:32 glooming-asteroid localkube[26499]: I1003 16:34:32.653745 26499 kuberuntime_manager.go:457] Container {Name:dnsmasq Image:gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4 Command:[] Args:[-v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -restartDnsmasq=true -- -k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] WorkingDir: Ports:[{Name:dns HostPort:0 ContainerPort:53 Protocol:UDP HostIP:} {Name:dns-tcp HostPort:0 ContainerPort:53 Protocol:TCP HostIP:}] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:150 scale:-3} d:{Dec:<nil>} s:150m Format:DecimalSI} memory:{i:{value:20971520 scale:0} d:{Dec:<nil>} s:20Mi Format:BinarySI}]} VolumeMounts:[{Name:kube-dns-config ReadOnly:false MountPath:/etc/k8s/dns/dnsmasq-nanny SubPath:} {Name:default-token-dkjg2 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthcheck/dnsmasq,Port:10054,Host:,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:60,TimeoutSeconds:5,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:5,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Oct 03 16:34:32 glooming-asteroid localkube[26499]: I1003 16:34:32.653993 26499 kuberuntime_manager.go:741] checking backoff for container "dnsmasq" in pod "kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"
Oct 03 16:34:32 glooming-asteroid localkube[26499]: I1003 16:34:32.654136 26499 kuberuntime_manager.go:751] Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)
Oct 03 16:34:32 glooming-asteroid localkube[26499]: E1003 16:34:32.654174 26499 pod_workers.go:182] Error syncing pod 7b11e42b-a79a-11e7-b83c-0090f5ed1486 ("kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"), skipping: failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"
Oct 03 16:34:44 glooming-asteroid localkube[26499]: I1003 16:34:44.654035 26499 kuberuntime_manager.go:457] Container {Name:dnsmasq Image:gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4 Command:[] Args:[-v=2 -logtostderr -configDir=/etc/k8s/dns/dnsmasq-nanny -restartDnsmasq=true -- -k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] WorkingDir: Ports:[{Name:dns HostPort:0 ContainerPort:53 Protocol:UDP HostIP:} {Name:dns-tcp HostPort:0 ContainerPort:53 Protocol:TCP HostIP:}] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[cpu:{i:{value:150 scale:-3} d:{Dec:<nil>} s:150m Format:DecimalSI} memory:{i:{value:20971520 scale:0} d:{Dec:<nil>} s:20Mi Format:BinarySI}]} VolumeMounts:[{Name:kube-dns-config ReadOnly:false MountPath:/etc/k8s/dns/dnsmasq-nanny SubPath:} {Name:default-token-dkjg2 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath:}] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:/healthcheck/dnsmasq,Port:10054,Host:,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:60,TimeoutSeconds:5,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:5,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:nil Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
Oct 03 16:34:44 glooming-asteroid localkube[26499]: I1003 16:34:44.654621 26499 kuberuntime_manager.go:741] checking backoff for container "dnsmasq" in pod "kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)"
Oct 03 16:34:44 glooming-asteroid localkube[26499]: I1003 16:34:44.655048 26499 kuberuntime_manager.go:751] Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-910330662-r7x4d_kube-system(7b11e42b-a79a-11e7-b83c-0090f5ed1486)
Anything else do we need to know:
Some troubleshooting commands with output:
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
kube-dns-910330662-r7x4d 3/3 Running 11 20h
kubectl get svc --namespace=kube-system
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns 10.0.0.10 <none> 53/UDP,53/TCP 20h
kubernetes-dashboard 10.0.0.193 <nodes> 80:30000/TCP 20h
:warning: Endpoint is empty: kubectl get ep kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 20h
Tail from kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
I1003 14:43:04.930205 263 nanny.go:108] dnsmasq[280]: Maximum number of concurrent DNS queries reached (max: 150)
I1003 14:43:14.948913 263 nanny.go:108] dnsmasq[280]: Maximum number of concurrent DNS queries reached (max: 150)
I would be very happy with a workaround so I'm not stuck on this issue.
As a workaround, you may try deploying coredns instead of kube-dns. If you do, take care to disable kube-dns in the add on manager ("minikube addons disable kube-dns").
Deployment: https://github.com/coredns/deployment/tree/master/kubernetes
:wave: thanks for your suggestion @chrisohaver, sadly it didn't solve my issue.
Could you provide any more information on the host you're running the None driver on?
@dlorenc yes, what kind of information would you like?
Is it just a stock ubuntu 16.04 installation? Have you done anything special with the network settings? Is it running on a VM, or a physical machine?
Yes stock Ubuntu 16.04, running on a physical machine (development laptop). Nothing special with network settings as far as I know...
ifconfig output when minikube is running
docker0 Link encap:Ethernet HWaddr 02:42:05:86:af:80
inet addr:172.17.0.1 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:5ff:fe86:af80/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:53165 errors:0 dropped:0 overruns:0 frame:0
TX packets:58600 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5366119 (5.3 MB) TX bytes:20857731 (20.8 MB)
enp0s25 Link encap:Ethernet HWaddr 00:90:f5:ed:14:86
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Interrupt:20 Memory:f7e00000-f7e20000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:1591921 errors:0 dropped:0 overruns:0 frame:0
TX packets:1591921 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:705999303 (705.9 MB) TX bytes:705999303 (705.9 MB)
veth4502c39 Link encap:Ethernet HWaddr e2:e3:46:ed:6d:3a
inet6 addr: fe80::e0e3:46ff:feed:6d3a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:4865 (4.8 KB)
veth5232af3 Link encap:Ethernet HWaddr 06:2c:3d:85:50:ee
inet6 addr: fe80::42c:3dff:fe85:50ee/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:68 errors:0 dropped:0 overruns:0 frame:0
TX packets:1027 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:22908 (22.9 KB) TX bytes:128739 (128.7 KB)
vethebfe5aa Link encap:Ethernet HWaddr ae:77:ef:4f:52:53
inet6 addr: fe80::ac77:efff:fe4f:5253/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:42 errors:0 dropped:0 overruns:0 frame:0
TX packets:63 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5046 (5.0 KB) TX bytes:25277 (25.2 KB)
vethf27d244 Link encap:Ethernet HWaddr 9e:ae:23:6e:04:81
inet6 addr: fe80::9cae:23ff:fe6e:481/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:14 errors:0 dropped:0 overruns:0 frame:0
TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2257 (2.2 KB) TX bytes:7789 (7.7 KB)
vethf83f683 Link encap:Ethernet HWaddr 66:8c:af:9f:6f:08
inet6 addr: fe80::648c:afff:fe9f:6f08/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:250 (250.0 B) TX bytes:4865 (4.8 KB)
vethf85c67f Link encap:Ethernet HWaddr fa:55:14:f3:72:81
inet6 addr: fe80::f855:14ff:fef3:7281/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:250 (250.0 B) TX bytes:4865 (4.8 KB)
wlp3s0 Link encap:Ethernet HWaddr b4:b6:76:a2:13:b3
inet addr:192.168.3.192 Bcast:192.168.3.255 Mask:255.255.254.0
inet6 addr: fe80::4166:368b:55ff:fbd2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1300 Metric:1
RX packets:4290172 errors:0 dropped:0 overruns:0 frame:0
TX packets:2467188 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:5145628303 (5.1 GB) TX bytes:270743860 (270.7 MB)
It affects me too. DNS works when using the Virtualbox driver and doesn't work when using the none driver.
System:
Nothing fancy with networking or DNS as far as I know
cat /etc/resolv.conf
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.
nameserver 127.0.0.53
I'm using the newest minikube from Github.
kubectl exec -ti busybox -- nslookup kubernetes.default hangs with:
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
kubedns container seems to start correctly, the logs are the same as in virtuablox version.
dnsmasq seems to start correctly and after a while I see an info about reached limit:
I1020 12:24:30.649214 1 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I1020 12:24:30.649315 1 nanny.go:86] Starting dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I1020 12:24:30.705616 1 nanny.go:111]
W1020 12:24:30.705637 1 nanny.go:112] Got EOF from stdout
I1020 12:24:30.705687 1 nanny.go:108] dnsmasq[14]: started, version 2.78-security-prerelease cachesize 1000
I1020 12:24:30.705700 1 nanny.go:108] dnsmasq[14]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I1020 12:24:30.705702 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I1020 12:24:30.705705 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I1020 12:24:30.705707 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.1#10053 for domain cluster.local
I1020 12:24:30.705709 1 nanny.go:108] dnsmasq[14]: reading /etc/resolv.conf
I1020 12:24:30.705711 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I1020 12:24:30.705713 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I1020 12:24:30.705715 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.1#10053 for domain cluster.local
I1020 12:24:30.705717 1 nanny.go:108] dnsmasq[14]: using nameserver 127.0.0.53#53
I1020 12:24:30.705720 1 nanny.go:108] dnsmasq[14]: read /etc/hosts - 7 addresses
I1020 12:33:07.139512 1 nanny.go:108] dnsmasq[14]: Maximum number of concurrent DNS queries reached (max: 150)
I1020 12:33:17.149489 1 nanny.go:108] dnsmasq[14]: Maximum number of concurrent DNS queries reached (max: 150)
I1020 12:33:27.158616 1 nanny.go:108] dnsmasq[14]: Maximum number of concurrent DNS queries reached (max: 150)
I1020 12:33:37.164004 1 nanny.go:108] dnsmasq[14]: Maximum number of concurrent DNS queries reached (max: 150)
I1020 12:33:47.176527 1 nanny.go:108] dnsmasq[14]: Maximum number of concurrent DNS queries reached (max: 150)
I1020 12:33:57.188216 1 nanny.go:108] dnsmasq[14]: Maximum number of concurrent DNS queries reached (max: 150)
Logs from sidecar container:
ERROR: logging before flag.Parse: I1020 11:43:05.394883 1 main.go:48] Version v1.14.4-2-g5584e04
ERROR: logging before flag.Parse: I1020 11:43:05.394935 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
ERROR: logging before flag.Parse: I1020 11:43:05.394965 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
ERROR: logging before flag.Parse: I1020 11:43:05.394995 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
ERROR: logging before flag.Parse: W1020 11:43:22.399309 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:50271->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W1020 11:43:29.399631 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:50260->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W1020 11:43:36.399957 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:43683->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W1020 11:43:43.400251 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:38300->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W1020 11:43:50.400500 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:45071->127.0.0.1:53: i/o timeout
ERROR: logging before flag.Parse: W1020 11:44:04.187002 1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:60537->127.0.0.1:53: i/o timeout
...
What's interesting is that switching to coredns helps as it works but with errors!
kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10 coredns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.0.0.1 kubernetes.default.svc.cluster.local
Logs from kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=coredns -o name):
.:53
CoreDNS-011
2017/10/20 12:41:11 [INFO] CoreDNS-011
2017/10/20 12:41:11 [INFO] linux/amd64, go1.9, 1b60688d
linux/amd64, go1.9, 1b60688d
172.17.0.5 - [20/Oct/2017:12:44:25 +0000] "PTR IN 10.0.0.10.in-addr.arpa. udp 40 false 512" NOERROR qr,aa,rd,ra 91 192.043µs
127.0.0.1 - [20/Oct/2017:12:44:27 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 2.649373ms
127.0.0.1 - [20/Oct/2017:12:44:27 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 4.830272ms
127.0.0.1 - [20/Oct/2017:12:44:27 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 8.396219ms
127.0.0.1 - [20/Oct/2017:12:44:27 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 8.569993ms
127.0.0.1 - [20/Oct/2017:12:44:27 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 6.769748ms
127.0.0.1 - [20/Oct/2017:12:44:27 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 9.991105ms
20/Oct/2017:12:44:27 +0000 [ERROR 0 kubernetes.default. AAAA] unreachable backend: no upstream host
[ loooong spam of the same messages with SERVFAIL ]
127.0.0.1 - [20/Oct/2017:12:44:53 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 2.990799414s
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "AAAA IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 2.990947158s
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "AAAA IN kubernetes.default.default.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,rd,ra 115 39.579µs
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "AAAA IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,rd,ra 107 51.747µs
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "A IN kubernetes.default. udp 36 false 512" SERVFAIL qr,rd 36 33.742µs
20/Oct/2017:12:44:53 +0000 [ERROR 0 kubernetes.default. A] unreachable backend: no upstream host
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "A IN kubernetes.default.default.svc.cluster.local. udp 62 false 512" NXDOMAIN qr,rd,ra 115 54.224µs
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd,ra 70 84.575µs
172.17.0.5 - [20/Oct/2017:12:44:53 +0000] "PTR IN 1.0.0.10.in-addr.arpa. udp 39 false 512" NOERROR qr,aa,rd,ra 89 84.99µs
172.17.0.4 - [20/Oct/2017:12:53:07 +0000] "AAAA IN grafana.com.kube-system.svc.cluster.local. udp 59 false 512" NXDOMAIN qr,aa,rd,ra 112 155.98µs
172.17.0.4 - [20/Oct/2017:12:53:07 +0000] "A IN grafana.com.kube-system.svc.cluster.local. udp 59 false 512" NXDOMAIN qr,aa,rd,ra 112 60.314µs
172.17.0.4 - [20/Oct/2017:12:53:07 +0000] "A IN grafana.com.svc.cluster.local. udp 47 false 512" NXDOMAIN qr,aa,rd,ra 100 78.454µs
172.17.0.4 - [20/Oct/2017:12:53:07 +0000] "AAAA IN grafana.com.svc.cluster.local. udp 47 false 512" NXDOMAIN qr,aa,rd,ra 100 49.168µs
172.17.0.4 - [20/Oct/2017:12:53:07 +0000] "A IN grafana.com.cluster.local. udp 43 false 512" NXDOMAIN qr,aa,rd,ra 96 54.093µs
172.17.0.4 - [20/Oct/2017:12:53:07 +0000] "AAAA IN grafana.com.cluster.local. udp 43 false 512" NXDOMAIN qr,aa,rd,ra 96 53.235µs
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 3.110907ms
20/Oct/2017:12:53:08 +0000 [ERROR 0 grafana.com. A] unreachable backend: no upstream host
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 4.576001ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 6.993853ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "AAAA IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 107.552µs
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "AAAA IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 2.241218ms
20/Oct/2017:12:53:08 +0000 [ERROR 0 grafana.com. AAAA] unreachable backend: no upstream host
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 8.727964ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 11.217801ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 9.548173ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "AAAA IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 2.931659ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "AAAA IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 4.744248ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 12.459702ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 14.463761ms
127.0.0.1 - [20/Oct/2017:12:53:08 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 15.324802ms
127.0.0.1 - [20/Oct/2017:12:53:09 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 2.368374381s
127.0.0.1 - [20/Oct/2017:12:53:09 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 2.368445804s
172.17.0.4 - [20/Oct/2017:12:53:09 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 2.368663172s
172.17.0.4 - [20/Oct/2017:12:53:09 +0000] "A IN grafana.com. udp 29 false 512" SERVFAIL qr,rd 29 45.614µs
20/Oct/2017:12:53:09 +0000 [ERROR 0 grafana.com. A] unreachable backend: no upstream host
When those logs happen nslookup takes longer, but it returns the correct result.
kubectl exec -ti busybox -- nslookup monitoring-grafana.kube-system
Server: 10.0.0.10
Address 1: 10.0.0.10 coredns.kube-system.svc.cluster.local
Name: monitoring-grafana.kube-system
Address 1: 10.0.0.133 monitoring-grafana.kube-system.svc.cluster.local
So kubedns doesn't work at all and coredns works, but it's not stable.
I'll test with kubeadm bootstrapper instead of localkube to see how things go.
Edit:
kubeadm with none driver doesn't seem to work, the cluster doesn't start. I guess it's too many of experimental features activated together :) kubeadm generates certificates for 127.0.0.1, 10.0.0.1, but the components try to use my eth0 interface's IP: 192.168.42.13.
sudo minikube logs
paź 20 15:43:36 my-host kubelet[20915]: W1020 15:43:36.577232 20915 status_manager.go:431] Failed to get status for pod "kube-controller-manager-my-host_kube-system(653e629f9f8a6d3380c54427dbc4d941)": Get https://192.168.42.13:8443/api/v1/namespaces/kube-system/pods/kube-controller-manager-my-host: x509: certificate is valid for 127.0.0.1, 10.0.0.1, not 192.168.42.13
paź 20 15:43:36 my-host kubelet[20915]: W1020 15:43:36.581473 20915 status_manager.go:431] Failed to get status for pod "kube-apiserver-my-host_kube-system(81edacb80dc81e85783254aa3d65d40a)": Get https://192.168.42.13:8443/api/v1/namespaces/kube-system/pods/kube-apiserver-my-host: x509: certificate is valid for 127.0.0.1, 10.0.0.1, not 192.168.42.13
paź 20 15:43:36 my-host kubelet[20915]: E1020 15:43:36.644770 20915 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.42.13:8443/api/v1/pods?fieldSelector=spec.nodeName%3Dmy-host&resourceVersion=0: x509: certificate is valid for 127.0.0.1, 10.0.0.1, not 192.168.42.13
paź 20 15:43:37 my-host kubelet[20915]: E1020 15:43:37.056110 20915 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:422: Failed to list *v1.Node: Get https://192.168.42.13:8443/api/v1/nodes?fieldSelector=metadata.name%3Dmy-host&resourceVersion=0: x509: certificate is valid for 127.0.0.1, 10.0.0.1, not 192.168.42.13
Ok, I think I've found the reason and the solution.
It's my /etc/resolv.conf, in Ubuntu 17.04 it contains:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.
nameserver 127.0.0.53
I've ran:
sudo systemctl stop systemd-resolved
sudo systemctl disable systemd-resolved
and edited /etc/resolv.conf to contain only:
nameserver 8.8.8.8
After that the cluster and DNS works!
kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
kube-dns-6fc954457d-p7sd9 3/3 Running 0 3m
kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default
Address 1: 10.0.0.1 kubernetes.default.svc.cluster.local
I'm not sure if it's a workaround or a solution though. The /etc/resolv.conf thing might be default for Ubuntu. Should the none driver work with the original configuration or should the configuration be changed?
It's the same issue as https://github.com/kubernetes/kubernetes/issues/45828, so it's not minikube specific. Unless minikube can implement a workaround so that the none driver works on Ubuntu Desktop out of the box?
@Siilwyn can you check if this is the case for you as well?
Can confirm @jgoclawski found the underlying issue and workaround! Think it would be nice if this is solved by minikube as editing the system resolv config is not so nice.
Same issue here. DNS is not working with the none driver but works when using Virtualbox driver
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
I1126 06:18:35.615019 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:18:45.630824 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:18:55.646374 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:19:05.662111 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:19:15.677589 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:19:25.693250 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:19:35.708908 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:19:45.725092 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:19:55.740858 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:20:05.756598 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:20:15.772095 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:20:25.788192 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:20:35.803835 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:20:45.819531 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
I1126 06:20:55.838375 1 nanny.go:108] dnsmasq[27]: Maximum number of concurrent DNS queries reached (max: 150)
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
I am also facing this issue with --vm-driver=none where a Pod fails to establish connection with the server running on web. Exact error string: Get https://<the-server-name>.io: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers).
I am waiting for any solution, suggestion, workaround.
CoreDNS didn't help here.
minikube version: v0.24.0
kubectl version: Client: 1.9.3 Server: 1.8.0
/remove-lifecycle stale
follow up on @jgoclawski suggestion of disabling systemd-resolved:
when systemd-network service is in use, the only needed steps would be:
$ sudo rm /etc/resolv.conf
$ sudo ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf
with these, the dns being injected to minikube is the one from dhcp.
for reference take a look at http://xmodulo.com/switch-from-networkmanager-to-systemd-networkd.html
There's also a solution which doesn't involve changing host system. Instead you can disable using host's resolv.conf by applying the following config map (details: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configmap-options):
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
upstreamNameservers: |-
["8.8.8.8", "8.8.4.4"]
@jgoclawski This solved internet access but now my nodes can't access the services..
Hmm. This solution is not working for me. I'm also facing the same issue. Unfortunately, none of the solution presented here is working for me!
Environment:
Minikube version: v0.25
OS: Ubuntu 18.04
VM Driver: none
Finally fixed this in my instance
Google Cloud VM - Unbuntu v18.04
Minikube: v0.28.1
started with
sudo -E minikube start --vm-driver=none --kubernetes-version=v1.11.3
The answers above and reading https://zwischenzugs.com/2018/08/06/anatomy-of-a-linux-dns-lookup-part-iv/ really helped.
In my case I needed to run
sudo rm /etc/resolv.conf && sudo ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf
and then stop/start minikube.
For reasons I'm still trying to figure out, setting core-dns upstream to 8.8.8.8 using the config below didn't help:
kind: ConfigMap
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream 8.8.8.8
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
proxy . /etc/resolv.conf
cache 30
reload
}
metadata:
creationTimestamp: 2018-09-09T18:24:22Z
name: coredns
namespace: kube-system
resourceVersion: "198"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
If anyone knows why ^ doesn't fix DNS resolution issues, I'd appreciate some pointers.
Change upstream 8.8.8.8 back to upstream
Change proxy . /etc/resolv.conf to proxy . 8.8.8.8
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Marking as obsolete since dnsmasq and localkube haven't been part in any releases in the last 6 months.
Fixed this using a slightly modified YML file from @bw2, used this yml file:
fixcoredns.yml:
kind: ConfigMap
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health
kubernetes cluster.local in-addr.arpa ip6.arpa {
upstream 8.8.8.8 8.8.4.4
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
proxy . 8.8.8.8 8.8.4.4
cache 30
reload
}
metadata:
creationTimestamp: 2018-09-09T18:24:22Z
name: coredns
namespace: kube-system
resourceVersion: "198"
selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
changes from @bw2's yml file:
applied as follows:
kubectl apply -f fixcoredns.yml
I restarted all the kube-system containers with
kubectl delete --all pods --namespace kube-system
then let them regenerate and confirmed the coredns containers were not in the CrashLoopBackOff state:
kubectl get pods --namespace=kube-system
Most helpful comment
There's also a solution which doesn't involve changing host system. Instead you can disable using host's
resolv.confby applying the following config map (details: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/#configmap-options):