Kops version: 1.8.0 (git-5099bc5)
Kubernetes version: v1.8.6 (6260bb08c46c31eea6cb538b34a9ceb3e406689c)
Cloud: AWS
When I create a brand new cluster, everything appears to be working fine. All the masters and workers are ready and I can deploy application pods. BUT, mostly the pods cannot resolve public DNS names like www.google.com, or even internal names like myservice.default. When I run ping www.google.com either the command takes a long time (over 10 seconds) and eventually says name not resolved, or it takes a long time and eventually starts pinging Google. It's as if kube-dns is failing most of the time, but not always.
Things I have noticed:
kops create cluster \
--api-loadbalancer-type internal \
--associate-public-ip=false \
--cloud=aws \
--dns private \
--image "595879546273/CoreOS-stable-1632.2.1-hvm" \
--master-count 3 \
--master-size t2.small \
--master-zones "us-east-1b,us-east-1c,us-east-1d" \
--name=stg-us-east-1.k8s.local \
--network-cidr 10.0.64.0/22 \
--networking flannel \
--node-count 5 \
--node-size t2.small \
--out . \
--output json \
--ssh-public-key ~/.ssh/mykey.pub \
--state s3://mybucket \
--target=terraform \
--topology private \
--vpc vpc-3153eb2e \
--zones "us-east-1b,us-east-1c,us-east-1d"
Modified subnets (kops edit cluster) as per https://github.com/kubernetes/kops/blob/master/docs/run_in_existing_vpc.md
Updated cluster config (kops update cluster) and deployed everything (terraform apply).
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
creationTimestamp: 2018-02-05T16:07:43Z
name: stg-us-east-1.k8s.local
spec:
api:
loadBalancer:
type: Internal
authorization:
alwaysAllow: {}
channel: stable
cloudProvider: aws
configBase: s3://mybucket/stg-us-east-1.k8s.local
etcdClusters:
- etcdMembers:
- instanceGroup: master-us-east-1b
name: b
- instanceGroup: master-us-east-1c
name: c
- instanceGroup: master-us-east-1d
name: d
name: main
- etcdMembers:
- instanceGroup: master-us-east-1b
name: b
- instanceGroup: master-us-east-1c
name: c
- instanceGroup: master-us-east-1d
name: d
name: events
iam:
allowContainerRegistry: true
legacy: false
kubernetesApiAccess:
- 0.0.0.0/0
kubernetesVersion: 1.8.6
masterInternalName: api.internal.stg-us-east-1.k8s.local
masterPublicName: api.stg-us-east-1.k8s.local
networkCIDR: 10.0.64.0/22
networkID: vpc-73cfbb0a
networking:
flannel:
backend: vxlan
nonMasqueradeCIDR: 100.64.0.0/10
sshAccess:
- 0.0.0.0/0
subnets:
- cidr: 10.0.65.0/24
egress: nat-012ee02a09a7830d2
id: subnet-de86e6f2
name: us-east-1b
type: Private
zone: us-east-1b
- cidr: 10.0.66.0/24
egress: nat-012ee02a09a7830d2
id: subnet-5fb5ef17
name: us-east-1c
type: Private
zone: us-east-1c
- cidr: 10.0.67.0/24
egress: nat-012ee02a09a7830d2
id: subnet-b13da5eb
name: us-east-1d
type: Private
zone: us-east-1d
- cidr: 10.0.64.32/27
id: subnet-d68bebfa
name: utility-us-east-1b
type: Utility
zone: us-east-1b
- cidr: 10.0.64.96/27
id: subnet-cbb0ea83
name: utility-us-east-1c
type: Utility
zone: us-east-1c
- cidr: 10.0.64.160/27
id: subnet-f23ea6a8
name: utility-us-east-1d
type: Utility
zone: us-east-1d
topology:
dns:
type: Private
masters: private
nodes: private
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: 2018-02-05T16:07:43Z
labels:
kops.k8s.io/cluster: stg-us-east-1.k8s.local
name: master-us-east-1b
spec:
associatePublicIp: false
image: 595879546273/CoreOS-stable-1632.2.1-hvm
machineType: t2.small
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-us-east-1b
role: Master
subnets:
- us-east-1b
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: 2018-02-05T16:07:43Z
labels:
kops.k8s.io/cluster: stg-us-east-1.k8s.local
name: master-us-east-1c
spec:
associatePublicIp: false
image: 595879546273/CoreOS-stable-1632.2.1-hvm
machineType: t2.small
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-us-east-1c
role: Master
subnets:
- us-east-1c
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: 2018-02-05T16:07:43Z
labels:
kops.k8s.io/cluster: stg-us-east-1.k8s.local
name: master-us-east-1d
spec:
associatePublicIp: false
image: 595879546273/CoreOS-stable-1632.2.1-hvm
machineType: t2.small
maxSize: 1
minSize: 1
nodeLabels:
kops.k8s.io/instancegroup: master-us-east-1d
role: Master
subnets:
- us-east-1d
---
apiVersion: kops/v1alpha2
kind: InstanceGroup
metadata:
creationTimestamp: 2018-02-05T16:07:43Z
labels:
kops.k8s.io/cluster: stg-us-east-1.k8s.local
name: nodes
spec:
associatePublicIp: false
image: 595879546273/CoreOS-stable-1632.2.1-hvm
machineType: t2.small
maxSize: 3
minSize: 3
nodeLabels:
kops.k8s.io/instancegroup: nodes
role: Node
subnets:
- us-east-1b
- us-east-1c
- us-east-1d
Contents of resolv.conf on a node:
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known DNS servers.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.
nameserver 10.0.64.2
search ec2.internal
Contents of resolv.conf on a system pod:
nameserver 10.0.64.2
search ec2.internal
Contents of resolv.conf on an application pod:
nameserver 100.64.0.10
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
options ndots:5
options single-request-reopen to resolv.conf on the application pods, as discussed in https://github.com/kubernetes/kubernetes/issues/56903 but this made no difference.options ndots:5 from application pod resolv.conf, as described in other places but this made no difference.After further investigation of this, I have five workers and only two of those worker nodes have the DNS problem in any application pod they run. The two nodes that have this problem in their pods are the two nodes that are running the kube-dns pods.
Some logs:
kubedns log (same on both kube-dns pods):
I0205 17:16:00.273097 1 dns.go:48] version: 1.14.4-2-g5584e04
I0205 17:16:00.280277 1 server.go:70] Using configuration read from directory: /kube-dns-config with period 10s
I0205 17:16:00.280336 1 server.go:113] FLAG: --alsologtostderr="false"
I0205 17:16:00.280346 1 server.go:113] FLAG: --config-dir="/kube-dns-config"
I0205 17:16:00.280353 1 server.go:113] FLAG: --config-map=""
I0205 17:16:00.280358 1 server.go:113] FLAG: --config-map-namespace="kube-system"
I0205 17:16:00.280363 1 server.go:113] FLAG: --config-period="10s"
I0205 17:16:00.280369 1 server.go:113] FLAG: --dns-bind-address="0.0.0.0"
I0205 17:16:00.280374 1 server.go:113] FLAG: --dns-port="10053"
I0205 17:16:00.280380 1 server.go:113] FLAG: --domain="cluster.local."
I0205 17:16:00.280387 1 server.go:113] FLAG: --federations=""
I0205 17:16:00.280394 1 server.go:113] FLAG: --healthz-port="8081"
I0205 17:16:00.280398 1 server.go:113] FLAG: --initial-sync-timeout="1m0s"
I0205 17:16:00.280403 1 server.go:113] FLAG: --kube-master-url=""
I0205 17:16:00.280409 1 server.go:113] FLAG: --kubecfg-file=""
I0205 17:16:00.280413 1 server.go:113] FLAG: --log-backtrace-at=":0"
I0205 17:16:00.280421 1 server.go:113] FLAG: --log-dir=""
I0205 17:16:00.280426 1 server.go:113] FLAG: --log-flush-frequency="5s"
I0205 17:16:00.280431 1 server.go:113] FLAG: --logtostderr="true"
I0205 17:16:00.280436 1 server.go:113] FLAG: --nameservers=""
I0205 17:16:00.280440 1 server.go:113] FLAG: --stderrthreshold="2"
I0205 17:16:00.280444 1 server.go:113] FLAG: --v="2"
I0205 17:16:00.280449 1 server.go:113] FLAG: --version="false"
I0205 17:16:00.280457 1 server.go:113] FLAG: --vmodule=""
I0205 17:16:00.291904 1 server.go:176] Starting SkyDNS server (0.0.0.0:10053)
I0205 17:16:00.299519 1 server.go:198] Skydns metrics enabled (/metrics:10055)
I0205 17:16:00.299529 1 dns.go:147] Starting endpointsController
I0205 17:16:00.299532 1 dns.go:150] Starting serviceController
I0205 17:16:00.300109 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0205 17:16:00.300120 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0205 17:16:00.799732 1 dns.go:171] Initialized services and endpoints from apiserver
I0205 17:16:00.799852 1 server.go:129] Setting up Healthz Handler (/readiness)
I0205 17:16:00.799874 1 server.go:134] Setting up cache handler (/cache)
I0205 17:16:00.799894 1 server.go:120] Status HTTP port 8081
dnsmasq logs (same on both kube-dns pods):
I0205 17:16:00.182836 1 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/in6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0205 17:16:00.186789 1 nanny.go:86] Starting dnsmasq [-k --cache-size=1000 --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/in6.arpa/127.0.0.1#10053]
I0205 17:16:01.053760 1 nanny.go:111]
W0205 17:16:01.053864 1 nanny.go:112] Got EOF from stdout
I0205 17:16:01.054055 1 nanny.go:108] dnsmasq[8]: started, version 2.78-security-prerelease cachesize 1000
I0205 17:16:01.054137 1 nanny.go:108] dnsmasq[8]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0205 17:16:01.054171 1 nanny.go:108] dnsmasq[8]: using nameserver 127.0.0.1#10053 for domain in6.arpa
I0205 17:16:01.054206 1 nanny.go:108] dnsmasq[8]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0205 17:16:01.054233 1 nanny.go:108] dnsmasq[8]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0205 17:16:01.054318 1 nanny.go:108] dnsmasq[8]: reading /etc/resolv.conf
I0205 17:16:01.054351 1 nanny.go:108] dnsmasq[8]: using nameserver 127.0.0.1#10053 for domain in6.arpa
I0205 17:16:01.054378 1 nanny.go:108] dnsmasq[8]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0205 17:16:01.054445 1 nanny.go:108] dnsmasq[8]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0205 17:16:01.054476 1 nanny.go:108] dnsmasq[8]: using nameserver 10.0.64.2#53
I0205 17:16:01.054538 1 nanny.go:108] dnsmasq[8]: read /etc/hosts - 7 addresses
sidecar log (same on both kube-dns pods):
ERROR: logging before flag.Parse: I0205 17:16:01.163076 1 main.go:48] Version v1.14.4-2-g5584e04
ERROR: logging before flag.Parse: I0205 17:16:01.163230 1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
ERROR: logging before flag.Parse: I0205 17:16:01.163269 1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
ERROR: logging before flag.Parse: I0205 17:16:01.163315 1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:1}
I rebuilt this cluster without specifying the image to use, so the cluster was built with Debian Jessie instead of CoreOS (k8s-1.8-debian-jessie-amd64-hvm-ebs-2018-01-14 (ami-8ec0e1f4) instead of 595879546273/CoreOS-stable-1632.2.1-hvm (ami-a53335df)) and this problem is solved :thinking:
So it seems like it's an issue in CoreOS or the provisioning code for CoreOS.
@joelittlejohn :
I think this has to do with kubernetes/kubernetes#21613
Fix that appears to work for us is to run sudo modprobe br_netfilter on all cluster nodes.
This affected our clusters using CoreOS AMI also!
Symptoms are that DNS responses are coming from unexpected source IP. When a DNS lookup is made it sends packet to the kube-dns _Service IP_. Then it receives response from the _Pod IP_, which is then dropped because the sender doesn't know it's talking to the pod... just the _Service IP_.
Symptoms are (when doing lookup with dig against _Service IP_):
# Symptoms: dig svc-name.svc.cluster.local returns "reply from unexpected source" error such as:
root@example-pod-64598c547d-z9vb4:/# dig @100.64.0.12 svc-name.svc.cluster.local
;; reply from unexpected source: 100.96.3.3#53, expected 100.64.0.12#53
;; reply from unexpected source: 100.96.3.3#53, expected 100.64.0.12#53
;; reply from unexpected source: 100.96.3.3#53, expected 100.64.0.12#53
Not sure where in kops provision process this should go... but it probably can be solved by dropping in a file to /etc/modules-load.d/ to load this kernel module.
echo br_netfilter > /etc/modules-load.d/br_netfilter.conf
If using cloud-init or Ignition, there are equivalent options
@trinitronx Thanks loading this module completely fixed the problem! I had found a bunch of potential solution in the kubernetes issues list but not this one :joy:
So it looks like kops for CoreOS should adding echo br_netfilter > /etc/modules-load.d/br_netfilter.conf (or something else to load this module) as part of provisioning a CoreOS cluster, because right now the CoreOS clusters that kops is creating are broken :thinking:
I fixed this in my own kops cluster by editing the cluster:
kops edit cluster stg-us-east-1.k8s.local --state s3://mybucket
and adding a hook:
- manifest: |
Type=oneshot
ExecStart=/usr/sbin/modprobe br_netfilter
name: fix-dns.service
/assign @KashifSaadat @gambol99
calling the CoreOS gurus :)
Yeah .. we've hit this one before, its an old bug (albeit not in kops but a previous installer we used and fixed with the same hack as above) ... The br-netfilter module needs to be enabled so iptables forces all packets, even those traversing the bridge, to go through the pre-routing tables .. I'm surprised the kube-proxy doesn't try and modprobe this itself, much like here.
We seem to have this enabled already, without explicitly doing a modprobe hack .. but we are using canal, so perhaps either the version of flannel, or calico it doing it for us. Let me do a quick test with CoreOS-stable-1632.2.1-hvm to rule out the os version.
core@ip-10-250-29-239 /etc/cni $ sudo lsmod | grep br_netfilter
br_netfilter 24576 0
bridge 151552 1 br_netfilter
I noticed in the logs of CoreOS-stable-1632.2.1-hvm
Feb 08 11:26:15 ip-10-250-101-49.eu-west-2.compute.internal kernel: bridge:
filtering via arp/ip/ip6tables is no longer available by default. Update your scripts
to load br_netfilter if you need this
It might be worth raising on the flannel repo to get an official response ..
Like @joelittlejohn , we were able to fix this on cluster create by adding the following hook via kops cluster edit:
Under spec:
hooks:
- name: fix-dns.service
roles:
- Node
- Master
before:
- network-pre.target
- kubelet.service
manifest: |
Type=oneshot
ExecStart=/usr/sbin/modprobe br_netfilter
[Unit]
Wants=network-pre.target
[Install]
WantedBy=multi-user.target
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Does anyone that has commented here know if this problem would still affect newly built clusters using all the latest versions? (kops 1.9, CoreOS, Kubernetes 1.9, flannel).
I'm loathe to just close this because it seems like such a massive bug: "DNS broken on brand new cluster". It's not yet clear to me whether this should be fixed in flannel, of kubernetes or kops.
Still the issue but with Calico instead of flannel...
kops 1.9
kubernetes 1.9.7
CoreOS -> https://coreos.com/dist/aws/aws-stable.json for my region
Calico
Not able to resolve internal dns entries. I'll try the workaround described above
I can confirm that on a cluster built with Kops 1.9.0, running CoreOS 1745.4.0 (Stable) br_netfilter is not loaded on boot.
Encountering this issue as well. Tried the 'hook' solution [ https://github.com/kubernetes/kops/issues/4391#issuecomment-364321275 ] though it didn't work.
Verified by
Deploying a radial/busyboxplus pod onto the node running kube-dns. Pinged a pod on the same node, no response. Pinged a pod on a different node and response was received. Both pods were pinged on their respective service names. So, service-name.namespace.
Also tested the same resources on minikube where everything is hosted on a single node. I had no issues there.
Versioning
Cloud-provider: gce
Networking: kubenet
Kernel Version: 4.4.64+
OS Image: cos-cloud/cos-stable-60-9592-90-0
Container Runtime Version: docker://1.13.1
Kubelet Version: v1.8.7
Kube-Proxy Version: v1.8.7
Operating system: linux
Architecture: amd64
re br_netfilter
modprobe -r br_netfilterSo I did add force-loading of the module in https://github.com/kubernetes/kops/pull/5490 . Hopefully that will help with the CoreOS issue.
@RobertDiebels I'm not sure ping works very well anyway between pods. I did try with COS on GCE and wasn't able to reproduce a problem doing curl -v -k https://kubernetes, but I only realized now that you were pinging so I'll have to try that case. But I wouldn't recommend using ping as a health-check anyway.
@justinsb Thanks for the tip. I'll try the same using curl. Will report back here today.
EDIT: I just ran my code again and everything seems to be working. This time I waited 10 minutes before doing anything. Before I waited approx. 5 minutes. So I believe my issue was due to the time it takes to initialize. Disregard earlier my comment.
EDIT2: It appears it wasn't due to the initialization time. It was due to me running 2 clusters in the same gce project. Opening a new ticket for that issue.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
We were experiencing a different symptom but probably due to the same source problem that is reported in this issue (another related one), and we could fix it following exactly what @trinitronx shared. All the details can be found here, it would be nice if kops could automatically take care of this since it took us a large amount of time and effort to figure out what was the problem (and meanwhile, our users were affected with plenty of timeouts).
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Same issue on AWS, Kops v1.10.0. I have tried creating K8s clusters with both Debian Stretch (not the standard Jessie) and Amazon Linux. Both have multiple pods failing due to DNS timeouts. In fact, even one of the DNS pods is failing, though the other is fine:
kube-system kube-dns-5fbcb4d67b-kfccn 0/3 CrashLoopBackOff 53 1h
Errors:
I1214 05:23:24.685877 1 dns.go:219] Waiting for [endpoints services] to be initialized from apiserver...
F1214 05:23:25.185865 1 dns.go:209] Timeout waiting for initialization
I've tried the 'hook' solution without success.
Just tried this again after upgrading to Kops beta v1.11 and rebuilding a clean cluster. My cluster is now working without any DNS issues.
curl -Lo kops https://github.com/kubernetes/kops/releases/download/1.11.0-beta.1/kops-linux-amd64
chmod +x ./kops
sudo mv ./kops /usr/local/bin/
I have been unable to get DNS working in a new cluster using Kops 1.11 beta, stable channel. I've tried kube-dns and core-dns with weave as the overlay. No luck. I've tried the hook fix. No luck.
@MCLDG What AMI, DNS provider, and overlay are you using?
Hook fix did not work because...um, the module was already loaded, so that makes sense. So maybe I have a different issue. Been unable to get DNS working reliably. Sometimes it will work after a long delay. Then fail on the same lookup. Sometimes it times out. Sometimes if I kill one of the DNS servers (take replicas down to 1) - things work perfect! Then when I try to reproduce in a new cluster, taking down to one pod does NOT work. Super frustrating not being able to reproduce the issue reliably (or reproduce a fix reliably).
@michaelajr , my 'create cluster' statement looks as follows. Default AMI, built-in K8s DNS and AWS VPC networking.
kops create cluster \
--node-count 2 \
--zones ap-southeast-1a,ap-southeast-1b,ap-southeast-1c \
--master-zones ap-southeast-1a,ap-southeast-1b,ap-southeast-1c \
--node-size m5.large\
--master-size t2.medium \
--topology private \
--networking amazon-vpc-routed-eni \
${NAME}
Some of my DNS calls would succeed. The pattern that seemed to work was where pod A called pod B and both were on the same worker node. If the pods were on different nodes the call would fail, though this wasn't consistent.
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
I fixed this in my own kops cluster by editing the cluster:
and adding a hook: