Environmental Info:
K3s Version: 1.18.8
k3s -v
k3s version v1.18.8+k3s1 (6b595318)
Node(s) CPU architecture, OS, and Version:
Ubuntu 18.04 2 vCPUs AMD64.
Cluster Configuration:
Single node master
Describe the bug:
If the OS has static IP assigned and /etc/resolv.conf is empty coredns pod goes in crashbackloop and doesnt startup.
Steps To Reproduce:
It might be tied to the defect from coredns which has been fixed in newer version of coredns
https://github.com/coredns/coredns/issues/3735
Expected behavior:
coredns shouldnt crash
k3s kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
helm-install-traefik-mh426 0/1 Completed 0 15m
traefik-85565c8c68-zcbnp 1/1 Running 0 15m
coredns-7944c66d8d-w5twl 0/1 CrashLoopBackOff 7 15m
k3s kubectl -n kube-system logs --previous coredns-7944c66d8d-w5twl
plugin/forward: no nameservers found
Actual behavior:
coredns pod is crashing and doesnt recover.
Additional context / logs:
Even with the coredns fix, this will result in failure - just without a crash loop. You need to populate upstream dns servers in /etc/resolv.conf.
Thanks @brandond - we're currently on 1.18.6 and it works just fine with the same setup (static IP) and empty /etc/resolv.conf, however coredns failure means nothing works with 1.18.8.
Any chance that for the next release we upgrade coredns where the https://github.com/coredns/coredns/issues/3735 issue is fixed.
@davidnuzik can we validate this? Sounds like a regression we might have picked up when bumping coredns?
Any thoughts on bumping up the coredns version. We wont be able to upgrade to newer versions of k3s unless this bug is fixed and I definitely like to be on as current a release as possible. Thanks!
@samirsss Issue seems to be resolved in k3s v1.19.3+k3s1 with empty /etc/resolv.conf and I was able to reproduce the issue with v1.18.8+k3s1
But coredns crashloops with v1.18.6 as well. So I am not sure if it was a regression.
v1.19.3+k3s1
root@ip-172-31-8-188:~# k3s -v
k3s version v1.19.3+k3s1 (974ad30b)
root@ip-172-31-8-188:~# crictl images |grep coredns
docker.io/rancher/coredns-coredns 1.6.9 4e797b3234604 43.3MB
root@ip-172-31-8-188:~# kubectl get pods -A |grep coredns
kube-system coredns-66c464876b-xt62k 1/1 Running 0 26m
v1.18.8+k3s1
root@ip-172-31-8-94:~# k3s -v
k3s version v1.18.8+k3s1 (6b595318)
root@ip-172-31-8-94:~# crictl images |grep coredns
docker.io/rancher/coredns-coredns 1.6.9 4e797b3234604 43.3MB
root@ip-172-31-8-94:~# kubectl get pods -A |grep coredns
kube-system coredns-7944c66d8d-sq6lt 0/1 CrashLoopBackOff 17 66m
v1.18.6+k3s1
root@ip-172-31-4-190:~# k3s -v
k3s version v1.18.6+k3s1 (6f56fa1d)
root@ip-172-31-4-190:~# crictl images |grep coredns
docker.io/rancher/coredns-coredns 1.6.3 c4d3d16fe508b 14.2MB
root@ip-172-31-4-190:~# kubectl get pods -A |grep coredns
kube-system coredns-8655855d6-ffts5 0/1 CrashLoopBackOff 15 56m
Steps followed to reproduce
wget https://github.com/rancher/k3s/releases/download/v1.19.3%2Bk3s1/k3s-airgap-images-amd64.tar
wget https://github.com/rancher/k3s/releases/download/v1.19.3%2Bk3s1/k3s
wget https://raw.githubusercontent.com/rancher/k3s/v1.19.3%2Bk3s1/install.sh
chmod u+x install.sh
chmod u+x k3s
cp k3s /usr/local/bin/
echo "" > /etc/resolv.conf
mkdir -p /var/lib/rancher/k3s/agent/images/
cp k3s-airgap-images-amd64.tar /var/lib/rancher/k3s/agent/images/
INSTALL_K3S_SKIP_DOWNLOAD=true ./install.sh
Update on the verification. By deleting /etc/resolv.conf and creating empty file, I was able to reproduce the issue and coredns crashloops on v1.19.3 as well.
v1.18.6+k3s1
rm /etc/resolv.conf
touch /etc/resolv.conf
kubectl get pods -A |grep coredns
kube-system coredns-8655855d6-p2d66 1/1 Running 0 78m
v1.19.3+k3s1
root@ip-172-31-1-169:~# k3s -v
k3s version v1.19.3+k3s1 (974ad30b)
root@ip-172-31-1-169:~# kubectl get pods -A |grep coredns
kube-system coredns-66c464876b-2vxnw 0/1 CrashLoopBackOff 7 14m
root@ip-172-31-1-169:~#
@samirsss Does empty resolv.conf refer to setting it to size 0 which retains the sym link or delete resolv.conf and create blank file which removes the sym link?
Also could you help understand the usecase of setting /etc/resolv.conf empty?
Sorry for the delay - I'll try this out this week with the latest version of 1.19.3 and see if it fixes the issue.
@ShylajaDevadiga - the issue that I saw was when I'd assigned a static IP to my VM and was trying to deploy k3s. I wasnt explicitly setting the size of resolv.conf to 0. I'll still try this out and get back to you. My lab's down today but should be functional tomorrow to give you some data.
Most helpful comment
@samirsss Issue seems to be resolved in k3s v1.19.3+k3s1 with empty /etc/resolv.conf and I was able to reproduce the issue with v1.18.8+k3s1
But coredns crashloops with v1.18.6 as well. So I am not sure if it was a regression.
v1.19.3+k3s1
v1.18.8+k3s1
v1.18.6+k3s1
Steps followed to reproduce