Kind: [firewalld] kind doesn't work on Fedora 32

Created on 2 May 2020  路  19Comments  路  Source: kubernetes-sigs/kind

What happened:

After upgrading to Fedora 32, I can no longer create a kind cluster.

What you expected to happen:

My kind cluster to get created

How to reproduce it (as minimally and precisely as possible):

kind create cluster --config=config.yaml

Were config.yaml is...

kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
networking:
  disableDefaultCNI: True
  podSubnet: "10.254.0.0/16"
  serviceSubnet: "172.30.0.0/16"
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
  extraPortMappings:
  - containerPort: 80
    hostPort: 80
    listenAddress: 0.0.0.0
  - containerPort: 443
    hostPort: 443
    listenAddress: 0.0.0.0
- role: worker
- role: worker

Anything else we need to know?:

Output/trace of running with -v 10 https://gist.github.com/christianh814/abbf1964b9224c8940864d02b9236128

I figured maybe something was stale and ran docker network rm kind and re-ran the command. This time I looked at the logs on my laptop and saw...

May 01 16:51:17 laptop audit[98494]: SERVICE_STOP pid=98494 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:spc_t:s0 msg='unit=kubelet comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
May 01 16:51:17 laptop audit[98423]: SERVICE_STOP pid=98423 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:spc_t:s0 msg='unit=kubelet comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'

Okay...so I docker exec into one of the workers and saw...

May 01 23:44:19 kind-worker systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
May 01 23:44:19 kind-worker systemd[1]: kubelet.service: Failed with result 'exit-code'.
May 01 23:44:20 kind-worker systemd[1]: kubelet.service: Service RestartSec=1s expired, scheduling restart.
May 01 23:44:20 kind-worker systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 217.
May 01 23:44:20 kind-worker systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
May 01 23:44:20 kind-worker systemd[1]: Started kubelet: The Kubernetes Node Agent.
May 01 23:44:20 kind-worker kubelet[3469]: Flag --fail-swap-on has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
May 01 23:44:20 kind-worker kubelet[3469]: F0501 23:44:20.360424    3469 server.go:199] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory

And indeed it's not there

root@kind-worker:/var/lib# ls -1 /var/lib/kubelet/config.yaml
ls: cannot access '/var/lib/kubelet/config.yaml': No such file or directory

Strange that kind create cluster DOES work fine.

Environment:

  • kind version: (use kind version):
$ kind version
kind v0.8.0 go1.14.2 linux/amd64
  • Kubernetes version: (use kubectl version):
$ kubectl version --client
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info):
$ docker version
Client:
 Version:           19.03.8
 API version:       1.40
 Go version:        go1.14rc1
 Git commit:        afacb8b
 Built:             Mon Mar 16 15:45:37 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.14rc1
  Git commit:       afacb8b
  Built:            Mon Mar 16 00:00:00 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.3
  GitCommit:        
 runc:
  Version:          1.0.0-rc10+dev
  GitCommit:        fbdbaf85ecbc0e077f336c03062710435607dbf1
 docker-init:
  Version:          0.18.0
  GitCommit:        
  • OS (e.g. from /etc/os-release):
$ cat /etc/fedora-release 
Fedora release 32 (Thirty Two)
$ 聽uname -a
Linux laptop 5.6.7-300.fc32.x86_64 #1 SMP Thu Apr 23 14:13:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
good first issue help wanted kinbug kindocumentation kinexternal prioritimportant-soon

Most helpful comment

Update. So on F32, I got it working with Firewalld by changing the FirewallBackend in the /etc/firewalld/firewalld.conf file from nftables to iptables and restarted docker.

# grep 'FirewallBackend=iptables' /etc/firewalld/firewalld.conf 
FirewallBackend=iptables

After I did that, my kind deployments started working "as normal".

All 19 comments

/var/lib/kubelet/config.yaml does not exist initially, this is normal. I wish kubeadm would make this clearer :/

during kubeadm's bootstrapping the kubelet config does not exist initially and kubelet is crashlooping until the config is populated.

can you share the full kind export logs in an archive? there's not a lot to go on here short of getting my hands on an identical host...

this config works on ubuntu 20.04 and kind v0.8.1 w/ ipv6 disabled. will have to reboot to sanity check the more common ipv6 enabled.

So it failed again with the following config...

kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
networking:
  disableDefaultCNI: True
  podSubnet: "10.254.0.0/16"
  serviceSubnet: "172.30.0.0/16"
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker

So I tried a simpler config...

kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
- role: worker

So it's network related. I'll try 0.8.1 to see.

I'll also upload the logs

v0.8.1 gave me the same result. I believe it's network related

for anyone following along, discussion in this thread: https://kubernetes.slack.com/archives/CEKK1KTN2/p1588378366006900

looking around it sounds like firewalld and docker do not work well together https://github.com/firewalld/firewalld/issues/461

apparently disabling firewalld worked

i'm not sure what we can do here, based on the logs in slack it seems that firewalld breaks containers being able to reach to each other over a docker network which is standard docker functionality (e.g. compose uses this)

I think short of fully disabling firewalld, you can do:

firewall-cmd --permanent --zone=trusted --add-interface=docker0
firewall-cmd --get-zone-of-interface=<your eth interface>
firewall-cmd --zone=<zone from above> --add-masquerade --permanent
firewall-cmd --reload

firewall-cmd --permanent --zone=trusted --add-interface=docker0
firewall-cmd --get-zone-of-interface=
firewall-cmd --zone= --add-masquerade --permanent
firewall-cmd --reload

(btw this was from https://github.com/docker/for-linux/issues/955#issuecomment-621141128)

Digging more, this seems to get all the Docker-relevant networking working for our CI with Fedora 32 except the KIND bits :sweat_smile:. I've only gotten KIND working by disabling firewalld and enabling iptables:

sudo systemctl stop firewalld
sudo systemctl disable firewalld
sudo dnf install -y iptables-services
sudo touch /etc/sysconfig/iptables
sudo touch /etc/sysconfig/ip6tables
sudo systemctl start iptables
sudo systemctl start ip6tables
sudo systemctl enable iptables
sudo systemctl enable ip6tables
sudo iptables -t filter -F
sudo iptables -t filter -X
sudo systemctl restart docker

Update. So on F32, I got it working with Firewalld by changing the FirewallBackend in the /etc/firewalld/firewalld.conf file from nftables to iptables and restarted docker.

# grep 'FirewallBackend=iptables' /etc/firewalld/firewalld.conf 
FirewallBackend=iptables

After I did that, my kind deployments started working "as normal".

Seems like somewhere between the upstream projects there's a bug to be fixed here, but this also seems worthy of at least a known-issues entry in our docs with workaround(s).

I think short of fully disabling firewalld, you can do:

firewall-cmd --permanent --zone=trusted --add-interface=docker0
firewall-cmd --get-zone-of-interface=<your eth interface>
firewall-cmd --zone=<zone from above> --add-masquerade --permanent
firewall-cmd --reload

Worked for CentOS 8 too

I'm not a fedora or firewalld user, but if someone wants to make an opinion about which fix to take, we should document it on this page https://kind.sigs.k8s.io/docs/user/known-issues/
https://github.com/kubernetes-sigs/kind/blob/master/site/content/docs/user/known-issues.md

workaround and known issue are now documented
https://github.com/kubernetes-sigs/kind/pull/1672

Was this page helpful?
0 / 5 - 0 ratings