Version:
1.17
Describe the bug
I am trying to deploy a k3s cluster on two Raspberry Pi computers. Thereby, I would like to use the Rapsberry Pi 4 as the master/server of the cluster and a Raspberry Pi 3 as a worker node/agent.
However, when I try to make a deployment the pod is always deployed on the Raspberry Pi 4 (master) and not on the worker node.
To Reproduce
On both computers:
cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory into /boot/cmdline.txtOn master Raspberry Pi 4:
curl -sfL https://get.k3s.io | sh -
sudo kubectl run nginx-sample -image nginx --port 80
On worker node Raspberry Pi 3:
curl -sfL https://get.k3s.io | K3S_URL=https://myserver:6443 K3S_TOKEN=XXX sh -
Expected behavior
The command sudo kubectl get pods -o wide --all-namespaces should show that the pod is deployed on the worker node.
Actual behavior
The command sudo kubectl get pods -o wide --all-namespaces shows that the pod is deployed on the master node.
Additional context
The command sudo kubectl get nodesshows both nodes.
When trying to use the Raspberry Pi 3 as the master and the Raspberry Pi 4 as the node, then the pod is deployed on the worker node.
Did you taint the master node? I believe by default k3s will not taint it.
kubectl label --overwrite node <MASTER> node-role.kubernetes.io/master=true:NoSchedule
@carpenike I did not specify it explicitly.
However, when I use the Raspberry Pi 3 as the master node and the Rasperry Pi 4 as the worker node, then the deployment is deployed on the worker node, as expected.
@carpenike If you taint the node master node like that, what happens if you delete the coredns pod? It wouldn't be able to reschedule on the master node again, would it ?
@Kerwood -- correct. You'd have to use a toleration within a deployment to allow deployments on the master node. Also, I mis-wrote that. It's taint, not label.
kubectl taint--overwrite node <MASTER> node-role.kubernetes.io/master=true:NoSchedule
My suspicion is that it's choosing the node with the most available resources when doing the node selection, and choosing the Pi4.
I have been trying different things to achieve the same the last couple of days.
Here's my solution. (Which in my opinion should be default)
kubectl taint node dev-k3s-master k3s-controlplane=true:NoSchedule
kubectl edit deployments local-path-provisioner -n kube-system
And add the following to the containers spec.
spec:
...
template:
...
spec:
...
tolerations:
- effect: NoExecute
operator: Exists
- effect: NoSchedule
operator: Exists
Do the same for metrics-server and coredns. The latter will have will have tolerations: present, so just add the two effects to the list.
It would be nice to have the install script recognize this setup (e.g. --no-schedule-master or some kind of flag) that would grant these the appropriate tolerations automatically.
For the sake of completeness. Merged request above (https://github.com/rancher/k3s/pull/1275/files) takes care of this, however (at the time of writing) it's not yet released. On older versions you can "future proof" the behaviour of the merge request by creating a patch.yaml file with following contents
spec:
template:
spec:
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
and running
kubectl patch deployment metrics-server -n kube-system --patch "$(cat patch.yaml)"
kubectl patch deployment coredns -n kube-system --patch "$(cat patch.yaml)"
kubectl patch deployment local-path-provisioner -n kube-system --patch "$(cat patch.yaml)"
from the same folder.
Master node must have a following taint node-role.kubernetes.io/master=true:NoSchedule which you can apply either on install time or with override command mentioned above.
@alekc It is my understanding that if the k3s-master is rebooted those will get overwritten or am I wrong?
To add up on @alekc's answer, I also had to patch the service load balancer:
$ kubectl patch daemonset svclb-traefik -n kube-system --patch "$(cat patch.yaml)"
Right, I've been deploying with --no-deploy=traefik flag so didn't have that one.
I attached a label to the node and used the nodeSelector entry to specify on which node to deploy.
See: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
Most helpful comment
It would be nice to have the install script recognize this setup (e.g. --no-schedule-master or some kind of flag) that would grant these the appropriate tolerations automatically.