Environmental Info:
k3s version v1.18.6+k3s1 (6f56fa1d)
Node(s) CPU architecture, OS, and Version:
AWS - t3a.medium - 4GB, 2CPU
Linux ip-172-20-3-222 5.4.0-1018-aws #18-Ubuntu SMP Wed Jun 24 01:15:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
➜ kubectl get node
NAME STATUS ROLES AGE VERSION
ip-172-20-3-222.us-east-2.compute.internal Ready single 13d v1.18.6+k3s1
Describe the bug:
Occasionally the k3s server instance has an internal timeout (will file a different issue for that) and systemd will restart the k3s service. When this happens k3s will re-apply the node-role.kubernetes.io/master=true label to the node. This "breaks" the AWS CCM since that service will not add master nodes to the CCM managed AWS load balancer pools.
Steps To Reproduce:
Add single role label and remove master role label
➜ kubectl label node ip-172-20-3-222.us-east-2.compute.internal 'node-role.kubernetes.io/single="true"'
➜ kubectl label node ip-172-20-3-222.us-east-2.compute.internal node-role.kubernetes.io/master-
Validate that only the single role is enabled
➜ kubectl get node
NAME STATUS ROLES AGE VERSION
ip-172-20-3-222.us-east-2.compute.internal Ready single 13d v1.18.6+k3s1
Restart k3s service
systemctl restart k3s.service
Expected behavior:
Only single role label should be applied
kubectl get node
NAME STATUS ROLES AGE VERSION
ip-172-20-3-222.us-east-2.compute.internal Ready single 13d v1.18.6+k3s1
Actual behavior:
master label has been reapplied
kubectl get node
NAME STATUS ROLES AGE VERSION
ip-172-20-3-222.us-east-2.compute.internal Ready master,single 13d v1.18.6+k3s1
@jgreat - semi-unrelated - Im curious how you are sidestepping this issue with an external CCM https://github.com/rancher/k3s/issues/1807
(we are introducing a fix and i wonder how it meshes with whatever you are doing to work around it)
I'm adding the "raw" ccm manifest via cloud-init to /var/lib/rancher/k3s/server/manifests and have an install script that wraps around k3s install to provide the provider-id info and re-label the node after it comes up.
install.sh
#!/bin/bash
provider_id="$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)/$(curl -s http://169.254.169.254/latest/meta-data/instance-id)"
curl -sfL https://get.k3s.io -o /k3s_install.sh
chmod +x /k3s_install.sh
/k3s_install.sh \
--disable-cloud-controller \
--disable servicelb \
--disable local-storage \
--disable traefik \
--kubelet-arg="cloud-provider=external" \
--kubelet-arg="provider-id=aws:///${provider_id}"
rm /k3s_install.sh
# wait for cluster up.
return=1
while [ ${return} != 0 ]; do
sleep 2
kubectl get nodes $(hostname -f) 2>&1 >/dev/null
return=$?
done
# re-lable if single node cluster. AWS CCM doesn't run on "master" nodes.
if [ "${NODE_ROLE}" == "single" ]; then
is_master=$(kubectl get node -o json | jq -r ".items[] | select(.metadata.name == \"$(hostname -f)\") | .metadata.labels.\"node-role.kubernetes.io/master\"")
if [ "${is_master}" == "true" ]; then
kubectl label node $(hostname -f) node-role.kubernetes.io/master- node-role.kubernetes.io/single="true"
fi
fi
00-aws-ccm.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: cloud-controller-manager
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: system:cloud-controller-manager
labels:
kubernetes.io/cluster-service: "true"
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- '*'
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- ""
resources:
- services
verbs:
- list
- watch
- patch
- apiGroups:
- ""
resources:
- services/status
verbs:
- update
- patch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
# For leader election
- apiGroups:
- ""
resources:
- endpoints
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
resourceNames:
- "cloud-controller-manager"
verbs:
- get
- list
- watch
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
- "cloud-controller-manager"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- serviceaccounts
verbs:
- create
- apiGroups:
- ""
resources:
- secrets
verbs:
- get
- list
- apiGroups:
- "coordination.k8s.io"
resources:
- leases
verbs:
- get
- create
- update
- list
# For the PVL
- apiGroups:
- ""
resources:
- persistentvolumes
verbs:
- list
- watch
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: aws-cloud-controller-manager
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:cloud-controller-manager
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: aws-cloud-controller-manager-ext
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: cloud-controller-manager
namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: aws-cloud-controller-manager
namespace: kube-system
labels:
k8s-app: aws-cloud-controller-manager
spec:
selector:
matchLabels:
component: aws-cloud-controller-manager
tier: control-plane
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
component: aws-cloud-controller-manager
tier: control-plane
spec:
serviceAccountName: cloud-controller-manager
hostNetwork: true
# If this is a single node we do not want this selector
# and we need to remove the node-role.kubernetes.io/master label
# Maybe set node-role.kubernetes.io/combined: "true"
# nodeSelector:
# node-role.kubernetes.io/master: "true"
tolerations:
- key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
containers:
- name: aws-cloud-controller-manager
image: jgreat/aws-cloud-controller-manager:20200331-095641
Here's a temporary workaround just in case someone else runs into this.
As part of my install, I'm adding a couple of 'ExecStartPost' commands to the k3s systemd service. This should remove the "master" label if the service is restarted.
# remove master label if single node cluster. AWS CCM doesn't run on "master" nodes.
if [ "${NODE_ROLE}" == "single" ]; then
/usr/local/bin/k3s kubectl label node --all --overwrite node-role.kubernetes.io/master-
# Add extra line because of the tailing backslash in the k3s install generated systemd unit config.
echo '' >> /etc/systemd/system/k3s.service
echo 'ExecStartPost=/usr/bin/sleep 10' >> /etc/systemd/system/k3s.service
echo 'ExecStartPost=/usr/local/bin/k3s kubectl label node --all --overwrite node-role.kubernetes.io/master-' >> /etc/systemd/system/k3s.service
systemctl daemon-reload
fi
Most helpful comment
Here's a temporary workaround just in case someone else runs into this.
As part of my install, I'm adding a couple of 'ExecStartPost' commands to the k3s systemd service. This should remove the "master" label if the service is restarted.