Kubespray: After the certificate expires how use kubespray to renew certificate

Created on 17 Dec 2019  路  22Comments  路  Source: kubernetes-sigs/kubespray


After the certificate expires how to kubespray renew certificate.I can't find methods in kubespray docs.

kinsupport lifecyclrotten

Most helpful comment

@kerOssinas you are right, the upgrade-cluster.yml of Kubespray will also rotate the certificates, but this also requires to rotate all the service account secrets as the signing key for service accounts will be rotated as well. See the docs for more information.

For a more lightweight approach, run the following commands on every master node. This will only replace the master certificates and preserve the service account signing key.

First backup the current certificates:

cp -R /etc/kubernetes/ssl /etc/kubernetes/ssl.backup
cp /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.backup
cp /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.backup
cp /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.backup
cp /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.backup

Use kubeadm to renew the certificates:

kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew apiserver
kubeadm alpha certs renew front-proxy-client
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf
kubeadm alpha kubeconfig user --client-name system:node:{nodename} --org system:nodes > /etc/kubernetes/kubelet.conf

Replace {nodename} with the name of the master node (see /etc/kubernetes/kubelet.conf). Then regenerate the admin certificate:

kubeadm alpha kubeconfig user --client-name kubernetes-admin --org system:masters > /etc/kubernetes/admin.conf
cp /etc/kubernetes/admin.conf ~/.kube/config

And finally restart the following processes:

  • kube-apiserver
  • kube-controller-manager
  • kube-scheduler

One way to do this is to find the containers with docker ps and kill them with docker rm -f <id>. Kubelet will automatically start a new instance. You can also just reboot the master node.

All 22 comments

Got into this situation today so I had to play the cluster.yml playbook to trigger certificate generation then manually reboot the masters for the new certs to work.

@daohoangson hao do you play and certificate is changed? ths

I ran this:

ansible-playbook -i inventory/mycluster/hosts.yaml  cluster.yml

It failed at this step:

TASK [kubernetes-apps/cluster_roles : PriorityClass | Create k8s-cluster-critical] ********************************************************************************************************************
skipping: [xxx1]
skipping: [xxx2]
fatal: [xxx3]: FAILED! => {"changed": false, "msg": "error running kubectl (/usr/local/bin/kubectl apply --force --filename=/etc/kubernetes/k8s-cluster-critical-pc.yml) command (rc=1), out='', err='error: unable to recognize \"/etc/kubernetes/k8s-cluster-critical-pc.yml\": Get https://xxx:6443/api?timeout=32s: x509: certificate has expired or is not yet valid\n'"}

I had to ssh into each of the masters (my cluster has 3) and reboot it.

@daohoangson its run playbook and reboot machine k8s-cluster-crit and k8s cluster is ok ? my version 2.12,what your version ?

It worked for my cluster. I'm on 2.9 though.

"ansible-playbook -i inventory/mycluster/hosts.yaml cluster.yml" did not work for me. I also ran the "ansible-playbook upgrade-cluster.yml" and upgraded from 14.1 to 14.2, but my certs still did not upgrade.

Any new updates how to update certs in Kubespray.

the below commands worked for me

Force deleting old configuration and SSL certificats, I have the filling that cluster.yml do not regenerat the certificat if the /var/lib/kubelet exist by the the bootstrap-os role

ansible -b -m shell -a 'mv /etc/ssl/etcd  /etc/ssl/etcd.old' -i inventory/renew-ssl-test/inventory.ini etcd
ansible -b -m shell -a 'mv /etc/kubernetes /etc/kubernetes.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'mv /var/lib/kubelet /var/lib/kubelet.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'mv /etc/cni /etc/cni.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'mv /etc/calico /etc/calico.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'systemctl stop etcd.service' -i inventory/renew-ssl-test/inventory.ini etcd

Cluster UP again
ansible-playbook -b --flush-cache -i inventory/renew-ssl-test/inventory.ini cluster.yml

Regenerate the default tokens

kubectl delete serviceaccount  --namespace=default default 
kubectl delete serviceaccount --namespace=kube-system default

I had an issue with Ingress that I resolved with the below commands

kubectl get serviceaccount nginx-ingress-serviceaccount -o yaml >nginx-ingress-serviceaccount.yaml
kubectl delete serviceaccount nginx-ingress-serviceaccount
for snip in creationTimestamp resourceVersion selfLink uid kubernetes.io metadata; do sed -i "/$snip/d" nginx-ingress-serviceaccount.yaml; done
kubectl apply -f nginx-ingress-serviceaccount.yaml

@darbiste which nodes did you have in your inventory file? only the master nodes or all of them?

@postedo all of them, below is my inventory file

[all]
k8s02-rhel7.home ansible_host=192.168.56.201 ip=192.168.56.201 access_ip=192.168.56.201
k8s03-rhel7.home ansible_host=192.168.56.202 ip=192.168.56.202 access_ip=192.168.56.202

[kube-master]
k8s02-rhel7.home

[etcd]
k8s02-rhel7.home

[kube-node]
k8s02-rhel7.home
k8s03-rhel7.home

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node

@darbiste What version are you running?

1.16.3

so any safe suggetion?

I ended up having to rebuild my environment and redeploy all. Running 1.15 now which has the added features for kubeadm to create new certs. Make sure to enable kubeadm in kubespray config files before running the playbook.

vi roles/kubernetes/master/defaults/main/main.yml
kubeadm_control_plane: true <<<<<<<<<---------Change to true
etcd_kubeadm_enabled: true <<<<<<<<<---------Change to true

vi roles/kubernetes/kubeadm/defaults/main.yml
etcd_kubeadm_enabled: true <<<<<<<<<---------Change to true

vi inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml
kubeadm_control_plane: true <<<<<<<<<---------Change to true

I tried _darbiste_'s approach for kubectl/kubeadm 1.13 and it seems to work fine. One small change:

After the redeploy, I had to execute the following for kubectl to start functioning again:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

I know when you create a cluster with kubeadm and upgrade the cluster with kubeadm it refreshes your certs. Doesn't kubespray upgrade.yml do the same? Need to test this.

@kerOssinas you are right, the upgrade-cluster.yml of Kubespray will also rotate the certificates, but this also requires to rotate all the service account secrets as the signing key for service accounts will be rotated as well. See the docs for more information.

For a more lightweight approach, run the following commands on every master node. This will only replace the master certificates and preserve the service account signing key.

First backup the current certificates:

cp -R /etc/kubernetes/ssl /etc/kubernetes/ssl.backup
cp /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.backup
cp /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.backup
cp /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.backup
cp /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.backup

Use kubeadm to renew the certificates:

kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew apiserver
kubeadm alpha certs renew front-proxy-client
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf
kubeadm alpha kubeconfig user --client-name system:node:{nodename} --org system:nodes > /etc/kubernetes/kubelet.conf

Replace {nodename} with the name of the master node (see /etc/kubernetes/kubelet.conf). Then regenerate the admin certificate:

kubeadm alpha kubeconfig user --client-name kubernetes-admin --org system:masters > /etc/kubernetes/admin.conf
cp /etc/kubernetes/admin.conf ~/.kube/config

And finally restart the following processes:

  • kube-apiserver
  • kube-controller-manager
  • kube-scheduler

One way to do this is to find the containers with docker ps and kill them with docker rm -f <id>. Kubelet will automatically start a new instance. You can also just reboot the master node.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Thanks @bartjkdp for your approach, but i want to know how to deal with worker node for renewing certificate?

@MnifR worker nodes automatically rotate their certificates when the --rotate-certificates flag is used on the kubelet process. This flag is enabled by default in Kubespray.

@bartjkdp i have rebooted the worker nodes but without success they are still NotReady when i ssh them there is no kubelet container up i am using kubespray 2.11

Oct 17 06:25:29 worker1 kubelet[31245]: W1017 06:25:29.912570 31245 bootstrap.go:158] Error waiting for apiserver to come up: timed out waiting to connect to apiserver Oct 17 06:25:29 worker1 kubelet[31245]: F1017 06:25:29.915175 31245 server.go:273] failed to run Kubelet: cannot create certificate signing request: Unauthorized Oct 17 06:25:29 worker1 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a Oct 17 06:25:29 worker1 systemd[1]: kubelet.service: Failed with result 'exit-code'. Oct 17 06:25:40 worker1 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart. Oct 17 06:25:40 worker1 systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 576. Oct 17 06:25:40 worker1 systemd[1]: Stopped Kubernetes Kubelet Server. Oct 17 06:25:40 worker1 systemd[1]: Started Kubernetes Kubelet Server.

@MnifR you could try to manually generate a new token on one of the masters with:

kubeadm token create --print-join-command

And then re-join the worker by running the appropriate command on the worker.

See https://github.com/kubernetes/kubeadm/issues/809 for the details.

Otherwise you could try to start with a clean node and re-provision it with Kubespray.

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Was this page helpful?
0 / 5 - 0 ratings