Kubespray: After the certificate expires how use kubespray to renew certificate

Created on 17 Dec 2019 · 22Comments · Source: kubernetes-sigs/kubespray

After the certificate expires how to kubespray renew certificate.I can't find methods in kubespray docs.

kinsupport lifecyclrotten

Source

lklkxcxc

Most helpful comment

@kerOssinas you are right, the upgrade-cluster.yml of Kubespray will also rotate the certificates, but this also requires to rotate all the service account secrets as the signing key for service accounts will be rotated as well. See the docs for more information.

For a more lightweight approach, run the following commands on every master node. This will only replace the master certificates and preserve the service account signing key.

First backup the current certificates:

cp -R /etc/kubernetes/ssl /etc/kubernetes/ssl.backup
cp /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.backup
cp /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.backup
cp /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.backup
cp /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.backup

Use kubeadm to renew the certificates:

kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew apiserver
kubeadm alpha certs renew front-proxy-client
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf
kubeadm alpha kubeconfig user --client-name system:node:{nodename} --org system:nodes > /etc/kubernetes/kubelet.conf

Replace {nodename} with the name of the master node (see /etc/kubernetes/kubelet.conf). Then regenerate the admin certificate:

kubeadm alpha kubeconfig user --client-name kubernetes-admin --org system:masters > /etc/kubernetes/admin.conf
cp /etc/kubernetes/admin.conf ~/.kube/config

And finally restart the following processes:

kube-apiserver
kube-controller-manager
kube-scheduler

One way to do this is to find the containers with docker ps and kill them with docker rm -f <id>. Kubelet will automatically start a new instance. You can also just reboot the master node.

bartjkdp on 20 Jun 2020

👍15 ❤4 🎉3 😄2

All 22 comments

Got into this situation today so I had to play the cluster.yml playbook to trigger certificate generation then manually reboot the masters for the new certs to work.

daohoangson on 6 Jan 2020

@daohoangson hao do you play and certificate is changed? ths

xiaohuanxj on 9 Jan 2020

I ran this:

ansible-playbook -i inventory/mycluster/hosts.yaml  cluster.yml

It failed at this step:

TASK [kubernetes-apps/cluster_roles : PriorityClass | Create k8s-cluster-critical] ********************************************************************************************************************
skipping: [xxx1]
skipping: [xxx2]
fatal: [xxx3]: FAILED! => {"changed": false, "msg": "error running kubectl (/usr/local/bin/kubectl apply --force --filename=/etc/kubernetes/k8s-cluster-critical-pc.yml) command (rc=1), out='', err='error: unable to recognize \"/etc/kubernetes/k8s-cluster-critical-pc.yml\": Get https://xxx:6443/api?timeout=32s: x509: certificate has expired or is not yet valid\n'"}

I had to ssh into each of the masters (my cluster has 3) and reboot it.

daohoangson on 9 Jan 2020

@daohoangson its run playbook and reboot machine k8s-cluster-crit and k8s cluster is ok ? my version 2.12,what your version ?

xiaohuanxj on 10 Jan 2020

It worked for my cluster. I'm on 2.9 though.

daohoangson on 10 Jan 2020

"ansible-playbook -i inventory/mycluster/hosts.yaml cluster.yml" did not work for me. I also ran the "ansible-playbook upgrade-cluster.yml" and upgraded from 14.1 to 14.2, but my certs still did not upgrade.

Any new updates how to update certs in Kubespray.

mhabicht on 16 Jan 2020

👍4

the below commands worked for me

Force deleting old configuration and SSL certificats, I have the filling that cluster.yml do not regenerat the certificat if the /var/lib/kubelet exist by the the bootstrap-os role

ansible -b -m shell -a 'mv /etc/ssl/etcd  /etc/ssl/etcd.old' -i inventory/renew-ssl-test/inventory.ini etcd
ansible -b -m shell -a 'mv /etc/kubernetes /etc/kubernetes.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'mv /var/lib/kubelet /var/lib/kubelet.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'mv /etc/cni /etc/cni.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'mv /etc/calico /etc/calico.old' -i inventory/renew-ssl-test/inventory.ini all
ansible -b -m shell -a 'systemctl stop etcd.service' -i inventory/renew-ssl-test/inventory.ini etcd

Cluster UP again
ansible-playbook -b --flush-cache -i inventory/renew-ssl-test/inventory.ini cluster.yml

Regenerate the default tokens

kubectl delete serviceaccount  --namespace=default default 
kubectl delete serviceaccount --namespace=kube-system default

I had an issue with Ingress that I resolved with the below commands

kubectl get serviceaccount nginx-ingress-serviceaccount -o yaml >nginx-ingress-serviceaccount.yaml
kubectl delete serviceaccount nginx-ingress-serviceaccount
for snip in creationTimestamp resourceVersion selfLink uid kubernetes.io metadata; do sed -i "/$snip/d" nginx-ingress-serviceaccount.yaml; done
kubectl apply -f nginx-ingress-serviceaccount.yaml

mourad-hamza on 12 Feb 2020

👍2

@darbiste which nodes did you have in your inventory file? only the master nodes or all of them?

postedo on 14 Feb 2020

@postedo all of them, below is my inventory file

[all]
k8s02-rhel7.home ansible_host=192.168.56.201 ip=192.168.56.201 access_ip=192.168.56.201
k8s03-rhel7.home ansible_host=192.168.56.202 ip=192.168.56.202 access_ip=192.168.56.202

[kube-master]
k8s02-rhel7.home

[etcd]
k8s02-rhel7.home

[kube-node]
k8s02-rhel7.home
k8s03-rhel7.home

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node

mourad-hamza on 14 Feb 2020

👍1

@darbiste What version are you running?

mhabicht on 14 Feb 2020

1.16.3

mourad-hamza on 14 Feb 2020

so any safe suggetion?

kaybinwong on 13 Mar 2020

I ended up having to rebuild my environment and redeploy all. Running 1.15 now which has the added features for kubeadm to create new certs. Make sure to enable kubeadm in kubespray config files before running the playbook.

vi roles/kubernetes/master/defaults/main/main.yml
kubeadm_control_plane: true <<<<<<<<<---------Change to true
etcd_kubeadm_enabled: true <<<<<<<<<---------Change to true

vi roles/kubernetes/kubeadm/defaults/main.yml
etcd_kubeadm_enabled: true <<<<<<<<<---------Change to true

vi inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml
kubeadm_control_plane: true <<<<<<<<<---------Change to true

mhabicht on 13 Mar 2020

👎1

I tried _darbiste_'s approach for kubectl/kubeadm 1.13 and it seems to work fine. One small change:

After the redeploy, I had to execute the following for kubectl to start functioning again:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

jvleminc on 22 Mar 2020

I know when you create a cluster with kubeadm and upgrade the cluster with kubeadm it refreshes your certs. Doesn't kubespray upgrade.yml do the same? Need to test this.

kerOssinas on 6 Apr 2020

For a more lightweight approach, run the following commands on every master node. This will only replace the master certificates and preserve the service account signing key.

First backup the current certificates:

cp -R /etc/kubernetes/ssl /etc/kubernetes/ssl.backup
cp /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.backup
cp /etc/kubernetes/controller-manager.conf /etc/kubernetes/controller-manager.conf.backup
cp /etc/kubernetes/kubelet.conf /etc/kubernetes/kubelet.conf.backup
cp /etc/kubernetes/scheduler.conf /etc/kubernetes/scheduler.conf.backup

Use kubeadm to renew the certificates:

kubeadm alpha certs renew apiserver-kubelet-client
kubeadm alpha certs renew apiserver
kubeadm alpha certs renew front-proxy-client
kubeadm alpha kubeconfig user --client-name system:kube-controller-manager > /etc/kubernetes/controller-manager.conf
kubeadm alpha kubeconfig user --client-name system:kube-scheduler > /etc/kubernetes/scheduler.conf
kubeadm alpha kubeconfig user --client-name system:node:{nodename} --org system:nodes > /etc/kubernetes/kubelet.conf

Replace {nodename} with the name of the master node (see /etc/kubernetes/kubelet.conf). Then regenerate the admin certificate:

kubeadm alpha kubeconfig user --client-name kubernetes-admin --org system:masters > /etc/kubernetes/admin.conf
cp /etc/kubernetes/admin.conf ~/.kube/config

And finally restart the following processes:

kube-apiserver
kube-controller-manager
kube-scheduler

One way to do this is to find the containers with docker ps and kill them with docker rm -f <id>. Kubelet will automatically start a new instance. You can also just reboot the master node.

bartjkdp on 20 Jun 2020

👍15 ❤4 🎉3 😄2

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 21 Sep 2020

Thanks @bartjkdp for your approach, but i want to know how to deal with worker node for renewing certificate?

MnifR on 16 Oct 2020

@MnifR worker nodes automatically rotate their certificates when the --rotate-certificates flag is used on the kubelet process. This flag is enabled by default in Kubespray.

bartjkdp on 17 Oct 2020

@bartjkdp i have rebooted the worker nodes but without success they are still NotReady when i ssh them there is no kubelet container up i am using kubespray 2.11

Oct 17 06:25:29 worker1 kubelet[31245]: W1017 06:25:29.912570 31245 bootstrap.go:158] Error waiting for apiserver to come up: timed out waiting to connect to apiserver Oct 17 06:25:29 worker1 kubelet[31245]: F1017 06:25:29.915175 31245 server.go:273] failed to run Kubelet: cannot create certificate signing request: Unauthorized Oct 17 06:25:29 worker1 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a Oct 17 06:25:29 worker1 systemd[1]: kubelet.service: Failed with result 'exit-code'. Oct 17 06:25:40 worker1 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart. Oct 17 06:25:40 worker1 systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 576. Oct 17 06:25:40 worker1 systemd[1]: Stopped Kubernetes Kubelet Server. Oct 17 06:25:40 worker1 systemd[1]: Started Kubernetes Kubelet Server.

MnifR on 17 Oct 2020

@MnifR you could try to manually generate a new token on one of the masters with:

kubeadm token create --print-join-command

And then re-join the worker by running the appropriate command on the worker.

See https://github.com/kubernetes/kubeadm/issues/809 for the details.

Otherwise you could try to start with a clean node and re-provision it with Kubespray.

bartjkdp on 17 Oct 2020

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 17 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings