Environment:
On prem, HA cluster (3 masters + etcd together, 3 worker nodes).
printf "$(uname -srm)\n$(cat /etc/os-release)\n"):Linux 3.10.0-957.21.3.el7.x86_64 x86_64
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
ansible --version):ansible 2.7.11
Kubespray version (commit) (git rev-parse --short HEAD):
fb9103acd3efb873f6ad1a76a435a4074a77eb3f
Network plugin used:
Calico.
Copy of your inventory file:
all:
hosts:
k8s-master-01.saturn.dc10:
ip: 10.100.10.10
etcd_member_name: etcd1
k8s-master-02.saturn.dc10:
ip: 10.100.10.11
etcd_member_name: etcd2
k8s-master-03.saturn.dc10:
ip: 10.100.10.12
etcd_member_name: etcd3
k8s-node-01.saturn.dc10:
ip: 10.100.10.13
k8s-node-02.saturn.dc10:
ip: 10.100.10.14
k8s-node-03.saturn.dc10:
ip: 10.100.10.15
children:
kube-master:
hosts:
k8s-master-01.saturn.dc10:
k8s-master-02.saturn.dc10:
k8s-master-03.saturn.dc10:
etcd:
hosts:
k8s-master-01.saturn.dc10:
k8s-master-02.saturn.dc10:
k8s-master-03.saturn.dc10:
kube-node:
hosts:
k8s-node-01.saturn.dc10:
k8s-node-02.saturn.dc10:
k8s-node-03.saturn.dc10:
k8s-cluster:
children:
kube-master:
kube-node:
Command used to invoke ansible:
ansible-playbook -i inventories/dc10/k8s-saturn.yml upgrade_cluster.yml --k --become-user=root -K --user=admin --become -v
Output of ansible run:
TASK [kubernetes/master : kubeadm | Upgrade first master] *************************************************************************************************************
Sunday 28 July 2019 00:27:16 -0700 (0:00:00.048) 0:32:38.149 ***********
fatal: [k8s-master-01.saturn.dc10]: FAILED! => changed=true
cmd:
- timeout
- -k
- 600s
- 600s
- /usr/local/bin/kubeadm
- upgrade
- apply
- -y
- v1.15.0
- --config=/etc/kubernetes/kubeadm-config.yaml
- --ignore-preflight-errors=all
- --allow-experimental-upgrades
- --allow-release-candidate-upgrades
- --etcd-upgrade=false
- --force
delta: '0:00:02.736378'
end: '2019-07-28 00:27:20.958183'
failed_when_result: true
msg: non-zero return code
rc: 1
start: '2019-07-28 00:27:18.221805'
stderr: '[upgrade/apply] FATAL: couldn''t upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [failed to renew certificates for component "kube-apiserver": failed to renew certificate apiserver-kubelet-client: unable to sign certificate: must specify at least one ExtKeyUsage, rename /etc/kubernetes/tmp/kubeadm-backup-manifests-2019-07-28-00-27-20/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml: no such file or directory]'
stderr_lines:
- '[upgrade/apply] FATAL: couldn''t upgrade control plane. kubeadm has tried to recover everything into the earlier state. Errors faced: [failed to renew certificates for component "kube-apiserver": failed to renew certificate apiserver-kubelet-client: unable to sign certificate: must specify at least one ExtKeyUsage, rename /etc/kubernetes/tmp/kubeadm-backup-manifests-2019-07-28-00-27-20/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml: no such file or directory]'
stdout: |-
[upgrade/config] Making sure the configuration is correct:
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/version] You have chosen to change the cluster version to "v1.15.0"
[upgrade/versions] Cluster version: v1.14.3
[upgrade/versions] kubeadm version: v1.15.0
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler]
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.15.0"...
Static pod: kube-apiserver-k8s-master-01.saturn.dc10 hash: 6703699804331f97507d42e9ab125bc3
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: 4ee6f47611b030fd3861ec4752f266c6
Static pod: kube-scheduler-k8s-master-01.saturn.dc10 hash: fd29bfff9a9c75a09e7573f98e900cd5
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests975079952"
[controlplane] Adding extra host path mount "audit-policy" to "kube-apiserver"
[controlplane] Adding extra host path mount "audit-logs" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-tls" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-ca-trust" to "kube-apiserver"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
Is there a recommended way to force regenerate certs and /etc/kubernetes/manifests?
I think my certs are messed up because I added another host to supplementary_addresses_in_ssl_keys, can anyone recommend how I can recover?
Attempting to upgrade to 1.14.4 with upgrade_cluster.yml playbook:
TASK [kubernetes/master : kubeadm | Upgrade first master] *************************************************************************************************************
Sunday 28 July 2019 03:19:54 -0700 (0:00:00.120) 0:32:50.072 ***********
fatal: [k8s-master-01.saturn.dc10]: FAILED! => changed=true
cmd:
- timeout
- -k
- 600s
- 600s
- /usr/local/bin/kubeadm
- upgrade
- apply
- -y
- v1.14.4
- --config=/etc/kubernetes/kubeadm-config.yaml
- --ignore-preflight-errors=all
- --allow-experimental-upgrades
- --allow-release-candidate-upgrades
- --etcd-upgrade=false
- --force
delta: '0:00:29.482534'
end: '2019-07-28 03:20:25.101522'
failed_when_result: true
msg: non-zero return code
rc: 1
start: '2019-07-28 03:19:55.618988'
stderr: '[upgrade/postupgrade] FATAL post-upgrade error: failed to write or validate certificate "apiserver": certificate apiserver is invalid: x509: certificate is valid for k8s-master-01.saturn.dc10, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.saturn-dc10.local, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.saturn-dc10.local, localhost, k8s-master-01.saturn.dc10, k8s-master-02.saturn.dc10, k8s-master-03.saturn.dc10, lb-apiserver.kubernetes.local, lb-k8s-api.saturn.dc10, not k8s-api.saturn.dc10'
stderr_lines:
- '[upgrade/postupgrade] FATAL post-upgrade error: failed to write or validate certificate "apiserver": certificate apiserver is invalid: x509: certificate is valid for k8s-master-01.saturn.dc10, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.saturn-dc10.local, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.saturn-dc10.local, localhost, k8s-master-01.saturn.dc10, k8s-master-02.saturn.dc10, k8s-master-03.saturn.dc10, lb-apiserver.kubernetes.local, lb0k8s-api.saturn.dc10, not k8s-api.saturn.dc10'
stdout: |-
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/version] You have chosen to change the cluster version to "v1.14.4"
[upgrade/versions] Cluster version: v1.14.3
[upgrade/versions] kubeadm version: v1.14.4
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler]
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 3 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.14.4"...
Static pod: kube-apiserver-k8s-master-01.saturn.dc10 hash: 4cd2fe517abb9ebe0c6bb5d21fb88458
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: c7d3dba1c6c8766d392cb83a916428bb
Static pod: kube-scheduler-k8s-master-01.saturn.dc10 hash: 33c58344b7aaf81a315081d43ec4ee92
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests251045572"
[controlplane] Adding extra host path mount "audit-policy" to "kube-apiserver"
[controlplane] Adding extra host path mount "audit-logs" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-tls" to "kube-apiserver"
[controlplane] Adding extra host path mount "etc-pki-ca-trust" to "kube-apiserver"
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-07-28-03-20-08/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-k8s-master-01.saturn.dc10 hash: 929e4c7a9d0e2040cd35f43de6ec7b8b
[apiclient] Found 3 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-07-28-03-20-08/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: c7d3dba1c6c8766d392cb83a916428bb
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: c7d3dba1c6c8766d392cb83a916428bb
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: c7d3dba1c6c8766d392cb83a916428bb
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: c7d3dba1c6c8766d392cb83a916428bb
Static pod: kube-controller-manager-k8s-master-01.saturn.dc10 hash: c150df8103bcdff08168229044068e0c
[apiclient] Found 3 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-07-28-03-20-08/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-k8s-master-01.saturn.dc10 hash: 33c58344b7aaf81a315081d43ec4ee92
Static pod: kube-scheduler-k8s-master-01.saturn.dc10 hash: a9d474c7aa430f26b5b646fb4cda6b61
[apiclient] Found 3 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[postupgrade]WARNING: failed to backup kube-apiserver cert and key: failed to created backup directory /etc/kubernetes/pki/expired: mkdir /etc/kubernetes/pki/expired: file exists[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
This fixed it for me: https://github.com/kubernetes/kubeadm/issues/1447#issuecomment-490494999
After adding the cert, I reran upgrade_cluster.yml.
Upgrade from 2.10.4 to 2.11.0
The same issue. It was resolved by regenerate more that one cert:
apiserver.*
apiserver-kubelet-client.*
front-proxy-client.*
etc. one by one after each error.
See last line in error to understand what need to regenerate.
Most helpful comment
This fixed it for me: https://github.com/kubernetes/kubeadm/issues/1447#issuecomment-490494999
After adding the cert, I reran upgrade_cluster.yml.