BUG REPORT
kubeadm version (use kubeadm version): kubeadm version: &version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:43:08Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Environment:
kubectl version): Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:46:06Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}uname -a): Linux k8s-m01 4.14.67-coreos #1 SMP Mon Sep 10 23:14:26 UTC 2018 x86_64 Intel Core i7 (Nehalem Class Core i7) GenuineIntel GNU/LinuxI am trying to upgrade from Kubernetes from v1.11.2 to v1.12.1 using the command kubeadm upgrade apply v1.12.1. However, this times out:
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.12.1"...
[upgrade/apply] FATAL: timed out waiting for the condition
Looking at the verbose logs, this appears to be because kubeadm is using the hostname to find the apiserver pod, rather than the Kubernetes node name:
I1015 10:59:20.235573 16962 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.12.1 (linux/amd64) kubernetes/4ed3216" 'https://192.168.1.1:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-HOSTNAME'
I1015 10:59:20.267945 16962 round_trippers.go:405] GET https://192.168.1.1:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-HOSTNAME 404 Not Found in 32 milliseconds
The cluster is upgraded successfully.
Upgrade a Kubernetes cluster where the master node has a hostname different to the node name specified when initialising the Kubernetes cluster.
kubeadm init --node-name=NODENAME
kubeadm upgrade apply v1.12.1
I hit this today on multiple clusters.
After digging a bit I noted that you need to pass the kube-apiserver-HOSTNAME to the InitConfiguration of your kubeadm config, reusing the old configuration will not work and doing a kubeadm config view seems to be broken in this version as per #1174 .
Here's the configuration I'm using to have this working:
apiEndpoint:
advertiseAddress: 172.32.10.90
bindPort: 6443
apiVersion: kubeadm.k8s.io/v1alpha3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: ""
usages:
- signing
- authentication
kind: InitConfiguration
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: ip-172-32-10-90.ec2.internal
kubeletExtraArgs:
cloud-provider: "aws"
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
The code referring to this is here: https://github.com/kubernetes/kubernetes/blob/4ed3216f3ec431b140b1d899130a69fc671678f4/cmd/kubeadm/app/phases/upgrade/staticpods.go#L394
I can work on this, @neolit123 wdyt about get the node name from /var/lib/kubelet/kubeadm-flags.env?
hi @yagonobre
https://github.com/kubernetes/kubernetes/issues/61664
the --hostname-override flag for the kubelet is going away eventually so we don't want look it up in /var/lib/kubelet/kubeadm-flags.env and don't want to bind more logic to that file.
1) one way to do this, is to query the kubelet.conf file in "etc/kubernetes" which is a Config type and from there look at the users and contexts.
2) another way, for a running kube-system pod (e.g. api-server) is to get it's podspec and look up nodeName.
if this doesn't get more comments within 2-3 working days just give 1) a go with a PR.
/cc @kubernetes/sig-cluster-lifecycle
opinions on 1 vs 2, or perhaps use another method?
@rdodev - can you reproduce this?
What is the recommended workaround for people affected by this bug?
This probably work https://github.com/kubernetes/kubeadm/issues/1170#issuecomment-430803138
I'm noticing this warning in v1.13.0-alpha.3:
[WARNING Hostname]: hostname "master" could not be reached
[WARNING Hostname]: hostname "master" lookup master on 172.31.0.2:53: no such host
this makes me wonder if this is a thing we want to even support.
Some other error messages I've noticed: For me the error didn't happen inside curl, but when trying to create the prepull pods. upgrade-prepull-* all had errors like:
Warning FailedCreatePodSandBox 5s kubelet, master Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "13866a6f947c7804057ab448b2ef6e7bf453adab44813e3ce4d6c4b6ec32eb11" network for pod "upgrade-prepull-kube-controller-manager-4ncxx": NetworkPlugin cni failed to set up pod "upgrade-prepull-kube-controller-manager-4ncxx_kube-system" network: no podCidr for node master
Normal SandboxChanged 4s (x9 over 28s) kubelet, master Pod sandbox changed, it will be killed and re-created.
The logs are also full of noPodCidr for node master (master was the name I chose, the hostname was a typical aws ip-172-something-something-something
Unfortunately we're coming up to the deadline for v1.13, so we're going to have to punt this out. I suspect there's deeper issues here and I think the advertiseAddress workaround should work for now.
The logs are also full of noPodCidr for node master (master was the name I chose, the hostname was a typical aws ip-172-something-something-something
I'm pretty sure I saw the same. I unfortunately tore down that cluster already.
Unfortunately we're coming up to the deadline for v1.13, so we're going to have to punt this out. I suspect there's deeper issues here and I think the advertiseAddress workaround should work for now.
I don't know if it's even a work-around rather than it being the happy path :)
Closing this issue if you modify the nodename on a running cluster this is undefined behavior.
Closing this issue if you modify the nodename on a running cluster this is undefined behavior.
Why?
kubeadm init supports --node-name and kubeadm join too. Why kubeadm upgrade not support it?
i tried reproducing the original problem with a 1.12->1.13 upgrade and i couldn't:
kubadm init ... --node-name=myhost --kubernetes-version=v1.12.0myhost is a custom hostname that does not match the local node hostname.--hostname-override=myhostNAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-576cbf47c7-9497k 1/1 Running 0 77s
kube-system coredns-576cbf47c7-lmfnd 1/1 Running 0 77s
kube-system etcd-myhost 1/1 Running 0 17s
kube-system kube-apiserver-myhost 1/1 Running 0 25s
kube-system kube-controller-manager-myhost 1/1 Running 0 17s
kube-system kube-proxy-srsg2 1/1 Running 0 77s
kube-system kube-scheduler-myhost 1/1 Running 0 28s
kube-system weave-net-nrxw8 2/2 Running 0 77s
kubeadm upgrade apply v1.13.0the upgrade passed fine!
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "myhost" as an annotation
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.13.0". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-86c58d9df4-cd49k 1/1 Running 0 27s
kube-system coredns-86c58d9df4-pvbdq 1/1 Running 0 27s
kube-system etcd-myhost 1/1 Running 0 5m8s
kube-system kube-apiserver-myhost 1/1 Running 0 65s
kube-system kube-controller-manager-myhost 1/1 Running 1 50s
kube-system kube-proxy-bh4nr 1/1 Running 0 13s
kube-system kube-scheduler-myhost 1/1 Running 0 48s
kube-system weave-net-nrxw8 2/2 Running 0 6m8s
please note that versions older than 1.13 are no longer supported, but the kubeadm team still supports 1.12->1.13 upgrades.
We have next case:
kubeadm init --node-name=$(hostname -f) --config=...
then
# kubeadm upgrade apply --certificate-renewal=false --config /tmp/kubeadm-config.yaml
[upgrade/config] Making sure the configuration is correct:
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/version] You have chosen to change the cluster version to "v1.15.2"
[upgrade/versions] Cluster version: v1.15.2
[upgrade/versions] kubeadm version: v1.15.2
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler]
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 2 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 2 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 2 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.15.2"...
[upgrade/apply] FATAL: timed out waiting for the condition
With -v=9:
I0820 06:17:12.947315 10369 request.go:947] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"kube-apiserver-master-0\" not found","reason":"NotFound","details":{"name":"kube-apiserver-master-0","kind":"pods"},"code":404}
But:
# kubectl -n kube-system get pods | grep apiserver
kube-apiserver-master-0.example.org 1/1 Running 0 25h
kube-apiserver-master-1.example.org 1/1 Running 0 25h
Workaround: Just set hostname == hostname -f(FQDN) for control plane nodes during upgrade.
# hostnamectl set-hostname $(hostname -f)
nope, i could not reproduce this problem. the node name was != hostname in my tests.
Also I'm using --hostname-override=$(hostname -f) in kubelet additional args. Maybe problem is here?
using the hostname as FQDN is not a problem.
I can confirm: trying kubeadm upgrade apply --dry-run and seeing
[dryrun] The GET request didn't yield any result, the API Server returned a NotFound error.
[dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
[dryrun] Resource name: "<redacted>"
[dryrun] The GET request didn't yield any result, the API Server returned a NotFound error.
[dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
[dryrun] Resource name: "<redacted>"
[dryrun] The GET request didn't yield any result, the API Server returned a NotFound error.
[dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
[dryrun] Resource name: "<redacted>"
[dryrun] The GET request didn't yield any result, the API Server returned a NotFound error.
[dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
The cluster nodes were created with kubeadm init --node-name= so a kubelet is running with kubeadm init --node-name=
I am facing the same issue while upgrading from 1.11 to 1.12.3. My hostname and nodename are different.
[root@ip-10-0-1-124 centos]# hostname
ip-10-0-1-124.ec2.internal
[root@ip-10-0-1-124 centos]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
10.0.1.124 Ready master 1d v1.11.0
When I run kubeadm upgrade with dry-run, I see following issue:
kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[dryrun] Would perform action GET on resource "configmaps" in API group "core/v1"
[dryrun] Resource name: "kubelet-config-1.12"
[dryrun] The GET request didn't yield any result, the API Server returned a NotFound error.
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "ip-10-0-1-124.ec2.internal" as an annotation
[dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
[dryrun] Resource name: "ip-10-0-1-124.ec2.internal"
[dryrun] The GET request didn't yield any result, the API Server returned a NotFound error.
[dryrun] Would perform action GET on resource "nodes" in API group "core/v1"
[dryrun] Resource name: "ip-10-0-1-124.ec2.internal"
When I actually run upgrade with log level -v9 to see what is happening I can find that it is trying to search the kube-apiserver pod with the hostname rather than nodename.
upgrade/prepull] Successfully prepulled the images for all the control plane components
I0903 06:40:13.781487 51733 apply.go:202] [upgrade/apply] performing upgrade
I0903 06:40:13.781502 51733 apply.go:268] checking if cluster is self-hosted
I0903 06:40:13.781553 51733 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.12.3 (linux/amd64) kubernetes/435f92c" 'https://10.0.1.124:6443/apis/apps/v1/namespaces/kube-system/daemonsets/self-hosted-kube-apiserver'
I0903 06:40:13.783450 51733 round_trippers.go:405] GET https://10.0.1.124:6443/apis/apps/v1/namespaces/kube-system/daemonsets/self-hosted-kube-apiserver 404 Not Found in 1 milliseconds
I0903 06:40:13.783466 51733 round_trippers.go:411] Response Headers:
I0903 06:40:13.783473 51733 round_trippers.go:414] Content-Length: 252
I0903 06:40:13.783478 51733 round_trippers.go:414] Date: Tue, 03 Sep 2019 06:40:13 GMT
I0903 06:40:13.783484 51733 round_trippers.go:414] Content-Type: application/json
I0903 06:40:13.783506 51733 request.go:942] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"daemonsets.apps \"self-hosted-kube-apiserver\" not found","reason":"NotFound","details":{"name":"self-hosted-kube-apiserver","group":"apps","kind":"daemonsets"},"code":404}
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.12.3"...
I0903 06:40:13.784112 51733 round_trippers.go:386] curl -k -v -XGET -H "Accept: application/json, */*" -H "User-Agent: kubeadm/v1.12.3 (linux/amd64) kubernetes/435f92c" 'https://10.0.1.124:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-ip-10-0-1-124.ec2.internal'
I0903 06:40:13.785876 51733 round_trippers.go:405] GET **https://10.0.1.124:6443/api/v1/namespaces/kube-system/pods/kube-apiserver-ip-10-0-1-124.ec2.internal 404 Not Found in 1 milliseconds**
I0903 06:40:13.785893 51733 round_trippers.go:411] Response Headers:
I0903 06:40:13.785899 51733 round_trippers.go:414] Content-Length: 250
I0903 06:40:13.785905 51733 round_trippers.go:414] Date: Tue, 03 Sep 2019 06:40:13 GMT
I0903 06:40:13.785910 51733 round_trippers.go:414] Content-Type: application/json
While the name of the kubeapiserver pod in my cluster is kube-apiserver-10.0.1.124. Also I have the kubelet --hostname-override=10.0.1.124 in my kubelet config.
Okay I found out the workaround for this issue. You can follow the following steps:
kubectl -n kube-system get cm kubeadm-config -o jsonpath={.data} > config.yaml
In config.yaml, add nodeRegistration with your node name:
kubernetesVersion: v1.11.0
networking:
dnsDomain: cluster.local
podSubnet: 192.168.12.0/24
serviceSubnet: 10.96.0.0/12
nodeRegistration:
name: 10.0.1.124
Now run kubeadm upgrade with the config file you have updated and it works like charm.
`[root@ip-10-0-1-124 centos]# kubeadm upgrade apply v1.12.3 --config config.yaml
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration options from a file: config.yaml
[upgrade/apply] Respecting the --cri-socket flag that is set with higher priority than the config file.
[upgrade/version] You have chosen to change the cluster version to "v1.12.3"
[upgrade/versions] Cluster version: v1.11.0
[upgrade/versions] kubeadm version: v1.12.3
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.12.3"...
Static pod: kube-apiserver-10.0.1.124 hash: 316d1d5d7fd74b652b7bbae02ab843cc
Static pod: kube-controller-manager-10.0.1.124 hash: 587acc27d61cd1cc7819be5de5a9933f
Static pod: kube-scheduler-10.0.1.124 hash: 31eabaff7d89a40d8f7e05dfc971cdbd
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests082154743"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests082154743/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests082154743/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests082154743/kube-scheduler.yaml"
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2019-09-03-06-51-10/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s
Static pod: kube-apiserver-10.0.1.124 hash: 316d1d5d7fd74b652b7bbae02ab843cc
Static pod: kube-apiserver-10.0.1.124 hash: 316d1d5d7fd74b652b7bbae02ab843cc
Static pod: kube-scheduler-10.0.1.124 hash: 31eabaff7d89a40d8f7e05dfc971cdbd
Static pod: kube-scheduler-10.0.1.124 hash: 633222d4d708cfc5014fefd8c72f9402
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.12" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "10.0.1.124" as an annotation
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.12.3". Enjoy!
`
unfortunately 1.11 is no longer maintained by the kubeadm team.
if you see a bug with 1.13 - >1.14 upgrades please let us know.
Most helpful comment
I hit this today on multiple clusters.
After digging a bit I noted that you need to pass the
kube-apiserver-HOSTNAMEto theInitConfigurationof your kubeadm config, reusing the old configuration will not work and doing akubeadm config viewseems to be broken in this version as per #1174 .Here's the configuration I'm using to have this working:
The code referring to this is here: https://github.com/kubernetes/kubernetes/blob/4ed3216f3ec431b140b1d899130a69fc671678f4/cmd/kubeadm/app/phases/upgrade/staticpods.go#L394