Environment:
OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
Linux 4.15.0-91-generic x86_64
NAME="Ubuntu"
VERSION="16.04.6 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.6 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
Version of Ansible (ansible --version):
ansible 2.7.7
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/home/user/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
executable location = /usr/local/bin/ansible
python version = 2.7.12 (default, Oct 8 2019, 14:14:10) [GCC 5.4.0 20160609]
Version of Python (python --version):
Python 2.7.12
Kubespray version (commit) (git rev-parse --short HEAD):
a4e65c7
Network plugin used:
default, calico
Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):
cat inventory/mycluster/hosts.ini
[all]
host1 ansible_host=host1 ansible_user=ops ip=192.168.0.2
host2 ansible_host=host2 ansible_user=ops ip=192.168.0.3
[kube-master]
host1
host2
[etcd]
host1
[kube-node]
host1
host2
[k8s-cluster:children]
kube-master
kube-node
[calico-rr]
Command used to invoke ansible:
ansible-playbook -i inventory/mycluster/hosts.ini --become --become-user=root cluster.yml
Output of ansible run:
TASK [kubernetes-apps/cluster_roles : Kubernetes Apps | Wait for kube-apiserver] *****
Thursday 19 March 2020 02:20:14 -0700 (0:00:00.266) 0:02:31.721 **
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (10 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (9 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (8 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (7 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (6 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (5 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (4 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (3 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (2 retries left).
FAILED - RETRYING: Kubernetes Apps | Wait for kube-apiserver (1 retries left).
fatal: [host1]: FAILED! => {"attempts": 10, "changed": false, "content": "", "msg": "Status code was -1 and not [200]: Request failed:
Anything else do we need to know:
ansible only fails once it reaches: TASK [kubernetes-apps/cluster_roles : Kubernetes Apps | Wait for kube-apiserver]
manually trying to run kubeadm init reveals the issue:
sudo timeout -k 600s 600s /usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all
[init] Using Kubernetes version: v1.13.3
[preflight] Running pre-flight checks
[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[WARNING Port-10250]: Port 10250 is in use
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/ssl"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate authority generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation
[certs] External etcd mode: Skipping etcd/peer certificate authority generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing up-to-date kubeconfig file: "/etc/kubernetes/scheduler.conf"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[controlplane] Adding extra host path mount "usr-share-ca-certificates" to "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[controlplane] Adding extra host path mount "usr-share-ca-certificates" to "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[controlplane] Adding extra host path mount "usr-share-ca-certificates" to "kube-apiserver"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 5m0s
[apiclient] All control plane components are healthy after 6.501441 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "host1" as an annotation
[mark-control-plane] Marking the node host1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node host1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: sbv5tb.bx4ccrwtnbt5r7om
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
error execution phase addon/coredns: unable to update deployment: Deployment.apps "coredns" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"k8s-app":"kube-dns"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
I think the original version of kubespray for this installation was: 6b3f7306a416bff5b3a55d71a4a407f021c72830
I did find this bug report, which might be related: https://gitmemory.com/issue/kubernetes/kubeadm/1499/482033556. Although, it is not clear on how to resolve the problem.
@dabest1 May I ask why are you trying to install such an old k8s version with an old spray codebase ?
This was indeed change to kube-dns in newer release.
It was installed a long time ago, and now I was trying to move to a newer version of kubespray, but it broke the k8s cluster.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
I did find this bug report, which might be related: https://gitmemory.com/issue/kubernetes/kubeadm/1499/482033556. Although, it is not clear on how to resolve the problem.