Kubespray: Installation fail: kubeadm [Initialize first master]

Created on 3 Sep 2019  ·  31Comments  ·  Source: kubernetes-sigs/kubespray

Environment:

  • Cloud provider or hardware configuration:
    AWS
  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    coreos

  • Version of Ansible (ansible --version):
    ansible 2.7.8

Kubespray version (commit) (git rev-parse --short HEAD):
git checkout release-2.10
git checkout release-2.11

Network plugin used:
cilium 1.3.7

Copy of your inventory file:
inventory.zip

Command used to invoke ansible:

ssh-add  ~/.ssh/tempprivate
eval "$(ssh-agent -s)"
cd contrib/terraform/aws
vi terraform.tfvars
terraform init
terraform apply -var-file=credentials.tfvars
ansible-playbook -i ./inventory/hosts ./cluster.yml -e ansible_ssh_user=core -e bootstrap_os=coreos -b --become-user=root --flush-cache -e ansible_user=core

Output of ansible run:

image

Error
TASK [kubernetes/master : kubeadm | Initialize first master] ************************************
Tuesday 03 September 2019 07:14:25 +0000 (0:00:00.520) 0:22:02.910

FAILED - RETRYING: kubeadm | Initialize first master (3 retries left).
FAILED - RETRYING: kubeadm | Initialize first master (2 retries left).
FAILED - RETRYING: kubeadm | Initialize first master (1 retries left).
fatal: [kubernetes-dev0210john0903-master0]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "600s", "600s", "/opt/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--skip-phases=addon/coredns", "--experimental-upload-certs", "--certificate-key=ecabe44f2d9ce1b2edbb702c8a9c77d5c84bb9cb4da05eb42fcba3dfe4ec5b5e"], "delta": "0:02:02.449063", "end": "2019-09-03 07:23:13.971380", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2019-09-03 07:21:11.522317", "stderr": "\t[WARNING Port-6443]: Port 6443 is in use\n\t[WARNING Port-10251]: Port 10251 is in use\n\t[WARNING Port-10252]: Port 10252 is in use\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING Port-10250]: Port 10250 is in use\nerror execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition", "stderr_lines": ["\t[WARNING Port-6443]: Port 6443 is in use", "\t[WARNING Port-10251]: Port 10251 is in use", "\t[WARNING Port-10252]: Port 10252 is in use", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING Port-10250]: Port 10250 is in use", "error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition"], "stdout": "[init] Using Kubernetes version: v1.14.6\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Activating the kubelet service\n[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"\n[certs] Using existing ca certificate authority\n[certs] Using existing apiserver certificate and key on disk\n[certs] Using existing apiserver-kubelet-client certificate and key on disk\n[certs] Using existing front-proxy-ca certificate authority\n[certs] Using existing front-proxy-client certificate and key on disk\n[certs] External etcd mode: Skipping etcd/ca certificate authority generation\n[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation\n[certs] External etcd mode: Skipping etcd/server certificate authority generation\n[certs] External etcd mode: Skipping etcd/peer certificate authority generation\n[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation\n[certs] Using the existing \"sa\" key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s\n[apiclient] All control plane components are healthy after 0.010987 seconds\n[upload-config] storing the configuration used in ConfigMap \"kubeadm-config\" in the \"kube-system\" Namespace\n[kubelet] Creating a ConfigMap \"kubelet-config-1.14\" in namespace kube-system with the configuration for the kubelets in the cluster\n[kubelet-check] Initial timeout of 40s passed.", "stdout_lines": ["[init] Using Kubernetes version: v1.14.6", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Activating the kubelet service", "[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"", "[certs] Using existing ca certificate authority", "[certs] Using existing apiserver certificate and key on disk", "[certs] Using existing apiserver-kubelet-client certificate and key on disk", "[certs] Using existing front-proxy-ca certificate authority", "[certs] Using existing front-proxy-client certificate and key on disk", "[certs] External etcd mode: Skipping etcd/ca certificate authority generation", "[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation", "[certs] External etcd mode: Skipping etcd/server certificate authority generation", "[certs] External etcd mode: Skipping etcd/peer certificate authority generation", "[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation", "[certs] Using the existing \"sa\" key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s", "[apiclient] All control plane components are healthy after 0.010987 seconds", "[upload-config] storing the configuration used in ConfigMap \"kubeadm-config\" in the \"kube-system\" Namespace", "[kubelet] Creating a ConfigMap \"kubelet-config-1.14\" in namespace kube-system with the configuration for the kubelets in the cluster", "[kubelet-check] Initial timeout of 40s passed."]}

Anything else do we need to know:
Raise both on:
release-2.10
release-2.11

Previously, seems runs correctly for some version, but now always fail.

Is below the root reason?
WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd".

kinbug

Most helpful comment

I'm really sorry, but in this case it was my own issue, I've firewalled balancer host half an year ago and forgot about it, master hosts couldn't initialize b/c they didn't have connect to balancer.

All 31 comments

Do your nodes already have kubernetes services? The ports are already in use.
It seems like some people already got the same problems here: https://github.com/kubernetes/kubeadm/issues/1438
Maybe you could find some help there.

same problem on openstack

I received this kind of error yesterday and I understood the error can't be one of the warnings you can find in the logs, since it's only warnings and mostly because the error is thrown after the timeout is reached (and 3 retries).

In my case, I had an issue with the kubeadm join command, but the way I handled the issue could be the same for you.
You can try to manually perform the kubeadm init command with the verbose option (-v 5 for example):

# This is the command Ansible is trying to run in your original post
timeout -k 600s 600s \
  /opt/bin/kubeadm init \
  --config=/etc/kubernetes/kubeadm-config.yaml \
  --ignore-preflight-errors=all \
  --skip-phases=addon/coredns \
  --experimental-upload-certs \
  --certificate-key=ecabe44f2d9ce1b2edbb702c8a9c77d5c84bb9cb4da05eb42fcba3dfe4ec5b5e \
  -v 5

This should give you some hints.

Hi,

Had the same issue. In my case it was a IP<>Address problem in group_vars/all/all.yml

## External LB example config
apiserver_loadbalancer_domain_name: "kubernetes.tld"
loadbalancer_apiserver:
  address: 10.1.10.127
  port: 443

kubernetes.tld should match 10.1.10.127. So either update the DNS record or change the variable.

Hope it will help someone !

any updates? have the same issue with cloud_provider: aws, the same scenario (

same problem on openstack. I've tried the above but the install does not finish the TASK [kubernetes/master : kubeadm | Initialize first master] . Any help would be appreciated.

same problem on openstack. I've tried the above but the install does not finish the TASK [kubernetes/master : kubeadm | Initialize first master] . Any help would be appreciated.

try to add
kubelet_cgroup_driver: "cgroupfs"
to group_vars/k8s-cluster/k8s-cluster.yaml

same problem on openstack. I've tried the above but the install does not finish the TASK [kubernetes/master : kubeadm | Initialize first master] . Any help would be appreciated.

try to add
kubelet_cgroup_driver: "cgroupfs"
to group_vars/k8s-cluster/k8s-cluster.yaml

doesn't help in case AWS, still ports in use and [kubernetes/master : kubeadm | Initialize first master] failed

same problem on openstack. I've tried the above but the install does not finish the TASK [kubernetes/master : kubeadm | Initialize first master] . Any help would be appreciated.

try to add
kubelet_cgroup_driver: "cgroupfs"
to group_vars/k8s-cluster/k8s-cluster.yaml

Still the same issue on OpenStack.

FAILED - RETRYING: kubeadm | Initialize first master (2 retries left).Result was: { "attempts": 2, "changed": true, "cmd": [ "timeout", "-k", "300s", "300s", "/usr/local/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--skip-phases=addon/coredns", "--upload-certs" ], "delta": "0:05:00.007142", "end": "2019-10-02 15:31:43.120801", "failed_when_result": true, "invocation": { "module_args": { "_raw_params": "timeout -k 300s 300s /usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=all --skip-phases=addon/coredns --upload-certs ", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true } }, "msg": "non-zero return code", "rc": 124, "retries": 4, "start": "2019-10-02 15:26:43.113659", "stderr": "\t[WARNING Port-10251]: Port 10251 is in use\n\t[WARNING Port-10252]: Port 10252 is in use\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists\n\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists\n\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/\n\t[WARNING Port-10250]: Port 10250 is in use", "stderr_lines": [ "\t[WARNING Port-10251]: Port 10251 is in use", "\t[WARNING Port-10252]: Port 10252 is in use", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists", "\t[WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists", "\t[WARNING IsDockerSystemdCheck]: detected \"cgroupfs\" as the Docker cgroup driver. The recommended driver is \"systemd\". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "\t[WARNING Port-10250]: Port 10250 is in use" ], "stdout": "[init] Using Kubernetes version: v1.15.3\n[preflight] Running pre-flight checks\n[preflight] Pulling images required for setting up a Kubernetes cluster\n[preflight] This might take a minute or two, depending on the speed of your internet connection\n[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'\n[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"\n[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"\n[kubelet-start] Activating the kubelet service\n[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"\n[certs] Using existing front-proxy-ca certificate authority\n[certs] Using existing front-proxy-client certificate and key on disk\n[certs] External etcd mode: Skipping etcd/ca certificate authority generation\n[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation\n[certs] External etcd mode: Skipping etcd/server certificate authority generation\n[certs] External etcd mode: Skipping etcd/peer certificate authority generation\n[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation\n[certs] Using existing ca certificate authority\n[certs] Using existing apiserver certificate and key on disk\n[certs] Using existing apiserver-kubelet-client certificate and key on disk\n[certs] Using the existing \"sa\" key\n[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"\n[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"\n[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"\n[control-plane] Creating static Pod manifest for \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-controller-manager\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[control-plane] Creating static Pod manifest for \"kube-scheduler\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"\n[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"\n[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s\n[kubelet-check] Initial timeout of 40s passed.", "stdout_lines": [ "[init] Using Kubernetes version: v1.15.3", "[preflight] Running pre-flight checks", "[preflight] Pulling images required for setting up a Kubernetes cluster", "[preflight] This might take a minute or two, depending on the speed of your internet connection", "[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'", "[kubelet-start] Writing kubelet environment file with flags to file \"/var/lib/kubelet/kubeadm-flags.env\"", "[kubelet-start] Writing kubelet configuration to file \"/var/lib/kubelet/config.yaml\"", "[kubelet-start] Activating the kubelet service", "[certs] Using certificateDir folder \"/etc/kubernetes/ssl\"", "[certs] Using existing front-proxy-ca certificate authority", "[certs] Using existing front-proxy-client certificate and key on disk", "[certs] External etcd mode: Skipping etcd/ca certificate authority generation", "[certs] External etcd mode: Skipping etcd/healthcheck-client certificate authority generation", "[certs] External etcd mode: Skipping etcd/server certificate authority generation", "[certs] External etcd mode: Skipping etcd/peer certificate authority generation", "[certs] External etcd mode: Skipping apiserver-etcd-client certificate authority generation", "[certs] Using existing ca certificate authority", "[certs] Using existing apiserver certificate and key on disk", "[certs] Using existing apiserver-kubelet-client certificate and key on disk", "[certs] Using the existing \"sa\" key", "[kubeconfig] Using kubeconfig folder \"/etc/kubernetes\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/admin.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/kubelet.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/controller-manager.conf\"", "[kubeconfig] Using existing kubeconfig file: \"/etc/kubernetes/scheduler.conf\"", "[control-plane] Using manifest folder \"/etc/kubernetes/manifests\"", "[control-plane] Creating static Pod manifest for \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-controller-manager\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[control-plane] Creating static Pod manifest for \"kube-scheduler\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"usr-share-ca-certificates\" to \"kube-apiserver\"", "[controlplane] Adding extra host path mount \"cloud-config\" to \"kube-controller-manager\"", "[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory \"/etc/kubernetes/manifests\". This can take up to 5m0s", "[kubelet-check] Initial timeout of 40s passed." ] }

@rstriedl5c you know, i faced with lot of problem with kubespray+terraform+openstack.
for example: init cant be success because of master cant connect to openstack - cant resolve hostname.
try to connect to master via SSH and do journalctl -u kubelet - you will see why kubelet can't start.

in future you will face with problems in openstack like - insufficient rules in SG groups and so on...

@ppcololo Thanks for the information.

What's odd it can't talk the api endpoint on 6443 on my master. I've open the world to the security group. see logs below. I'm trying to use flannel vs calico to start as the cni.

Also you can see my /etc/hosts file is updated with private ip's but not the floating ip's.

# Ansible inventory hosts BEGIN
10.0.0.1 my-cluster-master-nf-1.k8s-os-lab.cluster.local my-cluster-master-nf-1
10.0.0.2 my-cluster-master-nf-2.k8s-os-lab.cluster.local my-cluster-master-nf-2
10.0.0.3 my-cluster-master-nf-3.k8s-os-lab.cluster.local my-cluster-master-nf-3
10.0.0.4 my-cluster-node-nf-1.k8s-os-lab.cluster.local my-cluster-node-nf-1
10.0.0.5 my-cluster-node-nf-2.k8s-os-lab.cluster.local my-cluster-node-nf-2
# Ansible inventory hosts END

Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.081591 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.0.1:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-os-la Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.116793 28559 kubelet.go:2169] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.163476 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.264056 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.281876 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://10.0.0.1:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-os-lab-k8s- Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.364658 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.431705 28559 controller.go:125] failed to ensure node lease exists, will retry in 7s, error: Get https://10.0.0.1:6443/apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/ Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.464956 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.481289 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://10.0.0.1:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.0.5 Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.565172 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.665444 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.681411 28559 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Get https://10.0.0.1:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&res Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.765792 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.866077 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.881913 28559 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://10.0.0.1:6443/apis/storage.k8s.io/v1beta1/csidrivers?limit=500&resourc Oct 02 17:09:14 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:14.966325 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.066548 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.082374 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.0.1:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-os-la Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.166787 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.267103 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.283370 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://10.0.0.1:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-os-lab-k8s- Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.367429 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.467807 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.482494 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://10.0.0.1:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.0.5 Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.568113 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.668455 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: I1002 17:09:15.676684 28559 kubelet_node_status.go:286] Setting node annotation to enable volume controller attach/detach Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: I1002 17:09:15.676988 28559 setters.go:73] Using node IP: "10.0.0.1" Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: I1002 17:09:15.680066 28559 kubelet_node_status.go:471] Recording NodeHasSufficientMemory event message for node my-cluster-master-nf-1 Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: I1002 17:09:15.680122 28559 kubelet_node_status.go:471] Recording NodeHasNoDiskPressure event message for node my-cluster-master-nf-1 Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: I1002 17:09:15.680140 28559 kubelet_node_status.go:471] Recording NodeHasSufficientPID event message for node my-cluster-master-nf-1 Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.681140 28559 pod_workers.go:190] Error syncing pod bcb3aff273a63df587968bf0c241649e ("kube-apiserver-my-cluster-master-nf-1_kube-system(bcb3aff273a63df587968bf0c241649e)"), skipping: Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.682221 28559 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Get https://10.0.0.1:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&res Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.768769 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.869116 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.882533 28559 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://10.0.0.1:6443/apis/storage.k8s.io/v1beta1/csidrivers?limit=500&resourc Oct 02 17:09:15 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:15.969459 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.029084 28559 event.go:249] Unable to write event: 'Patch https://10.0.0.1:6443/api/v1/namespaces/default/events/my-cluster-master-nf-1.15c9de871039b848: dial tcp 10.0.0.1:6443: conne Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.070139 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.083251 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.0.1:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-os-la Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.170395 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.270668 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.284576 28559 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://10.0.0.1:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-os-lab-k8s- Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.371004 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found Oct 02 17:09:16 my-cluster-master-nf-1 kubelet[28559]: E1002 17:09:16.471221 28559 kubelet.go:2248] node "my-cluster-master-nf-1" not found

do you use flannel cni?
i had this problem. try to check like this way: create file on ur master1 /etc/cni/net.d/10-flannel.conflist
with content:

{
  "name": "cbr0",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

and then see journal and systemctl status kubelet

@ppcololo I did not have this file, I've added to my master. I don't see any change to my journal or kubelet status. Do I need to run the ansible playbook?

what network plugin do you use? flannel/calico/canal?
after adding file send kubelet status and journal last logs. and mb try to restart kubelet systemctl restart kubelet

@ppcololo I'm using flannel. I did the restart of kubelet below.

```$ sudo systemctl status kubelet
● kubelet.service - Kubernetes Kubelet Server
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2019-10-02 20:46:59 UTC; 1min 14s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Main PID: 6481 (kubelet)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/kubelet.service
└─6481 /usr/local/bin/kubelet --logtostderr=true --v=2 --node-ip=10.0.0.1 --hostname-override=my-cluster-master-nf-1 --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/etc/kubernetes/kubelet-config.yaml --kubeconfig=/etc/kubernetes

Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.567131 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found
Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.667356 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found
Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.725590 6481 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://10.0.0.1:6443/api/v1/pods?fieldSelector=spec.nodeName%3D-k8s-os-lab
Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.767684 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found
Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.868096 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found
Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.925268 6481 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Get https://10.0.0.1:6443/apis/node.k8s.io/v1beta1/runtimeclasses?limit=500&reso
Oct 02 20:48:12 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:12.968432 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found
Oct 02 20:48:13 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:13.068759 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found
Oct 02 20:48:13 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:13.125289 6481 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://10.0.0.1:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.0.0.1
Oct 02 20:48:13 my-cluster-master-nf-1 kubelet[6481]: E1002 20:48:13.168976 6481 kubelet.go:2248] node "my-cluster-master-nf-1" not found```

@rstriedl5c kubelet running
but need to see full journalctl -u kubelet
and just for fun try to use canal driver - with this network plugin I have no problem on openstack

@olehhordiienko Thanks. We are both using OpenStack. The fix appears to be for AWS.

@rstriedl5c all changes i made:
in k8s-cluster.yml:

kube_version: v1.14.5
kube_network_plugin: canal
resolvconf_mode: host_resolvconf
node_volume_attach_limit: 26
kubelet_cgroup_driver: "cgroupfs"

all.yml

cloud_provider: openstack
upstream_dns_servers:
  - x.x.x.x
  - x.x.x.x
  • SG rules in compute module (bcs some check fails without this rules)
resource "openstack_networking_secgroup_rule_v2" "k8s_icmp" {
  direction         = "ingress"
  ethertype         = "IPv4"
  protocol          = "icmp"
  port_range_min    = "0"
  port_range_max    = "0"
  remote_ip_prefix  = "0.0.0.0/0"
  security_group_id = "${openstack_networking_secgroup_v2.k8s.id}"
}

resource "openstack_networking_secgroup_rule_v2" "k8s_master_etcd" {
  direction         = "ingress"
  ethertype         = "IPv4"
  protocol          = "tcp"
  port_range_min    = "2370"
  port_range_max    = "2380"
  remote_ip_prefix  = "0.0.0.0/0"
  security_group_id = "${openstack_networking_secgroup_v2.k8s_master.id}"
}

resource "openstack_networking_secgroup_rule_v2" "k8s_master_kube" {
  direction         = "ingress"
  ethertype         = "IPv4"
  protocol          = "tcp"
  port_range_min    = "10240"
  port_range_max    = "10260"
  remote_ip_prefix  = "0.0.0.0/0"
  security_group_id = "${openstack_networking_secgroup_v2.k8s_master.id}"

now I have well installed k8s cluster in openstack. but you know - i tried another tool (kops) and got cluster without a lot of pain (they added openstack support with LB)

@ppcololo I believe you're using CentOS, correct?

here are my configs since, I'm using Ubuntu.

# Can be docker_dns, host_resolvconf or none
# Default:
resolvconf_mode: docker_dns
# For Container Linux by CoreOS:
# resolvconf_mode: host_resolvconf```

in my k8s-cluster.yml:

kube_version: v1.14.5
kube_network_plugin: flannel
resolvconf_mode: host_resolvconf
node_volume_attach_limit: 26
kubelet_cgroup_driver: "cgroupfs"

I've added the SG rules in compute module.

I've used Kops on AWS, and other clouds before. Can you send me the Kops command you ran to create the OpenStack cluster? Plus any other things to consider with Kops OpenStack K8s cluster. Thanks in advance.

I will try the above Kubespray changes and let you know if it works for me. Thanks again.

@ppcololo silly question, how you generating your inventory.ini file? Does your's look similar to the following....I'm trying to setup GlusterFS nodes too. At this time, not using a bastion host.


# ## Configure 'ip' variable to bind kubernetes services on a
# ## different ip than the default iface
# ## We should set etcd_member_name for etcd cluster. The node that is not a etcd member do not need to set the value, or can set the empty string value.
[all]
my-cluster-k8s-master-nf-1 ansible_host={floating_ip} ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.7 etcd_member_name=etcd1
my-cluster-k8s-master-nf-2 ansible_host={floating_ip} ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.8 etcd_member_name=etcd2
my-cluster-k8s-master-nf-3 ansible_host={floating_ip} ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.9 etcd_member_name=etcd3
my-cluster-k8s-node-nf-1 ansible_host={floating_ip} ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.10
my-cluster-k8s-node-nf-2 ansible_host={floating_ip} ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.11
my-cluster-gfs-node-nf-1 ansible_host=10.0.0.3 ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.3
my-cluster-gfs-node-nf-2 ansible_host=10.0.0.4 ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.4
my-cluster-gfs-node-nf-3 ansible_host=10.0.0.5 ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo ip=10.0.0.5

[all:vars]
ansible_python_interpreter=/usr/bin/python3

# ## configure a bastion host if your nodes are not directly reachable
# bastion ansible_host=x.x.x.x ansible_user=some_user
# [bastion]
# my-cluster-bastion-1 ansible_host={floating_ip} ip=10.0.0.7 ansible_become=yes ansible_user=ubuntu ansible_become_method=sudo

# [bastion:vars]
# ansible_python_interpreter=/usr/bin/python3

[kube-master]
my-cluster-k8s-master-nf-1
my-cluster-k8s-master-nf-2
my-cluster-k8s-master-nf-3

[kube-master:vars]
# ansible_ssh_extra_args="-o StrictHostKeyChecking=no"
ansible_python_interpreter=/usr/bin/python3

[kube-node]
my-cluster-k8s-node-nf-1
my-cluster-k8s-node-nf-2

[kube-node:vars]
# ansible_ssh_extra_args="-o StrictHostKeyChecking=no"
ansible_python_interpreter=/usr/bin/python3

[etcd]
my-cluster-k8s-master-nf-1
my-cluster-k8s-master-nf-2
my-cluster-k8s-master-nf-3

[etcd:vars]
ansible_python_interpreter=/usr/bin/python3

[gfs-cluster]
my-cluster-gfs-node-nf-1
my-cluster-gfs-node-nf-2
my-cluster-gfs-node-nf-3

[gfs-cluster:vars]
ansible_python_interpreter=/usr/bin/python3

[network-storage]
my-cluster-gfs-node-nf-1
my-cluster-gfs-node-nf-2
my-cluster-gfs-node-nf-3

[network-storage:vars]
ansible_python_interpreter=/usr/bin/python3

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node
calico-rr

@rstriedl5c kubespray uses python script which parse terraform.tfstate file - this is inventory for ansible, i dont have another.
when you deploy VM's via terraform hosts have groups name in metadata. that's how ansible knows in which group each host is.
I can share command for kops, but not here - bcs it's kubespray repo and issue :)

hello,

i got very similar issue with kubespray with first master results every time error with the following error. i have the hosts on DNS and i can ping . also kubelet service is error due to files missing (/etc/kubernetes/ssl/ca.crt:)

fatal: [lvpaldbsvm28]: FAILED! => {"attempts": 3, "changed": true, "cmd": ["timeout", "-k", "300s", "300s", "/usr/local/bin/kubeadm", "init", "--config=/etc/kubernetes/kubeadm-config.yaml", "--ignore-preflight-errors=all", "--skip-phases=addon/coredns", "--upload-certs"], "delta": "0:00:00.108208", "end": "2019-10-20 15:59:24.267983", "failed_when_result": true, "msg": "non-zero return code", "rc": 3, "start": "2019-10-20 15:59:24.159775", "stderr": "[apiServer.certSANs: Invalid value: \"lvpaldbsvm28.pal.sap.corp\u00a0\": altname is not a valid IP address, DNS label or a DNS label

kubelet service shows the following error.

erver.go:251] unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.crt: no such ... or directory
Hint: Some lines were ellipsized, use -l to show in full.

@johnzheng1975

Were you able to resolve the error that caused Failure on TASK [kubernetes/master : kubeadm | Initialize first master]
I'm have a similar error on my Kubernetes install. I create a Github issue for the same https://github.com/kubernetes-sigs/kubespray/issues/5404.

Please let us know, it might help me with my issue. Thanks

Hit with the same but only on my second and third test install. I'll see if i can find what config option i'm using which triggers it.

any news here?

any news here?

You have a similar problem?

Yeah, I've found that if I disable loadbalancer_apiserver option

#loadbalancer_apiserver:
#  address: 1.2.3.4
#  port: 443

setup is going successfully.
And setup is stucking when I uncomment it(but it 100% worked half an year ago).

I use the latest release of kubespray from github and kubernetes 1.16.6

I'm really sorry, but in this case it was my own issue, I've firewalled balancer host half an year ago and forgot about it, master hosts couldn't initialize b/c they didn't have connect to balancer.

Close it since this is old issue

I'm really sorry, but in this case it was my own issue, I've firewalled balancer host half an year ago and forgot about it, master hosts couldn't initialize b/c they didn't have connect to balancer.

I believe kubespray scripts ought to make sure that all hosts (including the load balancer where an external one is being used) are reachable. In my case, I getting this same error using KVM VM's and in my case, the load balancer VM was stuck during boot waiting to do a fsck (or ctrl+D) in order to complete boot up.

Was this page helpful?
0 / 5 - 0 ratings