Environment:
Cloud provider: AWS
OS : centos LINUX 7
Version of Ansible: 2.7.10
Kubespray version: Master Tag: 2.9.0
Trying to install kubernetes on aws using kubespray. V2.9.0
freezing on tasks such as initialize first master and join to cluster ..
Logging to worker-nodes I found the below erro
unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.crt: no such file or directory
i have same issue...
all the node have same config, pkg version...
this issue cause fail to join kubeadm cluster join..
I can confirm, having the same issue when trying to join the cluster by node running Ubuntu 18.04, bare metal
Had same issue, firewall on master node was blocking incoming traffic so that the nodes could not connect to the master.
Disabling firewall fixed the issue.
Disabling firewall fixed the issue.
How can disable firewall fix a missing file?
Thx @IduVlad for sharing what had helped you. Unfortunately, in my case, there is no firewall, both hosts are on the same (intranet) subnet and communicate over local IPs. I am kind of lost, I do not know how to tackle this problem
@bclermont I didn't checked the code but I assume there is an openssl job on nodes which imports the certificate from the master. In this case if it cannot reach the master endpoint (blocked by the firewall) it won't create the missing files.
My troubleshooting scenario:
Was anyone whose firewall was disabled able to fix this?
curl <master>
is never going to work because master doesn't listen on port 80.
same issue...
There is no firewall, all nodes are on the same subnet.
Don't know how to solve this problem
does /etc/kubernetes/ssl/ca.crt exists in /etc/kubernetes/pki/ca.crt ?
does
/etc/kubernetes/ssl/ca.crtexists in/etc/kubernetes/pki/ca.crt?
/etc/kubernetes/pki is a symlink to /etc/kubernetes/ssl, so it's empty
I had the same issue and I can confirm it's a firewall issue.
ca.crt was not copied on nodes because I had firewalld on master node/s
After disabling the firewall I ran again playbook and ca.crt was propagated on the nodes and they were able to join the cluster.
You can check firewalld status with
systemctl status firewalld
And then stop ip:
systemctl stop firewalld
It would be nice if playbook check if there's any firewall preventiving certs to be propagated and exit with a clear error.
Im also facing the same issues while trying to setup k8s in fc30
[root@localhost kubelet]# systemctl status kubelet.service
â—Ź kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2020-01-13 23:01:57 IST; 2s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Process: 10229 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255/EXCEPTION)
Main PID: 10229 (code=exited, status=255/EXCEPTION)
CPU: 155ms
Jan 13 23:01:57 localhost.localdomain kubelet[10229]: F0113 23:01:57.664572 10229 server.go:253] unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or direct>
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Consumed 155ms CPU time.
...skipping...
â—Ź kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2020-01-13 23:01:57 IST; 2s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Process: 10229 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255/EXCEPTION)
Main PID: 10229 (code=exited, status=255/EXCEPTION)
CPU: 155ms
Jan 13 23:01:57 localhost.localdomain kubelet[10229]: F0113 23:01:57.664572 10229 server.go:253] unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or direct>
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Consumed 155ms CPU time.
~
~
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
--upload-certs
kubeadm init --config=kubeadm-config.yaml --upload-certs
this is kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
controlPlaneEndpoint: "192.168.170.188:6443"
networking:
# This CIDR is a Calico default. Substitute or remove for your CNI provider.
podSubnet: "172.22.0.0/16"
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
@freesaber how can it be added in KubeSpray
@fejta-bot @k8s-ci-robot please reopen the issue
Most helpful comment
I had the same issue and I can confirm it's a firewall issue.
ca.crt was not copied on nodes because I had firewalld on master node/s
After disabling the firewall I ran again playbook and ca.crt was propagated on the nodes and they were able to join the cluster.
You can check firewalld status with
And then stop ip:
It would be nice if playbook check if there's any firewall preventiving certs to be propagated and exit with a clear error.