Kubespray: unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.crt

Created on 2 May 2019 · 18Comments · Source: kubernetes-sigs/kubespray

Environment:

Cloud provider: AWS
OS : centos LINUX 7
Version of Ansible: 2.7.10
Kubespray version: Master Tag: 2.9.0

Trying to install kubernetes on aws using kubespray. V2.9.0
freezing on tasks such as initialize first master and join to cluster ..

Logging to worker-nodes I found the below erro
unable to load client CA file /etc/kubernetes/ssl/ca.crt: open /etc/kubernetes/ssl/ca.crt: no such file or directory

kinfailing-test lifecyclrotten

Source

nadimm90

👍16

Most helpful comment

I had the same issue and I can confirm it's a firewall issue.
ca.crt was not copied on nodes because I had firewalld on master node/s
After disabling the firewall I ran again playbook and ca.crt was propagated on the nodes and they were able to join the cluster.

You can check firewalld status with

systemctl status firewalld

And then stop ip:

systemctl stop firewalld

It would be nice if playbook check if there's any firewall preventiving certs to be propagated and exit with a clear error.

agarbato on 2 Nov 2019

👍3

All 18 comments

i have same issue...
all the node have same config, pkg version...
this issue cause fail to join kubeadm cluster join..

teramucho on 30 May 2019

I can confirm, having the same issue when trying to join the cluster by node running Ubuntu 18.04, bare metal

matfiz on 19 Jun 2019

Had same issue, firewall on master node was blocking incoming traffic so that the nodes could not connect to the master.
Disabling firewall fixed the issue.

IduVlad on 20 Jun 2019

👍1

Disabling firewall fixed the issue.

How can disable firewall fix a missing file?

KeepMasterBranch on 24 Jun 2019

Thx @IduVlad for sharing what had helped you. Unfortunately, in my case, there is no firewall, both hosts are on the same (intranet) subnet and communicate over local IPs. I am kind of lost, I do not know how to tackle this problem

matfiz on 24 Jun 2019

@bclermont I didn't checked the code but I assume there is an openssl job on nodes which imports the certificate from the master. In this case if it cannot reach the master endpoint (blocked by the firewall) it won't create the missing files.

My troubleshooting scenario:

from nodes I issued curl "{k8s-master-endpoint}" -> was unable to reach it
disabling firewall -> curl worked and ca.crt files on nodes were created on 2nd attempt

IduVlad on 24 Jun 2019

Was anyone whose firewall was disabled able to fix this?
curl <master>
is never going to work because master doesn't listen on port 80.

holmesb on 30 Aug 2019

same issue...
There is no firewall, all nodes are on the same subnet.
Don't know how to solve this problem

Mannarok on 6 Sep 2019

does /etc/kubernetes/ssl/ca.crt exists in /etc/kubernetes/pki/ca.crt ?

woopstar on 6 Sep 2019

does /etc/kubernetes/ssl/ca.crt exists in /etc/kubernetes/pki/ca.crt ?

/etc/kubernetes/pki is a symlink to /etc/kubernetes/ssl, so it's empty

Mannarok on 6 Sep 2019

You can check firewalld status with

systemctl status firewalld

And then stop ip:

systemctl stop firewalld

It would be nice if playbook check if there's any firewall preventiving certs to be propagated and exit with a clear error.

agarbato on 2 Nov 2019

👍3

Im also facing the same issues while trying to setup k8s in fc30

[root@localhost kubelet]# systemctl status kubelet.service
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2020-01-13 23:01:57 IST; 2s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Process: 10229 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255/EXCEPTION)
Main PID: 10229 (code=exited, status=255/EXCEPTION)
CPU: 155ms

Jan 13 23:01:57 localhost.localdomain kubelet[10229]: F0113 23:01:57.664572 10229 server.go:253] unable to load client CA file /etc/kubernetes/pki/ca.crt: open /etc/kubernetes/pki/ca.crt: no such file or direct>
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Main process exited, code=exited, status=255/EXCEPTION
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Failed with result 'exit-code'.
Jan 13 23:01:57 localhost.localdomain systemd[1]: kubelet.service: Consumed 155ms CPU time.
...skipping...
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2020-01-13 23:01:57 IST; 2s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Process: 10229 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_AUTHZ_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255/EXCEPTION)
Main PID: 10229 (code=exited, status=255/EXCEPTION)
CPU: 155ms

Bhuvan26 on 13 Jan 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 12 Apr 2020

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 12 May 2020

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 11 Jun 2020

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 11 Jun 2020

--upload-certs

kubeadm init --config=kubeadm-config.yaml --upload-certs
this is kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
controlPlaneEndpoint: "192.168.170.188:6443"
networking:
    # This CIDR is a Calico default. Substitute or remove for your CNI provider.
    podSubnet: "172.22.0.0/16"
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers