Kops: x509: certificate signed by unknown authority when installing cluster with kops

Created on 21 Nov 2016  路  22Comments  路  Source: kubernetes/kops

Hi,

Getting this error when executing any kubectl command:
Unable to connect to the server: x509: certificate signed by unknown authority

Did some digging around and found that it is because of self signed certificates. This can be solved by adding --insecure-skip-tls-verify=true to every kubectl command or (the preferred way) adding:

--kubelet-certificate-authority=/srv/kubernetes/ca.crt \
--kubelet-client-certificate=/var/run/kubernetes/kubelet.crt \
--kubelet-client-key=/var/run/kubernetes/kubelet.key 

to the kube-apiserver startup shell script.

My question: how can I get these configuration options added automatically added to the kube-apiserver startup script when I install the cluster with kops?

(Or is there another way of dealing with these certificates?)

Most helpful comment

Update:
Removing the embedded root certificate from ~/.kube/config and running this config command:

kubectl config set-cluster ${KUBE_CONTEXT} --insecure-skip-tls-verify=true \
--server=${KUBE_CONTEXT}

is the equivalent of adding --insecure-skip-tls-verify=true to every kubectl command.

All 22 comments

Update:
Removing the embedded root certificate from ~/.kube/config and running this config command:

kubectl config set-cluster ${KUBE_CONTEXT} --insecure-skip-tls-verify=true \
--server=${KUBE_CONTEXT}

is the equivalent of adding --insecure-skip-tls-verify=true to every kubectl command.

You really shouldn't have to do this. The kubecfg configuration includes the (self-signed) CA certificate and this ensures that you aren't being MITM-ed.

This sounds more like an installation problem when running kops. Were you doing anything unusual?

This should not happen. Closing pending further details, but if we get them, we should reopen.

I have been able to reproduce lots of times. I think it happens when I destroy a cluster and within a few mins, re-create the same cluster again.

Finally found the source of this error. Essentially my .kube/config was getting lost resulting in this error.
It was a bug in my scripts, outside of kops. Now that I have fixed, it, I do not expect to have this error.

if you are facing this error, try
kops export kubecfg --name $CLUSTER_NAME
hopefully that should fix it.

This will happen is you are intentionally MITMing eg if you are putting your cluster behind an external system that does SSL termination on a different CA then the cluster uses.

For example create cluster with kube generated CA, but need to put the UI and API behind an IT CA. Not sure how to deal with this yet

This will happen if you recreate a cluster and you do not copy the new configuration to the regular user.

When you create a new cluster it prompts the following:

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Be sure to execute the line sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config every time you recreate your cluster.

Getting the same issue on a fresh / blank install of kops 1.9.0.

$ kops create cluster --name=example.mycompany.com --state=s3://myproj-kubestate
I0511 12:22:20.648461   38921 create_cluster.go:1318] Using SSH public key: /Users/me/.ssh/id_rsa.pub

error reading cluster configuration "example.mycompany.com": error reading s3://myproj-kubestate/example.mycompany.com/config: error fetching s3://myproj-kubestate/example.mycompany.com/config: RequestError: send request failed
caused by: Get https://myproj-kubestate.s3.amazonaws.com/example.mycompany.com/config: x509: certificate signed by unknown authority

I'm not doing anything unusual as far as I'm aware, just following the tutorial / steps. What other info can I provide?

Just got the same error after installing a single node of Kubernetes using Docker Edge for Mac, and then typing $ kubectl version. It returned the client info (version, etc.), but not the server.

I've been getting this going through terraform demos on EKS (launching from osx, but in theory that shouldn't matter... right)?

this one was the latest, solved (temporarily) via the --insecure-skip-tls-verify=true flag.

demo here:
https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html

I got this now on EKS

If you're using self-signed certificate and --insecure-skip-tls-verify=true doesn't work, there is a chance that your network doesn't allow unsecure self signed cert. Try doing it over a VPN.

run "gcloud container clusters get-credentials standard-cluster-1 --zone us-central1-a --project devops1-218400"
here devops1-218400 is my project name.replace it with your project name

appending --insecure-skip-tls-verify=true to the end of kubectl get all did the trick...

thanks @mayank-dixit

I resolved by performing these steps on /root/.kub directory:

  1. sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    2 . sudo chown $(id -u):$(id -g) $HOME/.kube/config

Since I tried to start the cluster using kubeadm init many times but I did not update /root/.kube directory with the new entries. Soi performed these steps when I got a certification error.. remember that after running first command. write "yes" for overwrite process. Otherwise, it will not work.

Finally found the source of this error. Essentially my .kube/config was getting lost resulting in this error.
It was a bug in my scripts, outside of kops. Now that I have fixed, it, I do not expect to have this error.

if you are facing this error, try
kops export kubecfg --name $CLUSTER_NAME
hopefully that should fix it.

Thank you! That's the only thing that has worked for me

To whom it might help: I got this issue because I managed to paste in the certificate (under ${HOME}/.kube/certs/...) with color from pygmentize in my cat command so the cert wasn't actually correct. Entering the plaintext cert resolved the issue for me.

Error from server (Forbidden): error when retrieving current configuration of:
Resource: "/v1, Resource=serviceaccounts", GroupVersionKind: "/v1, Kind=ServiceAccount"
Name: "spinnaker-service-account", Namespace: "spinnaker"

In my case, I got this error with "kubectl version". I had installed minikube in my linux machine, and kubectl was configured to use the minikube. It got resolved when I added the minikube server (192.168.99.101 in the kube config file below) to the NO_PROXY env variable:

cat ~/.kube/config
apiVersion: v1
clusters:

  • cluster:
    certificate-authority: /home/ssriram/.minikube/ca.crt
    server: https://192.168.99.101:8443
    ...

I noticed my problem was resolved by regenerating the certificate which was expired/changed. Its just about regenerating the kube config file. I was using AKS (Azure Kubernetes) cluster so the belwo command regenerated the config file.

az aks get-credentials --resource-group myResourceGroup --name myAKSCluster

Hi, I'm getting this, but it's in the (journalctl) logs:

 k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://10.97.236.121:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dtinkerboard&limit=500&resourceVersion=0: x509: certificate signed by unknown authority

I'm intentionally trying to corrupt kubernets configuration / certificates, to learn how to restore everything back.

So I started changing certificate-authority-data content, then I ran:

kubeadm alpha certs renew all
mv /etc/kubernetes/admin.conf /etc/kubernetes/admin.conf.bak
kubeadm init phase kubeconfig admin

With diff I got confirmation that the only difference between the old and the new conf file was in the certificate-* entries data.

I'm stuck here though. What should I do?

I don't want to resolve to --insecure-skip-tls-verify=true or similar tricks, I want to restore the cluster functionality as it was in the beginning.

/open

Was this page helpful?
0 / 5 - 0 ratings