[apiclient] Created API client, waiting for the control plane to become ready
BUG REPORT
kubeadm version (use kubeadm version):
kubeadm version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:33:17Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Environment:
kubectl version):Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:33:17Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
CentOS Linux release 7.3.1611 (Core)
uname -a):Linux tme-lnx1-centos 3.10.0-514.21.1.el7.x86_64 #1 SMP Thu May 25 17:04:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
API controller failed to start when issuing kubeadm init
kubeadm init completely successfully
In the network I'm sitting there's a number of subdomains that have 'localhost' registered as a hostname. Somewhere when the API controller starts, it resolves localhost.foo.domain.com and sticks that IP address to the API controller:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ab6b449d952c gcr.io/google_containers/kube-apiserver-amd64@sha256:6d5aa429c2b0806e4b6d1d179054d6deee46eec0aabe7bd7bd6abff97be36ae7 "kube-apiserver --all" About a minute ago Exited (255) About a minute ago k8s_kube-apiserver_kube-apiserver-tme-lnx1-centos_kube-system_3a03da482c18faa7691e3f59fcfc1189_10
$ docker logs ab6b449d952c
E0613 19:23:20.246384 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *rbac.RoleBinding: Get https://localhost:6443/apis/rbac.authorization.k8s.io/v1beta1/rolebindings?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246417 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.LimitRange: Get https://localhost:6443/api/v1/limitranges?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246431 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.Namespace: Get https://localhost:6443/api/v1/namespaces?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246392 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.ResourceQuota: Get https://localhost:6443/api/v1/resourcequotas?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246372 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.ServiceAccount: Get https://localhost:6443/api/v1/serviceaccounts?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246405 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *storage.StorageClass: Get https://localhost:6443/apis/storage.k8s.io/v1beta1/storageclasses?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246681 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.Secret: Get https://localhost:6443/api/v1/secrets?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246703 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *rbac.ClusterRoleBinding: Get https://localhost:6443/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindings?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246806 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *rbac.Role: Get https://localhost:6443/apis/rbac.authorization.k8s.io/v1beta1/roles?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
E0613 19:23:20.246880 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *rbac.ClusterRole: Get https://localhost:6443/apis/rbac.authorization.k8s.io/v1beta1/clusterroles?resourceVersion=0: dial tcp 10.12.180.36:6443: getsockopt: connection refused
[restful] 2017/06/13 19:23:20 log.go:30: [restful/swagger] listing is available at https://10.21.34.95:6443/swaggerapi/
[restful] 2017/06/13 19:23:20 log.go:30: [restful/swagger] https://10.21.34.95:6443/swaggerui/ is mapped to folder /swagger-ui/
I0613 19:23:20.381159 1 serve.go:79] Serving securely on 0.0.0.0:6443
W0613 19:23:20.383807 1 storage_extensions.go:127] third party resource sync failed: Get https://localhost:6443/apis/extensions/v1beta1/thirdpartyresources: dial tcp 10.12.180.36:6443: getsockopt: connection refused
F0613 19:23:20.383841 1 controller.go:128] Unable to perform initial IP allocation check: unable to refresh the service IP block: Get https://localhost:6443/api/v1/services: dial tcp 10.12.180.36:6443: getsockopt: connection refused
10.12.180.36 is localhost.foo.domain.com, 10.21.34.95 is the correct public facing IP address of the VM.
Removing the search and/or domain from /etc/resolv.conf that has a localhost hostname registered will resolve this problem and kubeadm init succeeds after a kubeadm reset.
I confirm this issue – faced it on three FirstVDS.ru KVMs (two with Ubuntu 16.04 and one with CentOS 7.3). Also @albpal had this on myhosting.com. We got the issue fixed by removing the provider's DNS servers and installing dnsmasq. More details in https://github.com/kubernetes/kubeadm/issues/228#issuecomment-307158412 and the following comments.
Hope there is a fix on the kubeadm side, because sorting out what's going on after you've stuck at [apiclient] Created API client, waiting for the control plane to become ready is too hard.
@kachkaev I'm thinking of adding a preflight check for this, i.e. fail fast if
nslookup localhost, nslookup localhost.$(hostname -d) or nslookup $(hostname) returns a non-loopback or address that doesn't exist in ip addr.
How does that sound to you?
Thanks to chatting with @drajen I now have a clearer understanding of the problem domain, I had never encountered such a setup myself before.
@luxas my k8s experience is only a couple of weeks, so I'm not sure I can be a good adviser here. But I agree that running a preflight check would be a good start! It's also important to point people who fail the check to some good explanation of what's happening so that they could either install dnsmaq (like I did) of configure their DNS server (if they have access to it).
If you're fancy to experiment, I can share root access to two small KVMs on FirstVDS.ru, one with Ubuntu and another one with CentOS 7.3. I rented them to experiment with kubeadm and they are paid for another couple of weeks in any case. Just DM me on twitter or send an email.
Bump, also experiencing this.
Also, if 'localhost' appears on DNS after kubeadm has initted, the cluster has unexpected behaviour in communication between hosts.
I ran into this. I am using ubuntu though.. but pretty much the same problem.
One thing maybe worth pointing out is that it's not kubeadm that's failing per se, but this seems to be a Kubernetes-in-general thing. I'm not even sure if it even is a bug in the API server, since it just uses the normal golang resolver which takes (and should take indeed) /etc/resolv.conf into account.
If you have something invalid in there it will obviously fail.
However, kubeadm could be more user-friendly by detecting such a misconfiguration and failing fast, that's something we're definitely gonna take into account.
@luxas is there any open issue that you know of on the Kubernetes side?
@vascofg Not anyone I know about, but there might be.
You can ask in #sig-api-machinery on Slack...
This will be fixed in v1.7 thanks to https://github.com/kubernetes/kubernetes/pull/46772 :tada:
Happy to pick this one up.
This is fixed in the latest v1.6 release thanks to https://github.com/kubernetes/kubernetes/pull/48875 and in v1.7 thanks to https://github.com/kubernetes/kubernetes/pull/46772
I'm leaving a comment about this here as I ran into the same issue with a machine called 'localhost' registered on the Enterprise network. In my case, all I had to do was change the 'search' line in /etc/resolv.conf to start with 'localdomain' before any corporate domain names. This allowed 'nslookup localhost' to always actually search localhost.localdomain which resolves to the loopback address.
Although this is fxed in later realeases, the software I was installing is a black box K8s solution with no option for upgrading kubernetes or altering any of the config.
Most helpful comment
This will be fixed in v1.7 thanks to https://github.com/kubernetes/kubernetes/pull/46772 :tada: