BUG REPORT:
kubeadm version (use kubeadm version):
1.6.1
Environment:
ubuntu
kubectl version):uname -a):## What happened?
when joining the cluster via `kubeadm join --token d77f50.ccc501bafbaa4179 myip.118.240.130:6443` on a kube worker I get this message and it hangs here:
``` [discovery] Created cluster-info discovery client, requesting info from “https://myip.118.240.130:6443”```
the port 6443 should be open on the master, i telneted to it and it connected
looking at journalctl on the worker I see this: ```pr 11 22:26:23 phil-ubu-worker-1 kubelet[13231]: error: failed to run Kubelet: invalid kubeconfig: stat /etc/kubernetes/kubelet.conf: no such file or directory
Apr 11 22:26:23 phil-ubu-worker-1 kubelet[13231]: I0411 22:26:23.111145 13231 feature_gate.go:144] feature gates: map[]
Apr 11 22:26:22 phil-ubu-worker-1 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Apr 11 22:26:22 phil-ubu-worker-1 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Apr 11 22:26:22 phil-ubu-worker-1 systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Apr 11 22:26:12 phil-ubu-worker-1 systemd[1]: kubelet.service: Failed with result ‘exit-code’.
Apr 11 22:26:12 phil-ubu-worker-1 systemd[1]: kubelet.service: Unit entered failed state.
Apr 11 22:26:12 phil-ubu-worker-1 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE```
On the master:
ubectl --kubeconfig ./admin.conf get nodes
NAME STATUS AGE
phil-ubu NotReady 1h
master: Apr 11 22:39:42 phil-ubu kubelet[10694]: E0411 22:39:42.727630 10694 kubelet.go:2067] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Apr 11 22:39:42 phil-ubu kubelet[10694]: W0411 22:39:42.726912 10694 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
```
I expected the worker to join the cluster
I followed instructions here: https://kubernetes.io/docs/getting-started-guides/kubeadm/
@pswenson Couple of things to check: What pod networking provider and what version are you using?
flannel via kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
which has: quay.io/coreos/flannel:v0.7.1-amd64
Looks like kube-dns is not running. What is the output of kubectl get pods -n kube-system --kubeconfig ./admin.conf?
Well, kube-dns won't run until network is ready. That flannel manifest is not the one you should use with 1.6, but I don't know where the 1.6 manifest is for flannel, you can get Weave Net with kubectl apply -f https://git.io/weave-kube-1.6.
looks like you need a separate file to handle RBAC, https://github.com/tomdee/flannel/blob/743bafee48b69a3a3f79e37bc806d741715f1dd2/Documentation/kube-flannel-rbac.yml
@pswenson Did that work for you?
@coeki Trying with weave first... @errordeveloper I still have the same issue with weave
[discovery] Trying to connect to API Server "MYIP.118.240.157:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://MYIP.118.240.157:6443" ```
hangs.
this is a curl to the master from the worker
curl -v --insecure https://MYIP.118.240.157:6443
* Rebuilt URL to: https://MYIP.118.240.157:6443/
* Trying MYIP.118.240.157...
* Connected to MYIP.118.240.157 (MYIP.118.240.157) port 6443 (#0)
* found 173 certificates in /etc/ssl/certs/ca-certificates.crt
* found 692 certificates in /etc/ssl/certs
* ALPN, offering http/1.1
* SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256
* server certificate verification SKIPPED
* server certificate status verification SKIPPED
* common name: kube-apiserver (matched)
* server certificate expiration date OK
* server certificate activation date OK
* certificate public key: RSA
* certificate version: #3
* subject: CN=kube-apiserver
* start date: Fri, 21 Apr 2017 19:38:41 GMT
* expire date: Sat, 21 Apr 2018 19:38:41 GMT
* issuer: CN=kubernetes
* compression: NULL
* ALPN, server accepted to use http/1.1
> GET / HTTP/1.1
> Host: MYIP.118.240.157:6443
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Content-Type: text/plain
< X-Content-Type-Options: nosniff
< Date: Fri, 21 Apr 2017 19:56:25 GMT
< Content-Length: 57
<
* Connection #0 to host 96.118.240.157 left intact
so they can talk to each other, but something is going wrong in that call....
what is the workflow? what happens during this call below?
[discovery] Created cluster-info discovery client, requesting info from “https://myip.118.240.130:6443”
@coeki I had same result with your suggestions... Is the pod network even needed to get past this step?
[discovery] Created cluster-info discovery client, requesting info from “https://myip.118.240.130:6443”
Update: I just tested kubeadm with weave net on our old openstack env, which is configured with the exact same network configuration. It works fine.
So there is something non-obvious that is different about this new environment. My problem is I don't see a way to debug this.....
Closing.. .turned out the the MUT was setup inconsistently in our openstack. so the api join packet was being dropped. kubeadm had nothing to do with the prob
hi,guy, although it's late for 2 years, I still encountered the problem, but I fixed it. And I think people are easy to have the problem on multiple machines.
it' s not about network, it's about time on different machine. Different time makes the [token] out of date sometimes. So, try to synchronize the time on your clusters.
Thanks @MichaleWong ! That's the right answer for my failing case. Clock need to be synchronized before joining.
Most helpful comment
hi,guy, although it's late for 2 years, I still encountered the problem, but I fixed it. And I think people are easy to have the problem on multiple machines.
it' s not about network, it's about time on different machine. Different time makes the [token] out of date sometimes. So, try to synchronize the time on your clusters.