Kubeadm: Kubeadm join fail

Created on 12 Jul 2017 · 12Comments · Source: kubernetes/kubeadm

What happened?

Hello, I have an issue when I try to do a kubeadm join it appear to be succeed because I see :

[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "X.X.X.X:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://X.X.X.X:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://X.X.X.X:6443"
[discovery] Successfully established connection with API Server "X.X.X.X:6443"
[bootstrap] Detected server version: v1.7.0
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"

Node join complete:

Certificate signing request sent to master and response
received.

Kubelet informed of new secure connection details.

But on master node when I do a kubectl get nodes I only see my master node.

Versions

kubeadm version (use kubeadm version): v1.7.0

Environment:

Kubernetes version (use kubectl version): v1.7.0
OS (e.g. from /etc/os-release): CentOS 7
Kernel (e.g. uname -a): 3.10.0-327.36.3.el7.x86_64

What you expected to happen?

The worker node connected to the master

Thanks for your help

Source

PLoic

Most helpful comment

Problem solved ! :)
With the journalctl -xeu kubelet I see :
error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

I simply change my docker configuration file to add :

ExecStart=
ExecStart=/usr/bin/dockerd --exec-opt native.cgroupdriver=systemd

After restart docker and retry a kubeadm join the worker appear when I do a kubectl get nodes

Thanks for your help !

@piersbarrios I have created an issue relative to an issue like that : https://github.com/kubernetes/kubernetes/issues/48798

But by removing the $KUBELET_NETWORK_ARGS in/etc/systemd/system/kubelet.service.d/10-kubeadm.conf it seem to be working

PLoic on 13 Jul 2017

👍4

All 12 comments

Can you output the logs of the kubelet?
journalctl -xeu kubelet

luxas on 12 Jul 2017

I simply change my docker configuration file to add :

ExecStart=
ExecStart=/usr/bin/dockerd --exec-opt native.cgroupdriver=systemd

After restart docker and retry a kubeadm join the worker appear when I do a kubectl get nodes

Thanks for your help !

@piersbarrios I have created an issue relative to an issue like that : https://github.com/kubernetes/kubernetes/issues/48798

But by removing the $KUBELET_NETWORK_ARGS in/etc/systemd/system/kubelet.service.d/10-kubeadm.conf it seem to be working

PLoic on 13 Jul 2017

👍4

@PLoic : I believe I have another issue here because my logs now show :

Jul 13 15:23:52 k8s-Node171 kubelet[5443]: I0713 15:23:52.666752    5443 kubelet_node_status.go:247] Setting node annotation to enable volume controller attach/detach
Jul 13 15:23:52 k8s-Node171 kubelet[5443]: I0713 15:23:52.669988    5443 kubelet_node_status.go:82] Attempting to register node k8s-node171
Jul 13 15:23:52 k8s-Node171 kubelet[5443]: E0713 15:23:52.671312    5443 kubelet_node_status.go:106] Unable to register node "k8s-node171" with API server: nodes "k8s-node171" is forbidden: node k8s-Node171 cannot modify node k8s-node171
Jul 13 15:23:59 k8s-Node171 kubelet[5443]: I0713 15:23:59.671580    5443 kubelet_node_status.go:247] Setting node annotation to enable volume controller attach/detach

etc...

I am using weave btw

piersbarrios on 13 Jul 2017

Same error with piersbarrios

Jul 14 13:23:27 SZV1000204813 kubelet[120170]: I0714 13:23:27.314297 120170 kubelet_node_status.go:247] Setting node annotation to enable volume controller attach/detach Jul 14 13:23:27 SZV1000204813 kubelet[120170]: I0714 13:23:27.316602 120170 kubelet_node_status.go:82] Attempting to register node szv1000204813 Jul 14 13:23:27 SZV1000204813 kubelet[120170]: E0714 13:23:27.318670 120170 kubelet_node_status.go:106] Unable to register node "szv1000204813" with API server: nodes "szv1000204813" is forb Jul 14 13:23:27 SZV1000204813 kubelet[120170]: E0714 13:23:27.571840 120170 eviction_manager.go:238] eviction manager: unexpected err: failed GetNode: node 'szv1000204813' not found
I using Ubuntu16.04 and Flannel

ThomasZhou on 14 Jul 2017

This should be fixed now. Please use v1.7.1 and reopen if you can still reproduce the issue...

luxas on 14 Jul 2017

👍1

Dear all,

For v.1.7.1 we facing problem for join node to cluster with ip address. by command "kubeadm --token 8c2350.f55343444a6ffc46 join X.X.X.X:6443" with Error like below:

kubeadm join kubernetes-ms:6443 --token 8c2350.f55343444a6ffc46
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.06.0-ce. Max validated version: 1.12
[preflight] WARNING: hostname "" could not be reached
[preflight] WARNING: hostname "" lookup : no such host
[preflight] Some fatal errors occurred:
hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is 'a-z0-9?(.a-z0-9?)*')
[preflight] If you know what you are doing, you can skip pre-flight checks with --skip-preflight-checks

Lab Description (All node had been install docker/kubelet/kubectl/kubeadm):
Machine name Roles: IP Address:
kubeserve-ms Master 192.168.99.200
kubeserve-1 NodePort 192.168.99.201
kubeserve-2 NodePort 192.168.99.202

(kubeserve_ms) initial cluster by command (su to root):
kubeadm init --pod-network-cidr=10.244.0.0/16 --token 8c2350.f55343444a6ffc46
(kubeserve_ms) setup run cluster system by command (Regular User):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
(kubeserve_ms) init cluster by command:
sudo su -
kubeadm init --pod-network-cidr=10.244.0.0/16 --token 8c2350.f55343444a6ffc46
(kubeserve_ms) apply weave network module by command:
kubectl apply -n kube-system -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d 'n')"
(kubeserve-1,kuberserve-2) start join node by command:
kubeadm --token 8c2350.f55343444a6ffc46 join 192.168.99.200:6443
Result
kubeadm join kubernetes-ms:6443 --token 8c2350.f55343444a6ffc46
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.06.0-ce. Max validated version: 1.12
[preflight] WARNING: hostname "" could not be reached
[preflight] WARNING: hostname "" lookup : no such host
[preflight] Some fatal errors occurred: hostname "" a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is 'a-z0-9?(.a-z0-9?)*')
[preflight] If you know what you are doing, you can skip pre-flight checks with --skip-preflight-checks

Currently my workaround is only switch to use v1.7.0 that it work fine

praparn on 18 Jul 2017

@praparn See: https://github.com/kubernetes/kubeadm/issues/347
It will be fixed in v1.7.2
Meanwhile, you can just set --skip-preflight-checks

luxas on 18 Jul 2017

@luxas Note with thanks krab

praparn on 20 Jul 2017

@luxas : Wasn't fixed in v1.7.2
(I don't know if it's your will or not)

piersbarrios on 26 Jul 2017

not fixed neither for me.

kubeadm version: &version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.2", GitCommit:"922a86cfcd65915a9b2f69f3f193b8907d741d9c", GitTreeState:"clean", BuildDate:"2017-07-21T08:08:00Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

uname -a
Linux k8-node-01 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

gicappa on 28 Jul 2017

Ditto... Running 1.7.2 and the join fails. Giving --skip-preflight-checks seems to indicate that the join was successful, but the master never detects it and none of the required images on worker nodes get downloaded, so I image it must also be a bug.

Ps: I am joining a RPi3 with a x86 master.. I went ahead and created the cluster with kuberentes v1.7.0, which seems to work OK.

codesnk on 30 Jul 2017

I met this error too, but not related to the cgroup driver.

When the kubelet service is configured with

--feature-gates=RotateKubeletClientCertificate=true,RotateKubeletClientCertificate=true

the command kubeadm join looks succeed, but the master node can not find the join node,

and journalctl -efu kubelet gives the following messages:

Aug 25 04:14:12 storage3 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Aug 25 04:14:12 storage3 systemd[1]: Starting kubelet: The Kubernetes Node Agent...
Aug 25 04:14:12 storage3 kubelet[14516]: I0825 04:14:12.663893   14516 feature_gate.go:144] feature gates: map[RotateKubeletClientCertificate:true]
Aug 25 04:14:12 storage3 kubelet[14516]: I0825 04:14:12.682075   14516 certificate_manager.go:355] Requesting new certificate.

looks like it's hanging on the requesting for new certificates

It's not an emergency as those two features are in alpha, after disable it everything is ok, just for the sake of recording