This is a...
Problem:
Cannot create HA cluster by following steps provided under "Stacked control plane nodes."
Proposed Solution:
I am in the process of learning Kubernetes, don't know what exactly is wrong, and don't know how to fix.
Page to Update:
https://kubernetes.io/...
Kubernetes Version:
1.12.0
First attempt:
Failed setting up second control plane at this step:
kubeadm alpha phase kubelet write-env-file --config kubeadm-config.yaml
Output:
didn't recognize types with GroupVersionKind: [kubeadm.k8s.io/v1alpha3, Kind=ClusterConfiguration]
As a workaround, I replaced kubeadm.k8s.io/v1alpha3 with kubeadm.k8s.io/v1alpha2 and ClusterConfiguration with MasterConfiguration in the kubeadm-config.yaml
Second attempt:
kubeadm alpha phase mark-master --config kubeadm-config.yaml times out. I tried this several times, trying to debug, and noticed that everything seems to break after running kubectl exec -n kube-system etcd-${CP0_HOSTNAME} -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://${CP0_IP}:2379 member add ${CP2_HOSTNAME} https://${CP2_IP}:2380. By doing docker ps on the first control-plane, I notice that new containers keep getting created (I assume they keep crashing).
Third attempt:
Tried the proposed solution in #9526. This time there are no errors or timeouts from executing the commands, but the kube controller managers and schedulers on the second and third control planes appear to be crashing:
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-279cq 2/2 Running 0 22m
kube-system calico-node-4ggxh 2/2 Running 0 22m
kube-system calico-node-pzz6x 2/2 Running 0 22m
kube-system coredns-576cbf47c7-jf9v8 1/1 Running 0 48m
kube-system coredns-576cbf47c7-t6x8j 1/1 Running 0 48m
kube-system etcd-REDACTED 1/1 Running 1 47m
kube-system etcd-REDACTED 1/1 Running 0 41m
kube-system etcd-REDACTED 1/1 Running 0 25m
kube-system kube-apiserver-REDACTED 1/1 Running 0 47m
kube-system kube-apiserver-REDACTED 1/1 Running 0 41m
kube-system kube-apiserver-REDACTED 1/1 Running 0 25m
kube-system kube-controller-manager-REDACTED 1/1 Running 1 47m
kube-system kube-controller-manager-REDACTED 0/1 CrashLoopBackOff 13 42m
kube-system kube-controller-manager-REDACTED 0/1 CrashLoopBackOff 9 26m
kube-system kube-proxy-5z77x 1/1 Running 0 48m
kube-system kube-proxy-fljtd 1/1 Running 0 26m
kube-system kube-proxy-tlc2s 1/1 Running 0 42m
kube-system kube-scheduler-REDACTED 1/1 Running 1 47m
kube-system kube-scheduler-REDACTED 0/1 CrashLoopBackOff 13 42m
kube-system kube-scheduler-REDACTED 0/1 CrashLoopBackOff 10 26m
Looks like they don't have a configuration?:
kubectl --namespace=kube-system logs kube-controller-manager-REDACTED
Flag --address has been deprecated, see --bind-address instead.
I1001 18:39:51.847438 1 serving.go:293] Generated self-signed cert (/var/run/kubernetes/kube-controller-manager.crt, /var/run/kubernetes/kube-controller-manager.key)
invalid configuration: no configuration has been provided
Anyone know any quick fixes for my issues? I'd be happy to provide more information, but I'm not sure where to look.
/kind bug
@kubernetes/sig-cluster-lifecycle
/sig cluster-lifecycle
I ran into the same issue using debian 9, docker 17.3.3, and flannel.
I looked at these docs to try to see if I could maybe figure out more about the file. I ultimately skipped it, as i wasn't able to see any kubelet config files written to cp0 that weren't also on the other 2 hosts.
I was able to get through the rest of it with an almost working cluster.
control plane and etcd were working.
cp0 ran the local manifests, but stayed in "NotReady" state due to cni config not being present.
Tried again on coreos 1855.4.0 and had similar experience. Seems like a circular dependency on first host and CNI plugin. It works if you add the file /etc/cni/net.d/10-flannel.conflist to cp0 manually.
I had the same issue :
My context :
As workaround I did the step (aka kubeadm alpha phase kubelet write-env-file --config kubeadm-config.yaml) manually by getting the file generated on cp0.
Copy the file /var/lib/kubelet/kubeadm-flags.env from cp0 to cp1 at the same location instead of doing the step.
For the record, my file looks like this:
/var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS=--cgroup-driver=cgroupfs --network-plugin=cni --resolv-conf=/run/systemd/resolve/resolv.conf
@eturpin @ztec @gclyatt: Just to clarify, you are seeing this when using v1.12.0 of kubeadm?
@detiber for me, yes.
@detiber
Yes. Using the package from http://apt.kubernetes.io/ kubernetes-xenial main.
os: Ubuntu 16.04.5
arch: amd64
Kubernetes version: 1.12.0
@detiber I saw this on v1.12.0 of kubeadm and just tried again with v1.12.1 and had same result with the write-env-file step.
I have the same issue in Centos 7.5 v1.12.1
Can I just copy from CP0 to CP1 and move on ?
I am getting timeout switching CP1 to master if i do
Also on Centos 7.5 and v1.12.1, same issue as mentioned above.
Created file from cp0 /var/lib/kubelet/kubeadm-flags.env , and now I can continue.
NAME STATUS ROLES AGE VERSION
lt1-k8s-04.blah.co.za Ready master 37m v1.12.1
lt1-k8s-05.blah.co.za NotReady master 5m25s v1.12.1
/assign
I had the same issue :
OS: Ubuntu Ubuntu 16.04.4 LTS
Arch: amd64
Kubernetes version: 1.12.1
Issue on second master
I'm experiencing the same problem with Kubernetes 1.12.1 and Ubuntu 18.04.
Facing same issue with CentOS & Kubernetes 1.12.1
Please advise.
@pankajpandey9, the workaround outlined by @ztec a few posts up works.
For me, the workaround (manually creating the /var/lib/kubelet/kubeadm-flags.env file on the other nodes) allowed the rest of the commands in the documentation to complete.
However, the resulting cluster was still (mostly) broken. kube-controller-manager and kube-scheduler on the other nodes were in a continual crashing loop.
closing in favor of:
https://github.com/kubernetes/kubeadm/issues/1171
^ has the actual cause + solution defined too.
this is a kubeadm issue and we shouldn't track it in the website repo.
/close
@neolit123: Closing this issue.
In response to this:
closing in favor of:
https://github.com/kubernetes/kubeadm/issues/1171^ has the actual cause + solution defined too.
this is a kubeadm issue and we shouldn't tracking it in the website repo.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
I had the same issue :
My context :
As workaround I did the step (aka
kubeadm alpha phase kubelet write-env-file --config kubeadm-config.yaml) manually by getting the file generated oncp0.Copy the file
/var/lib/kubelet/kubeadm-flags.envfrom cp0 to cp1 at the same location instead of doing the step.For the record, my file looks like this:
/var/lib/kubelet/kubeadm-flags.env