Kind: RROR: failed to create cluster: failed to init node with kubeadm

Created on 23 Mar 2020  Â·  19Comments  Â·  Source: kubernetes-sigs/kind

What happened:
use kind create cluster then prompt:
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --ignore-preflight-errors=all --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

What you expected to happen:
create the cluster ok

How to reproduce it (as minimally and precisely as possible):
Mac vagrant centos/7
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:27:04 2020
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b
Built: Wed Mar 11 01:25:42 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Anything else we need to know?:

Environment:

  • kind version: (use kind version):kind v0.7.0 go1.13.6 linux/amd64
  • Kubernetes version: (use kubectl version):Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.4", GitCommit:"8d8aa39598534325ad77120c120a22b3a990b5ea", GitTreeState:"clean", BuildDate:"2020-03-12T21:03:42Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
    The connection to the server localhost:8080 was refused - did you specify the right host or port?
  • Docker version: (use docker info):19.03.8
  • OS (e.g. from /etc/os-release):CentOS Linux release 7.6.1810 (Core)
kinsupport triagneeds-information

Most helpful comment

kind create cluster --retain -v 1
kind export logs
lots of log files there.

check against https://kind.sigs.k8s.io/docs/user/known-issues/

All 19 comments

can you run kind with -v 1 when creating the cluster?

Mac vagrant centos/7

is that docker inside vagrant on mac? any reason not to use docker desktop?

do you have enough disk space in the VM? this error usually occurs when most docker things would have failed because the host is not capable.

Same error

ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --ignore-preflight-errors=all --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

I'm going to need more information about your failure and your host environment to know why it isn't working for you but common reasons are:

  • using an unsupported version of docker (I mean supported by docker upstream... some of the older versions etc. have issues)
  • not having enough disk / memory / CPU available to docker to run a kubernetes node (generally disk, occasionally memory, you need something like 600MB memory for one node and at least 2gb disk space to pull the node image and have some space to do things after)
  • using a filesystem that doesn't work out of the box with docker in docker https://github.com/kubernetes-sigs/kind/issues/1416#issuecomment-600438973
  • using a host environment that cannot nest containers like this (e.g. crostini) https://github.com/kubernetes-sigs/kind/issues/763

please check the known issues

This error just means that kubeadm was unable to bring up the control plane successfully, which generally means the host environment is unhealthy.

@BenTheElder I appreciate your help.

I'm getting that error when i'm trying to create a cluster(one master node and two worker nodes) using a config file.

  • Logs on master-node
    INFO: ensuring we can execute /bin/mount even with userns-remap
    INFO: remounting /sys read-only
    INFO: making mounts shared
    INFO: fix cgroup mounts for all subsystems
    INFO: clearing and regenerating /etc/machine-id
    Initializing machine ID from random generator.
    INFO: faking /sys/class/dmi/id/product_name to be "kind"
    INFO: faking /sys/class/dmi/id/product_uuid to be random
    INFO: faking /sys/devices/virtual/dmi/id/product_uuid as well
    Failed to find module 'autofs4'
    systemd 242 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
    Detected virtualization docker.
    Detected architecture x86-64.
    Failed to create symlink /sys/fs/cgroup/cpu: File exists
    Failed to create symlink /sys/fs/cgroup/cpuacct: File exists
    Failed to create symlink /sys/fs/cgroup/net_cls: File exists
    Failed to create symlink /sys/fs/cgroup/net_prio: File exists
    Welcome to Ubuntu 19.10!
    Set hostname to .
    Failed to bump fs.file-max, ignoring: Invalid argument
    Configuration file /kind/systemd/kubelet.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
    Configuration file /etc/systemd/system/kubelet.service.d/10-kubeadm.conf is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
    [ OK ] Listening on Journal Socket (/dev/log).
    [ OK ] Reached target Swap.
    [ OK ] Reached target Slices.
    [ OK ] Listening on Journal Socket.
    Mounting FUSE Control File System...
    Mounting Kernel Debug File System...
    Starting Apply Kernel Variables...
    Mounting Huge Pages File System...
    [ OK ] Listening on Journal Audit Socket.
    [ OK ] Reached target Sockets.
    Starting Journal Service...
    Starting Remount Root and Kernel File Systems...
    [UNSUPP] Starting of Arbitrary Exec…Automount Point not supported.
    [ OK ] Started Dispatch Password …ts to Console Directory Watch.
    [ OK ] Reached target Local Encrypted Volumes.
    [ OK ] Reached target Paths.
    Starting Create list of re…odes for the current kernel...
    [ OK ] Started Create list of req… nodes for the current kernel.
    [ OK ] Started Remount Root and Kernel File Systems.
    Starting Update UTMP about System Boot/Shutdown...
    Starting Create System Users...
    [ OK ] Started Update UTMP about System Boot/Shutdown.
    [ OK ] Mounted Kernel Debug File System.
    [ OK ] Mounted FUSE Control File System.
    [ OK ] Started Apply Kernel Variables.
    [ OK ] Mounted Huge Pages File System.
    [ OK ] Started Create System Users.
    Starting Create Static Device Nodes in /dev...
    [ OK ] Started Create Static Device Nodes in /dev.
    [ OK ] Reached target Local File Systems (Pre).
    [ OK ] Reached target Local File Systems.
    [ OK ] Started Journal Service.
    Starting Flush Journal to Persistent Storage...
    [ OK ] Reached target System Initialization.
    [ OK ] Started Daily Cleanup of Temporary Directories.
    [ OK ] Reached target Timers.
    [ OK ] Reached target Basic System.
    [ OK ] Started kubelet: The Kubernetes Node Agent.
    Starting containerd container runtime...
    [ OK ] Started containerd container runtime.
    [ OK ] Reached target Multi-User System.
    [ OK ] Reached target Graphical Interface.
    Starting Update UTMP about System Runlevel Changes...
    [ OK ] Started Flush Journal to Persistent Storage.
    [ OK ] Started Update UTMP about System Runlevel Changes.

  • docker info results
    Client:
    Debug Mode: false
    Server:
    Containers: 25
    Running: 21
    Paused: 0
    Stopped: 4
    Images: 29
    Server Version: 19.03.5
    Storage Driver: overlay2
    Backing Filesystem: extfs
    Supports d_type: true
    Native Overlay Diff: true
    Logging Driver: json-file
    Cgroup Driver: cgroupfs
    Plugins:
    Volume: local
    Network: bridge host ipvlan macvlan null overlay
    Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
    Swarm: inactive
    Runtimes: runc
    Default Runtime: runc
    Init Binary: docker-init
    init version: fec3683
    Security Options:
    seccomp
    Profile: default
    Kernel Version: 4.19.76-linuxkit
    Operating System: Docker Desktop
    OSType: linux
    Architecture: x86_64
    CPUs: 6
    Total Memory: 7.777GiB
    Name: docker-desktop
    Docker Root Dir: /var/lib/docker
    Debug Mode: true
    File Descriptors: 157
    Goroutines: 146
    System Time: 2020-03-24T08:10:00.574751724Z
    EventsListeners: 3
    HTTP Proxy: gateway.docker.internal:3128
    HTTPS Proxy: gateway.docker.internal:3129
    Registry: https://index.docker.io/v1/
    Labels:
    Experimental: false
    Insecure Registries:
    127.0.0.0/8
    Live Restore Enabled: false
    Product License: Community Engine

  • kind version results
    kind v0.7.0 go1.13.6 darwin/amd64

  • system_profiler SPSoftwareDataType results
    System Version: macOS 10.15.3 (19D76)
    Kernel Version: Darwin 19.3.0

After changing the containerPath, from /etc/containerd/config.toml to /dev/mapper/config.toml it worked.

@Seshirantha I would recommend using something like this in the kind config instead of trying to mount the config file:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
containerdConfigPatches: 
- |-
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."localhost:5000"]
    endpoint = ["http://blah:5000"]

(replace the patch contents with whatever you're setting in this config file)

closing due to lack of follow-up details.

How it could be possible to debug.
The last log message is stuck on:

[  OK  ] Started Update UTMP about System Runlevel Changes.

How to check the kubeadm command runs after the init process and where the logs of init?

kind create cluster --retain -v 1
kind export logs
lots of log files there.

check against https://kind.sigs.k8s.io/docs/user/known-issues/

@BenTheElder thanks for the command.

I see logs. It seems kubelete could not be started.

because cluster was not created I could not get kubelet logs:

 $ kind export logs           
ERROR: unknown cluster "kind"

Is there any other way to extract logs during the boot?

Another warning from the logs:

I0416 20:58:34.895732      93 checks.go:649] validating whether swap is enabled or not
    [WARNING Swap]: running with swap on is not supported. Please disable swap

--retain will not clean up, you need that flag when creating it, then
export will work.

On Thu, Apr 16, 2020 at 2:16 PM Michael Nikitochkin <
[email protected]> wrote:

@BenTheElder https://github.com/BenTheElder thanks for the command.

I see logs. It seems kubelete could not be started.

But because was not created I could not get kubelet logs:

$ kind export logs
ERROR: unknown cluster "kind"

Is there any other way to extract logs during the boot?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes-sigs/kind/issues/1437#issuecomment-614900945,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHADKYSE77FDGDX26XHC5LRM5YUPANCNFSM4LSGE3RA
.

warnings are not failures.

swap is a normal warning, kubernetes doesn't support this but we set it to allow it and it works, except for memory limits

you will also see failures early on due to cni not being configured.

kubeadm has many "normal" kubelet crashes early on, it's part of the design of kubeadm + kubelet that kubelet just restarts many times.

please open a new support issue or join the slack for more interactive support help, I do not monitor closed issues as reliably.

please check that the name of your cluster should not contain any underscore...
i bumped into the same error when i include an '_' in my cluster name

PRs welcome 🙃

On Fri, Sep 25, 2020, 03:26 TomLan42 notifications@github.com wrote:

please check that the name of your cluster should not contain any
underscore...
i bumped into the same error when i include an '_' in my cluster name

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes-sigs/kind/issues/1437#issuecomment-698852566,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHADK4QLMRNREZWA5L67F3SHRV4DANCNFSM4LSGE3RA
.

Hello,
I faced similar issue (I'm using Fedora 32), I solved it with:
sysctl net.bridge.bridge-nf-call-iptables=0
sysctl net.bridge.bridge-nf-call-arptables=0
sysctl net.bridge.bridge-nf-call-ip6tables=0
systemctl restart docker

Note: modify sysctl.conf for persistent kernel setting

please check that the name of your cluster should not contain any underscore...
i bumped into the same error when i include an '_' in my cluster name

this one worked for me, thank you!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nilebox picture nilebox  Â·  40Comments

hjacobs picture hjacobs  Â·  31Comments

anjiawei1991 picture anjiawei1991  Â·  34Comments

BenTheElder picture BenTheElder  Â·  30Comments

mitar picture mitar  Â·  49Comments