kind create cluster fails with failed to remove master

Created on 8 Mar 2020 · 12Comments · Source: kubernetes-sigs/kind

What happened:

...
Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.17.0.2:6443 --token <value withheld> \
    --discovery-token-ca-cert-hash sha256:4ce86a584bd9bef02b7576f32c65e3a97bc5c2a844eb94a904c7f780f61ba33d 
 ✗ Starting control-plane 🕹️ 
ERROR: failed to create cluster: failed to remove master taint: command "docker exec --privileged kind0-control-plane kubectl --kubeconfig=/etc/kubernetes/admin.conf taint nodes --all node-role.kubernetes.io/master-" failed with error: exit status 1

Output:
node/kind0-control-plane untainted
error: taint "node-role.kubernetes.io/master" not found

Stack Trace: 
sigs.k8s.io/kind/pkg/errors.WithStack
        /src/pkg/errors/errors.go:51
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
        /src/pkg/exec/local.go:116
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
        /src/pkg/cluster/internal/providers/docker/node.go:130
sigs.k8s.io/kind/pkg/cluster/internal/create/actions/kubeadminit.(*action).Execute
        /src/pkg/cluster/internal/create/actions/kubeadminit/init.go:107
sigs.k8s.io/kind/pkg/cluster/internal/create.Cluster
        /src/pkg/cluster/internal/create/create.go:136
sigs.k8s.io/kind/pkg/cluster.(*Provider).Create
        /src/pkg/cluster/provider.go:100
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.runE
        /src/pkg/cmd/kind/create/cluster/createcluster.go:86
sigs.k8s.io/kind/pkg/cmd/kind/create/cluster.NewCommand.func1
        /src/pkg/cmd/kind/create/cluster/createcluster.go:52
github.com/spf13/cobra.(*Command).execute
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:826
github.com/spf13/cobra.(*Command).ExecuteC
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:914
github.com/spf13/cobra.(*Command).Execute
        /go/pkg/mod/github.com/spf13/[email protected]/command.go:864
sigs.k8s.io/kind/cmd/kind/app.Run
        /src/cmd/kind/app/main.go:53
sigs.k8s.io/kind/cmd/kind/app.Main
        /src/cmd/kind/app/main.go:35
main.main
        /src/main.go:25
runtime.main
        /usr/local/go/src/runtime/proc.go:203
runtime.goexit
        /usr/local/go/src/runtime/asm_amd64.s:1357

What you expected to happen:

cluster started normally

How to reproduce it (as minimally and precisely as possible):

$cat ${HOME}/kind/conf/kind-kind0.yaml
kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
  extraMounts:
  - containerPath: /var/lib/etcd
    hostPath: /tmp/kind/etcd

kind create cluster --name=kind0 --config ${HOME}/kind/conf/kind-kind0.yaml --loglevel=debug

Anything else we need to know?:

Environment:

- kind version: (use `kind version`):
$kind version
kind v0.7.0 go1.13.6 linux/amd64

- Kubernetes version: (use `kubectl version`):
$kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:23:26Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server 127.0.0.1:32768 was refused - did you specify the right host or port?

Docker version: (use docker info):

$docker info
Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 46
 Server Version: 19.03.7
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.15.0-88-generic
 Operating System: Ubuntu 18.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.61GiB
 Name: dt1
 ID: SWX4:5N53:7OIQ:2TCX:D4B3:LQVG:HURP:2DED:S4S7:SX6N:BTOR:RTV2
 Docker Root Dir: /home/dt1/3rdp/apps/docker/docker-data
 Debug Mode: true
  File Descriptors: 22
  Goroutines: 41
  System Time: 2020-03-08T17:31:50.491465466-04:00
  EventsListeners: 0
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine
WARNING: No swap limit support

OS (e.g. from /etc/os-release):

$cat /etc/os-release 
NAME="Ubuntu"
VERSION="18.04.4 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.4 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

kinsupport

Source

ymolists

All 12 comments

This seems super similar to this issue.

I also tried to remove passing the configuration file and it works by doing:

kind create cluster --name=kind0 --loglevel=debug

ymolists on 9 Mar 2020

uh are you persisting etcd? that's not supported currently.

aside: per the release notes --loglevel is deprecated and v1alpha4 is the current config format fyi.

BenTheElder on 9 Mar 2020

it probably works without passing the config because etcd is not persisted, kubeadm / kind do not expect etcd to have an existing state.

BenTheElder on 9 Mar 2020

confirmed with a kubeadm dev that this is not expected to be supported on their end, I don't think kind expects this either. this extra host mount is very likely the problem.

BenTheElder on 9 Mar 2020

cc @mauilion also fyi @neolit123

BenTheElder on 9 Mar 2020

yes, persisting etcd is not supported by kubeadm. the /var/lib/etcd has to be clean for a new node.

neolit123 on 9 Mar 2020

Hi folks thank you so much for your help ! really much appreciated.

I remember a while ago my machine was thrashing the hdd hard. So i tracked it to this issue/comment . When i remove etcd line i can still hear the hd being trashed like before.

from the kind page:

If you have go (1.11+) and docker installed GO111MODULE="on" go get sigs.k8s.io/[email protected] && kind create cluster is all you need!

i am confused i was under the impression that 0.7.0 was the latest version. how did that command line go away ? i never updated my binary until now

ymolists on 10 Mar 2020

i am confused i was under the impression that 0.7.0 was the latest version. how did that command line go away ? i never updated my binary until now

huh? what command line?
EDIT: If you have go (1.11+) and docker installed GO111MODULE="on" go get sigs.k8s.io/[email protected] && kind create cluster is all you need! is very much supported, see the release notes for details about what has changed, I'm not sure what you mean here https://github.com/kubernetes-sigs/kind/releases

BenTheElder on 10 Mar 2020

https://github.com/kubernetes-sigs/kind/issues/845#issuecomment-529154066 is fine for one run but not repeated runs, the directory needs to be cleared. that comment does not constitute a supported feature 😅

BenTheElder on 10 Mar 2020

👍1

My apologies for not being clear.

i meant the --loglevel command line option. maybe this was always deprecated in 0.7.0 and i never noticed it !

ymolists on 10 Mar 2020

👍1

I can confirm cleaning up the directory works !!! @BenTheElder thanks a million !!!

ymolists on 10 Mar 2020

👍1

Yes, the logging changed in v0.6.0, we've not removed that flag yet but it should print a warning.

Glad it's working!

BenTheElder on 10 Mar 2020