What happened: I'm seeing the same issue as described in #1149. Here is what I'm seeing:
+ kind create cluster --wait 30m --image kindest/node:v1.14.9@sha256:bdd3731588fa3ce8f66c7c22f25351362428964b6bca13048659f68b9e665b72
--
| Creating cluster "kind" ...
| ✓ Ensuring node image (kindest/node:v1.14.9) 🖼
| ✓ Preparing nodes 📦
| ✗ Writing configuration 📜
| ERROR: failed to create cluster: failed to get IPs for node: kind-control-plane: file should only be one line, got 2 lines
What you expected to happen: The cluster starts up normally.
How to reproduce it (as minimally and precisely as possible): I haven't been able to get it to repro reliably, but once it happens, it's stuck that way.
Anything else we need to know?: The solution proposed in #1149 of blowing away the .docker directory doesn't work for us. It's occurring during CI, so needing to babysit the agents and kill them once this happens is not an option. It seems to be a problem with the construction of the base image
Environment:
kind version): kind v0.6.0 go1.13.4 linux/amd64kubectl version): Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:13:54Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}docker info): Client:
--
| Debug Mode: false
|
| Server:
| Containers: 0
| Running: 0
| Paused: 0
| Stopped: 0
| Images: 14
| Server Version: 18.09.6
| Storage Driver: overlay2
| Backing Filesystem: extfs
| Supports d_type: true
| Native Overlay Diff: true
| Logging Driver: json-file
| Cgroup Driver: cgroupfs
| Plugins:
| Volume: local
| Network: bridge host macvlan null overlay
| Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
| Swarm: inactive
| Runtimes: runc
| Default Runtime: runc
| Init Binary: docker-init
| containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339
| runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
| init version: fec3683
| Security Options:
| apparmor
| seccomp
| Profile: default
| Kernel Version: 4.15.0-1052-gcp
| Operating System: Ubuntu 16.04.6 LTS
| OSType: linux
| Architecture: x86_64
| CPUs: 16
| Total Memory: 58.97GiB
| Name: bk-6dd1738aeec68778a4e320f52a0193781e61e8d2-3vkq
| ID: NWBL:2PDV:FRB2:TXXZ:4LS5:FER2:FOLR:QG5T:4QF2:FEQJ:7ERT:6MTY
| Docker Root Dir: /var/lib/docker
| Debug Mode: false
| Registry: https://index.docker.io/v1/
| Labels:
| Experimental: false
| Insecure Registries:
| 127.0.0.0/8
| Live Restore Enabled: false
| Product License: Community Engine
/etc/os-release):NAME="Ubuntu"
--
| VERSION="16.04.6 LTS (Xenial Xerus)"
| ID=ubuntu
| ID_LIKE=debian
| PRETTY_NAME="Ubuntu 16.04.6 LTS"
| VERSION_ID="16.04"
| HOME_URL="http://www.ubuntu.com/"
| SUPPORT_URL="http://help.ubuntu.com/"
| BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
| VERSION_CODENAME=xenial
| UBUNTU_CODENAME=xenial
Hi, in https://github.com/kubernetes-sigs/kind/issues/1149 the issue was that docker was an alias sudo docker. I don't think anyone actually proposed blowing away .docker though the author did do this.
Can you share your docker ~info /~ docker config?
/assign
Er, also noting that v0.6.0 is not the latest, any bugfixes we may have already made would be in v0.7.0.
What does docker network inspect bridge give you?
Thanks for the reply. I noticed that we weren't on latest so I'm currently trying on v0.7.0. Time will tell if that solves this since there's no consistent repro.
My agent doesn't have a docker config, at least when I run docker config ls I get
+ docker config ls
--
| Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.
Not sure if that helps.
Er sorry, the docker daemon config is in a json file on the host, the
docker config command is actually confusingly unrelated
On Fri, Feb 21, 2020, 11:50 Steven Tricanowicz notifications@github.com
wrote:
Thanks for the reply. I noticed that we weren't on latest so I'm currently
trying on v0.7.0. Time will tell if that solves this since there's no
consistent repro.My agent doesn't have a docker config, at least when I run docker config
ls I get
- docker config ls
--
| Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.
Not sure if that helps.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes-sigs/kind/issues/1350?email_source=notifications&email_token=AAHADK7OKFEEWFJ3UMU5U3LREAWA3A5CNFSM4KZFTZC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMT4III#issuecomment-589808673,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHADK6RBZOSEIO4TLUK2ODREAWA3ANCNFSM4KZFTZCQ
.
It appears we don't have one, so we're just using the default. Or at least there's none at /etc/docker/daemon.json.
That's surprising,
So this error means that for some reason when we ask docker to inspect the
container we created with a format string that lists the IPs we got more
output that we expected...
That shouldn't happen 🤔
On Fri, Feb 21, 2020, 12:20 Steven Tricanowicz notifications@github.com
wrote:
It appears we don't have one, so we're just using the default. Or at least
there's none at /etc/docker/daemon.json.—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes-sigs/kind/issues/1350?email_source=notifications&email_token=AAHADK6BBV4G5PM7RDUSFJLREAZRDA5CNFSM4KZFTZC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMT65JY#issuecomment-589819559,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHADK5SCOQWPXCR26EQMFTREAZRDANCNFSM4KZFTZCQ
.
That's surprising, So this error means that for some reason when we ask docker to inspect the container we created with a format string that lists the IPs we got more output that we expected... That shouldn't happen 🤔
@BenTheElder should we be able to see the docker inspect error with more verbosity?
https://github.com/kubernetes-sigs/kind/blob/5cf3257f5bb5fe11828b4f310f8882f349753234/pkg/cluster/internal/providers/docker/node.go#L54-L64
@strican can you add a -v7 flag per example to kind in your CI so the next time we have more data, I'm curious about those extra-lines of the command
docker inspect -f {{range .NetworkSettings.Networks}}{{.IPAddress}},{{.GlobalIPv6Address}}{{end}}
@aojea it's NOT a docker inspect error in that the command did succeed and exit 0. The output is just unexpected. There shouldn't be more lines.
More specifically, we pass a format to docker that includes no newlines, so multiple lines should not be possible under normal circumstances.
@aojea I've reverted back to v0.6.0 and added the v7 flag and have been running builds all day. I haven't hit the issue yet and I don't think we've hit it since updating to v0.7.0. Unfortunately this might be a case of "No Repro", but I'll keep trying and let you know if I hit anything. Thanks all for jumping on this.
going to close for now as not reproducible but please /reopen with more information if you spot this again!
FWIW I have this error when using a kind v0.7.0 in docker:19.03.8-dind image in a GitHub Actions workflow which is using docker version:
Client:
Version: 3.0.10+azure
API version: 1.40
Go version: go1.12.14
Git commit: 99c5edceb48d64c1aa5d09b8c9c499d431d98bb9
Built: Tue Nov 5 00:55:15 2019
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 3.0.10+azure
API version: 1.40 (minimum version 1.12)
Go version: go1.12.14
Git commit: ea84732a77
Built: Fri Jan 24 20:08:11 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.2.11
GitCommit: f772c10a585ced6be8f86e8c58c2b998412dd963
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
... installing and running kind in the workflow steps directly and the cluster is created without any issues. No issues when running the kind in dind environment on my PC (Ubuntu 18.04 with docker 19.03.7) so I'm suspecting the issue is with the hosts docker version.
looking over this now, I think CombinedOutputLines was just handy and perhaps what's actually happening here is docker is printing some error to stderr ...? I'm going to patch this to use just stdout. not sure if that's actually the issue (you'd expect the command to fail anyhow?) but sending the PR anyhow
@BenTheElder Facing the same error when using a kind v0.7.0 with docker version:
```Client: Docker Engine - Community
Version: 19.03.4
API version: 1.40
Go version: go1.12.10
Git commit: 9013bf5
Built: Thu Oct 17 23:44:48 2019
OS/Arch: darwin/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.4
API version: 1.40 (minimum version 1.12)
Go version: go1.12.10
Git commit: 9013bf5
Built: Thu Oct 17 23:50:38 2019
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.2.10
GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339
runc:
Version: 1.0.0-rc8+dev
GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
docker-init:
Version: 0.18.0
GitCommit: fec3683
```
.. saw the same error as with kind v0.6.0 installed. Not sure if it's worth reopening but would be good to know if im doing something wrong.
We've since found that there are circumstances where docker spits out an error on ~all commands (e.g. due to bad ownership of docker config), you should check that, but we've already filed https://github.com/kubernetes-sigs/kind/pull/1415 which is merged into master to not read from stderr.
Same here. I am also seeing this error
$ minikube start --driver=docker
😄 minikube v1.12.2 on Ubuntu 20.04
✨ Using the docker driver based on existing profile
👍 Starting control plane node minikube in cluster minikube
🔄 Restarting existing docker container for "minikube" ...
🤦 StartHost failed, but will try again: IPs output should only be one line, got 2 lines
🏃 Updating the running docker "minikube" container ...
😿 Failed to start docker container. "minikube start" may fix it: provision: Temporary Error: error getting ip during provisioning: IPs output should only be one line, got 2 lines
💣 error provisioning host: Failed to start host: provision: Temporary Error: error getting ip during provisioning: IPs output should only be one line, got 2 lines
😿 minikube is exiting due to an error. If the above message is not useful, open an issue:
👉 https://github.com/kubernetes/minikube/issues/new/choose
@vyom-soft That is a bug in minikube's forked usage of kind. You should file an issue with minikube.
kind has not had this bug since march / v0.8.0
https://kind.sigs.k8s.io/docs/user/quick-start/ can guide you to create a similar one-node cluster with kind without this bug.
Most helpful comment
We've since found that there are circumstances where docker spits out an error on ~all commands (e.g. due to bad ownership of docker config), you should check that, but we've already filed https://github.com/kubernetes-sigs/kind/pull/1415 which is merged into master to not read from stderr.