What happened:
Running kind delete cluster gives the following output
ERROR: failed to delete cluster: failed to delete nodes: command "docker rm -f -v kind-control-plane2 kind-control-plane,ingress-proxy-80/target,ingress-proxy-443/target kind-external-load-balancer kind-control-plane3 kind-worker2 kind-worker kind-worker3" failed with error: exit status 1
What you expected to happen:
For the cluster to get deleted without error
How to reproduce it (as minimally and precisely as possible):
kind create cluster --config=config.yaml
The contents of config.yaml
kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
networking:
disableDefaultCNI: True
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
- role: worker
kubeadmConfigPatches:
- |
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
metadata:
name: config
networking:
serviceSubnet: "172.30.0.0/16"
podSubnet: "10.254.0.0/16"
docker exec -ti kind-control-plane sysctl -w net.ipv4.conf.all.rp_filter=0
docker exec -ti kind-control-plane2 sysctl -w net.ipv4.conf.all.rp_filter=0
docker exec -ti kind-control-plane3 sysctl -w net.ipv4.conf.all.rp_filter=0
docker exec -ti kind-worker sysctl -w net.ipv4.conf.all.rp_filter=0
docker exec -ti kind-worker2 sysctl -w net.ipv4.conf.all.rp_filter=0
docker exec -ti kind-worker3 sysctl -w net.ipv4.conf.all.rp_filter=0
curl -so manifests/calico.yaml https://docs.projectcalico.org/v3.8/manifests/calico.yaml
sed -i 's/192\.168/10\.254/g' manifests/calico.yaml
kubectl apply -f manifests/calico.yaml
for worker in kind-worker kind-worker{2..3}
do
kubectl label node ${worker} node-role.kubernetes.io/worker=''
done
kubectl create ns ingress-nginx
helm repo add stable https://kubernetes-charts.storage.googleapis.com/
helm repo update
helm install ingress-nginx stable/nginx-ingress --namespace ingress-nginx \
--set rbac.create=true --set controller.image.pullPolicy="Always" --set controller.extraArgs.enable-ssl-passthrough="" \
--set controller.stats.enabled=true --set controller.service.type="NodePort"
for port in 80 443
do
node_port=$(kubectl get svc -n ingress-nginx ingress-nginx-nginx-ingress-controller -o=jsonpath="{.spec.ports[?(@.port == ${port})].nodePort}")
docker run -d --name ingress-proxy-${port} \
--publish 127.0.0.1:${port}:${port} \
--link kind-control-plane:target \
alpine/socat -dd \
tcp-listen:${port},fork,reuseaddr tcp-connect:target:${node_port}
done
Anything else we need to know?:
I need to run the following in order to successfully run kind delete cluster
docker rm --link /ingress-proxy-80/target
docker rm --link /ingress-proxy-443/target
Another thing, is that I never had to do this in v0.5.0
Environment:
kind v0.6.0 go1.13.4 linux/amd64
kubectl version):Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Containers: 3
Running: 3
Paused: 0
Stopped: 0
Images: 10
Server Version: 1.13.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Authorization: rhel-push-plugin
Swarm: inactive
Runtimes: oci runc
Default Runtime: oci
Init Binary: /usr/libexec/docker/docker-init-current
containerd version: (expected: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1)
runc version: N/A (expected: 9df8b306d01f59d3a8029be411de015b7304dd8f)
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
WARNING: You're not using the default seccomp profile
Profile: /etc/docker/seccomp.json
selinux
Kernel Version: 5.3.11-300.fc31.x86_64
Operating System: Fedora 31 (Workstation Edition)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 12
Total Memory: 62.47 GiB
Name: laptop
ID: EPXO:U3CU:UTC3:2OMV:HEXZ:KE2X:X4BW:M35G:IUY2:UTMY:LKOK:NJF4
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
default-route-openshift-image-registry.apps.ocp4.cloud.chx
default-route-openshift-image-registry.apps.openshift4.cloud.chx
127.0.0.0/8
Live Restore Enabled: true
Registries: docker.io (secure), registry.fedoraproject.org (secure), quay.io (secure), registry.access.redhat.com (secure), registry.centos.org (secure), docker.io (secure)
NAME=Fedora
VERSION="31 (Workstation Edition)"
ID=fedora
VERSION_ID=31
VERSION_CODENAME=""
PLATFORM_ID="platform:f31"
PRETTY_NAME="Fedora 31 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:31"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f31/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=31
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=31
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
More verbose output
$ kind delete cluster -v 9
Deleting cluster "kind" ...
ERROR: failed to delete cluster: failed to delete nodes: command "docker rm -f -v kind-control-plane,ingress-proxy-80/target,ingress-proxy-443/target kind-control-plane2 kind-worker kind-control-plane3 kind-worker3 kind-worker2 kind-external-load-balancer" failed with error: exit status 1
Output:
kind-control-plane2
kind-worker
kind-control-plane3
kind-worker3
kind-worker2
kind-external-load-balancer
Error response from daemon: No such container: kind-control-plane,ingress-proxy-80/target,ingress-proxy-443/target
Stack Trace:
sigs.k8s.io/kind/pkg/errors.WithStack
/src/pkg/errors/errors.go:51
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
/src/pkg/exec/local.go:116
sigs.k8s.io/kind/pkg/internal/cluster/providers/docker.(*Provider).DeleteNodes
/src/pkg/internal/cluster/providers/docker/provider.go:130
sigs.k8s.io/kind/pkg/internal/cluster/delete.Cluster
/src/pkg/internal/cluster/delete/delete.go:42
sigs.k8s.io/kind/pkg/cluster.(*Provider).Delete
/src/pkg/cluster/provider.go:105
sigs.k8s.io/kind/pkg/cmd/kind/delete/cluster.runE
/src/pkg/cmd/kind/delete/cluster/deletecluster.go:58
sigs.k8s.io/kind/pkg/cmd/kind/delete/cluster.NewCommand.func1
/src/pkg/cmd/kind/delete/cluster/deletecluster.go:44
github.com/spf13/cobra.(*Command).execute
/go/pkg/mod/github.com/spf13/[email protected]/command.go:826
github.com/spf13/cobra.(*Command).ExecuteC
/go/pkg/mod/github.com/spf13/[email protected]/command.go:914
github.com/spf13/cobra.(*Command).Execute
/go/pkg/mod/github.com/spf13/[email protected]/command.go:864
sigs.k8s.io/kind/cmd/kind/app.Run
/src/cmd/kind/app/main.go:53
sigs.k8s.io/kind/cmd/kind/app.Main
/src/cmd/kind/app/main.go:35
main.main
/src/main.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:203
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
i'm going to assume that if you do kind get nodes it also lists your ingress containers?
ideally the filter should not catch these.
Indeed it does
$ kind get nodes
kind-control-plane3
kind-worker2
kind-external-load-balancer
kind-worker
kind-control-plane,ingress-proxy-80/target,ingress-proxy-443/target
kind-control-plane2
kind-worker3
@christianh814
in this function you can see the docker command that is used for retrieving the list of nodes:
https://github.com/kubernetes-sigs/kind/blob/226c290cdd1f8f39a948d436ac9c96deeb1ae1ef/pkg/internal/cluster/providers/docker/provider.go#L92
deprecatedClusterLabelKey is defined here:
https://github.com/kubernetes-sigs/kind/blob/383279e348eed288a3ad68a984b355ddd4a7254a/pkg/internal/cluster/providers/docker/constants.go#L24
one potential fix is to always split after the first , on a line that the command returns. [1]
the problem here seems to be the linking.
i don't know if there is a way to exclude this from the output of the docker command, but you can play with that.
the alternative is to go with [1], assuming , must not be part of the node name.
the alternative is to go with [1], assuming , must not be part of the node name.
looks like the cluster names already forbids ",".
cluster names must match
^[a-zA-Z0-9_.-]+$
potential fix here:
https://github.com/kubernetes-sigs/kind/pull/1117
/priority important-longterm
- Run the workaournd to make calico work
docker exec -ti kind-control-plane sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-control-plane2 sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-control-plane3 sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-worker sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-worker2 sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-worker3 sysctl -w net.ipv4.conf.all.rp_filter=0
@christianh814
this is no longer needed in 0.6 https://github.com/kubernetes-sigs/kind/pull/897
- Run the workaournd to make calico work
docker exec -ti kind-control-plane sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-control-plane2 sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-control-plane3 sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-worker sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-worker2 sysctl -w net.ipv4.conf.all.rp_filter=0 docker exec -ti kind-worker3 sysctl -w net.ipv4.conf.all.rp_filter=0@christianh814
this is no longer needed in 0.6 #897
Nice!
So:
Warning: The --link flag is a legacy feature of Docker. It may eventually be removed. Unless you absolutely need to continue using it, we recommend that you use user-defined networks to facilitate communication between two containers instead of using --link. One feature that user-defined networks do not support that you can do with --link is sharing environment variables between containers. However, you can use other mechanisms such as volumes to share environment variables between containers in a more controlled way.
https://docs.docker.com/network/links/
I'm not sure we should invest in this, in addition, @neolit123's patch doesn't work in all cases, the output is not sorted and the format is not documented :(
+1 i will close my PR.
--link might have to be added to "known issues".
Given that this is the only instance I've heard of someone trying to use this with kind and docker upstream strongly discourages using it with bold red text, I think we can even leave out the "known issues" and focus on the user defined networks solution.
https://github.com/kubernetes-sigs/kind/issues/1124 takes priority but we definitely need to look at user defined networks more.
I think you can accomplish this kind of ingress stuff without using --link as well.
FWIW if we need a more complete fix in the future, I think the answer is to walk the list of names and keep the one without a / in it. I'd rather not add unnecessary complexity for a feature that is going away though.
Most helpful comment
Indeed it does