I am trying to use it on GitLab CI which uses DIND. I am trying to setup cluster inside a Docker container. I have tried the following:
$ docker run --privileged --rm -d --name dind docker:dind
$ docker run --rm -t -i --link dind:docker -e DOCKER_HOST=tcp://docker:2375 ubuntu:artful
Inside container:
$ apt-get update
$ apt-get install --yes golang-go git curl unzip wget apt-transport-https curl ca-certificates
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ curl -s https://download.docker.com/linux/ubuntu/gpg | apt-key add -
$ echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable" > /etc/apt/sources.list.d/docker.list
$ apt-get update
$ apt-get install --yes kubectl docker-ce
$ export GOPATH=/usr/local/go
$ export PATH="${GOPATH}/bin:${PATH}"
$ go get sigs.k8s.io/kind
$ kind create
$ export KUBECONFIG="/root/.kube/kind-config-1"
$ kubectl cluster-info
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The connection to the server localhost:32771 was refused - did you specify the right host or port?
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
898cd8aeedce kindest/node:v1.11.3 "/usr/local/bin/entr…" 2 minutes ago Up 2 minutes 0.0.0.0:32771->6443/tcp kind-1-control-plane
$ docker exec kind-1-control-plane ps
PID TTY TIME CMD
1 ? 00:00:00 systemd
55 ? 00:00:00 systemd-journal
71 ? 00:00:12 dockerd
88 ? 00:00:00 docker-containe
812 ? 00:00:08 kubelet
1755 ? 00:00:00 ps
$ kind delete
Hmm, we're running kind in our DIND setup, I've not tried it in this particular fashion yet though.
/assign
er if I'm reading this correctly, you installed a new version of docker _after_ starting a dind container? that seems like a bad idea. investigating locally with a dind container
edit: nevermind, reread that 🙃
So the problem here appears to be is the network connection from your linked container to kind, the cluster is actually running but you can't talk to it, since it's actually running over in the dind container.
In our CI we do it like:
(host vm / kubernetes node, running docker)
|-> [a kubernetes pod, in which we run docker in docker, run `kind` in this container]
|-> [kind "node" container is a sub container, which itself runs docker inside]
|-> [kubernetes / docker containers for things running on kind]
But with your setup it appears to be more like:
(host vm, running docker presubmably?)
|-> [dind container, running docker]
| |-> [kind "node" container, running docker itself]
| |-> [kubernetes pod container(s)]
|
|-> [ubuntu container, linked to the dind container, ]
|-> [running `kind`, talking to docker in dind container]
Would it be possible for you to avoid the dind container if you can already run docker containers? Can you give more details on your setup? Also note that --link appears to be a legacy feature docker may remove.
So I am using --link just for debugging purposes to simulate what I believe GitLab CI is doing. Otherwise I am targeting Docker executor with docker-in-docker. They have a concept of services and then you connect to those services.
Thanks, taking a look.
This job runs kind in a docker-in-docker pod on kubernetes to run the conformance tests, so kind in docker in docker is definitely a supported use case, however we are also mounting some things to the pod (read only /lib/modules and /sys/fs/cgroup) ...
When you run in the docker executor, are you running everything with the dind container, or are you running the dind container alongside another container there as well?
I think the docker executor is actually a thin layer over kubernetes, cc @munnerz who I believe was involved here...
edit: nope, but it has very similar config, we can mount the volumes if necessary but they shouldn't strictly be necessary
It looks like we can mimic our pod setup if needed, per:
https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-docker-section
One of our actual pods looks like this: https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-kind-conformance/508/artifacts/prow_podspec.yaml
When you run in the docker executor, are you running everything with the dind container, or are you running the dind container alongside another container there as well?
So currently I run dind as a service container, and then I have my main container in which I would like to run my tests on the kubernetes cluster. I have done this setup in the past for using regular Docker images/containers and it works well.
Have you done it while talking to a networked service running over in the dind container before? We should just need to fix up:
localhost:some-port as where the kubernetes API server is runningI can't tell from these details what that address is though, but given what you've shown and local replication of this setup as best as I can tell, otherwise things should be working fine.
Have you done it while talking to a networked service running over in the dind container before?
Oh, I remember. I think I had issues with that in the past. The issue was that from outside, I can see only the dind container, and not any network behind it. So I had to publish ports in Docker containers running through dind, so that they got available on the dind container. So dind container is like host, and you do not have access directly to containers behind.
Which might be also additional problem for me because I want to run then pods on the cluster, which again might not be available from my testing container, because it would again be behind dind container/host.
Yes exactly, it may be possible to forward ports from the dind container but it might be tricky to manage, and we'd need to possibly add some small feature to kind to inform it of the expected address instead of localhost.
Alternatively, if you can run your other code + kind within a container or host running docker (eg dind), it will be a bit simpler. We know this works.
If we can get to the "kubernetes API server is forwarded through dind, and we've told kind to sign the certs for this, and point the kubeconfig at it", then we could use the kubernetes api server proxy functionality to talk to pods on the cluster...
but it will again likely be simpler and more robust if we can avoid the kind cluster being in a different "host" docker / network space.
OK. So I do not know about gitlab.com CI, but on our private GitLab instance I discovered that it seems I am given docker.sock mounted into my container from host, I guess. (I have some thoughts about such setup and security of it, but I will not complain at the moment.) So I can simply have one Docker container inside which I do everything. I do the following (in an image with Go and Docker client already installed):
go get sigs.k8s.io/kind
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
apt-get update -q -q
apt-get install --yes kubectl
kind create
export KUBECONFIG="/root/.kube/kind-config-1"
kubectl cluster-info || true
docker ps
docker exec kind-1-control-plane ps
So I just install kind and kubectl and then test it out. Sadly, it still does not work but I think this is closer. The output of final commands is is as follows:
$ export KUBECONFIG="/root/.kube/kind-config-1"
$ kubectl cluster-info || true
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The connection to the server localhost:32768 was refused - did you specify the right host or port?
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4aaf4aaa8acf kindest/node:v1.11.3 "/usr/local/bin/entr…" 2 minutes ago Up About a minute 0.0.0.0:32768->6443/tcp kind-1-control-plane
83494fdd744b a37a729110ce "sh -c 'if [ -x /usr…" 3 minutes ago Up 3 minutes runner-b938861c-project-880-concurrent-0-build
$ docker exec kind-1-control-plane ps
PID TTY TIME CMD
1 ? 00:00:01 systemd
52 ? 00:00:00 systemd-journal
66 ? 00:00:38 dockerd
97 ? 00:00:00 docker-containe
814 ? 00:00:06 kubelet
922 ? 00:00:00 docker-containe
923 ? 00:00:00 docker-containe
925 ? 00:00:00 docker-containe
926 ? 00:00:00 docker-containe
992 ? 00:00:00 pause
995 ? 00:00:00 pause
1003 ? 00:00:00 pause
1005 ? 00:00:00 pause
1069 ? 00:00:00 docker-containe
1088 ? 00:00:01 kube-scheduler
1097 ? 00:00:00 docker-containe
1118 ? 00:00:32 kube-apiserver
1119 ? 00:00:00 docker-containe
1135 ? 00:00:00 docker-containe
1168 ? 00:00:06 etcd
1180 ? 00:00:03 kube-controller
1412 ? 00:00:00 docker-containe
1431 ? 00:00:00 pause
1453 ? 00:00:00 docker-containe
1470 ? 00:00:00 kube-proxy
1555 ? 00:00:00 docker-containe
1591 ? 00:00:00 pause
1695 ? 00:00:00 exe
1707 ? 00:00:00 ps
So you see that docker ps now shows my CI container alongside kind-1-control-plane container. I think this is better because it means I should be able to connect directly to stuff in kind-1-control-plane container. But I am not yet able. Any suggestions here?
I think you just need your second container to use --network=host when creating it, I think I've got this setup replicated locally and that works for me.
from within a dind container I first tried what I think was your setup and confirmed the issue (still not the same network, but I can see the "cluster" running), then I created another container from within the dind container with:
docker run -it --network=host -v /var/run/docker.sock:/var/run/docker.sock ubuntu /bin/bash
and proceeded to install docker + kubectl, copy the kind config over from the other container, etc., and I can talk to the cluster, listing pods etc.
Also:
(I have some thoughts about such setup and security of it, but I will not complain at the moment.)
Absolutely! Any dind solution should be a major security concern, including this one. Please be careful.
kind and friends are cheap, fast, and work in situations where nested VMs are not available; but any current docker in docker solution is effectively ~root on the host regardless of whether the docker socket is passed due to needing either the socket (in which case they can create arbitrary containers on the host...) or being run with --privileged to run docker themselves (which is also ~root). We make sure to run this on CI VMs that don't have any particularly sensitive credentials. More sensitive workloads are on entirely separate CI cluster(s) .
I of course also run kind locally, but I don't schedule any untrusted workloads on it, unless you count Kubernetes itself to some degree.
Hm, running with --host is sadly not possible for me. I think the issue is simply to update the address of where the kubernetes is running. I tried:
$ docker run --rm -t -i --privileged -v /var/run/docker.sock:/var/run/docker.sock ubuntu:artful
$ apt-get update
$ apt-get install --yes golang-go git curl unzip wget apt-transport-https curl ca-certificates
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
$ curl -s https://download.docker.com/linux/ubuntu/gpg | apt-key add -
$ echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable" > /etc/apt/sources.list.d/docker.list
$ apt-get update
$ apt-get install --yes kubectl docker-ce
$ export GOPATH=/usr/local/go
$ export PATH="${GOPATH}/bin:${PATH}"
$ go get sigs.k8s.io/kind
$ kind create
$ sed -i "s/localhost:32781/$(docker inspect --format '{{.NetworkSettings.IPAddress}}' kind-1-control-plane):6443/" /root/.kube/kind-config-1
$ export KUBECONFIG="/root/.kube/kind-config-1"
$ kubectl cluster-info
But I am guessing certificates do not match? I still get connection refused. How can I be sure that the other container really runs properly? I can ping it now from my container.
I did nmap port scan and only port 10250/tcp is open on the container.
Are you sure --network=host is not doable? it should work if we're talking about another container in the dind. It should be the network of the dind container, not the actual host network.
EG it looks like network_mode = "host" will tell an executor to do this.
https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-runners-docker-section
The certificates may not match if you use another address, when kind inits kubernetes it requests localhost as an additional address for the API server certs so it an authenticate, besides the auto-detected IP(s). We can add a field to add another address to sign but we'd have to specify it ahead of time.
So currently I am not using dind anymore, but Docker socket from the host. So the container in which I am already runs. I could try to create another container inside and then go inside it and so on, but to me it looks like the issue is somewhere else because the container runs and I can ping it (it is jut not on localhost), but no ports besides 10250 are open in the container.
There should be one randomly allocated port (allocated by docker) open on the container forwarding to the secure API server port (6443), and the exported kubeconfig will match localhost:${THE_PORT}. If you run a process on the same level as the docker daemon it should be able to talk to it. 10250 is probably the random port from that session.
If you run something in an nested container that container ideally needs to use --network=host to avoid going into another network namespace.
Or we can point at the container IP instead (which the cert should actually already be signed for if it's just the docker container IP), the config with that IP is currently obtainable with docker cp kind-1-control-plane:/etc/kubernetes/admin.conf admin.conf.
EDIT: adjacent -> nested. for adjacent we just want to use the actual node container IP, which is actually in the default config, when we export the config to the host we rewrite this to match the forwarded port. I'm thinking about ways we could better expose that...
So something else is wrong. So I am connecting directly to the container, bypassing Docker port mapping. Connecting to 6443 does not work. 10250 is port on which kubelet is listening. But why there are no other things running correctly in the container.
I managed to get it working with the following:
$ docker run --rm -t -i --privileged ubuntu:artful
$ apt-get update
$ apt-get install --yes golang-go git curl unzip wget apt-transport-https curl ca-certificates
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
$ curl -s https://download.docker.com/linux/ubuntu/gpg | apt-key add -
$ echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu xenial stable" > /etc/apt/sources.list.d/docker.list
$ apt-get update
$ apt-get install --yes kubectl docker-ce
$ echo '{"storage-driver": "vfs"}' > /etc/docker/daemon.json
$ service docker start
$ export GOPATH=/usr/local/go
$ export PATH="${GOPATH}/bin:${PATH}"
$ go get sigs.k8s.io/kind
$ kind create
$ export KUBECONFIG="/root/.kube/kind-config-1"
$ kubectl cluster-info
So instead of using host's Docker socket, I do simply a proper dind inside my container.
Awesome!
You can likely avoid some peformance loss (I don't have numbers currently) by making /var/lib/docker a volume of some kind (eg tmpfs) instead of switching to the vfs driver.
If you use kind with defaults this should continue to work as is for the foreseeable future. The config is not yet stable (PR #36) and logging etc needs work. I'll be stabilizing it and looking into multinode this quarter though, we intend to use it for more CI ourselves. :-)
Please let me know if you have any more feedback or issues. I know user and development guides are very high on my list currently besides UX and stability fixes.
So while I was able to make this work, it would be great if this would work also no gitlab.com. It would be useful to try it there as well.
And thanks for all this work and thank you for all the help.
OK, as a note to my future self and others. I had issues running Docker inside my own privileged container so that I could run kind inside, and the reason was that I wanted to use overlay2 on top of overlay2 already. The solution was to define VOLUME /var/lib/docker so that Docker files went directly to host's volume and not overlay inside the container.
Yes exactly, I tried to mention this above but failed I think. kind does something like this for it's own docker in docker on the "node"s for the same reason :upside_down_face:
Doing that has worked flawlessly for dind in our CI at least. overlay fs don't stack, but it works fine if you just make sure the docker graph (/var/lib/docker) is a volume for any dind containers.
I'll be sure to add this to the docs soon!
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
After fixing the cluster name issue from #619 I think I hit the same issue.
In my CI get:
kind cluster name is kind947216
Creating cluster "kind947216" ...
• Ensuring node image (kindest/node:v1.14.2) 🖼 ...
....
✓ Joining worker nodes 🚜
Cluster creation complete. You can now use the cluster with:
export KUBECONFIG="$(kind get kubeconfig-path --name="kind947216")"
kubectl cluster-info
+ kubectl cluster-info
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The connection to the server localhost:40781 was refused - did you specify the right host or port?
I am not so sure to understand in which container /var/lib/docker must be specified as a volume?
Is it the one I use as image for my gitlab job?
Since this issue is quite old now is there some bits of docs about that?
@TheErk to run docker in docker that path must be a volume in the container you run docker in.
There are no docs for this because we don't have any gitlab CI and nobody has contributed any 😅
As mentioned previously, any contributions tohttps://github.com/kind-ci/examples would also be extremely welcome, we aim to eventually have starter configs etc. for use everywhere..
xref: https://github.com/kubernetes-sigs/kind/issues/620#issuecomment-503652473
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
I've had success running kind with the docker:dind service by using the Networking section of the config, e.g. setting networking.apiServerAddress to the IP address of the docker service.
Using something like apiServerAddress: 0.0.0.0 won't work because this single setting is used for three things:
Example:
cat >>kind.yaml <<END
networking:
apiServerAddress: $(host docker)
END
See my full setup here: https://gitlab.com/ViDA-NYU/reproserver/commit/4e9e8adfca37ca091e5c02ad3a3b070736e3b0ec
SOLUTION:
@mitar You can run KIND inside the docker container by allowing network to use the host network using the following command:
Note:
Most helpful comment
I managed to get it working with the following:
So instead of using host's Docker socket, I do simply a proper dind inside my container.