What happened:
$ kind create cluster
$ export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"
$ docker pull fedora:latest
$ kind load docker-image fedora:latest
# Exits successfully
$ kubectl run --attach --rm --restart=Never test-metrics --image=fedora:latest -- sleep 1
# Do some complicated e2e cluster stuff
$ ...
# Returns in error
$ kubectl run --attach --rm --restart=Never test-metrics --image=fedora:latest -- sleep 1
$ kubectl describe pod test-metrics
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 15s default-scheduler Successfully assigned default/test-metrics to kind-control-plane
Normal Pulling 12s (x2 over 14s) kubelet, kind-control-plane Pulling image "fedora:latest"
Normal Pulled 12s (x2 over 13s) kubelet, kind-control-plane Successfully pulled image "fedora:latest"
Warning Failed 12s (x2 over 13s) kubelet, kind-control-plane Error: failed to create containerd container: error unpacking image: content digest sha256:871804803c6076dd10a7163887a26c6bc6cc33e0989dfbe3a45c5851490ee064: not found
What you expected to happen:
Idempotent kubectl run.
How to reproduce it (as minimally and precisely as possible):
See above, or https://github.com/containerd/containerd/issues/3401.
Anything else we need to know?:
This seems to be a bug in containerd that was fixed in https://github.com/containerd/containerd/issues/3401, and released in v1.2.8. However the base image for node, Ubuntu 19.04, only has up to v1.2.6 at the time of writing. Because kind depends heavily on containerd at runtime, I suggest installing containerd from its latest version in the base Dockerfile so bugs that affect kind can be patched more quickly.
I can try modifying how containerd is installed if desired.
Environment:
kind version):v0.5.1
kubectl version):v1.14.6
docker info):Server Version: 18.09.6
/etc/os-release):Fedora 30 (Workstation Edition)
using :latest with kind load ... is usually a bug btw, we should consider detecting and warning on that.
https://kind.sigs.k8s.io/docs/user/quick-start/#loading-an-image-into-your-cluster
I intend to have us on container 1.2.9 or 1.3 for the next kind release.
I suggest installing containerd from its latest version in the base Dockerfile so bugs that affect kind can be patched more quickly.
yes, the problem is their binary releases are only linux/amd64, and the ubuntu / debian packages are not at the latest. we are using the latest packages until we have cross platform builds of arbitrarily new versions. @aojea was looking into this.
cc also @Random-Liu ... can we get the upstream containerd releases to include tarballs for more architectures?
@BenTheElder glad to hear you're on top of this.
Is it not feasible to cross-compile the containerd binary from source in a builder container as a pre-build step, then copy that binary into the base image? I'm likely missing some detail from your release process.
Roughly that's the plan, compiling and packaging it (eg with runc) is not as easy as you might, it's not a single pure Go binary, it's a few that need system c libraries etc.
We have our own cross compile of ctr (the client) on cloud build already (search the repo) but that was simpler. @aojea is currently working on the cross compile
this should get us on 1.2.9 in the meantime https://github.com/kubernetes-sigs/kind/pull/875
new node images should have containerd 1.2.9, we're also hopefully getting multi-arch releases upstream https://github.com/containerd/containerd/pull/3702
closing this for now as we should have the root cause fixed, however you may need to build a node image from the latest base image (install kind from head and then kind build node-image is the easiest route) until we release again.