Containers-roadmap: [EKS] [CRI]: Support for Containerd CRI

Created on 5 Jun 2019 · 31Comments · Source: aws/containers-roadmap

Tell us about your request
What do you want us to build?
Support for Containerd CRI

Which service(s) is this request for?
EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
Currently, EKS nodes run dockerd. Containerd is a popular CRI that is more efficient.

Are you currently working around this issue?
How are you currently solving this problem?
AL2 nodes fail when containerd is installed.

Additional context
Anything else we should know?
This will enable customers customise and configure kubelet parameters to select their preferred Container Runtime.

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

EKS

Source

paavan98pm

👍89

Most helpful comment

EKS/Fargate uses the containerd runtime, so that is a production ready option today.

Our plan for containerd on worker nodes is to add official EKS support for Bottlerocket, once the project graduates from a public preview.

Bottlerocket is a Linux based operating system purpose-built to run containers. We encourage you to try out the public preview and leave any feedback on the GitHub project. You can learn more about Bottlerocket here.

mikestef9 on 9 Jul 2020

👍15

All 31 comments

No updates in past 2 months, even no acknowledgement. This looks bad.

rverma-nikiai on 3 Aug 2019

👍2

Yep, time to think about it right?

lgg42 on 9 Aug 2019

We're excited to support containerd, too. We are making sure that we have the right test, security, and release tools in place before we officially recommend it to our customers.

It would be useful to know if folks have specific thoughts about how the runtime should be configured:

Should we install both docker-ce and containerd in the AMI? Should both start automatically upon instance launch?
Do customers want to use a runtimeClassName config to pick a runtime dynamically?

jtoberon on 23 Aug 2019

👍4

Great @jtoberon. If there is just one AMI, then both runtimes with runtimeClassName sounds like a good transition plan. But for us the end goal is for the cluster/AMIs to be docker-free; the past grief caused by the unstable development practices, the kitchen-sinking of swarm, the ‘moby’ mess and other junk changes make us wary of that upstream project, we are ready to leave it behind.

whereisaaron on 23 Aug 2019

:+1: to being able to run both runtimes with runtimeClassName. Is there a ETA for the support?

owenthereal on 12 Sep 2019

Is it as simple as making changes to the worker nodes AMI https://github.com/awslabs/amazon-eks-ami to run containerd? Or are the masters/control-plane also in need of changes?

edify42 on 13 Sep 2019

Yes, we intend to change that AMI build to install the containerd software.

To support that change, we need to do a bunch of other things. Here are a few examples:

Build the containerd binary into the Amazon Linux yum repositories.
Establish a process for learning about and responding to embargoed CVEs in any new software that we build.
Do performance testing to make sure we understand the performance implications of this runtime change for our customers.
Set up automated tests to ensure that all of the software that we package for our customers continues to work together over time.

jtoberon on 16 Sep 2019

👍9

Thanks @jtoberon yes I imagine it is not trivial. And I expect it will need a healthy period of 'developer preview' too.

One other possible chore for your list is ensuring the container logging and log rotation works well, and that you have a solid fluentd/cloudwatch configuration. Since that works quite differently for containerd compared to dockerd.

whereisaaron on 17 Sep 2019

👍3

Hello, I am trying to get runsc running on EKS workers. Is there a way to do it today? Appreciate any pointers

kvidhya on 24 Sep 2019

Also when will the containerd support be released?

kvidhya on 24 Sep 2019

👍1

Now that docker enterprise was acquired could this also be scoped out for the ECS platform? Not really sure what goes into that other than maybe updating ecs optimized ami and the ecs-agent itself? What is everyone thoughts on that?

Thutm on 13 Nov 2019

We would be interested in switching out docker for containerd in our EKS nodes as well, in hopes that it might help with things like https://github.com/awslabs/amazon-eks-ami/issues/195

kr3cj on 9 Jan 2020

In design, Fargate nodes seem to use containerd as their container runtime.

inductor on 4 Feb 2020

👍1

With a custom AMI setup, at first sight it seems to work correctly 👍

eks.1 - 1.16

# Install containerd
~# wget https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.3.4.linux-amd64.tar.gz
~# tar --no-overwrite-dir -C / -xzf /tmp/cri-containerd-1.3.4.linux-amd64.tar.gz
~# containerd config default > /etc/containerd/config.toml
~# systemctl containerd start

# Added couple of necessary flags on the kubelet
--container-runtime-endpoint=unix:///run/containerd/containerd.sock
--container-runtime=remote

# Node info
System Info:
  Machine ID:                 ec24d43f57c1054dbf44887269f36c5a
  System UUID:                ec24d43f-57c1-054d-bf44-887269f36c5a
  Boot ID:                    976f4d4e-07df-4d7a-94a4-9ff7a661ed70
  Kernel Version:             5.4.0-1009-aws
  OS Image:                   Ubuntu 20.04 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.3.4
  Kubelet Version:            v1.16.8
  Kube-Proxy Version:         v1.16.8

# Ready event
  Normal   NodeReady                9m1s                   kubelet, ip-10-0-0-1.eu-west-1.compute.internal  Node ip-10-0-0-1.eu-west-1.compute.internal status is now: NodeReady

mvisonneau on 5 May 2020

👍6

EKS with some NodeGroups as Docker and some as ContainerD? Is this possible with K8S? Would be nice . . .

kferrone on 12 May 2020

Hi EKS team, it‘s 2020 and containerd is a CNCF graduated project. It shouldn‘t be only installed on the EKS AMIs, it should be the default runtime.

When will we see containerd support on EKS?

hendrikhalkow on 21 May 2020

Any updates on that?

midN on 28 May 2020

👍2

EKS/Fargate uses the containerd runtime, so that is a production ready option today.

Our plan for containerd on worker nodes is to add official EKS support for Bottlerocket, once the project graduates from a public preview.

mikestef9 on 9 Jul 2020

👍15

Hi @mikestef9 and/or EKS team - we are trying to work out which 'sandboxing' features (e.g. gVisor, Firecracker) are available in EKS - as far as I can tell, none are currently supported (but please correct me if that is wrong).

Bottlerocket 1.0.0 was released ~15 hours ago - what does the timeline look like for it being available in EKS? Will it be supported when using managed node groups? And - the big question for me - given that it uses containerd, does that mean we'll be able to plug in Firecracker/similar?

ecrousseau on 1 Sep 2020

👍5

@ecrousseau can you clarify what you mean by "sandboxing"? I'd argue EKS/Fargate does provide "sandboxing" in that each Kubernetes pod runs in its dedicated OS/kernel (or VM/instance if you will). When the user deploys a pod we source on the fly a dedicated vm/instance from the Fargate pool and use that dedicated vm/instance to run that specific pod. Rinse and repeat. This vm/instance could be an EC2 instance that is part of the Fargate pool or could be a micro-VM running on Firecracker. This is an implementation detail and would not be something that a user should be aware of. They both implement the same deployment pattern (1 pod per instance/vm).

Is this the "sandboxing" you are alluding to?

mreferre on 1 Sep 2020

👍1

Thanks @mreferre - yes, that kind of separation is what I was talking about. I will have a look at Fargate.

ecrousseau on 2 Sep 2020

👍3

EKS/Fargate isn't a perfect solution for sandboxing in my environment. We use the Datadog agent running as a daemonset to collect logs and metrics (and maybe soon APM) for our workloads.

I haven't tested this yet: To accomplish those collections with EKS/Fargate, we would need to run the Datadog agent as a sidecar container in every EKS/Fargate pod. This is based on the assumption that the Datadog agent could still do the collections as a non-privileged container (e.g. access /var/logs/pods possibly as a readOnly mount).

Assuming that we could access the logs and metrics, we'd still have to exponentially increase our resource utilization (i.e. costs) to run the sidecars versus the daemonset.

matthewhembree on 5 Nov 2020

@matthewhembree refer to the Datadog has this documentation to see how it's implemented in details. You are correct that, with the current model, you'd need to have an agent sidecar per each pod. Unless you are consuming nearly all resources you are sizing your pod for, I would speculate that resource consumption isn't as relevant as having to change the operational model to inject these sidecars to make logging work on EKS/Fargate. We want to solve for this by way of this feature that we are working on. The idea would be to have a router embedded into the Fargate service that you transparently use to ship logs (and more) to an external endpoint with a centralized configuration. Not only would you not to have to inject a sidecar into every pod but you would not have to deal with DeamonSets either (given with Fargate there are no nodes by definition). If that would be of interest to you please subscribe to that issue to get updates as we progress.

mreferre on 5 Nov 2020

Docker is now deprecated(v1.20).

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.20.md#changes-by-kind-1

phenri00 on 2 Dec 2020

👍10 👀7

Calm down... it's not like Docker will be broken:

https://github.com/kubernetes/kubernetes/pull/94624#issuecomment-737486287

Saziba on 2 Dec 2020

As I understand, both Fargate and Bottlerocket are generally available and use containerd CRI, and are supported by tools like eksctl, c.f., Bottlerocket and Fargate.

Managed Nodegroups don't specifically support Bottlerocket yet. That's a different issue, #950. There is support for custom AMIs through a Launch Template, which could certainly _be_ a Bottlerocket AMI, if you want this combination working _now_.

Is there anything left that is going to happen in the future for this ticket? Some things trawled from the comments on this ticket:

Migrating Amazon's non-bottlerocket EKS-optimised AMIs (i.e. AmazonLinux2) to use containerd CRI, or offering a containerd-based alternative AMI. There's plenty of users using docker-in-docker in privileged containers on their k8s clusters, who still need Docker running on the node for their workflow, and that will still work even after k8s is not using Docker as its container runtime (as long as containerd for k8s and Docker containerd don't conflict...)
What about Windows EKS worker nodes? k8s Windows containerd CRI support is beta in 1.19 and hopefully GA around 1.21.

If those aren't part of the current plan, then the plan at https://github.com/aws/containers-roadmap/issues/313#issuecomment-656212703 has been achieved, so maybe resolve this ticket and track separate requests like the above two separately.

TBBle on 3 Dec 2020

Calm down... it's not like Docker will be broken:

kubernetes/kubernetes#94624 (comment)

What if you are running dind in your CI? This will not work anymore, right?

(yes, I know you should use something like kaniko for building images)

phenri00 on 3 Dec 2020

What if you are running dind in your CI? This will not work anymore, right?

You can still run Docker on the node for ~~dind~~ DooD use, it's just that you'll also need a containerd CRI (or cri-o, or other CRI implementation) on the node for k8s to use. It's like a GPU then, it's up to the cluster administrator to make Docker available as a system resource if necessary.

For a dind use-case, you should be able to run dockerd as a privileged non-sandboxed daemonset image, so you don't have to install it manually on your nodes, like we do with kube-proxy and many CSI/CNI plugins for example. ~~(It's possible that even a fully-privileged pod doesn't have the access it needs for this, I acknowledge).~~

It's also possible that by the time this happens, Docker's bundled containerd (on Linux) will have CRI available, so you could still install Docker on the node, and then point k8s at the Docker-bundled containerd's CRI. I'm not _certain_ that will work, but if you have no choice but ~~_dind_~~ _DooD_, then it'll be worth exploring sooner rather than later.

Of course, if your workflow was assuming that the node would have access to ~~dind~~ DooD-built images without pushing them to a registry, _that_ will break irretrievably. It's already a pretty-risky approach now though, so hopefully no one's still relying on that a year from now.

Edit: Actually, I think we're talking about DooD, "Docker outside of Docker", where a container has access to the host's /var/run/docker.sock. That's the flow that breaks when users are relying on the k8s install _automatically_ including a Docker daemon. That flow also breaks now if you use Bottlerocket today, which doesn't use Docker, or Fargate, where you can't run privileged containers at all (and is also not using Docker). So this was already a bad idea on k8s clusters.

Docker-in-Docker is where you have a Docker daemon running in a privileged container with one of the docker-dind container images. I believe that will still work, as it just needs to have access to the host, and doesn't rely on the external container runtime _also_ being Docker, so I expect that will work on Bottlerocket today.

So in short: Move to Bottlerocket and test your stuff. If anything breaks around Docker, move back and you have 3 or so k8s releases to fix that before the whole ecosystem becomes like that. And you'll have a less-brittle pipeline as a bonus.

TBBle on 3 Dec 2020

👍4

Calm down... it's not like Docker will be broken:
kubernetes/kubernetes#94624 (comment)

What if you are running dind in your CI? This will not work anymore, right?

(yes, I know you should use something like kaniko for building images)

https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/

Saziba on 3 Dec 2020

Calm down... it's not like Docker will be broken:
kubernetes/kubernetes#94624 (comment)

What if you are running dind in your CI? This will not work anymore, right?
(yes, I know you should use something like kaniko for building images)

https://kubernetes.io/blog/2020/12/02/dont-panic-kubernetes-and-docker/

Thanks. Great stuff. So it seems like dind will be broken then(if you are mounting the socket):

One thing to note: If you are relying on the underlying docker socket (/var/run/docker.sock) as part of a workflow within your cluster today, moving to a different runtime will break your ability to use it. This pattern is often called Docker in Docker. There are lots of options out there for this specific use case including things like kaniko, img, and buildah

Kaniko looks great so will probably move to that.

phenri00 on 3 Dec 2020

I _think_ you can still make use of the Docker socket by having the docker runtime on your host OS but using a different runtime with Kubernetes. I'd only go for that if updating your applications to something like Kaniko isn't possible right now though.