Instances running in awsvpc networking mode will have greater allotments of ENIs allowing for greater Task densities.
Will this benefit EKS workers also?
@MarcusNoble EKS uses secondary IPs, so it allows for much bigger tasks in each pod.
If we get new density to EC2, maybe it can benefit EKS too, but right now it is a much smaller issue than it is for ECS
@MarcusNoble Can you please tell us more about your EKS pods-per-node density requirements?
It'd be great if we could make use of some of the smaller instance types (in terms of CPU and memory) but still benefit from being able to have a large number of pods. When we were picking the right instance type we've had to pick much more resources than we need because of the IP limitation when balanced with the cost of running more smaller ones.
yes please. this would be valuable. increasing container density on ECS/.EKS. (no matter if IP or port based) also having a one pager max containers per instnace flavor would be useful too
An acceptable level of ENI density should be about 1 ENI / 0.5 VCPU and scale linearly with instance size, not every other as it is today.
An acceptable level of ENI density should be about 1 ENI / 0.5 VCPU and scale linearly with instance size, not every other as it is today.
I would say 1ENI / .5 VCPU would be on the low end. Honestly at that rate we probably still wouldn't bother with awsvpc networking mode. We regularly run 10-16 tasks on hosts with as few as 2 VCPUs.
I would point out that on other providers this limit is not in place. So coming in with purely a k8s familiarity.. I expect that there is a hard coded limit of 110 pods per node.
This one caught us a bit off guard. Started migrating from GCP and chose as close to same sized machines as we could in AWS. Start the migration and suddenly pods aren't starting.
It was only because we had happened to have remembered reading about ips per ENI that we were able to figure this out.
I can definitely understand the context switching for the CPU and other factors being an issue with traditional EC2. But with much smaller jobs running it would be nice to at least be able to acknowledge these risks and do it anyways.
Especially with EKS where we can / are responsible for setting usage needs to let k8s best schedule across our node capacity
I can explain a good use case for this. We currently have a EKS cluster on AWS and a AKS cluster on Azure.
On the Azure cluster we run many small pods (80 pods approx. per node): they are so small that they can easily fit on the equivalent of a m5.xlarge. Unfortunately, the m5.xlarge allows only 59 pods per node (of which at least 2 pods are needed by the system itself).
So we are basically using the Azure cluster for cost optimization.
Any news on when we can expect an update? We are planning to move workloads to ECS using awsvpc but are currently blocked by this issue. We could use the the bridge networking mode for now, but for this it would be good to know whether an update to this issue is imminent or rather something for next year (both are fine, but information on this would be great)
@peterjuras We are currently actively working on this feature. Targeting a release soon, this year.
@emanuelecasadio Please note this issue tracks the progress of ENI density increases for ECS. We are also working on networking improvements for EKS, just not as part of this issue.
@ofiliz Does this mean "calendar year?" (ie, 2019). We were initially under the impression this feature would be shipping months ago. Until it does ship, awsvpc (and thus App Mesh) is not usable for us.
@ofiliz Does this mean "calendar year?" (ie, 2019). We were initially under the impression this feature would be shipping months ago. Until it does ship, awsvpc (and thus App Mesh) is not usable for us.
I second this, I struggle to see AppMesh working for the majority of use cases with ECS given the current ENI limitations and sole support for awsvpc networking mode. It's a shame there is so much focus on EKS support when K8s already has tons of community support and tooling around service-mesh architectures. Meanwhile today, for ECS, all service-mesh deployments have to be more or less home-rolled due to limited support.
I've been patiently waiting, but I'm about to just roll Linkerd out across all of our clusters because the feature set of AppMesh as is right now is still very limited, and this ENI density issue is a non-starter for us. It seems AppMesh was prematurely announced, since it's just now GA 6 months after announcement, and is still effectively unusable for any reasonably sophisticated ECS deployments.
AWS tend to release services as soon as they are useful for some subset of their intended customer base. If you are running reasonably heavy memory containers then, depending on the instance type you use, you won't hit the ENI limits when using awsvpc networking.
While this is a problem for you (and myself) there are clearly going to be some people where this is useful and so it's obviously good to release it to those people before solving a much harder issue around ENI density or reworking the awsvpc networking on ECS to use secondary IPs such as with EKS via network policies on top of security groups.
There's certainly a nice level of simplicity in that with the awsvpc networking then each task gets its own ENI and thus you can use AWS networking primitives such as security groups natively. EKS' use of secondary IPs for pods sits on top of the already well established network policies used by overlay networks in Kubernetes but for a lot of people this is way more complexity than necessary.
I personally prefer the simplicity of ECS over Kubernetes for exactly these types of decisions.
I've said this before in multiple places.
having native SG per ENI is a huge benefit for any org.
Powered by Nitro technology, it should be possible to create a new instance family that removes the limit of ENI per vCPU/core that currently limits EC2.
That's pretty outrageous speculation there.
Whatever you do you're still restricted by the physical limitations of the actual tin and part of that ENI per core thing is just because that's how instances are divided up as part of the physical kit. Even if the networking is entirely virtualised or offloaded there's still some cost to it and AWS needs to be able to portion that out to every user of the tin as fairly as possible.
true @tomelliff but would lift this entire problem to a different scale
@joshuabaird @mancej Yes, this calendar year, coming soon. We appreciate your feedback. We are aware that this issue impact AppMesh on ECS and are working hard to increase the task density without requiring our customers to make any disruptive changes or lose any functionality that they expect from VPC networking using awsvpc mode.
Hi everyone: I'm on the product management team for ECS. We're going to be doing an early access period soon for this feature prior to being generally available.
In the event you're interested in participating: can you please email me at bsheck [at] amazon with your AWS account ID(s). I'll ensure your accounts get access and follow up with more specific instructions when the early access period is opened up.
With the release of the Amazon ECS agent v1.28.0 released today, the introduction of high density awsvpc tasks support was announced. What's the new limit ? Is it more ENI per EC2 instances ? more IP addresses per ENI ?
We have instances running as many as 120 tasks on them, wondering where the limit is now.
Thanks!
@mfortin The agent release today is staged in anticipation for when we open up the feature for general availability relatively soon. At that point, we'll be publishing all the documentation with all the various ENI increases on a per-instance basis and I'll report back here at that time.
@Bensign I sent you an email last month to be part of the beta test from my corporate email, we love being guinea pigs ;) If you prefer, I can make this request more official through our TAM.
@mfortin Sending you a note momentarily on this.
When is it planned to be live-prod or when/how I can use it?
@Bensign Any chances to see the documentation or/and the feature availability date?
Such information would be great to know for planning the time, especially, when the vacation period is coming.
@Bensign Any chances to see the documentation or/and the feature availability date?
Such information would be great to know for planning the time, especially, when the vacation period is coming.
It is still in beta, but the GA release is coming soon. I can't share more specifics right now, but we will update this issue once the feature is generally available.
@abby-fuller is this limited to the specific families listed on the docs, or does it also include sub families like c5d?
@abby-fuller is this limited to the specific families listed on the docs, or does it also include sub families like c5d?
It is currently limited to the specific instance types listed in the docs. We are working on adding additional instance types.
How does this work? Is there any reason why we wouldn't opt into this mode? Are there any limitations?
Is this actually working for anyone? I have the account setting defined, running newest ECS AMI (w/ 1.28.1 ECS agent, etc), but I still can only run 3 tasks on a m5.2x. I don't see that the trunk interface is being provisioned. Talking to support now, but I think they may be stumped as well.
An update: I enabled awsvpcTrunking for the account using a non-root account/role. This role was also used to provision the ECS container instance and the ECS service, but ENI trunking was still not working/available. We then logged into the ECS console using the root account and enabled the setting (which sets the default setting for the entire account). After doing this, ENI trunking started working as expected.
@joshuabaird Yup. I had the same issue. You need to enable the awsvpcTrunking as the root user. It's not obvious.
Does this apply just to ECS or also EKS? Was directed here by a couple of aws solution architects before this was closed. Was under the impression it would be usable by eks as well. The announcement doesn鈥檛 mention it though
Hi @geekgonecrazy, this feature is currently only for ECS. Do you want more pods per node in EKS? Or do you want VPC security groups for each EKS pod? If you can tell us more about your requirements, we can suggest solutions or consider adding such a feature in our roadmap.
@ofiliz
I would point out that on other providers this limit is not in place. So coming in with purely a k8s familiarity.. I expect that there is a hard coded limit of 110 pods per node.
This one caught us a bit off guard. Started migrating from GCP and chose as close to same sized machines as we could in AWS. Start the migration and suddenly pods aren't starting.
It was only because we had happened to have remembered reading about ips per ENI that we were able to figure this out.
I can definitely understand the context switching for the CPU and other factors being an issue with traditional EC2. But with much smaller jobs running it would be nice to at least be able to acknowledge these risks and do it anyways.
Especially with EKS where we can / are responsible for setting usage needs to let k8s best schedule across our node capacity
To quote my initial comment here 4 months ago.
Every other provider we can do the k8s default of 110 pods per node. With eks we have to get a machine with more interfaces and way more specs then we need just to get 110 pods per node.
Are there any plans to also bring this to the smallest instance types (e.g. t2/t3.micro)? I would rather plan on using this feature for DEV environments, where we would bin pack as much as possible, on production environments I don't see as much need here.
@ofiliz we have a workload running on a different cloud provider that we would like to move to EKS, but the fact that we cannot allocate 110 pods on a t3.medium or t3.large node is a no-go for us.
@geekgonecrazy @emanuelecasadio Thanks for your feedback. We are working on significantly improving the EKS pods-per-node density, as well as adding other exciting new networking features. We have created a new item in our EKS roadmap: https://github.com/aws/containers-roadmap/issues/398
ENI trunking doesn't work when opting in via console as non-root user. You would need to opt-in as the root user via console or run the following command as root/non-root user.
aws ecs put-account-setting-default --name awsvpcTrunking --value enabled --region
ENI trunking doesn't work for instances launched in a shared VPC subnet : https://docs.aws.amazon.com/vpc/latest/userguide/vpc-sharing.html
The instances fail to register to the cluster when launched in a shared VPC and ENI trunking feature being enabled.
Bumping @peterjuras question - will you ever support t2/t3 family?
Running at least c5 family on dev/qa/preprod environment costs way too much.
Due to technical constraints with how ENI trunking works, we do not currently have pans to support t2 and t3.
Most helpful comment
Hi everyone: I'm on the product management team for ECS. We're going to be doing an early access period soon for this feature prior to being generally available.
In the event you're interested in participating: can you please email me at bsheck [at] amazon with your AWS account ID(s). I'll ensure your accounts get access and follow up with more specific instructions when the early access period is opened up.