Containers-roadmap: [ECS] : Capacity Strategy to Fall back to OD only When No More Spot Capacity Available

Created on 26 Feb 2020 · 8Comments · Source: aws/containers-roadmap

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request
Ability to create a capacity strategy that allows you to use spot instances as long as the spot capacity is available, and fall back to on-demand instances only when there is no capacity available for spot.

Which service(s) is this request for?
ECS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
I was hoping that "Base" in a capacity strategy will be more of a "strategy" but it seems to be a "constraint". In my use case, I was hoping to use 5 (which is also the total number of tasks in my service) as base for my capprovider1 which entirely consists of spot instances and use a 1:1 weight. So, the base will be met as long as there are spot instance available, otherwise I was hoping it to ignore the base and fall back to the capprovider2 which has OD instances. But even when capprovider2 has instances, service fails to place tasks because it's trying to satisfy base.

Are you currently working around this issue?
Using lambda
Please let me know if more information is required or in case there is a better alternative.

Proposed

Source

aliabas7

👍50

Most helpful comment

We also observe a similar problem that I will describe below. If it sounds like a separate issue please let me know.

We run our ECS cluster with the following default providers:
FARGATE_SPOT base=0 weight=50
FARGATE base=0 weight=50

Now let's say we run a service that uses the default providers and uses autoscaling.

If the service has a desired_count=10 and the fargate_spot capacity is not available, ECS will not use the available fargate capacity to honour desired_count. The service will run with only 5 tasks instead.

I consider this almost a bug, as it is very counter intuitive that ECS will allocate by providers first and consistently ignore desired_count.
We would prefer an integrated spot/non-spot scaling approach like EC2 Fleet does.

dactp on 4 Mar 2020

👍11

All 8 comments

We also observe a similar problem that I will describe below. If it sounds like a separate issue please let me know.

We run our ECS cluster with the following default providers:
FARGATE_SPOT base=0 weight=50
FARGATE base=0 weight=50

Now let's say we run a service that uses the default providers and uses autoscaling.

dactp on 4 Mar 2020

👍11

I consider this almost a bug, as it is very counter intuitive that ECS will allocate by providers first and consistently ignore desired_count.

I fully agree with this. – There should be an option to prioritize the desired count over the capacity provider. It would open a door for more flexible usage of spot capacity, also on the long-running services.

nikovirtala on 14 Apr 2020

Couldn't agree more. i asked about this when SPOT was launched. Had a chat to our TAM and also the service team. Dont think it was on the agenda any time soon back then. Personally, I doubt this will be a priority for AWS as it makes SPOT just too easy and everyone will choose to use SPOT instead of FARGATE and where is the fun in that...

jitesh88 on 8 Jul 2020

@dactp I'm confused, does this setting:

FARGATE_SPOT base=0 weight=50
FARGATE base=0 weight=50

Allow OD to be implemented only if SPOT is not available?

twigs67 on 26 Aug 2020

Just means it'll run 50% of tasks in Fargate and 50% in Spot, there's no failover if one is not available

nathanielram on 26 Aug 2020

👍4

StevePavlin on 5 Sep 2020

Sanyambansal76 on 4 Nov 2020

How would one handle this with lambda?

Trigger a lambda on spot allocation failure event which does a run-task api call on the FARGATE capacity-provider?

seanturner026 on 4 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

ecs UI makes it hard to see image name

ORESoftware · 3Comments

[ECR] [TAGS]: Increase maximum number of tags per image

groodt · 3Comments

[Fargate] [request]: Fargate in São Paulo

mineiro · 3Comments

[EKS] [request]: Ability to Use one ALB Across Multiple Ingress Objects

aliabas7 · 3Comments

[ECS] [request]: Seperate LoadBalancer traffic distribution from lifecycle mgmt.

adlemich · 3Comments