containers-roadmap 🚀 - [ECS] Full support for Capacity Providers in CloudFormation.

Related to this, in order to support capacity providers with managedTerminationProtection, we also need to be able to set the new-instances-protected-from-scale-in property when creating the ASG via CloudFormation. This latter property was added 4 years ago to the AWS SDK / AWS CLI, but is still not supported in CF -- hopefully full support for CP in CF is added a bit faster.

lawrencepit on 3 Jan 2020

👍2

Has there been any progress made on this?

Add support for Capacity providers #1

geof2001 on 7 Jan 2020

We are working on it and will provide updates as soon as more information is available.

coultn on 7 Jan 2020

👍21

Related to this, in order to support capacity providers with managedTerminationProtection, we also need to be able to set the new-instances-protected-from-scale-in property when creating the ASG via CloudFormation. This latter property was added 4 years ago to the AWS SDK / AWS CLI, but is still not supported in CF -- hopefully full support for CP in CF is added a bit faster.

Additionally, when the new-instances-protected-from-scale-in property is set on ASG, scheduled action to scale-in instances could not be executed. Feature like force-scale-in for scheduled actions would be useful if for example we have dev env and we would like to turn off instances for night and turn them back on in the morning.

psuj on 10 Jan 2020

+1

pparth on 21 Jan 2020

👍7

When this is implemented, will it be possible to do a rolling update to the launch template under autoscaling and a change to a service in ecs, such that the new tasks run on instances from the new launch template while the old ones stay on the old instances as they roll over?

I'm struggling to achieve this with custom resources at the moment, partly as the dependencies are all in funny directions. Would be great to have it all defined declaratively in cfn.

tobymiller1 on 22 Jan 2020

Cross-linking the resp. request in https://github.com/aws-cloudformation/aws-cloudformation-coverage-roadmap/issues/301

sopel on 5 Feb 2020

Any ETA on this?

RomanCRS on 13 Mar 2020

👍35 👀11

Does this depend on #632?

pauldraper on 26 Mar 2020

Does this depend on #632?

I think no.

RomanCRS on 26 Mar 2020

Sadly, that's the reason why using CloudFormation is becoming more and more frustrating.

andreaswittig on 27 Mar 2020

👍16

FWIW, Terraform has supported this since shortly after the API was released: https://github.com/terraform-providers/terraform-provider-aws/pull/11151

Of course, it can't delete capacity providers since there's no API:
https://www.terraform.io/docs/providers/aws/r/ecs_capacity_provider.html

gabegorelick on 6 Apr 2020

👀3 ❤3

I don't want to use, rely on and support third-party software if I have a chance to use the official product.

RomanCRS on 6 Apr 2020

👍13 😕4

any update?

Vince-Cercury on 21 Apr 2020

👍18

same here, any updates?

XBeg9 on 28 Apr 2020

any update?

ronancunningham72 on 2 May 2020

👍11 👎1

the lack of Cfn support for this 6 months in is really disappointing. This puts the burden on anyone building CI/CD using Cfn to add additional and silly custom cli/sdk pieces to actually tie in capacity providers, which then have to be ripped out once the support that should be part of a point release is in place.
You can do better. Communicating timeframes would help as well.

darrenweiner on 6 May 2020

👍19

Have you had a deeper look into Capacity Providers and Cluster Auto Scaling? Does not match with my requirements at all. Does not scale down properly. Does not work with CloudFormation rolling updates for the ASG. So missing CloudFormation support is not the only problem here. :)

andreaswittig on 6 May 2020

Have you had a deeper look into Capacity Providers and Cluster Auto Scaling? Does not match with my requirements at all. Does not scale down properly. Does not work with CloudFormation rolling updates for the ASG. So missing CloudFormation support is not the only problem here. :)

Thanks for the feedback - can you explain more what you mean by "does not scale down properly"?

coultn on 6 May 2020

coultn: Here's what I think is a common use case: A CI/CD pipeline where services are spun up on an ASG backed EC2 cluster.
Services do not pre-exists, the CI/CD creates them.
Currently, you can not use cfn to create a capacity provider enabled service.
If the underlying cluster doesn't have the memory or cpu, I would expect that when a new service is deployed, it would add another ec2 and deploy the new service..but there's no way to do that currently. I suppose what might work right now is: Deploy the service with no capacity provider, perhaps with a quantity of 0 so it stabilizes, then via the cli, update the service to use a capacity provider, then another cli call to increase the quantity to 1....but that seems like hoop jumps.
With regards to down scaling, in reading the documentation, it seems a bit unclear on exactly how this is meant to work: If the goal is to optimize resources, I would actually want the cp to be intelligent enough to a) determine that the cluster is currently overprovisioned and b) if so, drain EC2 accordingly and have the ASG terminate the drained instance...all with standard, appropriate cooldown periods, etc.

darrenweiner on 6 May 2020

👍3

Currently, you can not use cfn to create a capacity provider enabled service.

Thanks for the feedback! We are working on full support for capacity providers in CloudFormation, and we definitely understand the need for that. However, I do want to point out that you can actually create a capacity-provider enabled service in CloudFormation today. You can accomplish this by first configuring a default capacity provider strategy for the cluster. This default capacity provider strategy will be used by any service you create that does not specify a launch type. Next, when you create your service in CloudFormation, do not include the LaunchType parameter. The service will use the capacity provider strategy defined by the cluster, and will auto-scale from zero instances if necessary.

With regards to down scaling, in reading the documentation, it seems a bit unclear on exactly how this is meant to work: If the goal is to optimize resources, I would actually want the cp to be intelligent enough to a) determine that the cluster is currently overprovisioned and b) if so, drain EC2 accordingly and have the ASG terminate the drained instance...all with standard, appropriate cooldown periods, etc.

Understood. In the first version of ECS cluster auto scaling, we took a more conservative route where instances would not scale in unless no tasks are running on them. We are looking at the idea of automating an "instance drainer" that will automatically find underutilized instances and set them to draining. With ECS cluster auto scaling, those instances would automatically shut down once no tasks are running on them. It's possible to do this already today, but you would need to implement your own Lambda function (or similar) to do the evaluation of the instance and call the ECS API to set the instance to the DRAINING state.

coultn on 6 May 2020

👍5

Really awesome feedback, thank you. As far as the workaround for setting it at Cluster creation, I'll take a look at that..easy enough to implement for QA/Dev..a little trickier for existing prod environments.

Trying to avoid custom tooling since...this seems sooo close to being a solid solution.

Any timing on better cfn support? I know that's a different, probably very overwhelmed team, but would be nice to see some improvements here. ECS rocks, and once this is dialed in, it's going to really round out the offering.

Will keep checking for ECS updates!

darrenweiner on 7 May 2020

Dear colleagues,
Please, in CF, provide the opportunity of fine tune some Capacity Provider auto generated parameters. Currently, in addition the the current parameters, we need the adjust the Cooldown in the Auto Scaling Plan manually, as well the Alarms datapoints, all after the Capacity Provider creation. It would be great put all this together in the CF script. This is a must for us. Thank you very much!

marcelmunarolo on 7 May 2020

👍2

Regarding timeline - we can't share specific timelines but we will share updates here as soon as they are available.

coultn on 7 May 2020

coultn:
Because this is such a useful feature for so many of my clients, I decided to re-tool things today.
Unfortunately, capacity providers still doesn't seem to work.
The cluster default cp is in place.
I re-created services without the LaunchTemplate reference, and it clearly shows the services are using the capacity provider strategy.
However, when I deploy services and exhaust the memory, it throws the usual message saying it can't find a container with the resources.
Interestingly, and probably to the point: The cloudwatch metric for the cp that is assigned to this cluster (CapacityProviderReservation) isn't reporting any metrics at all.
I have seen this metric chart more appropriately in previous tests a few weeks ago with another client...no idea why it's not reporting anything. I spun up about 5-8 services today on this cluster using the cp strategy....
I'll just keep checking back for updates...hopefully some good changes coming soon.

darrenweiner on 7 May 2020

+1

rcrelia on 12 May 2020

This is definitely a showstopper for our CDK-powered automation workflows. Setting Capacity Provider on a cluster level is something CloudFormation team is looking into. https://github.com/aws-cloudformation/aws-cloudformation-coverage-roadmap/issues/301

In the meantime our workaround is to run following aws-cli command in our ci/cd workflow:

aws ecs put-cluster-capacity-providers \
    --cluster CLUSTER_NAME \ 
    --capacity-providers FARGATE \ 
    --default-capacity-provider-strategy capacityProvider=FARGATE

I really hope this ships soon. 🤞

robertd on 12 May 2020

👍6 ❤1

+2

jakebanks on 15 Jun 2020

Deletion is now supported by the API. Will this accelerate the implementation of this feature addition?

https://aws.amazon.com/jp/about-aws/whats-new/2020/06/amazon-ecs-capacity-providers-support-delete-functionality/

hatappo on 17 Jun 2020

+1

pramshar on 19 Jun 2020

Saw this earlier today, but the resources don’t seem to have been updated yet: https://twitter.com/aws_doc/status/1273943424849383424?s=21

mwarkentin on 20 Jun 2020

I have implemented the new CloudFormation resources in one of my stacks and can confirm it works 👍

there's still a missing link though which might be (part of) the reason why it was not announced yet:

AWS::ECS::CapacityProvider AutoScalingGroupProvider requires the parameter AutoScalingGroupArn which accepts only an ARN (which contains a UUID part so you cannot "guess" it).

Unfortunately AWS::AutoScaling::AutoScalingGroup does not expose its ARN so there's no way to reference this in the AutoScalingGroupProvider for now.

Either hardcoding an existing ARN or, once more, hacking around with a Custom Resource to get the ARN works.

guillaumesmo on 20 Jun 2020

👍5 😕1

ah AWS, where just the C is an acceptable MVP for CRUD. oh well glad it's finally getting released.

akdor1154 on 20 Jun 2020

I have implemented the new CloudFormation resources in one of my stacks and can confirm it works

there's still a missing link though which might be (part of) the reason why it was not announced yet:

AWS::ECS::CapacityProvider AutoScalingGroupProvider requires the parameter AutoScalingGroupArn which accepts only an ARN (which contains a UUID part so you cannot "guess" it).

Unfortunately AWS::AutoScaling::AutoScalingGroup does not expose its ARN so there's no way to reference this in the AutoScalingGroupProvider for now.

Either hardcoding an existing ARN or, once more, hacking around with a Custom Resource to get the ARN works.

What about the Termination protection on Autoscaling and managed termination on CapacityProvider? I believe Autoscaling resource needs to be updated to support that.

rhlarora84 on 20 Jun 2020

👍2

A typical scenario, of having a template with an ASG and an capacity provider defined in the same template (which rhlarora84 alluded to) is not possible because the AWS::AutoScaling::AutoScalingGroup resource only returns the name..but capacity provider reqiures an Arn...That's kind of a miss on the ASG resource as well (why does it not have an Arn attribute...?).
At the least, it would be nice if the capacity provider can specify the name or arn as an option. A number of other resources support that.

darrenweiner on 21 Jun 2020

👍3

@coultn Hello, Is there a way or how are we going to cover the need of doing the ASG rolling update for AMI refresh or something that sort with having the capacity provider with managed termination. At present the pack (ECS Cluster, Capacity provider, ASG, Cloud formation) does not support the rolling update since the termination protection of ASG should be on for the managed termination of CP to work so we are sacrificing the managed termination of CP over the rolling update for now. It would be great if it can accommodate all these.

manokaran3529 on 22 Jun 2020

👍1

@manokaran3529 be careful, on scale down we saw container instances being terminated with managed termination protection off, when there was a better choice available (instance not running any container). You raise a good point regarding the rolling update though, I'm intending on using that and haven't tested yet...

jakebanks on 23 Jun 2020

How to manage circular dependency ?

ECS Cluster _needs_ Capacity Provider
Capacity Provider needs ASG (because of Arn)

When you delete ECS Cluster will get deleted first and fail because it ASG is still alive.

Error occurred during operation 'DeleteClusters SDK Error: The Cluster cannot be deleted while Container Instances are active or draining. (Service: Ecs, Status Code: 400, Request ID: 5751e46b-d3d4-4f0c-ad2f-ca7e072184c7, Extended Request ID: null)'.

pramshar on 23 Jun 2020

👍2

Hello! We are actively working on a few things to provide more comprehensive capacity provider support in CloudFormation.

Ability to reference the ASG name in the AWS::ECS::CapacityProvider resource
Ability to specify a custom capacity provider strategy in the AWS::ECS::Service resource
Ability to enable scale-in protection in the AWS::AutoScaling::AutoScalingGroup resource

anoopkapoor on 23 Jun 2020

👍23 🎉6 🚀4

Hello! We are actively working on a few things to provide more comprehensive capacity provider support in CloudFormation.

Ability to reference the ASG name in the AWS::ECS::CapacityProvider resource

Ability to specify a custom capacity provider strategy in the AWS::ECS::Service resource

Ability to enable scale-in protection in the AWS::AutoScaling::AutoScalingGroup resource

ETA?

shaybbigid on 24 Jun 2020

👍7 👎2

@manokaran3529 be careful, on scale down we saw container instances being terminated with managed termination protection off, when there was a better choice available (instance not running any container). You raise a good point regarding the rolling update though, I'm intending on using that and haven't tested yet...

Yes, it terminated an instance which had most of the tasks. As a hack, we changed the termination policy of the ASG to 'Newest' so while termination it picked the newest one where we had only the scaled up tasks.

manokaran3529 on 25 Jun 2020

Capacity Provider for Cloudformation is now available: https://d201a2mn26r7lk.cloudfront.net/latest/gzip/CloudFormationResourceSpecification.json

(or more friendly changelog: https://github.com/aws/aws-cdk/commit/4ce27f4195c70bd9e365ec0e0df5c0ede863bc8a)

mb-dev on 1 Jul 2020

👎6

Capacity Provider for Cloudformation is now available: https://d201a2mn26r7lk.cloudfront.net/latest/gzip/CloudFormationResourceSpecification.json

(or more friendly changelog: aws/aws-cdk@4ce27f4)

What does this mean ? This is old news here, looks same thing to me , still everyday look around here for fixes are done or not.

pramshar on 2 Jul 2020

Sorry, i missed that this was released 12 days ago. Will wait for the fixes above.

mb-dev on 2 Jul 2020

I was doing some testing today, and I noticed that I could pass the AutoScalingGroup name to as the autoScalingGroupArn in the CreateCapacityProvider API call when previously it would error out.

Armed with this knowledge I tried this:

  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      DesiredCapacity: 0
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: 2
      MinSize: 0
      VPCZoneIdentifier:
        - !Ref SubnetId

  CapacityProvider:
    Type: AWS::ECS::CapacityProvider
    Properties:
      AutoScalingGroupProvider:
        AutoScalingGroupArn: !Ref AutoScalingGroup
        ManagedScaling:
          Status: DISABLED
        ManagedTerminationProtection: DISABLED

And it worked! I only tested this in the ap-southeast-2 region. So I assume the reason this change wasn't announced is because it isn't live everywhere yet?

Good news for everyone tracking this issue through. I'll wait for this to be confirmed here before I use this in production, but it saves me from using a rather ugly custom resource to extract the ARN like I was planning to do.

taylorb-syd on 7 Jul 2020

👍4

Indeed, documentation has been updated to "The Amazon Resource Name (ARN) or short name that identifies the Auto Scaling group."

guillaumesmo on 7 Jul 2020

👍4

Hi All, confirming that

Ability to reference the ASG name in the AWS::ECS::CapacityProvider resource
is now available in all regions.

anoopkapoor on 7 Jul 2020

🎉9

@anoopkapoor any eta on 3? Scale in protection on Autoscaling.

rhlarora84 on 8 Jul 2020

👍7

@anoopkapoor any eta on 2)Ability to specify a custom capacity provider strategy in the AWS::ECS::Service resource?

belangovan on 9 Jul 2020

👍3

I was doing some testing today, and I noticed that I could pass the AutoScalingGroup name to as the autoScalingGroupArn in the CreateCapacityProvider API call when previously it would error out.

Armed with this knowledge I tried this:
  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      DesiredCapacity: 0
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: 2
      MinSize: 0
      VPCZoneIdentifier:
        - !Ref SubnetId

  CapacityProvider:
    Type: AWS::ECS::CapacityProvider
    Properties:
      AutoScalingGroupProvider:
        AutoScalingGroupArn: !Ref AutoScalingGroup
        ManagedScaling:
          Status: DISABLED
        ManagedTerminationProtection: DISABLED
And it worked! I only tested this in the ap-southeast-2 region. So I assume the reason this change wasn't announced is because it isn't live everywhere yet?

Good news for everyone tracking this issue through. I'll wait for this to be confirmed here before I use this in production, but it saves me from using a rather ugly custom resource to extract the ARN like I was planning to do.

How do you manage circular dependency still ?

ECS Cluster needs Capacity Provider
Capacity Provider needs ASG (because of Ref)

When you delete ECS Cluster will get deleted first and fail because it ASG is still alive.

Error occurred during operation 'DeleteClusters SDK Error: The Cluster cannot be deleted while Container Instances are active or draining. (Service: Ecs, Status Code: 400, Request ID: 5751e46b-d3d4-4f0c-ad2f-ca7e072184c7, Extended Request ID: null)'.

pramshar on 10 Jul 2020

👍2

How do you manage circular dependency still ?

ECS Cluster needs Capacity Provider
Capacity Provider needs ASG (because of Ref)

When you delete ECS Cluster will get deleted first and fail because it ASG is still alive.

Error occurred during operation 'DeleteClusters SDK Error: The Cluster cannot be deleted while Container Instances are active or draining. (Service: Ecs, Status Code: 400, Request ID: 5751e46b-d3d4-4f0c-ad2f-ca7e072184c7, Extended Request ID: null)'.

To be honest, I didn't manage it then deletion. Now, we can either leave this as is and just deal with it like the "non-empty-bucket" for AWS::S3::Bucket or we can add logic into the Delete Workflow for AWS::ECS::Cluster to drain all existing services and tasks.

I'm a fan of the second approach because unlike an AWS::S3::Bucket we're typically not having permanent data loss if the cluster is deleted. Unfortunately it'll take time to get a tear down process that works for most circumstances.

I can probably make a generic custom resource/resource provider that can be used for clean-up if you're running ephemeral workloads and need a solution to this right now?

taylorb-syd on 13 Jul 2020

With out specifying a capacity provider on a service, is it actually possible to deploy a ECS Service into a Fargate-only ECS cluster (with the out-of-the-box capacity providers + a default capacity provider set?)

Would assume it would "just work", but in the current implementation we're seeing errors along the lines of:

There are no capacity providers in the capacity provider strategy with a weight value greater than zero. Specify a weight value greater than zero for at least one capacity provider and try again.

I presume adding the ability to set them (with weighting) on the service (similar to the command line) fixes that, but from my understanding it appears it's impossible to do that at the moment?

Sutto on 14 Jul 2020

With out specifying a capacity provider on a service, is it actually possible to deploy a ECS Service into a Fargate-only ECS cluster (with the out-of-the-box capacity providers + a default capacity provider set?)

Would assume it would "just work", but in the current implementation we're seeing errors along the lines of:

There are no capacity providers in the capacity provider strategy with a weight value greater than zero. Specify a weight value greater than zero for at least one capacity provider and try again.

I presume adding the ability to set them (with weighting) on the service (similar to the command line) fixes that, but from my understanding it appears it's impossible to do that at the moment?

How did you define the ECS Cluster to make a Fargate only cluster? Can you either describe it using DescribeClusters or provide the CloudFormation template snippet you used?

taylorb-syd on 14 Jul 2020

@taylorb-syd

We haven't updated to do the new support for capacity providers in CFN, but the steps we followed to make the cluster (and have been running tasks fine in, without specifying anything).

Cluster itself is just a raw AWS::ECS::Cluster in cloudformation (tags set, nothing else), and after
creation we ran:

aws ecs put-cluster-capacity-providers \
          --cluster "my-cluster-name" \
         --capacity-providers FARGATE FARGATE_SPOT \
         --default-capacity-provider-strategy  capacityProvider=FARGATE_SPOT \
         --region ap-southeast-2

If that makes sense?

So there are no ECS instances / other capacity providers, just the above configuration - and attempting to deploy a service into that causes the given issue.

I think the root of the issue is a service ignores the default-capacity-provider-strategy

Sutto on 14 Jul 2020

Personally, in CFn, I used

ECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
      CapacityProviders:
        - FARGATE_SPOT
      DefaultCapacityProviderStrategy:
        - Base: 0
          CapacityProvider: FARGATE_SPOT
          Weight: 1

And my ECS is pure FARGATE SPOT.

And before the CFn support, I did exactly like Sutto above.

PierreKiwi on 14 Jul 2020

👍4

Ah, interesting - odd that it works without that there, sounds like I've just gone down the wrong path (Thanks @PierreKiwi) - The above statement previously created it (a few months ago), with no weight set - and that value was hidden via the UI.

Turns out the default configuration above ignores giving it a weight, which seems counter intuitive - the CFN version is much more explicit, and much nicer.

Sutto on 14 Jul 2020

To be perfectly correct, actually before CFn support, I was doing this

aws ecs put-cluster-capacity-providers \
    --cluster <CLUSTER_NAME> \
    --capacity-providers FARGATE_SPOT \
    --default-capacity-provider-strategy capacityProvider=FARGATE_SPOT,weight=1,base=0 \
    --region <REGION>

So it was an easy translation to CFn :)

PierreKiwi on 14 Jul 2020

Ah, interesting - odd that it works without that there, sounds like I've just gone down the wrong path (Thanks @PierreKiwi)

Turns out the default configuration above ignores giving it a weight, which seems counter intuitive - the CFN version is much more explicit, and much nicer.

Ahh yes, it defaults to weight : 0, which I think is not desirable. I'll report this up the chain internally. For now, make sure you set a non-zero weight if you're only specifying one capacity provider.

▶ aws ecs create-cluster --cluster-name testing --capacity-providers FARGATE FARGATE_SPOT --default-capacity-provider-strategy capacityProvider=FARGATE_SPOT
{
    "cluster": {
        "clusterArn": "arn:aws:ecs:ap-southeast-2:<redacted>:cluster/testing",
        "clusterName": "testing",
        "status": "PROVISIONING",
        "registeredContainerInstancesCount": 0,
        "runningTasksCount": 0,
        "pendingTasksCount": 0,
        "activeServicesCount": 0,
        "statistics": [],
        "tags": [],
        "settings": [
            {
                "name": "containerInsights",
                "value": "enabled"
            }
        ],
        "capacityProviders": [
            "FARGATE",
            "FARGATE_SPOT"
        ],
        "defaultCapacityProviderStrategy": [
            {
                "capacityProvider": "FARGATE_SPOT",
                "weight": 0,
                "base": 0
            }
        ],
        "attachmentsStatus": "UPDATE_IN_PROGRESS"
    }
}

taylorb-syd on 14 Jul 2020

@taylorb-syd @PierreKiwi thanks for the help with this - that wasn't a fun one to try and work around. Might also be worth pushing up the chain in the ECS team to show the weight when looking in the console - at the moment, it's hidden until you click edit / add new, and there appears to be no way to modify the weight of the existing (short of adding + removing in the same request).

Sutto on 14 Jul 2020

👍1

@taylorb-syd @PierreKiwi thanks for the help with this - that wasn't a fun one to try and work around. Might also be worth pushing up the chain in the ECS team to show the weight when looking in the console - at the moment, it's hidden until you click edit / add new, and there appears to be no way to modify the weight of the existing (short of adding + removing in the same request).

I have noted this in my internal request. Thanks.

taylorb-syd on 14 Jul 2020

Is launching a task in fargate spot using cloudformation supported at the moment? We are trying to do this, and it seems like there is some initial support for capacity providers, but not for attaching tasks to those capacity providers.

adamkeim-pwr on 15 Jul 2020

You don't need to do anything special to attach tasks to the capacity providers. If you have this line in your Service definition, you'll want to remove it:

LaunchType: FARGATE

Then you'll see the Capacity provider listed for your task:

Task definition console

mildebrandt on 15 Jul 2020

👍1

You don't need to do anything special to attach tasks to the capacity providers. If you have this line in your Service definition, you'll want to remove it:
LaunchType: FARGATE
Then you'll see the Capacity provider listed for your task:

The assumption here is that a DCPS has been set on your cluster similar to this:

ECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
      CapacityProviders:
        - FARGATE_SPOT
      DefaultCapacityProviderStrategy:
        - Base: 0
          CapacityProvider: FARGATE_SPOT
          Weight: 1

Without a DCPS, removing the LaunchType will default to attempting to launch the task under EC2 instead of FARGATE_SPOT. Other than that @adamkeim-pwr it should work as expected.

taylorb-syd on 16 Jul 2020

If I have a cluster with FARGATE and FARGATE_SPOT as capacity providers, how can I select which capacity provider the task is then using them? I know I can do weight, but I would like to be able to select on a task level which capacity provider is used. Do I just need to use two clusters?

adamkeim-pwr on 16 Jul 2020

I was doing some testing today, and I noticed that I could pass the AutoScalingGroup name to as the autoScalingGroupArn in the CreateCapacityProvider API call when previously it would error out.

Armed with this knowledge I tried this:
  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      DesiredCapacity: 0
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: 2
      MinSize: 0
      VPCZoneIdentifier:
        - !Ref SubnetId

  CapacityProvider:
    Type: AWS::ECS::CapacityProvider
    Properties:
      AutoScalingGroupProvider:
        AutoScalingGroupArn: !Ref AutoScalingGroup
        ManagedScaling:
          Status: DISABLED
        ManagedTerminationProtection: DISABLED
And it worked! I only tested this in the ap-southeast-2 region. So I assume the reason this change wasn't announced is because it isn't live everywhere yet?

Good news for everyone tracking this issue through. I'll wait for this to be confirmed here before I use this in production, but it saves me from using a rather ugly custom resource to extract the ARN like I was planning to do.

This is still not going to work. In a non-trivial deployment of ECS you will be wanting to pass the ECS Cluster ID into the launch configuration/template so that the agent knows which cluster to join. This means that the dependency chain goes like this:

ECSCluster -> Capacity Provider -> Autoscaling Group -> Launch Config -> ECSCluster

What is needed is a resource to manage the attachment of the Capacity provider to the ECSCluster and break the dependency loop. If that is not provided by AWS you are going to need a custom resource to add the capacity provider.

gunzy83 on 20 Jul 2020

👍4

what I'm doing is passing the cluster name as parameter of the stack. this way you can use the parameter both in the ClusterName property of the cluster, and in the UserData of LaunchTemplate. circular dependency gone!

guillaumesmo on 20 Jul 2020

what I'm doing is passing the cluster name as parameter of the stack. this way you can use the parameter both in the ClusterName property of the cluster, and in the UserData of LaunchTemplate. circular dependency gone!

Thanks, but that is not a solution. At best it is a workaround that solves the circular dependency but not the operational aspects.

It will only work if you set your capacity to 0 at create time so the stack completes then run a second pass to update the minimum or desired count. If you start the autoscaling group with anything but zero, your nodes will not start the ecs agent because the cluster will not exist yet, you will need to replace the instances. If you use CreatePolicy and cfn-signal it will prevent the ASG from starting at all. You should not need two passes to make this work.

gunzy83 on 20 Jul 2020

Any instances that launch ahead of the cluster creation do end up getting registered with the cluster as well once its created.

anoopkapoor on 20 Jul 2020

That is only because systemd restarts the service that spins up the docker container. It will eventually back off if it does not see it run successfully. I guess it will eventually join but that is not the point. We follow the same pattern with a service controlling a docker container on our Ubuntu AMI using ansible on bootstrap, the bootstrap will fail if the service does not start though and we get a non-zero return and failure signal is sent from cfn-signal. We want to know that the instance is ready to serve when it signals.

We are talking about operating this in a Cloudformation environment are we not? If you do rolling AMI updates for your cluster you surely want to know that the instance has joined the cluster before you signal for Cloudformation to move to the next instance (along with a lifecycle hook to drain instances before termination)?

We might be running a slightly niche way, sure... but do you think a new user should have to find this thread to find this workaround to get this working?

gunzy83 on 20 Jul 2020

Hey, sorry if I'm being a bit being daft: I've been following the thread but not 100% sure what the status of this is. Is the functionality out, but somewhat experimental and not yet documented?

I Googled "cloudformation" capacity provider and this was the only thing that came back:

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-cluster-capacityproviderstrategyitem.html

Is this enough to get up and running with Capacity Providers + CloudFormation?

Thanks!

EdwardIII on 20 Jul 2020

@EdwardIII I found the same thing, Google is not returning the Cloudformation documentation for this yet but it is there. Some of it is working with workarounds but IMO it is currently half baked and needs work.

gunzy83 on 20 Jul 2020

I was doing some testing today, and I noticed that I could pass the AutoScalingGroup name to as the autoScalingGroupArn in the CreateCapacityProvider API call when previously it would error out.
Armed with this knowledge I tried this:
  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      DesiredCapacity: 0
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: 2
      MinSize: 0
      VPCZoneIdentifier:
        - !Ref SubnetId

  CapacityProvider:
    Type: AWS::ECS::CapacityProvider
    Properties:
      AutoScalingGroupProvider:
        AutoScalingGroupArn: !Ref AutoScalingGroup
        ManagedScaling:
          Status: DISABLED
        ManagedTerminationProtection: DISABLED
And it worked! I only tested this in the ap-southeast-2 region. So I assume the reason this change wasn't announced is because it isn't live everywhere yet?
Good news for everyone tracking this issue through. I'll wait for this to be confirmed here before I use this in production, but it saves me from using a rather ugly custom resource to extract the ARN like I was planning to do.
This is still not going to work. In a non-trivial deployment of ECS you will be wanting to pass the ECS Cluster ID into the launch configuration/template so that the agent knows which cluster to join. This means that the dependency chain goes like this:

ECSCluster -> Capacity Provider -> Autoscaling Group -> Launch Config -> ECSCluster

What is needed is a resource to manage the attachment of the Capacity provider to the ECSCluster and break the dependency loop. If that is not provided by AWS you are going to need a custom resource to add the capacity provider.

This is exactly the issue that I’m currently having, I’ve no idea how to break that dependency chain.

verbitan on 23 Jul 2020

I was doing some testing today, and I noticed that I could pass the AutoScalingGroup name to as the autoScalingGroupArn in the CreateCapacityProvider API call when previously it would error out.
Armed with this knowledge I tried this:
  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      DesiredCapacity: 0
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: 2
      MinSize: 0
      VPCZoneIdentifier:
        - !Ref SubnetId

  CapacityProvider:
    Type: AWS::ECS::CapacityProvider
    Properties:
      AutoScalingGroupProvider:
        AutoScalingGroupArn: !Ref AutoScalingGroup
        ManagedScaling:
          Status: DISABLED
        ManagedTerminationProtection: DISABLED
And it worked! I only tested this in the ap-southeast-2 region. So I assume the reason this change wasn't announced is because it isn't live everywhere yet?
Good news for everyone tracking this issue through. I'll wait for this to be confirmed here before I use this in production, but it saves me from using a rather ugly custom resource to extract the ARN like I was planning to do.
This is still not going to work. In a non-trivial deployment of ECS you will be wanting to pass the ECS Cluster ID into the launch configuration/template so that the agent knows which cluster to join. This means that the dependency chain goes like this:
ECSCluster -> Capacity Provider -> Autoscaling Group -> Launch Config -> ECSCluster
What is needed is a resource to manage the attachment of the Capacity provider to the ECSCluster and break the dependency loop. If that is not provided by AWS you are going to need a custom resource to add the capacity provider.
This is exactly the issue that I’m currently having, I’ve no idea how to break that dependency chain.

After being tired of getting it released and following up, at least I did exactly the same
Inside CFN, I create everything but I don't attach Capacity Provider to ECS Cluster

LaunchConfiguration->ECS Cluster(depends on implicit as I attach instance to cluster)
AutoScalingGroup->LaunchConfiguration (Depends on implicit
CapacityProvider->AutoScalingGroup ( Depends on implicit
CustomResourceToAttachCP->ECS Cluster & Capacity Provider(DependsOn Explicitly)

Inside Lambda CustomResourceToAttachCP , I use put_cluster_capacity_providers on create and delete (Blank)

pramshar on 25 Jul 2020

👍1

I was doing some testing today, and I noticed that I could pass the AutoScalingGroup name to as the autoScalingGroupArn in the CreateCapacityProvider API call when previously it would error out.
Armed with this knowledge I tried this:
  AutoScalingGroup:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      DesiredCapacity: 0
      LaunchTemplate:
        LaunchTemplateId: !Ref LaunchTemplate
        Version: !GetAtt LaunchTemplate.LatestVersionNumber
      MaxSize: 2
      MinSize: 0
      VPCZoneIdentifier:
        - !Ref SubnetId

  CapacityProvider:
    Type: AWS::ECS::CapacityProvider
    Properties:
      AutoScalingGroupProvider:
        AutoScalingGroupArn: !Ref AutoScalingGroup
        ManagedScaling:
          Status: DISABLED
        ManagedTerminationProtection: DISABLED
And it worked! I only tested this in the ap-southeast-2 region. So I assume the reason this change wasn't announced is because it isn't live everywhere yet?
Good news for everyone tracking this issue through. I'll wait for this to be confirmed here before I use this in production, but it saves me from using a rather ugly custom resource to extract the ARN like I was planning to do.
This is still not going to work. In a non-trivial deployment of ECS you will be wanting to pass the ECS Cluster ID into the launch configuration/template so that the agent knows which cluster to join. This means that the dependency chain goes like this:
ECSCluster -> Capacity Provider -> Autoscaling Group -> Launch Config -> ECSCluster
What is needed is a resource to manage the attachment of the Capacity provider to the ECSCluster and break the dependency loop. If that is not provided by AWS you are going to need a custom resource to add the capacity provider.
This is exactly the issue that I’m currently having, I’ve no idea how to break that dependency chain.
After being tired of getting it released and following up, at least I did exactly the same
Inside CFN, I create everything but I don't attach Capacity Provider to ECS Cluster

LaunchConfiguration->ECS Cluster(depends on implicit as I attach instance to cluster)

AutoScalingGroup->LaunchConfiguration (Depends on implicit

CapacityProvider->AutoScalingGroup ( Depends on implicit

CustomResourceToAttachCP->ECS Cluster & Capacity Provider(DependsOn Explicitly)

Inside Lambda CustomResourceToAttachCP , I use put_cluster_capacity_providers on create and delete (Blank)

I really hope AWS can release an update to solve this problem, if I have to create a Lambda to attach Capacity Provider to ECS Cluster, it seems so tricky!

zhaopengdirk on 8 Aug 2020

At the moment in order to work around the Circular Dependency you need to, unfortunately, name the Cluster, and cannot use an auto-generated name. That way you can specify in your Launch Configuration / Launch Template the name of the cluster as a string. I recommend the following name convention:

ECSCluster:
  Type: AWS::ECS::Cluster
  Properties:
    ClusterName: !Sub ${AWS::StackName}-ECSCluster

LaunchConfiguration:
  Type: AWS::AutoScaling::LaunchConfiguration
  Properties:
    UserData:
      Fn::Base64: !Sub |
          #!/bin/bash
          echo ECS_CLUSTER=${AWS::StackName}-ECSCluster >> /etc/ecs/ecs.config

I will work internally to see if we can get a separate resource to break the dependency in CloudFormation. However, naming the cluster seems like the best solution. Fortunately all actions to an AWS::ECS::Cluster resource are mutable actions apart from the ClusterName property, which means that a static name will not have the usual consequences that discourage the use of a static name.

taylorb-syd on 10 Aug 2020

👍5

At the moment in order to work around the Circular Dependency you need to, unfortunately, name the Cluster, and cannot use an auto-generated name. That way you can specify in your Launch Configuration / Launch Template the name of the cluster as a string. I recommend the following name convention:
ECSCluster:
  Type: AWS::ECS::Cluster
  Properties:
    ClusterName: !Sub ${AWS::StackName}-ECSCluster

LaunchConfiguration:
  Type: AWS::AutoScaling::LaunchConfiguration
  Properties:
    UserData:
      Fn::Base64: !Sub |
          #!/bin/bash
          echo ECS_CLUSTER=${AWS::StackName}-ECSCluster >> /etc/ecs/ecs.config
I will work internally to see if we can get a separate resource to break the dependency in CloudFormation. However, naming the cluster seems like the best solution. Fortunately all actions to an AWS::ECS::Cluster resource are mutable actions apart from the ClusterName property, which means that a static name will not have the usual consequences that discourage the use of a static name.

Sounds great!

zhaopengdirk on 10 Aug 2020

👎1

Issue has state coming soon 4 month ago, but still we don't get an release fix for it. Whether dependencies solved. It's must need to do production ecs deployment as ecs agent update cluster @srrengar

belangovan on 10 Sep 2020

😕6

Hi!
Ability to specify a custom capacity provider strategy in the AWS::ECS::Service resource is now available.
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-ecs-service.html

anoopkapoor on 1 Oct 2020

👍1

@anoopkapoor , circular dependencies issue ECS -> CP -> ASG -> Launch config -> CP, with have option to REF: ECS Cluster in CP as an separate resource may helps. or any proper way to solve ... please close this soon...

belangovan on 2 Oct 2020

@belangovan ack. This is the list I'm tracking now:

[complete] Ability to reference the ASG name in the AWS::ECS::CapacityProvider resource
[complete] Ability to specify a custom capacity provider strategy in the AWS::ECS::Service resource
[coming soon] Ability to enable scale-in protection in the AWS::AutoScaling::AutoScalingGroup resource
Break circular dependency so that unnamed clusters can be created
Stack deletion fails since the cluster deletion comes ahead of ASG deletion. Cluster cannot be deleted if instances in ASG are still active.
Ability to update parameters in the AWS::ECS::CapacityProvider resource without interruption including ASG warm-up time.

anoopkapoor on 2 Oct 2020

👍7 👀5

@anoopkapoor , thanks for status, it's really help us to track. If possible, Please share us an tentative ETA ?

belangovan on 2 Oct 2020

Yeah, bumping the issue. We've been stuck on this for a while.

ptwohig on 5 Oct 2020

Hi!
Ability to enable scale-in protection in the AWS::AutoScaling::AutoScalingGroup resource is now available.

[complete] Ability to reference the ASG name in the AWS::ECS::CapacityProvider resource
[complete] Ability to specify a custom capacity provider strategy in the AWS::ECS::Service resource
[complete] Ability to enable scale-in protection in the AWS::AutoScaling::AutoScalingGroup resource
Break circular dependency so that unnamed clusters can be created
Stack deletion fails since the cluster deletion comes ahead of ASG deletion. Cluster cannot be deleted if instances in ASG are still active.
Ability to update parameters in the AWS::ECS::CapacityProvider resource without interruption including ASG warm-up time.

anoopkapoor on 9 Oct 2020

🚀5 👍4

by clicking I saw "Not currently supported by AWS CloudFormation."

So is the document not up-to-date?

On Fri, Oct 9, 2020 at 1:42 PM anoop notifications@github.com wrote:

Hi!
Ability to enable scale-in protection in the
AWS::AutoScaling::AutoScalingGroup resource is now available
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html#cfn-as-group-newinstancesprotectedfromscalein
.

[complete] Ability to reference the ASG name in the
AWS::ECS::CapacityProvider resource

[complete] Ability to specify a custom capacity provider strategy
in the AWS::ECS::Service resource

[complete] Ability to enable scale-in protection in the
AWS::AutoScaling::AutoScalingGroup resource

Break circular dependency so that unnamed clusters can be created

Stack deletion fails since the cluster deletion comes ahead of ASG
deletion. Cluster cannot be deleted if instances in ASG are still active.

Ability to update parameters in the AWS::ECS::CapacityProvider
resource without interruption including ASG warm-up time.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/aws/containers-roadmap/issues/631#issuecomment-706393388,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AA72QMI3NF3CL7HIZF5T5MTSJ5YTRANCNFSM4JV6NRFA
.

JongLim on 9 Oct 2020

by clicking I saw "Not currently supported by AWS CloudFormation." So is the document not up-to-date?

Thanks for pointing out. Document has now been updated.

anoopkapoor on 10 Oct 2020

👍1

Until this is not implemented, I've been using spot instances in ECS with docker compose by specifying an existing ECS cluster that is already configured with SPOT as provider (x-aws-cluster).

flaviostutz on 18 Oct 2020

In yesterday's What's New:

https://aws.amazon.com/about-aws/whats-new/2020/10/aws-fargate-spot-for-amazon-ecs-now-supported-in-aws-cloudformation/

ronkorving on 27 Oct 2020

🎉4

@srrengar , @anoopkapoor , any update in

Break circular dependency so that unnamed clusters can be created
Stack deletion fails since the cluster deletion comes ahead of ASG deletion. Cluster cannot be deleted if instances in ASG are still active.
Ability to update parameters in the AWS::ECS::CapacityProvider resource without interruption including ASG warm-up time.
we are blocked for production release

belangovan on 4 Nov 2020

👍14

Any news on this issue? More than one year has passed since this problem was raised. Frameworks like Terraform can deal with Capacity Providers since day 0. I really like many CF / AWS CDK features, but the time AWS takes to support its own resources is really frustrating.

CarlosDomingues on 11 Dec 2020

👍13

Containers-roadmap: [ECS] Full support for Capacity Providers in CloudFormation.

Most helpful comment

All 89 comments

Related issues