Reproduce:
copilot initcopilot env init after 'No' for it)Then the CFn stack for new env is going to fail due to an error something like:
Invalid request provided: CreateCluster Invalid Request: Unable to assume the service linked role. Please verify that the ECS service linked role exists. (Service: Ecs, Status Code: 400, Request ID: a973c2ca-204c-4715-9e2e-6097da7eb8c1, Extended Request ID: null)
Here is the CLI output:
~ snip ~
All right, you're all set for local development.
Deploy: Yes
✘ Failed to create the infrastructure for the test environment.
- Virtual private cloud on 2 availability zones to hold your services [Failed]
- Virtual private cloud on 2 availability zones to hold your services [Failed]
Resource creation cancelled
- Internet gateway to connect the network to the internet [Failed]
Resource creation cancelled ess]
- Public subnets for internet facing services [In Progress]
- Private subnets for services that can't be reached from the internet [In Progress]
- Routing tables for services to talk with each other [In Progress]
- ECS Cluster to hold your services [Failed]
Invalid request provided: CreateCluster Invalid Request: Unable to assume the service linked role ess]
- Application load balancer to distribute traffic [In Progress]
✘ wait until stack prod-ready-copilot-test create is complete: ResourceNotReady: failed waiting for successful resource state
$ copilot --version
copilot version: v0.3.0
I think we need some documentation and/or nice CLI output to let users solve - removing the existing failed stack and copilot env init again - this problem.
Yea - this is a weird one. You can just run env init again - and it'll work (copilot will clean up failed stacks) - but it's an odd race condition between the cluster and SLR being created. I'll also bring this up with the service team - since I don't think this is the behavior we're expecting.
@kohidave Thanks! I tried the repro three times with different new AWS accounts and had (successfully?) same results, Just FYI 😉
Were the accounts brand new too? I wonder if accounts that weren't new, but not using ECS would have the same issue. But either way, we should figure out a better way to handle this!
Ah update - I think I understand now that this might be a bit of a bug in the CF resource itself. It creates a role, but doesn't wait for the eventual consistency delay of the SLR.
Ah, guess we need a dependency on it
We may be able to create a custom resource that sets it up (it's not a normal role) and then have the cluster have a dependency on that custom resource. I'll work with the service team in the meantime to see if there are other ways to work around this.
Ok, one better work around is to create the SLR manually in our env stack: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-iam-servicelinkedrole.html
That should alleviate this issue and is pretty simple!
I ran into this as well, had to manually create ECS role. Running init again didn't help.
This fix was released just now in v1.1.0: https://github.com/aws/copilot-cli/releases/tag/v1.1.0!