terraform-plan doesn't properly detect deletion of ecs_service

Created on 21 Oct 2015  ·  14Comments  ·  Source: hashicorp/terraform

I manually removed an ECS service definition via the AWS console.

I ran terraform plan and got:

~ aws_ecs_service.registry
    desired_count: "0" => "1"

What I expected was a new service to be created.

After running apply (just for kicks):

aws_ecs_service.registry: Modifying...
  desired_count: "0" => "1"
Error applying plan:

1 error(s) occurred:

* aws_ecs_service.registry: ServiceNotActiveException: Service was not ACTIVE.
    status code: 400, request id: aea58c2e-7811-11e5-b8d8-a5d3e2f4e7bb
bug provideaws

Most helpful comment

I was able to solve the inactive task definition issue with the example in the ECS task definition data source. You set up the ECS service resource to use the the max revision of either what your Terraform resource has created, or what is in the AWS console which the data source retrieves.

The one downside to this is if someone changes the task definition, Terraform will not realign that to what's defined in code.

All 14 comments

Hey @davedash – do you have any configuration that demonstrates this? Was it a Task Definition that you removed? Or the actual thing that aws_ecs_service.registry refers too (instead of references). Apologies if I'm mixing terms, I'm not too fluent in ECS :smile:

Hi @catsby I really have to tag my repo when I file a bug next time ;)

I removed on the AWS console the actual ECS Service, not the task definition.

I had the same problem (deleted the service, terraform got confused) - I created a stub service with the same name, did a terraform refresh, and then terraform apply and I think I'm operational again.

I just reproduced this:

resource "aws_ecs_cluster" "sleep" {
  name = "helloworld-del-test"
}

resource "aws_ecs_task_definition" "sleep" {
  family = "tf-helloworld-del-test"
  container_definitions = <<TASK_DEFINITION
[
  {
    "name": "sleep",
    "image": "busybox",
    "cpu": 10,
    "command": ["sleep","360"],
    "memory": 10,
    "essential": true
  }
]
TASK_DEFINITION
}

resource "aws_ecs_service" "sleep" {
  name = "sleep"
  cluster = "${aws_ecs_cluster.sleep.id}"
  task_definition = "${aws_ecs_task_definition.sleep.arn}"
  desired_count = 1
}
$ aws ecs update-service --service arn:aws:ecs:us-west-2:12060895217:service/sleep --cluster helloworld-del-test --desired-count 0
$ aws ecs delete-service --service arn:aws:ecs:us-west-2:12060895217:service/sleep --cluster helloworld-del-test

The solution is to treat existing ECS service with state INACTIVE as deleted (non-existing) service. I will send a patch for this.

Here's a recap of my chat session with AWS support which helped me understand better how this workflow works:

When you delete a service in ECS it will mark the service as INACTIVE and clean up the events.
Since ECS mark the service as INACTIVE instead of deleting it, you should treat services in INACTIVE state as deleted.

See #3828

Hello,

I believe this isn't entirely fixed, or I'm missing something.

I have also manually "deleted" (via the AWS Console) a task definition which switched it to "inactive" state.

However, when I try to run Terraform, plan seems to correctly indicate it needs to be created, but apply fails to create it because it's already in "inactive" state. If I understand correctly, the above fix should have made Terraform treat "inactive" task definitions as deleted.

Is this a bug? Am I doing something wrong?

Output:

$ terraform plan

[...]

+ aws_ecs_service.myapp
    cluster:                            "arn:aws:ecs:eu-central-1:XXXXXXXXXX:cluster/default"
    deployment_maximum_percent:         "200"
    deployment_minimum_healthy_percent: "100"
    desired_count:                      "1"
    name:                               "myapp"
    task_definition:                    "arn:aws:ecs:eu-central-1:XXXXXXXXXX:task-definition/myapp:1"


Plan: 1 to add, 0 to change, 0 to destroy.
$ terraform apply

[...]

aws_ecs_service.myapp: Creating...
  cluster:                            "" => "arn:aws:ecs:eu-central-1:XXXXXXXXXX:cluster/default"
  deployment_maximum_percent:         "" => "200"
  deployment_minimum_healthy_percent: "" => "100"
  desired_count:                      "" => "1"
  name:                               "" => "myapp"
  task_definition:                    "" => "arn:aws:ecs:eu-central-1:XXXXXXXXXX:task-definition/myapp:1"
Error applying plan:

1 error(s) occurred:

* aws_ecs_service.myapp: ClientException: TaskDefinition is inactive
  status code: 400, request id: 4a59b26b-08b3-11e7-83cc-1bd0ec181350 "myapp"

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
$ 

same as @elad with terraform ver0.99

does anybody know how to fix this?

I was able to solve the inactive task definition issue with the example in the ECS task definition data source. You set up the ECS service resource to use the the max revision of either what your Terraform resource has created, or what is in the AWS console which the data source retrieves.

The one downside to this is if someone changes the task definition, Terraform will not realign that to what's defined in code.

still happening
Terraform v0.11.0
provider.aws v1.4.0

I've got the same issue using Terraform 0.11.3 AWS provider 1.11.0 and terragrunt 0.14.2, make a static configuration of the ECS services is a shame IMHO, just trying to use the ECS task definition data source is not working (I'm using the count pattern in this cluster too)

Error: Error refreshing state: 1 error(s) occurred:

* module.edge.data.aws_ecs_task_definition.task_definition: 1 error(s) occurred:

* module.edge.data.aws_ecs_task_definition.task_definition[0]: data.aws_ecs_task_definition.task_definition.0: Failed getting task definition ClientException: Unable to describe task definition.
    status code: 400, request id: 160a43b4-25e5-11e8-b560-29264574469c "api-gateway-td-edge-fon-hw-dev"

But taking a look to the state...

[terragrunt] 2018/03/12 12:06:53 Running command: terraform state show module.edge.aws_ecs_task_definition.task_definition[0]
id                         = api-gateway-td-edge-fon-hw-dev
arn                        = arn:aws:ecs:eu-west-1:312497795905:task-definition/api-gateway-td-edge-fon-hw-dev:2
container_definitions      = [{"cpu":0,"environment":[{"name":"JAVA_OPTS","value":"-Xss256k -Xms64m -Xmx256m -XX:+UseG1GC"},{"name":"SPRING_PROFILES_ACTIVE","value":"development"}],"essential":true,"image":"312497795905.dkr.ecr.eu-west-1.amazonaws.com/api-gateway:lastest","logConfiguration":{"logDriver":"awslogs","options":{"awslogs-group":"edge-fon-hw-dev/api-gateway","awslogs-region":"eu-west-1"}},"memory":256,"mountPoints":[],"name":"api-gateway","portMappings":[{"containerPort":9000,"hostPort":9000,"protocol":"tcp"}],"volumesFrom":[]}]
cpu                        =
execution_role_arn         =
family                     = api-gateway-td-edge-fon-hw-dev
memory                     =
network_mode               =
placement_constraints.#    = 0
requires_compatibilities.# = 0
revision                   = 2
task_role_arn              =

The inactive task definition problem was resolved for me by https://github.com/terraform-providers/terraform-provider-aws/pull/5565

Still happening with Terraform v0.12.9 (installed with Homebrew on Mac OS).

When I manually delete an ECS service in the AWS interface that was created with a aws_ecs_service and aws_ecs_task_definition resource definitions, I get this plan:

  # module.whatever.aws_ecs_service.api will be updated in-place

And then applying that plan renders this error:

module.whatever.aws_ecs_service.api: Modifying... [id=arn:aws:ecs:eu-west-1:774908103135:service/search-cluster/search-api-service]

Error: error updating ECS Service (arn:aws:ecs:eu-west-1:774908103135:service/search-cluster/search-api-service): ServiceNotActiveException: Service was not ACTIVE.
        status code: 400, request id: 34080a32-9955-483d-b99a-4a1413025468

The particular workaround in my case is to just make a plan and apply it again, without having to do anything else.

Someone let me know if you need more context.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings