Terraform: Terraform doesn't reuse an AWS Role it just created and fails?

Created on 15 Aug 2016 · 14Comments · Source: hashicorp/terraform

Hi,

I have a neat Terraform structure where each environment (dev, stg, prod) has its very own state file. To execute each plan, I need to apply each environment individually (check dev dir below).

I then have a series of modules for handling AWS ECS and ECS services, as well as all the necessary infra to support that. Those are obviously reused between all the environments.

This in theory seemed a very good idea, but I've hit something which I don't know if its an anti-pattern of some sort or a shortcoming of Terraform or just something I'm missing. The ecs-services dir has all the different services modules. These essentially have the necessary specific config for each different service, e.g., container_definitions, etc. which vary from service to service. Then there is, also inside ecs-services a shared dir which is a module used by the services modules (nested module) and which essentially encapsulates all the common stuff to actually create an ECS service (create task, service, ELB for that service, add roles, etc). Each service module essentially uses this nested module and passes the necessary config to it.

The problem I have is that this nested module is apparently trying to create every time the shared ecs-service-role that was created by the first service module that ran:

Error creating IAM Role ecs-service-role: EntityAlreadyExists: Role with name ecs-service-role already exists.

I would like for Terraform to not try to create again a role that already exists, and instead reuse it. Is there any way to do this?

I know I can just flatten everything out and it would be fine, but kind of misses the point of modularising this for proper reuse. Would be grateful for some help, and if this is a Terraform shortcoming does it make sense to consider this as a feature request? Below is the file structure I'm using for this.

├── README.md
├── dev
│   ├── dev.tf
│   ├── main.tf
│   ├── qa.tf
│   ├── terraform.tfstate
│   ├── terraform.tfstate.backup
│   └── terraform.tfvars
├── modules
│   ├── ecs
│   │   ├── main.tf
│   │   └── roles.tf
│   ├── ecs-services
│   │   ├── broker
│   │   │   └── main.tf
│   │   ├── elasticsearch
│   │   │   └── main.tf
│   │   ├── mysql
│   │   │   └── main.tf
│   │   ├── nginx
│   │   │   └── main.tf
│   │   ├── php-fpm
│   │   │   └── main.tf
│   │   └── shared
│   │       ├── main.tf
│   │       └── roles.tf
│   ├── shared
│   │   └── variables.tf
│   ├── vpc-private
│   │   └── main.tf
│   └── vpc-public
│       └── main.tf
├── prod
├── stg
│   └── demo.tf

core question

Source

matamouros

Most helpful comment

Hi Folks! :wave: Sorry you are running into trouble.

It seems like there might be some confusion here about how the Terraform state mechanism is designed to work or potentially how to handle a single API object across multiple portions of configuration. Terraform by design is meant to ensure that only one resource declaration is used across a configuration to prevent management conflicts.

At its core, the Terraform state is designed to individually track resources in their current location in the Terraform configuration so it can properly manage the lifecycle of that resource. Multiple declarations of the same resource (e.g. an IAM policy resource in a module that is declared multiple times) are not tied together or automatically merged as they could have conflicting configuration.

When trying to work with a resource that is more "global", generally speaking, we recommend a setup declaring the resource once (e.g. outside a module that is declared multiple times) and/or utilizing data sources to reference existing resources without trying to manage them.

More documentation is available for both these concepts here:
• Terraform state: https://www.terraform.io/docs/state/index.html
• Data sources: https://www.terraform.io/docs/configuration/data-sources.html

If you feel the documentation is lacking clarity about these concepts or are having a certain use case that you feel is not being met by either of the solutions mentioned above, we would love to hear your feedback in a new issue with details about what you are trying to accomplish so we can provide guidance towards potentially existing solutions or work towards future improvements.

bflad on 20 Sep 2018

👍11 ❤2

All 14 comments

Hi @matamouros

This is a really interesting problem that I faced myself recently. The way I was able to get around it was to think in terms of do i want the same service role in dev, prod and staging? If not, I could pass a different prefix to the ecs module that would allow an environment specific role

this means that each role is separate and terraform can manage each role separately

Would this help?

Paul

stack72 on 15 Aug 2016

I'd say this is generally a question related to reusing/sharing resources across environments.

In addition to what @stack72 said:

Assuming each environment has a separate tfstate file (and it really _should_) and has remote state enabled, you can expose resource ARN/ID/anything else via output and then reference it via terraform_remote_state data source.

Don't take this is as _"the only right solution"_ :tm: but I would personally discourage from using the same AWS account for all environments, put aside reusing the same IAM roles in the same account. It feels too risky if you think of environments as isolated environments for the same resources/team.

^ This is also the motivation behind https://www.terraform.io/docs/providers/aws/index.html#allowed_account_ids and https://www.terraform.io/docs/providers/aws/index.html#forbidden_account_ids - motivation to prevent the user (including myself) from doing stupid things like executing destroy actions against wrong environment(s).

radeksimko on 15 Aug 2016

Hi, thanks both. I do have separate Amazon accounts for each of the three environments, since this is indeed the only way of having true separation of concerns. And given that terraform apply needs to run inside ./dev or ./stg or ./prod, it also generates separate state files for each.

Thus, the problem I exposed is not about reusing the role between environments. The problem is about reusing one generic ecs-service-role role (which gathers the required permissions for a service to register with an ECS cluster, etc) within the same environment. This role is generic, so I wanted to apply it to every service I create on that environment. But what happens is that Terraform tries to always create this generic role, for every service being created (because that service is reusing the nested module that creates that role), rather than moving on if the role is already there.

For the time being, @stack72 did hint at the stopgap solution: just allow the creation of these generic ecs-service-role by having each service name them something different. So I will have something like what's on the picture (and many more once I have all the services configured). The problem is that it really is a generic role and I wanted to just reuse it inside that same environment.

screen shot 2016-08-15 at 12 29 09

matamouros on 15 Aug 2016

Hi @matamouros

Got it - so the way i handled this in my env was as follows:

module "ecs_cluster" {
  source = "../modules/ecs_cluster"

  env_prefix = "dev"

  ami_id           = "${var.ecs_ami}"
  cluster_name     = "dev_ecs_cluster"
  desired_capacity = "3"
  max_size         = "5"
  instance_type    = "t2.large"

  vpc_id     = "${module.dev_vpc.vpc_id}"
  subnet_ids = ["${module.dev_vpc.private_subnets}"]
  key_name   = "${aws_key_pair.dev_keypair.key_name}"
}

module "search_service" {
  source = "../modules/service"

  environment = "${var.environment}"

  //service
  desired_count  = 1
  cluster        = "${module.ecs_cluster.ecs_cluster_name}"
  iam_role       = "${module.ecs_cluster.ecs_iam_role_id}"
  container_port = 8085

Notice that I create the role in the ECS cluster module and then that IAM Role ARN is an output from the cluster that i pass to each service definition - thus a generic role for ECS in the environment

stack72 on 15 Aug 2016

Got it, makes sense. It's really either that or repeating the role creation with a slightly different name for each service. But still, any way to have Terraform not fail if the role is already there? I can see there could be a few more situations where we don't want it to fail if the role exists already, especially if it was just created seconds ago earlier in that same plan execution.

matamouros on 15 Aug 2016

So technically Terraform failing is correct, you are telling it to create a role and because the AWS API fails when trying to create a duplicate IAM, then Terraform returns that error code. This is a very specific usecase when terraform is managing global based infra (e.g. IAMs and Route53)

I can also suggest that you have a specific terraform project that manages these types of resources and then you can use remote state to pass the Ids around

stack72 on 15 Aug 2016

Ok, seems a bit of an anti-pattern from my part to want this as a feature request. I'll look into inquiring remote state later, for now I'll go with creating distinct roles. I realised I won't be able to have the role at the cluster creation time as you suggested last, as I will have 2-3 different ECS clusters on the same environment, and the whole issue would happen again. Many thanks!

matamouros on 15 Aug 2016

Nps - will close this off for now :) Any other questions then please do shout!

stack72 on 15 Aug 2016

Hi - I hate to beat a dead horse, but I am having this issue upon creating creating _polices_ (getting EntityAlreadyExists) and trying to create _Lambda permissions_ (getting ResourceConflictException). I thought the whole point of the state file mechanism was to track what already exists and to not attempt to recreate resources that are already created. In my case, this is isolated to a single environment (no cross-environment resources).

I read the fine manual, but there's still a gap in my understanding how terraform tracks resources to create vs. resources that have already been created and whether there's such thing as an idempotent creation mechanism, or not.?

chriswolf-nrg on 5 Jun 2018

👍13

Terraform is garbage when it comes to this, I miss cloudformation a whole bunch now

connaryscott on 15 Sep 2018

👍1

This is actually a pretty serious issue for aws usability that requires hacks to work around. I don't think the fact that aws's api works a certain way is a justification for terraform to necessarily conform to the same behavior given its fundamentally declarative nature. Definitely hope there can be some sort of official workaround found for this kind of issue (even something as simple as having the option of ignoring these errors flat out)

nikhilkhanna on 20 Sep 2018

👍4

It's quite terrible that this happens. Intuitive behavior would have devs think that the state file tells terraform "hey, this resource exists and it hasn't been changed/edited so don't try to create it". Honestly, this hinders usability a lot.

emmanuelnk on 20 Sep 2018

👍2

Hi Folks! :wave: Sorry you are running into trouble.

bflad on 20 Sep 2018

👍11 ❤2

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.