Terraform-provider-aws: Cycle error for replacement of aws_api_gateway_deployment with lifecycle create_before_destroy set to true and API Gateway resources in depends_on section

Created on 18 Dec 2019  路  28Comments  路  Source: hashicorp/terraform-provider-aws

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.18
+ provider.aws v2.42.0

Affected Resource(s)

  • aws_api_gateway_deployment

Terraform Configuration Files

I'm not copying all API Gateway resources' configuration as it's pretty standard but happy to share configuration of whole API Gateway if requested

resource "aws_api_gateway_deployment" "deployment" {
  depends_on = [
    aws_api_gateway_rest_api.api,
    aws_api_gateway_resource.api_email_health,
    aws_api_gateway_method.api_email_health_get,
    aws_api_gateway_integration.api_email_health_get_integration,
    aws_api_gateway_method.api_email_health_options,
    aws_api_gateway_integration.api_email_health_options_integration,
    aws_api_gateway_integration_response.api_email_health_options_integration_response,
    aws_api_gateway_method_response.api_email_health_options_response,
    aws_api_gateway_resource.api_email_templates,
    aws_api_gateway_method.api_email_templates_get,
    aws_api_gateway_integration.api_email_templates_get_integration,
    aws_api_gateway_method.api_email_templates_options,
    aws_api_gateway_integration.api_email_templates_options_integration,
    aws_api_gateway_integration_response.api_email_templates_options_integration_response,
    aws_api_gateway_method_response.api_email_templates_options_response,
    aws_api_gateway_resource.api_email_emails,
    aws_api_gateway_method.api_email_emails_post,
    aws_api_gateway_integration.api_email_emails_post_integration,
    aws_api_gateway_method.api_email_emails_options,
    aws_api_gateway_integration.api_email_emails_options_integration,
    aws_api_gateway_integration_response.api_email_emails_options_integration_response,
    aws_api_gateway_method_response.api_email_emails_options_response,
    aws_api_gateway_resource.api_email
  ]

  rest_api_id = aws_api_gateway_rest_api.api.id

  stage_description = "Deployed at ${timestamp()}"

  stage_name = var.aws_spotlight_environment

  lifecycle {
    create_before_destroy = true
  }
}

Expected Behavior


As resource aws_api_gateway_deployment is configured as depends_on all API Gateway resources/methods/integrations/responses, it shouldn't be created before all resources in API Gateway are provisioned so outcome should be (and was this way till recently): old API Gateway resources are destroyed, new are created, new deployment created, old deployment destroyed
We force replacement of aws_api_gateway_deployment so current API Gateway state is always deployed to main stage

This was behaviour in Terraform 0.11.x

Actual Behavior

Cycle Error

Error: Cycle: aws_api_gateway_integration.api_email_health_get_integration (destroy), aws_api_gateway_integration.api_email_health_options_integration (destroy), aws_api_gateway_integration_response.api_email_health_options_integration_response (destroy),
aws_api_gateway_method_response.api_email_health_options_response (destroy), aws_api_gateway_method.api_email_health_options (destroy), aws_api_gateway_resource.api_email_health (destroy), aws_api_gateway_deployment.deployment, aws_api_gateway_deployment.deployment (destroy deposed 359e79c1),
aws_api_gateway_method.api_email_health_get (destroy)

Removal off create_before_destroy = true in lifecycle of resource aws_api_gateway_deployment helps but causes it to fail anyway on different error:

Error: error deleting API Gateway Deployment (bdq86u): BadRequestException: Active stages pointing to this deployment must be moved or deleted

If I remove depends_on section instead, I have situations that deployment happens before all API methods are properly configured. Example:

Error: Error creating API Gateway Deployment: BadRequestException: No integration defined for method

I tried adding separate resource for stage aws_api_gateway_stage but problem persists

Steps to Reproduce

  1. Create API Gateway with aws_api_gateway_deployment which depends on API Gateway resources and is recreated with every terraform apply
  2. Run terraform apply
  3. Change one or more API Gateway resources which forces them to be destroyed and recreated (ie change API Gateway resource path)
  4. Run terraform apply
servicapigateway

Most helpful comment

Hi all! 馃憢 Just a quick note to let you know this is on our radar and we will be taking a look in the near future to arrive at a resolution.

All 28 comments

We also ran into this problem, and solved it by removing create_before_destroy from the deployment, and manually running terraform taint on the stage resource to force it to be recreated, which got rid of the other error you mention.

We also ran into this problem, and solved it by removing create_before_destroy from the deployment, and manually running terraform taint on the stage resource to force it to be recreated, which got rid of the other error you mention.

If you taint the resource, does that mean that the deployment will be destroyed before a new one created and so the API be unavailable for the period of time in between destroy and create?

We also ran into this problem, and solved it by removing create_before_destroy from the deployment, and manually running terraform taint on the stage resource to force it to be recreated, which got rid of the other error you mention.

Isn't it manual wrangling to solve problem? We use CD software to deploy our TF code so we would prefer avoid such workarounds. Plus our stage is active as its attached to Custom Domain Name so we can't have it destroyed or have not existing deployment.

Currently we use null resource with some sleep command and deployment resource explicitly set to depends on that null resource as form of workaround. Deployment resource itself isn't set to depend on any API Gateway resources but delay gives time to all of required resources (methods, integrations and so on) to be provisioned before deployment is created (example below uses PowerShell as language for command because that's what we use in our company mostly)

resource "null_resource" "wait_for_all_resources" {
  triggers = {
    timestamp = timestamp()
  }
  provisioner "local-exec" {
    command     = "Start-Sleep -Seconds 60"
    interpreter = ["PowerShell", "-Command"]
  }
}

Having same issue with

Terraform v0.12.19
+ provider.aws v2.45.0

Having same issue with

Terraform v0.12.19
+ provider.aws v2.45.0

Same with
Terraform v0.12.19

  • provider.aws v2.46.0

Does anyone know if there is any work on this issue?

Same issue here, with terraform 0.11

I am experiencing the same issue with terraform 0.12 and with new triggers argument.
Removing create_before_destroy solves the problem, even though it's not ideal solution.

I also encountered the same issue. I tried two possible compromise solutions.

  1. Wait for a while until all the dependent resources are created

    I tried the following solution and I could change method and resource at least. The drawback is that this will trigger deployment every time you apply even if you don't have any change in the dependent resources.

    resource "aws_api_gateway_deployment" "deployment" {
    - depends_on = [
    -   module.method.lambda-integration
    - ]
    
      rest_api_id = aws_api_gateway_rest_api.api.id
    
      triggers = {
    -   redeployment = sha1(join(",", list(
    -     jsonencode(module.method.lambda-integration), # I was using lambda integration as a trigger of deployment.
    -   )))
    +   redeployment = timestamp()
      }
    
      provisioner "local-exec" {
        command = "sleep 30"
      }
    
      lifecycle {
        create_before_destroy = true
      }
    }
    
  2. Pass variable for trigger
    In this way, we can control when to recreate deployment, but you need to separate the resource update and deployment trigger. If you put them in one apply, creating and destroying deployment will start before completing to update the dependent resources.

    resource "aws_api_gateway_deployment" "deployment" {
      rest_api_id = aws_api_gateway_rest_api.api.id
    
      triggers = {
        redeployment = var.release-date
      }
    
      lifecycle {
        create_before_destroy = true
      }
    }
    

So the bug is still there ? There is no fix ? We have to do workarounds ?

I'm using 0.12.26 and having the same issue.

I also had this issue, the following solution worked well for me.
I'm using random_uuid resource to produce a value that is passed to triggers block in aws_api_gateway_deployment resource.
The random_uuid is re-generated when keepers values change, which can be set to anything e.g jsonencode(aws_api_gateway_method.method) and jsonencode(aws_api_gateway_integration.integration). It is important to make sure that aws_api_gateway_deployment is created after everything, I achieved it by extracting it into a module and using mandatory variable.

variable "required_resources" {
  type        = list(string)
  description = "Change in these values trigger redeployment"
}

resource "aws_api_gateway_deployment" "deployment" {
  rest_api_id = var.rest_api_id
  stage_name  = var.stage

  # hack to force redeployment every time this hash changes
  triggers = {
    redeployment = sha1(join(",", var.required_resources)
  }

  # false by default, just for clarity
  lifecycle {
    create_before_destroy = false
  }
}

The above resource is placed in its own module.

resource "random_uuid" "deployment_trigger" {
  depends_on = [aws_api_gateway_integration.integration, aws_api_gateway_method.method]
  keepers = {
    # Generate a new id every time something happens to these resources
    method      = jsonencode(aws_api_gateway_method.method)
    integration = jsonencode(aws_api_gateway_integration.integration)
    path        = var.resource_path
  }
}

# some other gateway stuff...

module "deployment" {
  source      = "../modules/api-gateway-deployment"
  rest_api_id = aws_api_gateway_rest_api.api.id
  stage       = var.stage

  required_resources = [
    random_uuid.deployment_trigger.id,
    random_uuid.deployemnt_trigger_for_another_method.id,
# add random uuid for each method/integration
  ]
}

I placed stuff required for adding new method into its own module as well so I don't have to write "random_uuid" "deployment_trigger" multiple times. This seems to be working fine for consecutive deployments and changes to api gateway integration/method.

I published modules I use, they are very basic and might not work for all projects but code can be adapted for your needs.
https://github.com/vladcar/terraform-aws-serverless-common-api-gateway-method
https://github.com/vladcar/terraform-aws-serverless-common-api-gateway-deployment

hello everybody i did find a solution,
terraform handel resources in singleton mode, it means on resource with a specific name should exist only one time in a tf state, in the case of apigateway deployment, a deployment cant be modified, its a partucularity of aws, and it is quite normal it is like a tag.
my solution is to remove the resource from the tfstate after each apply
terraform state rm aws_api_gateway_deployment.gw_deploy_dev
and now i can see the history of terrform deployments on my Api
i hope it will help you,
corona virus is a mess but thanks to the time that i had i could made a reverse engineering of the apigw,
but in the end i think that Terraform should add new type of ressource based of the design pattern Prototype

This is not a valid solution. One - you're doing manual work around configuration. Two - when you remove this from state, it means deployment will be created on next apply automatically (even if you don't need/want to).
What I see here is a way of tainting/abandoning deployment on destroy. Can't we have some parameter that simply removes deployment from state instead of running API call to delete deployment?

Guys, this bug means that Terraform CANNOT work with API Gateway in Production. Is there ANY view to when this CRITICAL defect in the AWS Provider will be fixed? Alternatively we will have to move away from Terraform.

i dont agree, you can split your infra in two module, one special for the deployment.
each time you ask for deployment its a new one, like the design pattern PROTOTYPE

Completely wiping out the value of declarative Infrastructure as Code. If we should manually do a whole bunch of extra work to the top level every single time some minor change in a bottom level happens, what's the point of Terraform?

Since it seems this code has zero value, I will post the code that is not working. We have made numerous changes to try and get this working, and not one has worked. This particular variation builds the API Gateway just fine, but any slight change (e.g. to what parameters we validate) results in "Error: error deleting API Gateway Deployment (ufn1gl): BadRequestException: Active stages pointing to this deployment must be moved or deleted"

The only way to make this work in tooling (fully automated) is to entirely destroy the entire API gateway and recreate it, resulting in a completely new URL. I would not be happy with that solution in a Development environment; in a Production one it's a joke.

The defects related to our issue are:

locals {
  private_config_map = { type = "PRIVATE", vpc_endpoint_ids = var.vpc_endpoint_ids }
  regional_config_map = { type = "REGIONAL", vpc_endpoint_ids = null }
}

/* ---------------------------
 * API GATEWAY
 * --------------------------- */
resource "aws_api_gateway_rest_api" "main" {
  name            = var.name

  dynamic "endpoint_configuration" {
    for_each = var.private == true ? list(local.private_config_map) : list(local.regional_config_map)

    content {
      types             = [endpoint_configuration.value["type"]]
      vpc_endpoint_ids  = endpoint_configuration.value["vpc_endpoint_ids"]
    }
  }

  api_key_source  = "HEADER"
  body            = var.body
  tags            = var.tags

  lifecycle {
    ignore_changes = [
      policy
    ]
  }
}

/* ---------------------------
 * SETTINGS
 * --------------------------- */
resource "aws_api_gateway_method_settings" "main" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  stage_name  = aws_api_gateway_deployment.deploy.stage_name
  method_path = "*/*"

  settings {
    metrics_enabled    = true
    logging_level      = "INFO"
    data_trace_enabled = true
  }
}

/* ---------------------------
 * MAIN STAGE
 * --------------------------- */
resource "aws_api_gateway_stage" "main" {
  stage_name            = "main"
  description           = "Main Stage for deploying functionality"
  rest_api_id           = aws_api_gateway_rest_api.main.id
  deployment_id         = aws_api_gateway_deployment.deploy.id
  xray_tracing_enabled  = var.xray_tracing_enabled

  variables             = var.variables

  access_log_settings {
    destination_arn = var.cloudwatch_log_arn
    format          = "\"{\"requestId\":\"$context.requestId\",\"ip\":\"$context.identity.sourceIp\",\"caller\":\"$context.identity.caller\",\"user\":\"$context.identity.user\",\"requestTime\":$context.requestTimeEpoch,\"httpMethod\":\"$context.httpMethod\",\"resourcePath\":\"$context.resourcePath\",\"status\":$context.status,\"protocol\":\"$context.protocol\",\"path\":\"$context.path\",\"stage\":\"$context.stage\",\"xrayTraceId\":\"$context.xrayTraceId\",\"userAgent\":\"$context.identity.userAgent\",\"responseLength\":$context.responseLength}\""
  }

  lifecycle {
    ignore_changes = [
      deployment_id
    ]
  }

  tags = var.tags

  depends_on = [aws_api_gateway_deployment.deploy]
}

/* ---------------------------
 * DEPLOYMENT
 * --------------------------- */
resource "aws_api_gateway_deployment" "deploy" {
  rest_api_id = aws_api_gateway_rest_api.main.id
  stage_name  = "deploy"
  stage_description = "Deployed at ${timestamp()}"

  triggers = {
    redeployment = sha1(join(",", list(
      jsonencode(var.body)
    )))
  }

  lifecycle {
    create_before_destroy = true
  }
}

@mitchellh Are you aware of this issue? Essentially, Terraform does not support AWS API Gateway. How exactly do we get developer focus on this, seems like an issue like this just gets closed and reopened and closed and reopened in cycle forever.

@mitchellh Are you aware of this issue? Essentially, Terraform does not support AWS API Gateway. How exactly do we get developer focus on this, seems like an issue like this just gets closed and reopened and closed and reopened in cycle forever.

Terraform does support API Gateway. I think the solution is not to be found in Terraform, but in the AWS Terraform Provider. Rather than have all of the elements of the API Gateway as separate Terraform resources, they should be blocks on the aws_api_gateway_rest_api resource, then when any part of the gateway resource changes the provider would know to create a new deployment. I would do the work but I do not code in go, but could design the solution.

resource "aws_api_gateway_rest_api" "my_gateway" {
  name        = "my-gateway"

  endpoint_configuration {
    types = ["REGIONAL"]
  }

  resource {
    path_part = "products"
    method {
        http_method = "GET"
        authorization = "NONE"
        integration = {
            integration_http_method = "POST"
            type                  = "AWS_PROXY"
            uri                     = aws_lambda_alias.example.invoke_arn
        }
    }
    resource {
        path_part = "toys"
        method {
            http_method = "GET"
            authorization = "NONE"
            integration = {
                integration_http_method = "POST"
                type                  = "AWS_PROXY"
                uri                     = aws_lambda_alias.example.invoke_arn
            }
        }
    }
  }
}

@Glen-Moonpig your solution sounds interesting. The one piece I would dispute is that Terraform supports API Gateway. Terraform is supposed to be a tool to manage infrastructure as code - this is a production focused tool. If Terraform cannot create and manage components like API Gateway without causing production outages not required in normal operation of the component, then I would argue quite vehemently that it is not in fact supported.

Especially since this has been unresolved in one shape or form for over a year.

@Glen-Moonpig your solution sounds interesting. The one piece I would dispute is that Terraform supports API Gateway. Terraform is supposed to be a tool to manage infrastructure as code - this is a production focused tool. If Terraform cannot create and manage components like API Gateway without causing production outages not required in normal operation of the component, then I would argue quite vehemently that it is not in fact supported.

Especially since this has been unresolved in one shape or form for over a year.

I am using Terraform to deploy and maintain API Gateways in numerous projects. I have not had any production outages. There are very simple ways to handle this particular scenario. You can just break your changes down into multiple applys and they will go through fine.
Terraform 0.13.3/0.14 might resolve the cycle issue as there are various changes around cycles and plans.

We're using Terraform Cloud for this, which does not appear to support multiple apply's as you state; and even if I try this manually, the multiple applies always result in the same underlying error. I believe it may be because we are using OpenAPI Import instead of manually specifying each resource independently; but that's a key feature of API Gateway.

@shederman My team is having the same issue (posting from my personal github however); most of these workarounds are not ideal and some won't work if you have both a deployment and a stage resource.

We currently workaround by

  • adding a timestamp() trigger to the deployment resource
  • tainting our stage (add a taint command to our deploy script essentially) and destroy/recreate both deployment and stage every apply

This should not be necessary though. I hope that this is indeed resolved in .13

@riley-clarkson Do you get any service interruptions like that? We have mission-critical services running on API Gateway and the idea of destroying stages on every deploy is not a popular one I can tell you!

@shederman We do have service interruptions, which is okay for us, but still not ideal (and will not be possible for some projects/teams). We tried most of the workarounds in this thread before resorting to tainting the stage every deployment. Would like to see this fixed

Yeah, that clearly shows that this is not Production ready for mission critical systems

Does anyone know how long it will be until the Hashicorp bot autocloses the issue (as happened to the previous few)?

Hi all! 馃憢 Just a quick note to let you know this is on our radar and we will be taking a look in the near future to arrive at a resolution.

Was this page helpful?
0 / 5 - 0 ratings