Terraform-provider-aws: AWS API Gateway VPC Link timeout before AVAILABILITY is completed.

Created on 7 Oct 2019 · 17Comments · Source: hashicorp/terraform-provider-aws

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.12.6
+ provider.aws v2.31.0

Affected Resource(s)

aws_api_gateway_vpc_link

Terraform Configuration Files

resource "aws_api_gateway_vpc_link" "main" {
  name        = "${var.application}"
  description = "Provides access to ${var.application}"
  target_arns = ["${aws_lb.svc_nlb.arn}"]
}

Debug Output

https://gist.github.com/philippevidal80/435a44e73134ae4a3db7a36e3da9cce9

Panic Output

N.A

Expected Behavior

Actual Behavior

The aws_api_gateway_vpc_link resource should wait long enough for AVAILABILITY to complete.
In fact, it can last more than 8 minutes - which seems to be a hardcoded value (https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_api_gateway_vpc_link.go#L66) - for AWS VPC Link to be up and running.
In the case where 8 minutes are not enough, Terraform state won't be aware of this resource although it is creating and surely already created.
Then, next plan and apply phases will have following errors:

aws_api_gateway_vpc_link.main: Creating...

Error: Error waiting for APIGateway Vpc Link status to be "AVAILABLE": unexpected state 'FAILED', wanted target 'AVAILABLE'. last error: %!s(<nil>)

Because, the resource is already created (from the first apply phase).

And we can see at this point in AWS Web Console the right AWS VPC Link and the one that failed (because NLB already used by the first one).

Steps to Reproduce

Simply create a aws_api_gateway_vpc_link in the same Terraform project of a NLB load balancer.
Sometimes it will take more than 8 minutes - which seems to be a hardcoded value (https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_api_gateway_vpc_link.go#L66) - and Terraform returns error mentionned in Debug Output.

Important Factoids

N.A

References

10407

bug servicapigateway

Source

philippevidal80

👍24

Most helpful comment

My guess will be to provide a timeouts block for this resource as below:

resource "aws_api_gateway_vpc_link" "main" {
  name        = "${var.application}"
  description = "Provides access to ${var.application}"
  target_arns = ["${aws_lb.svc_nlb.arn}"]

  timeouts {
    create = "20m"
    delete = "20m"
  }
}

Currently, this block is not available.

Error: Unsupported block type

  on path_to_tf_ptoject/nlb.tf line 67, in resource "aws_api_gateway_vpc_link" "main":
  67:   timeouts {

Blocks of type "timeouts" are not expected here.

philippevidal80 on 7 Oct 2019

👍5

All 17 comments

My guess will be to provide a timeouts block for this resource as below:

resource "aws_api_gateway_vpc_link" "main" {
  name        = "${var.application}"
  description = "Provides access to ${var.application}"
  target_arns = ["${aws_lb.svc_nlb.arn}"]

  timeouts {
    create = "20m"
    delete = "20m"
  }
}

Currently, this block is not available.

Error: Unsupported block type

  on path_to_tf_ptoject/nlb.tf line 67, in resource "aws_api_gateway_vpc_link" "main":
  67:   timeouts {

Blocks of type "timeouts" are not expected here.

philippevidal80 on 7 Oct 2019

👍5

What is the status of this? It seems like it's not moved since 2019?

rgardam on 14 Feb 2020

👍1

Got the same problem, how can we do a timeout here?

T00mm on 19 Mar 2020

👍2

@philippevidal80 Hi how were you able to cope with this? Since Terraform still doesn't have a timeout attribute for aws_api_gateway_vpc_link, what was the turnaround?

htaidirt on 2 Apr 2020

I'm also hitting this issue with Terraform 0.12.16 and can't find a way around it. At this point I'm going to have to call a bash script and set it up with the AWS CLI.

billheinson on 7 May 2020

Hi, same with Terraform 012.24 - aws provider 2.60.0
And when it passes, it's very close:

sdesousa86 on 18 May 2020

Hi, got this same issue, and I add sleep command as null resource:

resource "null_resource" "sleep" {
  provisioner "local-exec" {
    command = "sleep 480" 
  }
}

Sleep works properly but after 8 min error still occur.

PZ973 on 20 May 2020

I know there is a PR out there to allow the timeouts section to be added, but that to be seems to be best left for circumstances when the time is outside normal expectations generally due to something that can't be predicted. For example, a DynamoDB table that might take a long time to update (say adding seondary indices) because its got a lot of rows. I think the better solution would just be to update the default timeout if it's expected under normal circumstances to take longer than 8 min.

richardgavel on 31 Jul 2020

Hi folks 👋 The maintainers typically agree with @richardgavel on this topic, where customizable timeouts should only provided in situations where there is a scalable factor involved. Updating the existing deletion timeout to be 20 or 30 minutes (having confirmation from the API Gateway service team on the longest expected time for the operation) feels like a more appropriate solution in this case.

bflad on 31 Jul 2020

@bflad To clarify, the timeout in question is not a deletion timeout. It's the wait for the status to go from pending to available: https://github.com/terraform-providers/terraform-provider-aws/blob/master/aws/resource_aws_api_gateway_vpc_link.go#L71

richardgavel on 1 Aug 2020

The fix for this, to increase the timeouts, has been merged and will release with version 3.3.0 of the Terraform AWS Provider, shortly. 👍

bflad on 20 Aug 2020

This has been released in version 3.3.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

hashibot[bot] on 21 Aug 2020

I've just upgraded to AWS provider 3.4.0 and still experiencing this issue:

module.user-preferences-api-gateway.aws_api_gateway_vpc_link.user_preferences_gateway_api_vpc_link[0]: Creating...                       
module.user-preferences-api-gateway.aws_route53_record.user_preferences_gateway_api_dns_record[0]: Still creating... [10s elapsed]       
module.user-preferences-api-gateway.aws_route53_record.user_preferences_gateway_api_dns_record[0]: Still creating... [20s elapsed]       
module.user-preferences-api-gateway.aws_route53_record.user_preferences_gateway_api_dns_record[0]: Still creating... [30s elapsed]       
module.user-preferences-api-gateway.aws_route53_record.user_preferences_gateway_api_dns_record[0]: Creation complete after 40s [id=Z2468E
QCVINKGX_user-preferences-gateway-api-test.data.XXXX.cloud_A]                                                                        

Error: Error waiting for APIGateway Vpc Link status to be "AVAILABLE": unexpected state 'FAILED', wanted target 'AVAILABLE'. last error: 
%!s(<nil>)

Although I'm not very familiar with Go code, If I look at the related PR merged to solve this issue, I only see an update of the waitForApiGatewayVpcLinkDeletion call:

https://github.com/terraform-providers/terraform-provider-aws/pull/10407/files

but not for any other actions. Have these been forgotten?

gijzelaerr on 1 Sep 2020

👎1 👍1

Hi @gijzelaerr ,

Your issue is not linked to the "too short" timeout problem, it's just the normal behavior when your VPC link creation failed.
For example when the NLB is already associated with another VPC Endpoint Service Configuration (I had this error one time).

I suggest to check the related error message in Cloudtrail for more details on your creation issue ;-)

sdesousa86 on 1 Sep 2020

Thank you for your answer. It might not be related but just to clarify; the load balancer (also created by TF) is not ready yet, I can see it is still provisioning in the AWS console. If I wait a couple of minutes and run the same TF script again, the run completes successfully. So it is the VPC link being created too quickly while the load balancer is not ready yet.

gijzelaerr on 2 Sep 2020

Humm... probably just a "missing dependency" issue.
But TF is supposed to automatically add dependency when you reference other resource attributes like this:

resource "aws_api_gateway_vpc_link" "nlb" {
  name        = "my-vpc-link-name"
  target_arns = [aws_lb.nlb.arn]
}

Try to manually add dependency if it's not working for you.

sdesousa86 on 2 Sep 2020

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

hashibot[bot] on 20 Sep 2020

Was this page helpful?

5 / 5 - 1 ratings

Related issues

Plans to support more custom endpoints in provider AWS (lambda, apigateway, ...)

hashibot · 3Comments

aws_security_group: revoke_rules_on_delete conflict with 'terraform plan'

carmas · 3Comments

Please add support for default values in Cloudwatch Metric Filters

hazmeister · 3Comments

AWS Security Group Rule protocol/port error.

gothrek22 · 3Comments

Using security_groups instead of the correct vpc_security_group_ids on an instance within VPC results in instance recreation on each apply

andywirv · 3Comments