Terraform-provider-aws: Terraform errors out on plan if defined NACL rule missing

Created on 14 Nov 2017  ·  13Comments  ·  Source: hashicorp/terraform-provider-aws

Terraform errors out if a defined NACL rule is not present when terraform plan is run.

This appears to be due to a bug in the findNetworkAclRule function in the referenced .go file. I only became aware of this because I had googled the phrase "Expected the Network ACL to have Entries". I suspect that the correct behavior in this case is for the function to return nil, nil rather than falling all the way through to the "I give up" error case. I can't suggest the exact code fix, only that the existing function doesn't correctly act on the expected NACL rule missing when other NACL rules are present.

Terraform Version

0.10.8

Affected Resource(s)

  • aws_network_acl_rule

This appears to affect any possible aws_network_acl

Terraform Configuration Files

https://www.dropbox.com/s/d3a772eddq1nzz4/sample.zip.gpg?dl=0

Debug Output

https://gist.github.com/AVALARA-WESPAYNE/f5c2ad4c6ac9759e7137d42da94e287e

Expected Behavior

Creation of missing NACL rule added to terraform plan

Actual Behavior

Terraform exits, reporting error which includes output from AWS API DescribeNetworkAcls request.

Steps to Reproduce

  1. Some time in the distant past, create VPC.
  2. Using other means, such as AWS web console, remove one or more rules from a NACL.
  3. terraform plan
  4. Terraform errors out with output similar to what is in gist link.

References

bug servicec2

All 13 comments

I'm having the same issue...

Terraform Version

Terraform v0.11.7
+ provider.aws v1.30.0

I believe I am experiencing the same problem. The result is that I can't renumber my rules to manage their order. I also can't destroy them so they are re-created unless I revert code, destroy and then apply the new code. Consequently, I can't upgrade my NACLs in production because I can't change them in place unless I build completely new ones and move subnets to those.

Related? https://github.com/hashicorp/terraform/pull/13608

Steps to Reproduce

  1. Some time in the distant past, create VPC.
  2. Renumber one or more rules from a NACL.
  3. terraform apply
  4. Terraform errors out with following

I also tried tainting the NACL to force rebuild, but that fails because it can't delete the rules it can't find and throws straight to an error.

Error: Error applying plan:

2 error(s) occurred:

* aws_network_acl_rule.private_everyone_ingress (destroy): 1 error(s) occurred:

* aws_network_acl_rule.private_everyone_ingress: Error Deleting Network Acl Rule: InvalidNetworkAclEntry.NotFound: no ingress entry with number 190 in network ACL acl-061a9e753a3f9b6d6
        status code: 400, request id: 3d866b1f-10a7-41f8-bde7-382d6057342d
* aws_network_acl_rule.private_everyone_egress (destroy): 1 error(s) occurred:

* aws_network_acl_rule.private_everyone_egress: Error Deleting Network Acl Rule: InvalidNetworkAclEntry.NotFound: no egress entry with number 500 in network ACL acl-061a9e753a3f9b6d6
        status code: 400, request id: 4fb0fe07-c8c7-4dc7-8c47-5b4b160310e2

I'm having the same issue.

my case is:

  1. created acl with terraform
  2. in aws console, manually change rule number from 1 to 10
  3. run terraform plan, got same error

I have seen this exact error as per OP and I think it is due to network ACLs being stateless. They can't be processed in same way as other resources. Yes they do get their own rule id per NACL - but Terraform doesn't know the difference as to whether a rule removed or has reappeared as a different number (and thus different nacl rule id) is related or even the same.

State can be instantly divergent once you make a change to any of the rules outside of terraform and thus instantly fails when plan or apply.

I have noticed also under this circumstance.

  1. have 2 rules in terraform applied as rule numbers : 100 and 500
  2. Change in terrafform to swap those resource 500 -> 100, 100 -> 500.
  3. plan then apply - it will crash erroring the same problem saying 100 is already existing.

My thinking is terraform could use a intermediary rule number that isn't occupied by any other whilst transitioning then make the final changes into the rightful numbers before cleaning up old rules.

Having the same issue here:

  1. Applied terraform and it created acl and acl rules
  2. Deleted acl and acl rules from aws console
  3. Re-applied terraform
  4. Hit Error below
* module.infrastructure.aws_network_acl_rule.private[2]: aws_network_acl_rule.private.2: Error Finding Network Acl Rule 110: InvalidNetworkAclID.NotFound: The networkAcl ID 'acl-01b66f4bce40732dd' does not exist
    status code: 400, request id: d7afd905-0d84-4672-be5d-8afc2718ff87

If anyone needs it, to recover from this you can simply do terraform state rm 'module.infrastructure.aws_network_acl_rule.private' if applicable.

ive noticed that same issue as @troxil. We create our rule numbers programmatically, ie:

rule_number    = "${100 + 10 * length(local.availability_zones) + 10 * count.index}"

If the number of availability zones changes, then terraform fails when doing a plan refresh or apply for the same reasons described above.

Granted, we're not going to be changing the number of AZs we assigned to a VPC that often, but the flexibility to be able to make those changes is part of the reason why we chose terraform

Having this same issue causing HUGE problems intermittently in Production. We cannot rely on terraform to manage ACLs for us; there's just FAR too many issues like this.

Can confirm this is still an issue in 0.11.13, we are not currently in a position to upgrade to 0.12.x

Still an issue in 0.12.6. An alternative workaround to state rm is to manually delete all network ACL rules for the problematic network ACL and then re-run apply.

The fix for this has been merged and will release with version 2.25.0 of the Terraform AWS Provider, next week. 👍

This has been released in version 2.25.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cjeanneret picture cjeanneret  ·  39Comments

jch254 picture jch254  ·  37Comments

oarmstrong picture oarmstrong  ·  44Comments

marcincuber picture marcincuber  ·  39Comments

jayanderson picture jayanderson  ·  44Comments