Terraform errors out if a defined NACL rule is not present when terraform plan is run.
This appears to be due to a bug in the findNetworkAclRule function in the referenced .go file. I only became aware of this because I had googled the phrase "Expected the Network ACL to have Entries". I suspect that the correct behavior in this case is for the function to return nil, nil rather than falling all the way through to the "I give up" error case. I can't suggest the exact code fix, only that the existing function doesn't correctly act on the expected NACL rule missing when other NACL rules are present.
0.10.8
This appears to affect any possible aws_network_acl
https://www.dropbox.com/s/d3a772eddq1nzz4/sample.zip.gpg?dl=0
https://gist.github.com/AVALARA-WESPAYNE/f5c2ad4c6ac9759e7137d42da94e287e
Creation of missing NACL rule added to terraform plan
Terraform exits, reporting error which includes output from AWS API DescribeNetworkAcls request.
terraform planI'm having the same issue...
Terraform Version
Terraform v0.11.7
+ provider.aws v1.30.0
I believe I am experiencing the same problem. The result is that I can't renumber my rules to manage their order. I also can't destroy them so they are re-created unless I revert code, destroy and then apply the new code. Consequently, I can't upgrade my NACLs in production because I can't change them in place unless I build completely new ones and move subnets to those.
Related? https://github.com/hashicorp/terraform/pull/13608
Steps to Reproduce
I also tried tainting the NACL to force rebuild, but that fails because it can't delete the rules it can't find and throws straight to an error.
Error: Error applying plan:
2 error(s) occurred:
* aws_network_acl_rule.private_everyone_ingress (destroy): 1 error(s) occurred:
* aws_network_acl_rule.private_everyone_ingress: Error Deleting Network Acl Rule: InvalidNetworkAclEntry.NotFound: no ingress entry with number 190 in network ACL acl-061a9e753a3f9b6d6
status code: 400, request id: 3d866b1f-10a7-41f8-bde7-382d6057342d
* aws_network_acl_rule.private_everyone_egress (destroy): 1 error(s) occurred:
* aws_network_acl_rule.private_everyone_egress: Error Deleting Network Acl Rule: InvalidNetworkAclEntry.NotFound: no egress entry with number 500 in network ACL acl-061a9e753a3f9b6d6
status code: 400, request id: 4fb0fe07-c8c7-4dc7-8c47-5b4b160310e2
I'm having the same issue.
my case is:
I have seen this exact error as per OP and I think it is due to network ACLs being stateless. They can't be processed in same way as other resources. Yes they do get their own rule id per NACL - but Terraform doesn't know the difference as to whether a rule removed or has reappeared as a different number (and thus different nacl rule id) is related or even the same.
State can be instantly divergent once you make a change to any of the rules outside of terraform and thus instantly fails when plan or apply.
I have noticed also under this circumstance.
My thinking is terraform could use a intermediary rule number that isn't occupied by any other whilst transitioning then make the final changes into the rightful numbers before cleaning up old rules.
Having the same issue here:
* module.infrastructure.aws_network_acl_rule.private[2]: aws_network_acl_rule.private.2: Error Finding Network Acl Rule 110: InvalidNetworkAclID.NotFound: The networkAcl ID 'acl-01b66f4bce40732dd' does not exist
status code: 400, request id: d7afd905-0d84-4672-be5d-8afc2718ff87
If anyone needs it, to recover from this you can simply do terraform state rm 'module.infrastructure.aws_network_acl_rule.private' if applicable.
ive noticed that same issue as @troxil. We create our rule numbers programmatically, ie:
rule_number = "${100 + 10 * length(local.availability_zones) + 10 * count.index}"
If the number of availability zones changes, then terraform fails when doing a plan refresh or apply for the same reasons described above.
Granted, we're not going to be changing the number of AZs we assigned to a VPC that often, but the flexibility to be able to make those changes is part of the reason why we chose terraform
Having this same issue causing HUGE problems intermittently in Production. We cannot rely on terraform to manage ACLs for us; there's just FAR too many issues like this.
Can confirm this is still an issue in 0.11.13, we are not currently in a position to upgrade to 0.12.x
Still an issue in 0.12.6. An alternative workaround to state rm is to manually delete all network ACL rules for the problematic network ACL and then re-run apply.
The fix for this has been merged and will release with version 2.25.0 of the Terraform AWS Provider, next week. 👍
This has been released in version 2.25.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.
For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!