Creating a resource with an interpolated count works on the very first apply. Subsequent plans and applies that cause the resource to be modified will change the name of the resource. This can lead to issues where there are references to resources that terraform doesn't believe exist, however they do... just under a slightly different name.
From my debugging it appears that EvalCountFixZeroOneBoundary is executed in two passes. After modification of a resource containing a count, the first pass will always return a count of 1 (and count appears in the RawConfig's unknownKeys). The second has the correct value, but by this point, the resource name has already been replaced with a string that has trimmed the ".0" from the end.
Terraform v0.9.3
Affects core
resource "aws_autoscaling_group" "asg" {
...
count = "${length(data.aws_availability_zones.available.names)}"
...
}
Resource names should be consistent as per the initial apply.
aws_autoscaling_group.asg.0: Refreshing state... (ID: ...)
aws_autoscaling_group.asg.1: Refreshing state... (ID: ...)
aws_autoscaling_group.asg.2: Refreshing state... (ID: ...)
Resource names are modified on later applies/plans. The ".0" prefix is removed:
aws_autoscaling_group.asg: Refreshing state... (ID: ...)
aws_autoscaling_group.asg.1: Refreshing state... (ID: ...)
aws_autoscaling_group.asg.2: Refreshing state... (ID: ...)
Please list the steps required to reproduce the issue, for example:
terraform applyterraform plan/apply after a modification that causes an update to the resource with a countI attempted to fix this by ignoring a rename if there's another resource in the state that ends in ".1"
(basically assume that the resource.Count() variable of 1 must be incorrect).
However, this means that if we do actually modify the count to 1, the other resources are deleted and 1 resource will be created with the removed ".0" (aws_autoscaling_group.asg), and then will error, because it hasn't deleted the .0 resource (aws_autoscaling_group.asg.0).
This renaming business seems error prone. Maybe we could just always have a .0 whenever a count is defined, even if it happens to only equal 1.
Sounds like another case possibly resolved by #13793.
@saracen - if you are hacking the source directly - can you see if adding the transformer to the plan graph builder fixes the issue for you?
Also I think that having count imply a list by default regardless of the count is a good idea, but one consideration that would need to be taken into account in that case is that it is currently a common pattern to use count to toggle singular resources in the graph (ie: ${var.enabled ? 1 : 0}), and those configs may be actually relying on something like aws_autoscaling_group.foo.id (without splat) and those configs might break under that change (not 100% sure on that one, just something to check).
@vancluever Early indications seem that it's working great! I added a bunch of weird workarounds and static values to my configs because I've got a deadline. But I'll be gradually removing more of those today and testing this out.
Thank you so much for this solution and spotting this issue at a weekend!
@saracen no problem and happy to hear things are working for you!
@vancluever Something still seems to be a problem somewhere, but I haven't tracked down whats causing it yet.
I can do a bunch of updates, and then I eventually hit a cycle error. When listing the state, I see that an instance of a resource without the count suffix has been introduced:
resource.dynamic
resource.dynamic[0]
resource.dynamic[1]
resource.dynamic[2]
Removing it fixes the cycle problem. When I'm free I'll try to come up with something that's repeatable.
@saracen there definitely seems to be a case where this is still happening (see #13828 for a repro). Looking into it a bit more I'd imagine that there's still something a little amiss where on refreshes the resource does not exist for interpolation like it should... I'm still in the process of tracking it down but probably won't be able to look that much more into it until the evening.
@saracen I think the stuff in #13828 might be due to a different issue actually, but you might want to check the updates I put in that ticket to see if applying that stuff helps with the cycles. Cheers!
I just got bit badly by this bug. It was while developing a plugin which made it worse because I kept assuming it was a bug in the plugin...
It seems it can manifest itself in 2 ways:
Error reading aws_instance.web count: strconv.ParseInt: parsing "${length(var.clusters) * var.total_workers_per_cluster}": invalid syntax
or with the .0 disappearing, which in my case resulted in
cabot_check_graphite.web_disk_critical.0: diffs didn't match during apply. This is a bug with Terraform and should be reported as a GitHub Issue.
I don't have much to add except that i hope it's fixed soon
Hi folks,
I am going to close this issue, because we approach this in a very different way in terraform 0.12. The relevant code base has changes significantly.
Anyone experiencing a similar-looking issue should please open a new GitHub issue and fill out the issue template in its entirety.
Thank you!
I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.