Folks,
It seems that order of environment variables is not being preserved. Also, mountPoints and volumesFrom are being showed in the plan, even when they're not defined in TF.
Large number of moving parts like environmental variables make working with big task definitions rather unpleasant :)
Terraform v0.11.2
aws_ecs_task_definition
resource "aws_ecs_task_definition" "service2" {
family = "service"
container_definitions = "${file("service.json")}"
}
service.json:
[{
"cpu": 10,
"environment": [
{
"name": "APP_VERSION",
"value": "dev-26758"
},
{
"name": "DATADOG_ENV",
"value": "ee1-llll-pppppppp"
},
{
"name": "DATADOG_PATCH_MODULES",
"value": "celery:true,elasticsearch:true,flask:true,httplib:true,mysql:true,redis:true,requests:true"
},
{
"name": "DATADOG_SERVICE_NAME",
"value": "translation-service"
},
{
"name": "DATADOG_TRACE_AGENT_HOSTNAME",
"value": "172.17.0.1"
},
{
"name": "DATADOG_TRACE_AGENT_PORT",
"value": "8126"
},
{
"name": "DATADOG_TRACE_ENABLED",
"value": "false"
}
],
"essential": true,
"image": "wordpress1",
"memory": 500,
"name": "wordpress",
"portMappings": [{
"containerPort": 80,
"hostPort": 0,
"protocol": "tcp"
},
{
"containerPort": 81,
"hostPort": 0,
"protocol": "tcp"
},
{
"containerPort": 82,
"hostPort": 0,
"protocol": "tcp"
}
]
}]
If I bump APP_VERSION
from dev-26758
to dev-26760
I get:
-/+ aws_ecs_task_definition.service2 (new resource required)
id: "service" => <computed> (forces new resource)
arn: "arn:aws:ecs:eu-west-1:000000000000:task-definition/service:14" => <computed>
container_definitions: "[{\"cpu\":10,\"environment\":[{\"name\":\"DATADOG_ENV\",\"value\":\"ee1-llll-pppppppp\"},{\"name\":\"DATADOG_TRACE_AGENT_HOSTNAME\",\"value\":\"172.17.0.1\"},{\"name\":\"DATADOG_SERVICE_NAME\",\"value\":\"translation-service\"},{\"name\":\"DATADOG_PATCH_MODULES\",\"value\":\"celery:true,elasticsearch:true,flask:true,httplib:true,mysql:true,redis:true,requests:true\"},{\"name\":\"DATADOG_TRACE_AGENT_PORT\",\"value\":\"8126\"},{\"name\":\"DATADOG_TRACE_ENABLED\",\"value\":\"false\"},{\"name\":\"APP_VERSION\",\"value\":\"dev-26758\"}],\"essential\":true,\"image\":\"wordpress1\",\"memory\":500,\"mountPoints\":[],\"name\":\"wordpress\",\"portMappings\":[{\"containerPort\":80,\"hostPort\":0,\"protocol\":\"tcp\"},{\"containerPort\":81,\"hostPort\":0,\"protocol\":\"tcp\"},{\"containerPort\":82,\"hostPort\":0,\"protocol\":\"tcp\"}],\"volumesFrom\":[]}]" => "[{\"cpu\":10,\"environment\":[{\"name\":\"APP_VERSION\",\"value\":\"dev-26760\"},{\"name\":\"DATADOG_ENV\",\"value\":\"ee1-llll-pppppppp\"},{\"name\":\"DATADOG_PATCH_MODULES\",\"value\":\"celery:true,elasticsearch:true,flask:true,httplib:true,mysql:true,redis:true,requests:true\"},{\"name\":\"DATADOG_SERVICE_NAME\",\"value\":\"translation-service\"},{\"name\":\"DATADOG_TRACE_AGENT_HOSTNAME\",\"value\":\"172.17.0.1\"},{\"name\":\"DATADOG_TRACE_AGENT_PORT\",\"value\":\"8126\"},{\"name\":\"DATADOG_TRACE_ENABLED\",\"value\":\"false\"}],\"essential\":true,\"image\":\"wordpress1\",\"memory\":500,\"name\":\"wordpress\",\"portMappings\":[{\"containerPort\":80,\"hostPort\":0,\"protocol\":\"tcp\"},{\"containerPort\":81,\"hostPort\":0,\"protocol\":\"tcp\"},{\"containerPort\":82,\"hostPort\":0,\"protocol\":\"tcp\"}]}]" (forces new resource)
family: "service" => "service"
network_mode: "" => <computed>
revision: "14" => <computed>
or in more human friendly form:
-/+ aws_ecs_task_definition.service2 (new resource required)
id: "service" => "<computed>" (forces new resource)
arn: "arn:aws:ecs:eu-west-1:000000000000:task-definition/service:14" => "<computed>"
container_definitions: [
{
"cpu": 10,
"environment": [
{
+ "name": "APP_VERSION",
+ "value": "dev-26760"
+ },
+ {
"name": "DATADOG_ENV",
"value": "ee1-llll-pppppppp"
},
{
- "name": "DATADOG_TRACE_AGENT_HOSTNAME",
- "value": "172.17.0.1"
+ "name": "DATADOG_PATCH_MODULES",
+ "value": "celery:true,elasticsearch:true,flask:true,httplib:true,mysql:true,redis:true,requests:true"
},
{
"name": "DATADOG_SERVICE_NAME",
"value": "translation-service"
},
{
- "name": "DATADOG_PATCH_MODULES",
- "value": "celery:true,elasticsearch:true,flask:true,httplib:true,mysql:true,redis:true,requests:true"
+ "name": "DATADOG_TRACE_AGENT_HOSTNAME",
+ "value": "172.17.0.1"
},
{
"name": "DATADOG_TRACE_AGENT_PORT",
"value": "8126"
},
{
"name": "DATADOG_TRACE_ENABLED",
"value": "false"
- },
- {
- "name": "APP_VERSION",
- "value": "dev-26758"
}
],
"essential": true,
"image": "wordpress1",
"memory": 500,
- "mountPoints": [
-
- ],
"name": "wordpress",
"portMappings": [
{
"containerPort": 80,
"hostPort": 0,
{
"containerPort": 82,
"hostPort": 0,
"protocol": "tcp"
}
- ],
- "volumesFrom": [
-
]
}
] (forces new resource)
network_mode: "" => "<computed>"
revision: "14" => "<computed>"
It's possible to work around this by making sure your local task definition has the same order as stored on Amazon, which can be queried using the AWS cli (or just copy, paste, and sanitise the json that Terraform returns when planning). The order will remain the same until you add a new variable key-pair. Annoying, but it's the only way to avoid your plans getting junked.
There is nice a nice function https://github.com/terraform-providers/terraform-provider-aws/blob/fae04dfedfbd653d6a0bdbcc5d7c04f3d54e3048/aws/ecs_task_definition_equivalency.go#L56 used to reorder (diff suppression) task definition during diff but it's used only there. All values go raw to the state
https://github.com/terraform-providers/terraform-provider-aws/blob/fae04dfedfbd653d6a0bdbcc5d7c04f3d54e3048/aws/resource_aws_ecs_task_definition.go#L265
https://github.com/terraform-providers/terraform-provider-aws/blob/401dc017401659015fa0afdde5336b891c695fe6/aws/structure.go#L621
@radeksimko added task definition migration from sha to json (thanks million for that) plus further fixes. I was wondering if you could help us again to get this fixed :)
Hello,
Just wondering if there's any update on this? This is unfortunately causing our ECS Task Definition to be replaced every time despite zero changes.
The ordering once it gets to AWS certainly seems very random!
Can we please have an update, we are having lots of issues with this bug @s-maj
In the initial report, your example diff output shows:
Is this bug report related to the display when a value does change, or idempotence of subsequent plan/apply operations?
@ctd, the original example isn’t great as some values have changed. However I can confirm that with TF 0.12 and latest AWS provider, the fact that AWS does not store the environment and secret variables in the order provided, results in Terraform wanting to change the order in the Terraform scripts.
@m13t thanks for the clarification.
I'm new to this particular code, so please excuse any mistakes in my thinking.
From an initial skim, it looks like the local/remote JSON strings are unmarshaled, a few specific fields are normalised, then the data is marshaled to JSON for comparison. This happens inside the DiffSuppressFunc though, which might lead to some surprising diff
output as the remarshalled JSON isn't used for display purposes.
Could you check your output for any other value changes between the local and remote task definition JSON values? I'll take a look at reproducing this issue on my end as well.
@ctd, that is more or less my suspicion too. I’ll dig out the most complex task definition we have to see if there are any other fields that have this problem.
I guess if this was a native Terraform resource rather than being serialised to JSON, it would make use of the array item hash to ascertain whether or not it had just moved index rather than thinking it’s been removed and added.
So it looks like there's also issues with volumes too:
- volume {
- name = "gocd_data" -> null
- docker_volume_configuration {
- autoprovision = false -> null
- driver = "local" -> null
- driver_opts = {
- "device" = ":/"
- "o" = "addr=fs-xxxxxxxx.efs.eu-west-2.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
- "type" = "nfs"
} -> null
- labels = {} -> null
- scope = "task" -> null
}
}
+ volume {
+ name = "gocd_data"
+ docker_volume_configuration {
+ autoprovision = false
+ driver = "local"
+ driver_opts = {
+ "device" = ":/"
+ "o" = "addr=fs-xxxxxxxx.efs.eu-west-2.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
+ "type" = "nfs"
}
+ scope = "task"
}
}
- volume {
- name = "gocd_home" -> null
- docker_volume_configuration {
- autoprovision = false -> null
- driver = "local" -> null
- driver_opts = {
- "device" = ":/"
- "o" = "addr=fs-xxxxxxxx.efs.eu-west-2.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
- "type" = "nfs"
} -> null
- labels = {} -> null
- scope = "task" -> null
}
}
+ volume {
+ name = "gocd_home"
+ docker_volume_configuration {
+ autoprovision = false
+ driver = "local"
+ driver_opts = {
+ "device" = ":/"
+ "o" = "addr=fs-xxxxxxxx.efs.eu-west-2.amazonaws.com,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport"
+ "type" = "nfs"
}
+ scope = "task"
}
}
However, if there are no other changes the volumes don't show changes either. It appears that if there are other changes then the volumes get added to the diff also. i.e. if the container_definitions
have changes, then so does volumes.
I haven't been able to reproduce the idempotency issue using a very minimal ECS stub service (https://gist.github.com/ctd/bfba7b6b9e6c50d4cba0b4fe930597b2).
Are you able to confirm under what circumstances you're seeing a resource update/replacement - e.g. does running two subsequent apply
operations result in an update on both? A full output log would be useful as well if you can provide one (obfuscated resource name/IDs are fine, of course).
@m13t @s-maj
Just following up, is this still an issue for you? If so, can you provide any more details (such as in my last comment) to help me reproduce the problem?
Not saying this is the explanation for this case, but I have found similar problems arise if the terraform task definition contains two environment variables with the same name.
Use the DEBUG output from the provider to determine the actual source of your diffs.
Inconsistent order between environment
objects is highly unlikely to be the cause of resource-level diffs.
diffs shown in plan
output are not canonicalized.
Therefore, there can be visual differences in their element ordering.
I lacked default values for my healthcheck:
s.
Adding the following values resolved my persistent diffs issue:
interval: 30
retries: 3
timeout: 5
I'd suggest these missing values should not have caused diffs.
Perhaps there's an existing feature request?
If someone knows of one, please mention/link here.
Debug journal:
terraform: v0.11.14
with plugin.terraform-provider-aws_v2.20.0_x4:
.
My first guess was that I had diffs because the value of this field is an opaque JSON string, and differences in object sort returned by the AWS API would be the cause of diffs.
Turns out, there's an entire go file dedicated to comparing ECS task definitions:
That looks good, and the tests look good; it appears to effectively test that un-ordered environment
objects compare as "equal to" ordered environment
objects.
The tested function is actually being called, so it's active code:
I'm now convinced object order is not my problem.
Now I want to see the canonicalized JSON and check it for diffs.
Checking TF_LOG=DEBUG output, I was able to see the canonicalized First:
and Second:
DEBUG values and was able to compare them for actual diffs.
No surprise, there were actual diffs.
My issue was due to not including any values for a few healthcheck:
defaults which the AWS API happily inserts.
Is there a plan to add a feature that lets the user say "Ignore the order of JSON lists for this particular JSON field value", so the AWS ENVIRONMENT
JSON list doesn't get detected as a diff when AWS chooses a stupid, arbitrary order that is different from what was submitted by Terraform?
In my case, the ECS tasks are getting recreated every time because AWS re-orders JSON lists unpredictably, but the reordering is static.
It's hard to see what the course of action is from this ticket.
@nhooey In my last comment, I tried to explain that the order of ENVIRONMENT
elements in terraform plan
output is misleading.
The aws_provider
is 100% properly canonicalizing all ENVIRONMENT variable lists before calculating diffs. I even link to the code where that happens.
At the same time, terraform plan
output is not and does not canonicalize that struct before emitting it, giving an appearance of being the source of a diff, when, if only order is different, it is not a diff.
It is my strong opinion that there is something else in your ECS task resource's definition that is different.
Have you checked the provider's output under TF_LOG=DEBUG
?
@tamsky is correct. However, for clarity, this should still be considered a bug since it renders a confusing diff - as evidenced by this thread. :)
@tamsky, @internetstaff:
I haven't reviewed the debug output, but in my case I kept re-ordering the environment
JSON list items in my resource "aws_ecs_task_definition"
until running terraform plan
said that the aws_ecs_task_definition
didn't have to be replaced anymore.
All I changed was the ordering of the JSON list items. Once they were in some _magic_ order, Terraform no longer wanted to replace the ECS task.
That seems like proof that Terraform is detecting and applying a diff when there conceptually isn't one.
@nhooey Do you have any steps to reliably reproduce the issue you've described?
I either need a reliable way to reproduce (I've tried without success), or debug output of the issue occurring to have a better idea of what's going on.
As mentioned, there _is_ a bug in the Terraform AWS Provider that may give deceptive output about the detected resource changes in a plan - at least until that is fixed we need to rely on the debugging output for determining if other issues exist.
same issue here, ordering of the environment variables is different everytime
It's far from ideal but with the help of terraform 0.12 you can re-order the environment variables to match the order given by AWS.
locals {
unordered_env = {
ENV_A = "${var.my_var}"
ENV_1 = "value_for_1"
}
}
aws ecs describe-task-definition --task-definition $TASK_DEF | jq '[ .taskDefinition.containerDefinitions[0].environment[] | .name]'
locals {
aws_order = [ .... ]
}
locals {
environments_variables = [for order in local.aws_order : map(order, local.unordered_env[order])]
}
environments_variables = [
{
"ENV_A" = "."............
},
{
"ENV_1" = "................."
},
]
data "template_file" "environments_variables" {
template = " { \"name\" : \"$${name}\", \"value\": \"$${value}\" }"
count = length(var.environments_variables)
vars = {
value = "${element(values(var.environments_variables[count.index]), 0)}"
name = "${element(keys(var.environments_variables[count.index]), 0)}"
}
}
data "template_file" "environments_variables_json" {
template = "[ $${value} ]"
vars = {
value = "${join(",", data.template_file.environments_variables.*.rendered)}"
}
}
it doesn't work with 0.11 because the for each interpolation syntax is from 0.12 only
you need to re-run the aws cli command and update the
aws_order
list each time you add/remove an env
This seems to have generally improved in the latest versions of the provider and TF itself .. but some improvements would be great ..
It doesn't seem like the ordering causes the task definitions to need to be updated any more ..
But when there is changes the diff is very nonsensical since it seems to be diffing against the raw order.
Could the provider not manage the order of the fields as they're coming from aws and when they're persisted to state? This would then just make the diff alphabetical and make the whole thing much more pleasant .. right now its a pain to actually analyze the changes in the variables ..
~ environment = [
- {
- name = "AWS_REGION"
- value = "..."
},
- {
- name = "PORT"
- value = "8001"
},
- {
- name = "POD_SERVICE"
- value = "..."
},
- {
- name = "POD_NAME"
- value = "..."
},
{
name = "APP_NAME"
value = "..."
},
~ {
~ name = "POD_ENVIRONMENT" -> "AWS_REGION"
~ value = "..." -> "..."
},
{
name = "LANG"
value = "en_US.UTF-8"
},
+ {
+ name = "POD_ENVIRONMENT"
+ value = "..."
},
+ {
+ name = "POD_NAME"
+ value = "..."
},
+ {
+ name = "POD_SERVICE"
+ value = "..."
},
+ {
+ name = "PORT"
+ value = "..."
},
{
name = "SSM_PATHS"
value = "..."
},
]
What was my actual change here? NONE .. I didn't add any env .. I made other changes on the task definition ...
@piotrb yours is the only discussion I found of this problem. Just so you know, https://github.com/terraform-providers/terraform-provider-aws/pull/11463 should fix that.
Improvements to this difference handling have been merged and will release with version 2.68.0 of the Terraform AWS Provider, later this week. Thanks to @jbergknoff-rival for the implementation. 👍
This has been released in version 2.68.0 of the Terraform AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.
For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template for triage. Thanks!
I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!
Most helpful comment
@piotrb yours is the only discussion I found of this problem. Just so you know, https://github.com/terraform-providers/terraform-provider-aws/pull/11463 should fix that.