Terraform: Confusing downstream error on reliant resources when an upstream resource is invalid.

Created on 10 May 2017 · 20Comments · Source: hashicorp/terraform

Terraform Version

0.9.4

Affected Resource(s)

All? Ran into it using aws_vpc.

Terraform Configuration Files

resource "aws_vpc" "main" {                                                                                                  
  cidr_block = "11.${data.template_file.environment_number.rendered}.0.0/16"                                                 

  tags {                                                                                                                     
    Name        = "${var.name}"                                                                                              
    Environment = "${var.name}"                                                                                              
    Terraform   = "Yes"                                                                                                      
  }                                                                                                                          
} 

resource "aws_subnet" "a" {                                                                                                
  vpc_id     = "${aws_vpc.main.id}"                                                                                        
  cidr_block = "11.${data.template_file.environment_number.rendered}.1.0/24"                                               

  tags {                                                                                                                   
    Name        = "${var.name}-a"                                                                                          
    Environment = "${var.name}"                                                                                            
    Terraform   = "Yes"                                                                                                    
  }                                                                                                                        
}

Expected Behavior

Admittedly this was an operator error, but the error message I got led me down a very confusing path and wasn't clear until I'd commented out half of my module.

The VPC definition above was invalid because of an error with the inline template I was using. I'd have _expected_ this to throw an error such as:

Error running plan: 1 error(s) occurred:

* module.staging1.aws_vpc.main: "cidr_block" must contain a valid CIDR, got error parsing: invalid CIDR address: 11..0.0/16

And I did get this error, but ...

Actual Behavior

The downstream dependencies, in this case the aws_subnet, error'd out so instead of the above error I got the following errors:

Error running plan: 1 error(s) occurred:

* module.staging1.aws_route53_zone.env: 1 error(s) occurred:

* module.staging1.aws_route53_zone.env: Resource 'aws_vpc.main' not found for variable 'aws_vpc.main.id'

It would be nice if Terraform would output the root error rater than saying aws_vpc.main wasn't found. It's declared, but invalid. Only after commenting out all dependent resources did I get the correct error message.

bug core

Source

joestump

👍17

Most helpful comment

This bit me today. registry.json had invalid json content.

Error running plan: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: Resource 'aws_ecs_task_definition.registry' not found for variable 'aws_ecs_task_definition.registry.arn'

data "template_file" "registry" {
  template = "${file("${path.module}/task-definitions/registry.json")}"

  vars {
    REGISTRY_VERSION = "${var.registry_version}"
  }
}

resource "aws_ecs_task_definition" "registry" {
  family                = "registry"
  container_definitions = "${data.template_file.registry.rendered}"
}

/* container and task definitions for running the actual Docker registry */
resource "aws_ecs_service" "registry" {
  name            = "registry"
  cluster         = "${var.cluster_id}"
  task_definition = "${aws_ecs_task_definition.registry.arn}"
  desired_count   = 1
}

dtandersen on 1 Sep 2017

👍10

All 20 comments

👍 , I've seen something like this when I'm adding module outputs. The new output I just added doesn't appear in a new Terraform run that appears to otherwise be 100% successful.

The problem? I mistyped something in the output value.

Hunting through debug output showed me the error of my ways the first couple times this happened and now I know to look for it.

But that's good news, right? If it's in the debug log, surely parse errors like this could just be output at higher urgency levels to address the issue...

leftathome on 14 Jul 2017

@leftathome what exactly did you see in the debug log?

rsalmond on 21 Jul 2017

Hi @joestump! Sorry for this confusing error message.

Could you clarify what exactly the problem was with the template that led you here? I assume there was something going wrong in this data.template_file.environment_number resource, but it'd help to reproduce this if we had some more info on what was wrong with the template resource and what you did to make the error go away after all of this debugging work.

apparentlymart on 21 Jul 2017

This bit me today. registry.json had invalid json content.

Error running plan: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: Resource 'aws_ecs_task_definition.registry' not found for variable 'aws_ecs_task_definition.registry.arn'

data "template_file" "registry" {
  template = "${file("${path.module}/task-definitions/registry.json")}"

  vars {
    REGISTRY_VERSION = "${var.registry_version}"
  }
}

resource "aws_ecs_task_definition" "registry" {
  family                = "registry"
  container_definitions = "${data.template_file.registry.rendered}"
}

/* container and task definitions for running the actual Docker registry */
resource "aws_ecs_service" "registry" {
  name            = "registry"
  cluster         = "${var.cluster_id}"
  task_definition = "${aws_ecs_task_definition.registry.arn}"
  desired_count   = 1
}

dtandersen on 1 Sep 2017

👍10

@apparentlymart it doesn't really matter what the upstream template is. I'd assume just hardcoding an invalid value would cause the same behavior. All that was wrong was that I was passing an empty variable to something that needed to be not empty.

I didn't call it out, but the CIDR block passed was 11..0.0/16, which is clearly invalid, but I didn't get an invalid CIDR block error or an error about the aws_vpc being unable to be created.

joestump on 1 Sep 2017

I just hit this in a different situation, with an aws_iam_role resource that was referenced by an aws_lambda_function resource. The JSON of the aws_iam_role's assume_role_policy was invalid, and any complaint about that was masked by "Resource 'aws_iam_role.abc' not found for variable 'aws_iam_role.abc.arn' " (names changed to protect the innocent). I had to completely comment out the lambda before I got an error message about the broken JSON.

$fractos picture$ fractos on 13 Oct 2017

+1 I ran into this too with a missing ":" in a json file. The json was a .tpl for an aws_emr_security_configuration.

e-carlin on 17 Oct 2017

I'm encountering this issue if an invalid JSON string is passed into the container_definitions property of aws_ecs_task_definition.

Terraform raises an error about aws_ecs_task_definition being missing, rather than a JSON parsing exception. If I remove all of the task's dependents, Terraform (correctly) raises the JSON error.

schmod on 27 Oct 2017

👍7

I just experienced this problem. The root cause was that my aws_elb.name was too long but the error message I had was unknown resource 'aws_elb.xxx' referenced in variable aws_elb.xxx.dns_name. I had to comment out all dependent resources before I could see the root error.

robyoung on 1 Nov 2017

@robyoung One useful workaround can be to use the -target argument to limit the planning to the resource in question, eg terraform plan -target=aws_elb.xxx. That avoids having to manually comment out the dependent resources, and should have the same effect on Terraform's behaviour. It's still not an obvious step, but it can help diagnose the root problem with a resource until this issue is fixed.

EDIT: I forgot to include the = in the -target= option.

ncraike on 2 Nov 2017

👍5 🎉2

Painful debugging .... only turn on TF_LOG=debug and track back [ERROR] output and ignore all the "not found" will tell you the root error

trung on 5 Jan 2018

❤2 🎉1

@apparentlymart to sum up the issue:

Errors on resources can be hidden by a dependent resource. Say for example we have an ecs task definition and an ecs service:

resource "aws_ecs_task_definition" "cavalcade" {
  family        = "${var.name}"

  container_definitions = <<TASK
[
Invalid JSON
]
TASK
}

resource "aws_ecs_service" "cavalcade" {
  name = "${var.name}"
  cluster = "${var.ecs_service_cluster_name}"
  desired_count = "${var.cavalcade_desired_count}"
  task_definition = "${aws_ecs_task_definition.cavalcade.family}:${aws_ecs_task_definition.cavalcade.revision}"

  lifecycle {
    create_before_destroy = true
  }
}

The task definition will fail to be created, but that error won't be displayed. The error aws_ecs_service.cavalcade: Resource 'aws_ecs_task_definition.cavalcade' not found for variable 'aws_ecs_task_definition.cavalcade.family' will be. The only way to find out there's an error in the task definition is to comment out the service definition and run a plan.

nathanielks on 13 Jan 2018

👍1

This also bite me and it was quite hard to debug. In my case there was an integer for a ECS task definition that could not be expressed as a string. Using the debug mode in terraform and looking for the ERROR ( thank @trung ) helped a lot. But yes, this issue feels to need an improvement on user experience.

TF_LOG=debug terraform plan -no-color 2> plan.log

    "memoryReservation": "${nginx_memory_reservation}"



md5-37e1dc0b351fc2c670740092408eff22



2018/06/15 10:57:28 [ERROR]
 root.ecs_service: eval: *terraform.EvalValidateResource, err: Warnings: []. 
Errors: [ECS Task Definition container_definitions is invalid: 
Error decoding JSON: json: cannot unmarshal string into 
Go struct field ContainerDefinition.MemoryReservation of type int64]

paurullan on 15 Jun 2018

Also happened here with Terraform v0.11.7. It is quite annoying since AWS specifically uses a lot of JSON, so the potential for invalid JSON is pretty big.

mzhaase on 11 Jul 2018

Same happened here with google_container_cluster and google_container_node_pool, specifying invalid CIDR for master_ipv4_cidr_block swallowed original error message and showed a confusing message for dependant resource.

When both resources are uncommented I get this error message
````
Error: Error running plan: 1 error(s) occurred:

module.environment.module.gke_cluster.google_container_node_pool.main_node_pool: 1 error(s) occurred:
module.environment.module.gke_cluster.google_container_node_pool.main_node_pool: Resource 'google_container_cluster.project_cluster' not found for variable 'google_container_cluster.project_cluster.id'
and when I comment out the node pool I get this one
Error: Error running plan: 1 error(s) occurred:
module.environment.module.gke_cluster.google_container_cluster.project_cluster: expected master_ipv4_cidr_block to contain a valid CIDR, got: with err: invalid CIDR address:
````

silvpol on 17 Aug 2018

I had this issue as well. Would have saved time if Terraform provided root error.

Initially when ran terraform apply I get:

Error: Error running plan: 1 error(s) occurred:
* output.vmss_public_ip: Resource 'azurerm_public_ip.vmss' not found for variable 'azurerm_public_ip.vmss.fqdn'

When I commented out the output, I see the underlying error:

Error: Error running plan: 1 error(s) occurred:
* azurerm_public_ip.vmss: only lowercase alphanumeric characters and hyphens allowed in "domain_name_label": "D0823-TF-VMSS-Packer"

Terraform configuration that was causing the issue is below. "${azurerm_resource_group.vmss.name}" was resolving to "D0823-TF-VMSS-Packer".

resource "azurerm_public_ip" "vmss" {
  name                         = "vmss-public-ip"
  location                     = "${var.location}"
  resource_group_name          = "${azurerm_resource_group.vmss.name}"
  public_ip_address_allocation = "static"
  domain_name_label            = "${azurerm_resource_group.vmss.name}"
  tags                         = "${var.tags}"
}

kawsark on 23 Aug 2018

Still present in 0.11.8.

smmckay on 24 Aug 2018

I got this when the JSON was valid, but it represented invalid container definitions.

Excerpted terraform:

resource "aws_ecs_service" "kibana" {
  ...
  task_definition = "${aws_ecs_task_definition.kibana.arn}"
}

resource "aws_ecs_task_definition" "kibana" {
  ...
  container_definitions = "${data.template_file.kibana_container_definitions.rendered}"
}

data "template_file" "kibana_container_definitions" {
  template = "${file("${path.module}/kibana_container_definitions.tpl.json")}"

  vars {
    container_name = "${var.container_name}"
    aws_account_id = "${var.aws_account_id}"
    aws_region = "${var.aws_region}"
    repository_name = "${var.repository_name}"
    container_version = "${var.container_version}"
    port = "${var.container_port}"
    host_port = "${module.constants.ecs_ephemeral_host_port}"
    log_group = "/ecs/kibana"
    log_stream_prefix = "ecs"
  }
}

Original kibana_container_definitions.tpl.json:

[
  {
    "name": "${container_name}",
    "image": "${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/${repository_name}:${container_version}",
    "essential": true,
    "portMappings": [
      {
        "containerPort": ${port},
        "hostPort": ${host_port}
      }
    ],
    "healthCheck": [
      "CMD-SHELL",
      "curl -f http://localhost:${port}/ || exit 1"
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${log_group}",
        "awslogs-region": "${aws_region}",
        "awslogs-stream-prefix": "${log_stream_prefix}"
      }
    }
  }
]

(Note that the "healthCheck" definition is incorrect. This property does not accept an array. It accepts an object.)

Error resulting from terraform plan:

Error: Error running plan: 1 error(s) occurred:

* module.elasticstack.module.kibana.aws_ecs_service.kibana: 1 error(s) occurred:

* module.elasticstack.module.kibana.aws_ecs_service.kibana: Resource 'aws_ecs_task_definition.kibana' not found for variable 'aws_ecs_task_definition.kibana.arn'

Updating kibana_container_definitions.tpl.json to the following allowed the plan to proceed:

[
  {
    "name": "${container_name}",
    "image": "${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/${repository_name}:${container_version}",
    "essential": true,
    "portMappings": [
      {
        "containerPort": ${port},
        "hostPort": ${host_port}
      }
    ],
    "healthCheck": {
      "command": ["CMD-SHELL", "curl -f http://localhost:${port}/ || exit 1"],
      "interval": 30,
      "timeout": 10,
      "retries": 2
    },
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${log_group}",
        "awslogs-region": "${aws_region}",
        "awslogs-stream-prefix": "${log_stream_prefix}"
      }
    }
  }
]

carlgieringer on 6 Sep 2018

Hi all! Sorry for the long silence here.

This seems to be the same root problem as #18129, so I'm going to close this one just to consolidate discussion over there. As you can see in my comment on that issue, the problem is still not quite resolved but we plan to deal with it prior to the v0.12.0 final release.

apparentlymart on 7 Nov 2018

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.