Terraform: Confusing downstream error on reliant resources when an upstream resource is invalid.

Created on 10 May 2017  ·  20Comments  ·  Source: hashicorp/terraform

Terraform Version

0.9.4

Affected Resource(s)

All? Ran into it using aws_vpc.

Terraform Configuration Files

resource "aws_vpc" "main" {                                                                                                  
  cidr_block = "11.${data.template_file.environment_number.rendered}.0.0/16"                                                 

  tags {                                                                                                                     
    Name        = "${var.name}"                                                                                              
    Environment = "${var.name}"                                                                                              
    Terraform   = "Yes"                                                                                                      
  }                                                                                                                          
} 

resource "aws_subnet" "a" {                                                                                                
  vpc_id     = "${aws_vpc.main.id}"                                                                                        
  cidr_block = "11.${data.template_file.environment_number.rendered}.1.0/24"                                               

  tags {                                                                                                                   
    Name        = "${var.name}-a"                                                                                          
    Environment = "${var.name}"                                                                                            
    Terraform   = "Yes"                                                                                                    
  }                                                                                                                        
}                             

Expected Behavior

Admittedly this was an operator error, but the error message I got led me down a very confusing path and wasn't clear until I'd commented out half of my module.

The VPC definition above was invalid because of an error with the inline template I was using. I'd have _expected_ this to throw an error such as:

Error running plan: 1 error(s) occurred:

* module.staging1.aws_vpc.main: "cidr_block" must contain a valid CIDR, got error parsing: invalid CIDR address: 11..0.0/16

And I did get this error, but ...

Actual Behavior

The downstream dependencies, in this case the aws_subnet, error'd out so instead of the above error I got the following errors:

Error running plan: 1 error(s) occurred:

* module.staging1.aws_route53_zone.env: 1 error(s) occurred:

* module.staging1.aws_route53_zone.env: Resource 'aws_vpc.main' not found for variable 'aws_vpc.main.id'

It would be nice if Terraform would output the root error rater than saying aws_vpc.main wasn't found. It's declared, but invalid. Only after commenting out all dependent resources did I get the correct error message.

bug core

Most helpful comment

This bit me today. registry.json had invalid json content.

Error running plan: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: Resource 'aws_ecs_task_definition.registry' not found for variable 'aws_ecs_task_definition.registry.arn'
data "template_file" "registry" {
  template = "${file("${path.module}/task-definitions/registry.json")}"

  vars {
    REGISTRY_VERSION = "${var.registry_version}"
  }
}

resource "aws_ecs_task_definition" "registry" {
  family                = "registry"
  container_definitions = "${data.template_file.registry.rendered}"
}

/* container and task definitions for running the actual Docker registry */
resource "aws_ecs_service" "registry" {
  name            = "registry"
  cluster         = "${var.cluster_id}"
  task_definition = "${aws_ecs_task_definition.registry.arn}"
  desired_count   = 1
}

All 20 comments

👍 , I've seen something like this when I'm adding module outputs. The new output I just added doesn't appear in a new Terraform run that appears to otherwise be 100% successful.

The problem? I mistyped something in the output value.

Hunting through debug output showed me the error of my ways the first couple times this happened and now I know to look for it.

But that's good news, right? If it's in the debug log, surely parse errors like this could just be output at higher urgency levels to address the issue...

@leftathome what exactly did you see in the debug log?

Hi @joestump! Sorry for this confusing error message.

Could you clarify what exactly the problem was with the template that led you here? I assume there was something going wrong in this data.template_file.environment_number resource, but it'd help to reproduce this if we had some more info on what was wrong with the template resource and what you did to make the error go away after all of this debugging work.

This bit me today. registry.json had invalid json content.

Error running plan: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: 1 error(s) occurred:

* module.registry.aws_ecs_service.registry: Resource 'aws_ecs_task_definition.registry' not found for variable 'aws_ecs_task_definition.registry.arn'
data "template_file" "registry" {
  template = "${file("${path.module}/task-definitions/registry.json")}"

  vars {
    REGISTRY_VERSION = "${var.registry_version}"
  }
}

resource "aws_ecs_task_definition" "registry" {
  family                = "registry"
  container_definitions = "${data.template_file.registry.rendered}"
}

/* container and task definitions for running the actual Docker registry */
resource "aws_ecs_service" "registry" {
  name            = "registry"
  cluster         = "${var.cluster_id}"
  task_definition = "${aws_ecs_task_definition.registry.arn}"
  desired_count   = 1
}

@apparentlymart it doesn't really matter what the upstream template is. I'd assume just hardcoding an invalid value would cause the same behavior. All that was wrong was that I was passing an empty variable to something that needed to be not empty.

I didn't call it out, but the CIDR block passed was 11..0.0/16, which is clearly invalid, but I didn't get an invalid CIDR block error or an error about the aws_vpc being unable to be created.

I just hit this in a different situation, with an aws_iam_role resource that was referenced by an aws_lambda_function resource. The JSON of the aws_iam_role's assume_role_policy was invalid, and any complaint about that was masked by "Resource 'aws_iam_role.abc' not found for variable 'aws_iam_role.abc.arn' " (names changed to protect the innocent). I had to completely comment out the lambda before I got an error message about the broken JSON.

+1 I ran into this too with a missing ":" in a json file. The json was a .tpl for an aws_emr_security_configuration.

I'm encountering this issue if an invalid JSON string is passed into the container_definitions property of aws_ecs_task_definition.

Terraform raises an error about aws_ecs_task_definition being missing, rather than a JSON parsing exception. If I remove all of the task's dependents, Terraform (correctly) raises the JSON error.

I just experienced this problem. The root cause was that my aws_elb.name was too long but the error message I had was unknown resource 'aws_elb.xxx' referenced in variable aws_elb.xxx.dns_name. I had to comment out all dependent resources before I could see the root error.

@robyoung One useful workaround can be to use the -target argument to limit the planning to the resource in question, eg terraform plan -target=aws_elb.xxx. That avoids having to manually comment out the dependent resources, and should have the same effect on Terraform's behaviour. It's still not an obvious step, but it can help diagnose the root problem with a resource until this issue is fixed.

EDIT: I forgot to include the = in the -target= option.

Painful debugging .... only turn on TF_LOG=debug and track back [ERROR] output and ignore all the "not found" will tell you the root error

@apparentlymart to sum up the issue:

Errors on resources can be hidden by a dependent resource. Say for example we have an ecs task definition and an ecs service:

resource "aws_ecs_task_definition" "cavalcade" {
  family        = "${var.name}"

  container_definitions = <<TASK
[
Invalid JSON
]
TASK
}

resource "aws_ecs_service" "cavalcade" {
  name = "${var.name}"
  cluster = "${var.ecs_service_cluster_name}"
  desired_count = "${var.cavalcade_desired_count}"
  task_definition = "${aws_ecs_task_definition.cavalcade.family}:${aws_ecs_task_definition.cavalcade.revision}"

  lifecycle {
    create_before_destroy = true
  }
}

The task definition will fail to be created, but that error won't be displayed. The error aws_ecs_service.cavalcade: Resource 'aws_ecs_task_definition.cavalcade' not found for variable 'aws_ecs_task_definition.cavalcade.family' will be. The only way to find out there's an error in the task definition is to comment out the service definition and run a plan.

This also bite me and it was quite hard to debug. In my case there was an integer for a ECS task definition that could not be expressed as a string. Using the debug mode in terraform and looking for the ERROR ( thank @trung ) helped a lot. But yes, this issue feels to need an improvement on user experience.

TF_LOG=debug terraform plan -no-color 2> plan.log
    "memoryReservation": "${nginx_memory_reservation}"



md5-37e1dc0b351fc2c670740092408eff22



2018/06/15 10:57:28 [ERROR]
 root.ecs_service: eval: *terraform.EvalValidateResource, err: Warnings: []. 
Errors: [ECS Task Definition container_definitions is invalid: 
Error decoding JSON: json: cannot unmarshal string into 
Go struct field ContainerDefinition.MemoryReservation of type int64]

Also happened here with Terraform v0.11.7. It is quite annoying since AWS specifically uses a lot of JSON, so the potential for invalid JSON is pretty big.

Same happened here with google_container_cluster and google_container_node_pool, specifying invalid CIDR for master_ipv4_cidr_block swallowed original error message and showed a confusing message for dependant resource.

When both resources are uncommented I get this error message
````
Error: Error running plan: 1 error(s) occurred:

  • module.environment.module.gke_cluster.google_container_node_pool.main_node_pool: 1 error(s) occurred:

  • module.environment.module.gke_cluster.google_container_node_pool.main_node_pool: Resource 'google_container_cluster.project_cluster' not found for variable 'google_container_cluster.project_cluster.id'
    and when I comment out the node pool I get this one
    Error: Error running plan: 1 error(s) occurred:

  • module.environment.module.gke_cluster.google_container_cluster.project_cluster: expected master_ipv4_cidr_block to contain a valid CIDR, got: with err: invalid CIDR address:
    ````

I had this issue as well. Would have saved time if Terraform provided root error.

  • Initially when ran terraform apply I get:
Error: Error running plan: 1 error(s) occurred:
* output.vmss_public_ip: Resource 'azurerm_public_ip.vmss' not found for variable 'azurerm_public_ip.vmss.fqdn'
  • When I commented out the output, I see the underlying error:
Error: Error running plan: 1 error(s) occurred:
* azurerm_public_ip.vmss: only lowercase alphanumeric characters and hyphens allowed in "domain_name_label": "D0823-TF-VMSS-Packer"
  • Terraform configuration that was causing the issue is below. "${azurerm_resource_group.vmss.name}" was resolving to "D0823-TF-VMSS-Packer".
resource "azurerm_public_ip" "vmss" {
  name                         = "vmss-public-ip"
  location                     = "${var.location}"
  resource_group_name          = "${azurerm_resource_group.vmss.name}"
  public_ip_address_allocation = "static"
  domain_name_label            = "${azurerm_resource_group.vmss.name}"
  tags                         = "${var.tags}"
}

Still present in 0.11.8.

I got this when the JSON was valid, but it represented invalid container definitions.

Excerpted terraform:

resource "aws_ecs_service" "kibana" {
  ...
  task_definition = "${aws_ecs_task_definition.kibana.arn}"
}

resource "aws_ecs_task_definition" "kibana" {
  ...
  container_definitions = "${data.template_file.kibana_container_definitions.rendered}"
}

data "template_file" "kibana_container_definitions" {
  template = "${file("${path.module}/kibana_container_definitions.tpl.json")}"

  vars {
    container_name = "${var.container_name}"
    aws_account_id = "${var.aws_account_id}"
    aws_region = "${var.aws_region}"
    repository_name = "${var.repository_name}"
    container_version = "${var.container_version}"
    port = "${var.container_port}"
    host_port = "${module.constants.ecs_ephemeral_host_port}"
    log_group = "/ecs/kibana"
    log_stream_prefix = "ecs"
  }
}

Original kibana_container_definitions.tpl.json:

[
  {
    "name": "${container_name}",
    "image": "${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/${repository_name}:${container_version}",
    "essential": true,
    "portMappings": [
      {
        "containerPort": ${port},
        "hostPort": ${host_port}
      }
    ],
    "healthCheck": [
      "CMD-SHELL",
      "curl -f http://localhost:${port}/ || exit 1"
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${log_group}",
        "awslogs-region": "${aws_region}",
        "awslogs-stream-prefix": "${log_stream_prefix}"
      }
    }
  }
]

(Note that the "healthCheck" definition is incorrect. This property does not accept an array. It accepts an object.)

Error resulting from terraform plan:

Error: Error running plan: 1 error(s) occurred:

* module.elasticstack.module.kibana.aws_ecs_service.kibana: 1 error(s) occurred:

* module.elasticstack.module.kibana.aws_ecs_service.kibana: Resource 'aws_ecs_task_definition.kibana' not found for variable 'aws_ecs_task_definition.kibana.arn'

Updating kibana_container_definitions.tpl.json to the following allowed the plan to proceed:

[
  {
    "name": "${container_name}",
    "image": "${aws_account_id}.dkr.ecr.${aws_region}.amazonaws.com/${repository_name}:${container_version}",
    "essential": true,
    "portMappings": [
      {
        "containerPort": ${port},
        "hostPort": ${host_port}
      }
    ],
    "healthCheck": {
      "command": ["CMD-SHELL", "curl -f http://localhost:${port}/ || exit 1"],
      "interval": 30,
      "timeout": 10,
      "retries": 2
    },
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${log_group}",
        "awslogs-region": "${aws_region}",
        "awslogs-stream-prefix": "${log_stream_prefix}"
      }
    }
  }
]

Hi all! Sorry for the long silence here.

This seems to be the same root problem as #18129, so I'm going to close this one just to consolidate discussion over there. As you can see in my comment on that issue, the problem is still not quite resolved but we plan to deal with it prior to the v0.12.0 final release.

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings