Terragrunt: Terragrunt should detect missing module and provider download errors and automatically re-run init to fix them

Created on 2 Dec 2017  路  8Comments  路  Source: gruntwork-io/terragrunt

Terraform has two types of frustrating errors:

  1. You add a module or modify its version and then next time you run plan or apply, you get an error that you need to re-run get. If Terraform knows you need to do that, why doesn't it just do it for you?

  2. You run init to download the code for providers and get a TLS handshake error or similar timeout. This happens especially often when using xxx-all commands, so many downloads are happening concurrently. Not sure if this is a bug in Terraform's download code or throttling on their servers, but I've had many xxx-all commands fail because of this.

Terragrunt should detect these errors and retry the previous command automatically.

enhancement help wanted

Most helpful comment

Adding a second motivation for this issue... I'm running across this when I want to upgrade the version of a provider module I'm using.

We have a boilerplate.tf that I symlink into all of my modules that has this clause:

provider "azurerm" {
  version = "1.0.0"
}

We do this so that we can test and coordinate upgrades of the provider at our leisure rather than having them randomly happen.

Unfortunately, when we update the version to 1.0.1, terragrunt plan-all fails because all of the existing Terragrunt work folders have init-ed 1.0.0 and need to be re-init-ed to get 1.0.1.

A stopgap measure could be to implement a terragrunt init-all that allows us to re-run init in all the modules the other -all commands would run. That might be easier to implement than auto-detection?

ETA: I figured out that adding the --terragrunt-source-update flag after plan-all provides a workaround for this issue, at least for us who use solely remote tfstate.

All 8 comments

Adding a second motivation for this issue... I'm running across this when I want to upgrade the version of a provider module I'm using.

We have a boilerplate.tf that I symlink into all of my modules that has this clause:

provider "azurerm" {
  version = "1.0.0"
}

We do this so that we can test and coordinate upgrades of the provider at our leisure rather than having them randomly happen.

Unfortunately, when we update the version to 1.0.1, terragrunt plan-all fails because all of the existing Terragrunt work folders have init-ed 1.0.0 and need to be re-init-ed to get 1.0.1.

A stopgap measure could be to implement a terragrunt init-all that allows us to re-run init in all the modules the other -all commands would run. That might be easier to implement than auto-detection?

ETA: I figured out that adding the --terragrunt-source-update flag after plan-all provides a workaround for this issue, at least for us who use solely remote tfstate.

init-all sounds like a great interim solution! PR very welcome for that :)

Here's one of the timeouts I saw just recently:

Failed to load state: RequestError: send request failed
caused by: Get https://<BUCKET_NAME>.us-west-2.amazonaws.com/?prefix=env%3A%2F: dial tcp 54.231.176.160:443: i/o timeout

And one more:

 Error: 1 error(s) occurred:

 * provider.external: new or changed plugin executable

And two more:

Error installing provider "aws": error fetching checksums: Get https://releases.hashicorp.com/terraform-provider-aws/0.1.4/terraform-provider-aws_0.1.4_SHA256SUMS: net/http: TLS handshake timeout.
Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...

Error installing provider "template": error fetching checksums: Get https://releases.hashicorp.com/terraform-provider-template/1.0.0/terraform-provider-template_1.0.0_SHA256SUMS: net/http: TLS handshake timeout.

One more!

Failed to load backend: 
Error configuring the backend "s3": RequestError: send request failed
caused by: Post https://sts.amazonaws.com/: net/http: TLS handshake timeout

And yet another one that comes up all the time (it's a known Terraform bug that has been ignored for months):

* module.high_memory_usage_alarms.aws_cloudwatch_metric_alarm.asg_high_memory_utilization: 1 error(s) occurred:

* aws_cloudwatch_metric_alarm.asg_high_memory_utilization: Creating metric alarm failed: ValidationError: A separate request to update this alarm is in progress.
    status code: 400, request id: 94309fbd-7e09-11e8-a5f8-1de9e697c6bf

Hi @brikis98, thank you for capturing all these errors!

With new releases of Terraform and new added-value services such as Terraform Registry a few new types of errors (or a whole class, rather) has been introduced. I am going capture some of them as follows:

Error: Module version requirements have changed

  on main.tf line 10, in module "vpc":
  10:   source  = "terraform-aws-modules/vpc/aws"

The version requirements have changed since this module was installed and the
installed version (2.47.0) is no longer acceptable. Run "terraform init" to
install all modules required by this configuration.

[terragrunt] 2020/09/10 11:09:42 Hit multiple errors:
exit status 1
Error: Could not satisfy plugin requirements


Plugin reinitialization required. Please run "terraform init".

Plugins are external binaries that Terraform uses to access and manipulate
resources. The configuration provided requires plugins which can't be located,
don't satisfy the version constraints, or are otherwise incompatible.

Terraform automatically discovers provider requirements from your
configuration, including providers used in child modules. To see the
requirements and constraints from each module, run "terraform providers".



Error: provider.aws: no suitable version installed
  version requirements: " < 4.0,3.5.0,>= 2.68"
  versions installed: "3.4.0"



[terragrunt] 2020/09/10 11:10:34 Hit multiple errors:
exit status 1



md5-ecaafb0627d821648637ecd7cc7a943c



Error: Error accessing remote module registry

Failed to retrieve available versions for module "vpc" (main.tf:9) from
registry.terraform.io: Failed to request discovery document: Get
https://registry.terraform.io/.well-known/terraform.json: dial tcp
151.101.2.49:443: connect: connection refused.

[terragrunt] 2020/09/10 11:15:40 Hit multiple errors:
exit status 1



md5-ecaafb0627d821648637ecd7cc7a943c



Error: Error accessing remote module registry

Failed to retrieve available versions for module "vpc" (main.tf:9) from
registry.terraform.io: Failed to request discovery document: Get
https://registry.terraform.io/.well-known/terraform.json: remote error: tls:
internal error.

[terragrunt] 2020/09/10 11:18:36 Hit multiple errors:
exit status 1



md5-ecaafb0627d821648637ecd7cc7a943c



Error: Error accessing remote module registry

Failed to retrieve available versions for module "vpc" (main.tf:9) from
registry.terraform.io: Discovery URL has a malformed Content-Type "".

[terragrunt] 2020/09/10 11:22:13 Hit multiple errors:
exit status 1
Was this page helpful?
0 / 5 - 0 ratings