Terragrunt: Errors with tflock, when deploying multiple terragrunt files

Created on 5 Nov 2019  ยท  4Comments  ยท  Source: gruntwork-io/terragrunt

Hi all, i'm new to terraform, and terragrunt, and i just read the Terraform: Up and running 2nd edition.

I have some problems with terragrunt, but maybe it's because i don't understand how it works. Hopefully someone will look at my structure, and my usage of terraform, and tell me if this is the right way.

I have two repositories infra-modules and infra-live.

Infra-live repo:

โ”œโ”€โ”€ gcp-project
โ”‚   โ”œโ”€โ”€ cluster-name
โ”‚   โ”‚   โ”œโ”€โ”€ cronjob
โ”‚   โ”‚   โ”‚    โ”œโ”€โ”€ cronjob-1
โ”‚   โ”‚   โ”‚    โ”‚   โ”œโ”€โ”€ terragrunt.hcl
โ”‚   โ”‚   โ”‚    โ”œโ”€โ”€ cronjob-2
โ”‚   โ”‚   โ”‚    โ”‚   โ”œโ”€โ”€ terragrunt.hcl
โ”‚   โ”œโ”€โ”€ terragrunt.hcl

Infra-modules repo:

โ”œโ”€โ”€ cronjob
โ”‚   โ”œโ”€โ”€ README.md
โ”‚   โ”œโ”€โ”€ variables.tf
โ”‚   โ”œโ”€โ”€ main.tf
โ”‚   โ”œโ”€โ”€ outputs.tf

My cronjob-1 terragrunt.hcl and cronjob-2 terragrunt.hcl looks like this:

terraform {
    source = "[email protected]:my-team/terraform-modules.git//cronjob?ref=v0.1.1"
}

include {
    path = find_in_parent_folders()
}

inputs = {
    name = "cronjob-1"
    namespace = "cronjob"
}

My main.tf in cronjob modules looks like this

provider "google" {
    region = var.gcp_region
}

terraform {
    backend "gcs" {}
    required_version = ">= 0.12.0"
}

resource "kubernetes_cron_job" "cronjob" {
  metadata {
    name = var.name
    namespace = var.namespace
  }
  spec {
    concurrency_policy            = "Replace"
    failed_jobs_history_limit     = 5
    schedule                      = "1 0 * * *"
    starting_deadline_seconds     = 10
    successful_jobs_history_limit = 10
    suspend                       = true
    job_template {
      metadata {}
      spec {
        backoff_limit = 2
        template {
          metadata {}
          spec {
            container {
              name    = "hello"
              image   = "busybox"
              command = ["/bin/sh", "-c", "date; echo Hello from the Kubernetes cluster"]
            }
            restart_policy = "OnFailure"
          }
        }
      }
    }
  }
}

My variables.tf in cronjob module looks like this:

variable "gcp_region" {
    description = "The GCP region to deploy to (e.g. europe-west1)"
    type        = string
}

variable "name" {
    description = "The name of the CronJob"
    type        = string
}

variable "namespace" {
    description = "The namespace to deploy the CronJob"
    type        = string
}

As you can see, i want to have multiple cronjobs in different directories. If i run terragrunt plan, in one of the cronjob directories on the live repo, it works as it should. But if i run terragrunt plan-all, in the cluster-name directory, for deploying both of the cronjobs, then i get this error:

Error: Error locking state: Error acquiring the state lock: writing "gs://terraform-state/my-project/default.tflock" failed: googleapi: Error 412: Precondition Failed, conditionNotMet
Lock Info:

I understand that the issue might be that the first cronjob-1 is being planned, and then the state is locked, and now it can't proceed with cronjob-2.
Can anyone help me to understand, what i'm doing wrong here. Should i create both of the cronjob resources in the module, and load it once, in the live repo?

Terraform v0.12.12
Terragrunt v0.21.1

question

All 4 comments

Can you share the contents of gcp-project/terragrunt.hcl, specifically the gcs bucket remote state config?

Yes here it is:

remote_state {
    backend = "gcs"
    config = {
      bucket = "terraform-state-my-team"
      prefix = "my-team-cluster-name"
    }
}

locals {
    default_yaml_path = find_in_parent_folders("empty.yaml")
}

inputs = merge(
    yamldecode(
        file("${get_terragrunt_dir()}/${find_in_parent_folders("env.yaml", local.default_yaml_path)}"),
    ),
    {
        gcp_region = "europe-west1"
    },
    {
        gcp_project = "my-project"
    },
)

env.yaml in cluster-name:

environment: "cluster-name"

The main issue here is:

prefix = "my-team-cluster-name"

Terragrunt will reuse this GCS remote state configuration for all the modules that have:

include {
    path = find_in_parent_folders()
}

By hard coding the prefix, you are telling all the modules to use the same state file which is what is causing the lock conflict. This will most likely cause further problems in that the code will not match the state.

To avoid this, you want to set a prefix based on path_relative_to_include(), which will be unique per module because each module will have a different directory.

Thanks @yorinasub17! Of course that wouldn't work, nice catch :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dzirg44 picture dzirg44  ยท  3Comments

GauntletWizard picture GauntletWizard  ยท  3Comments

dmlemos picture dmlemos  ยท  3Comments

dingobar picture dingobar  ยท  3Comments

antonbabenko picture antonbabenko  ยท  3Comments