Terraform: Mechanism for creating and using a local temporary directory

Created on 14 May 2019  路  7Comments  路  Source: hashicorp/terraform

Use-cases

Although we generally prefer to work primarily in memory and with remote APIs when working with Terraform, sometimes for pragmatic reasons we need to interact with local files on disk. In some cases those files need to exist only temporarily to support something that a Terraform module is doing, such as writing out a file to disk in order to access it from a provisioner.

Today we have a mechanism to build paths under the module's own directory, intended mainly for reading supporting files that are included along with the module source code using the file function, but we don't have a good answer for creating of temporary files such that they can be automatically cleaned up after the Terraform run is complete.

It would be convenient for each module to be able to allocate and use a temporary directory for such temporary files, ensuring that:

  • Each Terraform run sees a clean, empty directory, without risk of inadvertently picking up files left behind on a previous run.
  • Terraform can automatically clean up the directory after the run is complete.
  • Temporary files don't pollute the main configuration directory, allowing modules to treat their own directory as immutable.

Existing Solutions

Today it's most common to write temporary files into either path.module or path.root and just pick names that are unlikely to collide with other uses.

While this works okay, it does leave the files behind after the run completes, making them potentially visible to subsequent runs and leaving the module directory in an unclean state where version control tools may think the new file needs to be committed.

Proposal

One way to address this would be to support a new path.temp reference, with some special behavior compared to the other path. references.

Each module's separate terraform.EvalContext would have a field storing that module's temporary directory path, but it would start off empty indicating that no temporary path is allocated.

When preparing the variable scope for evaluating expressions, the scope builder would detect any references to path.temp and, if present, would call into the EvalContext to find the path to use. If no path is already allocated, one is allocated just in time (e.g. by calling ioutil.TempDir) and retained to be used for any future references to path.temp in the same module.

Once the graph walk is fully complete, Terraform would check whether a temporary directory was allocated for each module and attempt to delete it if so. The deletion might fail if e.g. the module created something with restrictive permissions in there, in which case it would emit a warning about it and continue.

Each module instance has its own separate path.temp so that different modules in the same configuration can have full control over their own path.temp namespaces and not need to coordinate with other modules.

If a particular file needs to be shared across a module boundary, the module which initially creates it would place it in its _own_ path.temp and then pass that path to another module either as an output value or as an input variable:

output "temporary_config_file" {
  value = "${path.temp}/temporary.conf"
}

This can work because Terraform would not attempt any directory clean up until the whole graph walk has completed, and so as long as the generating module doesn't delete or override that file itself after returning it any other module can make use of that path.

Temporary files only exist for the duration of one operation

An important caveat with this design is that temporary files cannot persist between operations. In particular, it would not work to create a temporary file during the plan phase and then access it during the apply phase, because the temporary file would've been deleted at the end of the plan phase.

To represent this, path.temp during the plan walk would return cty.UnknownVal(cty.String) -- or (known after apply), as the plan output calls it -- forcing any temporary-path-dependent work to be done during the apply step instead.

This means that in particular the temporary path mechanism would not be a good fit for use of the archive_file data source. That data source is not well-behaved because it has externally-visible side-effects during its read operation (creating the archive file) and thus it is generally not a good citizen in Terraform's workflow. It not working well with path.temp is just a logical extension of its existing quirky behavior, and not something this proposal aims to address:

data "archive_file" "example" {
  type        = "zip"
  source_dir  = "${path.module}/archive_src"
  output_path = "${path.temp}/archive.zip"
}

The above would work, but would cause a non-empty plan to be created every time and the archive to be created during apply. Non-convergence is typical for archive_file anyway in the common case where the filesystem isn't preserved between subsequent Terraform runs, but when using path.temp it would be true even if terraform plan and the subsequent terraform apply were run on the same machine and in the same directory. This is a general design flaw of archive_file, and addressing it is outside of scope of this proposal.

path.temp can, however, be used with any files created during the apply step, such as when using the local_file resource or when redirecting local-exec provisioner output to disk so it can be consumed by downstream resources. Naturally, any use of it in resource attributes would be non-convergent because a new file would need to be created on every run.

config enhancement proposal thinking

Most helpful comment

Hi @chamilad , Thanks for sharing!
What I'd actually would like to have, is, that if a terraform plan would only create temporary files, terraform decides to do nothing and show a No changes. Infrastructure is up-to-date.
Ratio: Sometimes temporary files (like in your example) are just needed to execute a second resource that should be tracked. (In your case, you trigger it by timestamp, hence each run. But imagine a local-exec does not to run with every terraform apply).
I guess, what I'm trying to say is, that temporary files maybe should not be part of terraform's state and only need to be created (silently) if the state (i.e. all other resources) is about to be changed.

My current workaround is, that I create the file in each local-exec in which I need it to be in place, but that is a lot of duplication if the same file is needed for a lot of resources (like e.g. kubeconfig).

resource "null_resource" "scriptExec" {
  triggers = {
    filename    = "/tmp/scriptExec_tempfile" // needed as trigger for destroy local-exec
    filecontent = var.content                // needed as trigger for destroy local-exec
  }


  provisioner "local-exec" {
    command = <<-EOF
      echo ${self.triggers.filecontent} > ${self.triggers.filename} //or any other way to create the file
      <run create command>
    EOF
  }

  provisioner "local-exec" {
    when    = destroy
    command = <<-EOF
      echo ${self.triggers.filecontent} > ${self.triggers.filename} //or any other way to create the file
      <run destroy command>
    EOF
  }
}

Note that I don't care about deleting the file, since it will be gone next run anyways.
Of course you could add a rm ${self.triggers.filename} after each command again.

All 7 comments

Could I please send me your pull request

And honestly I'm nobody's fool to release my personal acct info so make sure it's written up along guidelines already set forth just incorporate new info....

Agreed - this would be highly valuable. Has anyone come up with a viable workaround in the meanwhile? Since .terraform is already in all .gitignore files, I'm considering using something like ${path.root}/.terraform/tmp. I think we would probably still have to initialize the folder as a one-time manual step.

Update: The workaround of ${path.root}/.terraform/tmp seems to be working well, with the exception that there's no automated cleanup.

Thanks @aaronsteers for the hint with the .terraform/tmp folder. What still is missing is the cleanup of course. We do have the issue, that each run (e.g. terraform plan, done with AWS CodeBuild, hence on a fresh machine each time) will create the temporary files, even if nothing else has been changed. So we never get a No changes. Infrastructure is up-to-date..

Did I understand that correct, that this Proposal might address such cases? Are there other workarounds available even today?

@herrLierb - I can vouch for the approach I described above (${path.module}/.terraform/tmp) as a method that has been working for us workout a hitch for several months now, including in ci/cd environments. With a couple caveats:

  1. I'm not using this or recommending for a resource location, but as a temporary output directory for local processes - such as a provisioner-backed null resource building a pip package or zip file which will then be uploaded.
  2. Change detection should not rely on state within the tmp files - for the reason you call out, that this will force an apply on each new ci/cd environment.
  3. To get around # 2, I generally use an md5 hash of my input files (which are checked into source control and this always available) and I use the md5 both as change detection but also as a prefix in the local zip subfolder. By using the input files' collective md5, I ensure that each unique run will not contain extra files from previous runs.
  4. Per # 3, I am able to work around the lack of cleanup, since my source files' md5 creates a unique subfolder on each run.

While I'm happy with my solution, I don't know if it works for all use cases. Here are my own use cases for reference:

  1. As an output folder for pip or zip archive creation. (Can be done in a way to avoid unnecessary apply steps in plan.)
  2. As an intermediate location to dump a json policy document or other intermediate artifacts, for debugging purposes. (Probably will generate extra apply steps when run in ci/cd.)

For my part, the spec on this PR (including cleanup) would reduce the reliance on using md5s in the subfolder path, and would also formalize the approach to designating the "proper/best" tmp directory - and I think it has strong merit as written for those reasons.

On a pragmatic note, the caveats I would add is that terraform "resources" in the strict sense probably shouldn't be created in the tmp folder (as discussed, to ensure portability) and users may still need/want to create more advanced file hashing techniques to detect changes in input files (using a method similar to what I've described above).

I'm curious if this is helpful for other folks, and if there are other use cases I'm not thinking of.

For anyone looking (and @herrLierb and @aaronsteers ), you can implement a cleanup with a null_resource.

In my case, I use local_file to create a file with path ${path.root}/.terraform/tmp/${timestamp()}.txt. Then I execute a third party binary using null_resource and local-exec provisioner. I added another null_resource with the same trigger as the one used to execute the binary and executed rm -rf ${local_file.script.filename}, with an additional depends_on for the first null_resource to ensure order of execution.

resource "local_file" script {
  content  = var.script_content
  filename = "${path.root}/.terraform/tmp/${timestamp()}.txt"
}

resource "null_resource" scriptExec {
  triggers = {
    once = "${timestamp()}"
  }

  provisioner "local-exec" {
    command     = "/usr/bin/third/party/binary-f ${local_file.script.filename}"
  }
}

resource "null_resource" deleteLocalFile {
  triggers = {
    once = timestamp()
  }

  depends_on = [
    null_resource.scriptExec,
  ]

  provisioner "local-exec" {
    command = "rm -rf ${local_file.script.filename}"
  }
}

Hope this helps until the feature is implemented.

Hi @chamilad , Thanks for sharing!
What I'd actually would like to have, is, that if a terraform plan would only create temporary files, terraform decides to do nothing and show a No changes. Infrastructure is up-to-date.
Ratio: Sometimes temporary files (like in your example) are just needed to execute a second resource that should be tracked. (In your case, you trigger it by timestamp, hence each run. But imagine a local-exec does not to run with every terraform apply).
I guess, what I'm trying to say is, that temporary files maybe should not be part of terraform's state and only need to be created (silently) if the state (i.e. all other resources) is about to be changed.

My current workaround is, that I create the file in each local-exec in which I need it to be in place, but that is a lot of duplication if the same file is needed for a lot of resources (like e.g. kubeconfig).

resource "null_resource" "scriptExec" {
  triggers = {
    filename    = "/tmp/scriptExec_tempfile" // needed as trigger for destroy local-exec
    filecontent = var.content                // needed as trigger for destroy local-exec
  }


  provisioner "local-exec" {
    command = <<-EOF
      echo ${self.triggers.filecontent} > ${self.triggers.filename} //or any other way to create the file
      <run create command>
    EOF
  }

  provisioner "local-exec" {
    when    = destroy
    command = <<-EOF
      echo ${self.triggers.filecontent} > ${self.triggers.filename} //or any other way to create the file
      <run destroy command>
    EOF
  }
}

Note that I don't care about deleting the file, since it will be gone next run anyways.
Of course you could add a rm ${self.triggers.filename} after each command again.

Was this page helpful?
0 / 5 - 0 ratings