Terraform: depends_on should should defer interpolation

Created on 11 Jul 2019 · 10Comments · Source: hashicorp/terraform

Current Terraform Version

Terraform v0.12.3

Use-cases

interpolation functions should not run until the depended_on resource has completed.
Because that resource may change the environment.

Attempted Solutions

resource "null_resource" "lambda" {
  provisioner "local-exec" {
    command = "cd lambda; zip -u git_hook.zip git_hook.py"
  }
}

resource "aws_lambda_function" "git_hook" {
  filename      = "lambda/git_hook.zip"
  function_name = "git_hook_sqs"
  role          = aws_iam_role.iam_for_lambda.arn
  handler       = "lambda_handler"

  source_code_hash = filebase64sha256("lambda/git_hook.zip")

  runtime = "python3.7"

  environment {
    variables = {
      foo = "bar"
    }
  }

  depends_on = [null_resource.lambda]
}

This immediately fails with

Call to function "filebase64sha256" failed: no file exists at lambda/git_hook.zip

Proposal

depends_on should be transitive

current plan

Terraform will perform the following actions:                                           

  # aws_lambda_function.git_hook will be created                                        
  + resource "aws_lambda_function" "git_hook" {                                         
      + arn                            = (known after apply)                            
      + filename                       = "lambda/git_hook.zip"                          
      + function_name                  = "git_hook_sqs"                                 
      + handler                        = "lambda_handler"                               
      + id                             = (known after apply)                            
      + invoke_arn                     = (known after apply)                            
      + last_modified                  = (known after apply)                            
      + memory_size                    = 128                                            
      + publish                        = false                                          
      + qualified_arn                  = (known after apply)                            
      + reserved_concurrent_executions = -1                                             
      + role                           = "arn:aws:iam::121613305665:role/iam_for_lambda"
      + runtime                        = "python3.7"                                    
      + source_code_hash               = "7QPosc5Dyd/EDhlFOdc25BY0ToF++NO78ARgkQnsK4s=" 
      + source_code_size               = (known after apply)                            
      + timeout                        = 3                                              
      + version                        = (known after apply)                            

      + environment {                                                                   
          + variables = {                                                               
              + "foo" = "bar"                                                           
            }                                                                           
        }                                                                               

      + tracing_config {                                                                
          + mode = (known after apply)                                                  
        }                                                                               
    }                                                                                   

Plan: 1 to add, 0 to change, 0 to destroy.

should be

Terraform will perform the following actions:                                           

  # aws_lambda_function.git_hook will be created                                        
  + resource "aws_lambda_function" "git_hook" {                                         
      + arn                            = (known after apply)                            
      + filename                       = "lambda/git_hook.zip"                          
      + function_name                  = "git_hook_sqs"                                 
      + handler                        = "lambda_handler"                               
      + id                             = (known after apply)                            
      + invoke_arn                     = (known after apply)                            
      + last_modified                  = (known after apply)                            
      + memory_size                    = 128                                            
      + publish                        = false                                          
      + qualified_arn                  = (known after apply)                            
      + reserved_concurrent_executions = -1                                             
      + role                           = "arn:aws:iam::121613305665:role/iam_for_lambda"
      + runtime                        = "python3.7"                                    
      + source_code_hash               = (known after apply)  
      + source_code_size               = (known after apply)                            
      + timeout                        = 3                                              
      + version                        = (known after apply)                            

      + environment {                                                                   
          + variables = {                                                               
              + "foo" = "bar"                                                           
            }                                                                           
        }                                                                               

      + tracing_config {                                                                
          + mode = (known after apply)                                                  
        }                                                                               
    }                                                                                   

Plan: 1 to add, 0 to change, 0 to destroy.

References

enhancement

Source

mutt13y

👍6

Most helpful comment

There's going to be a lot of people wanting this use case when working with lambda, exactly as the OP is doing.

The original request:

interpolation functions should not run until the depended_on resource has completed

seems pretty fair and straightforward to me.

ED: Although, I notice source_code_hash seems to be optional for aws_lambda_function. The documentation says it's:

(Optional) Used to trigger updates.

I have no idea what "Used to trigger updates" is supposed to mean but my terraform does apply without it.

ED: also for the OP's use case https://www.terraform.io/docs/providers/archive/d/archive_file.html would be better

subos2008 on 20 Sep 2019

👍4

All 10 comments

Hi @mutt13y,

The various functions with file in their names that read files from disk are intended for use with files that are delivered as part of the configuration, such as being checked in to version control alongside the .tf files that reference them. They are _not_ for files that are generated during the terraform apply step.

We generally recommend against using Terraform to generate temporary artifacts locally, since that isn't really what it is for. We offer the facilities to do so because we're pragmatic and want to enable users to do some things that are slightly outside of Terraform's scope when needed, but the experience when doing so won't necessarily be smooth.

If generating the zip file as part of the terraform apply is important to your use case (rather than generating the artifact as a separate build step prior to running Terraform, which we'd recommend for most cases), I'd suggest also generating a hash of the file at the same time using the same mechanism (shell commands), rather than trying to mix work done by an external program with work done by Terraform.

apparentlymart on 11 Jul 2019

👎1

Hi @apparentlymart,
I take your point, I am just wondering what the use case is for the local_exec provisioner if it is not executed first (or last).
If there is a command that needs to be executed locally I think mostly you would want to run it before the apply or after it.
So perhaps an option on local_exec to control when it runs could be an option ?

Stuart

mutt13y on 12 Jul 2019

Provisioners in general are a sort of "last resort" feature for doing small fixups after an object is created that don't otherwise fit into Terraform's declarative model. For example, in some environments it's impractical to customize machine images so that compute instances can immediately start their work on boot, and so provisioners can fill that gap by allowing last-moment initialization to happen on the remote host. As an example for local-exec in particular, it is sometimes used to run the official CLI tool of whatever remote system they are working with in order to run some non-declarative side-effects that are needed to get some object up and running fully.

Where possible though, Terraform prefers to think of infrastructure objects as a sort of "appliance" that just starts immediately doing its job as soon as it's created. For managed services that sort of behavior tends to come for free. For services you deploy yourself into a generic virtual machine that will generally require a custom machine image and a feature like EC2's user_data to pass in custom settings to that image.

They can also be used for things that I might claim Terraform _shouldn't_ be used for, such as generating artifacts for deployment, because that's just the nature of general features like that. The Terraform team is generally pragmatic about folks using these features to get things done even if it wasn't something the feature was intended for, but that doesn't mean that these unintended uses will come without friction.

Another feature in Terraform that exists to be pragmatic about this sort of unintended use case is the local_file data source, which offers a way to read a file from disk while obeying the usual lifecycle rules for data resources. Since data resources _can_ participate in the dependency graph, that can be used for certain dynamic file creation use-cases. However, it doesn't currently have a mechanism for reading a hash of a file rather than reading the file itself, so in order to work for your use-case here it would need some new features.

I think it would still be better to separate artifact creation from provisioning though, because that has other benefits: you can build and test those artifacts using a CI system as is normally done for code to be deployed, and you can keep a historical tail of older artifacts to roll back to in the event of a problem, etc. There's a more specific suggestion for one way to set this up in the guide Serverless Applications with AWS Lambda and API Gateway. Even if your use-case doesn't include an API portion, the part about deploying the lambda function could still be relevant/useful.

apparentlymart on 12 Jul 2019

😕1

I would also like to see this kind of behaviour added. We currently use the null_resource and local-exec provisioner to copy some files from a GCS bucket to the local machine. The content of these files which are being copied are used in a later step to create some kubernetes secrets.

Although we specify that the kubernetes secret resource is dependent on the local-exec command, it doesn't wait for the local-exec to finish. This results in an error that the file to create the kubernetes secret resource does not exist.

We don't necessarily create artifacts during the Terraform run, but we are very reliant on certain remote files which need to be pulled in during the run.

MikeBlomm on 3 Sep 2019

I ended up writing a Makefile, you could use concourse as a better alternative.
I think that if you need something done before or after the apply it is reasonable to use some other tooling.
Would it make more sense to use vault for your secrets ? I am sure the vault provider will have proper dependancies.

I do end up wondering what the actual use case for the local-exec is if we cant control when it runs.

mutt13y on 3 Sep 2019

Well as I read the above comment it is only intended for small fixes, not for actual resource creation.

We looked at using Vault, but it currently is overkill to set up and maintain a whole client/server application just for our secrets.

MikeBlomm on 3 Sep 2019

There's going to be a lot of people wanting this use case when working with lambda, exactly as the OP is doing.

The original request:

interpolation functions should not run until the depended_on resource has completed

seems pretty fair and straightforward to me.

ED: Although, I notice source_code_hash seems to be optional for aws_lambda_function. The documentation says it's:

(Optional) Used to trigger updates.

I have no idea what "Used to trigger updates" is supposed to mean but my terraform does apply without it.

ED: also for the OP's use case https://www.terraform.io/docs/providers/archive/d/archive_file.html would be better

subos2008 on 20 Sep 2019

👍4

I personally think that the expectation that developers should run additional build steps, in an infrastructure repository, besides terraform apply to be a bit ugly, perhaps unpolished. I'd prefer to keep terraform as the only application that performs tasks in the repo before pushing changes to infrastructure.

The syntax of depends_on, and the examples showing how its used, definitely leads one to believe it could be used for exactly this - what the OP (and myself) is after. I feel like this is an ugly gotcha in terraform that I'd have to explain away to colleages I'm trying to sell it to.