Terraform-provider-aws: source_code_hash does not update

Created on 30 Jan 2019 · 14Comments · Source: hashicorp/terraform-provider-aws

_This issue was originally opened by @joerggross as hashicorp/terraform#20152. It was migrated here as a result of the provider split. The original body of the issue is below._

Terraform Version

v0.11.11

Terraform Configuration Files

data "aws_s3_bucket_object" "lambda_jar_hash" {
  bucket = "${var.lambda_s3_bucket}"
  key    = "${var.lambda_s3_key}.sha256"
}

resource "aws_lambda_function" "lambda_function_s3" {

  s3_bucket = "${var.lambda_s3_bucket}"
  s3_key = "${var.lambda_s3_key}"
  s3_object_version = "${var.lambda_s3_object_version}"

  function_name = "${var.lambda_function_name}"
  role = "${var.lambda_execution_role_arn}"
  handler = "${var.lambda_function_handler}"
  source_code_hash = "${base64encode(data.aws_s3_bucket_object.lambda_jar_hash.body)}"
  runtime = "java8"
  memory_size = "${var.lambda_function_memory}"
  timeout = "${var.lambda_function_timeout}"
  description = "${var.description}"
  reserved_concurrent_executions = "${var.reserved_concurrent_executions}"

}

Debug Output

...

~ module.comp-price-import-data-reader-scheduled-lambda.aws_lambda_function.lambda_function_s3
last_modified: "2019-01-30T11:58:32.826+0000" =>
source_code_hash: "6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=" => "ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ=="

Plan: 0 to add, 1 to change, 0 to destroy.

Crash Output

Plan: 0 to add, 1 to change, 0 to destroy.

Expected Behavior

We generate an additional file in the s3 bucket along with the lambda jar file to be deployed in s3. The additional file contains a SHA256 hash of the deployed jar file. The hash value of the file is set to the source_code_hash property of the lamba function, by using the bas64 encode function.

We would expect that the hash is stored in the tfsate and reused when applying the scripts, so that the lambda jar file is not redeployed unless the hash changes.

Actual Behavior

We applied the scripts different times without changing the jar or hash file in s3. Nevertheless terraform always redeployes the jar. The output (see above) is always the same ("6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=" => "ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ=="). It seems the the given hash is never stored in the tfstate.

bug serviclambda servics3

Source

hashibot[bot]

👍17

Most helpful comment

We're seeing the exact same issue, source_code_hash is never updated in the tfstate when applying so the lambda resource always requires updating no matter how many times we apply:

      ~ source_code_hash               = "83TsTFxfrLQJvQ8Re1YdXiGX2eQm1a1uX8Sc0bKeC3w=" -> "p6F5Wk4naphwng6ZQRNahuvJ7BUEFfHnMR9wQQpVkCM="

apottere on 5 Dec 2019

👍8

All 14 comments

I do the same for my Go lambda function but I don´t use the base64encode as the sha256 sum is in the file.

Try this
source_code_hash = "${data.aws_s3_bucket_object.lambda_jar_hash.body}"

dbgeek on 14 Feb 2019

Dear all,

we do have the issue as described above. The code looks similar. Each time we run terraform apply the Lambda function is redeployed, even if nothing has changed. I have looked at the output of terraform and can confirm, that the hash of source_code_hash is not updated in the state file.

cosots on 23 Apr 2019

👍2

Hi All,

So the case here is pretty much the same:

My lambda function is fetching its source code from S3 object. I struggling to generate the proper source code hash string, which make the the lambda being updated every single run (even the source code is the same)

 ~ aws_lambda_function.lambda_fuctions
      last_modified:    "2019-04-25T09:59:53.110+0000" => <computed>
      qualified_arn:    "arn:aws:lambda:us-east-2:*:function:Test:1" => <computed>
      source_code_hash: "QkHfqU5xHUNfKaRgSj4t5cSqPBZeI70Ga+b8H8QwlWk=" => "NDI0MWRmYTk0ZTcxMWQ0MzVmMjlhNDYwNGEzZTJkZTVjNGFhM2MxNjVlMjNiZDA2NmJlNmZjMWZjNDMwOTU2OQ=="
      version:          "1" => <computed>

In order to get *.jar filesum I creates another txt file that contains sha256sum of the file

 sha256sum lambda.jar
4241dfa94e711d435f29a4604a3e2de5c4aa3c165e23bd066be6fc1fc4309569

If I use the buildin terraform filebase64sha256 function I see the same filesum as the one that terraform gets for the S3 object
filebase64sha256("../lambda.jar") QkHfqU5xHUNfKaRgSj4t5cSqPBZeI70Ga+b8H8QwlWk=

But we I generate the filesum base64 encoded locally - the string that I'm getting is different.
echo -n "4241dfa94e711d435f29a4604a3e2de5c4aa3c165e23bd066be6fc1fc4309569" | base64 NDI0MWRmYTk0ZTcxMWQ0MzVmMjlhNDYwNGEzZTJkZTVjNGFhM2MxNjVlMjNiZDA2NmJlNmZjMWZj NDMwOTU2OQ==

result from the terraform console
base64encode(filesha256("../lambda.jar")) NDI0MWRmYTk0ZTcxMWQ0MzVmMjlhNDYwNGEzZTJkZTVjNGFhM2MxNjVlMjNiZDA2NmJlNmZjMWZjNDMwOTU2OQ==

In the documents for 0.11.11 is explisitly said:

base64sha256(string) - Returns a base64-encoded representation of raw SHA-256 sum of the given string. This is not equivalent of base64encode(sha256(string)) since sha256() returns hexadecimal representation.

So the question here is how to generate the properly encoded SHA256 string on the host Linux bash

tf looks liket that

`data "aws_s3_bucket_object" "jar_hash" {
  bucket = "essilor-lambda"
  key    = "lambda-functions/xxx/xxx/xxxx/lambda.txt"
}

output "test" {
  value = "${base64encode(data.aws_s3_bucket_object.jar_hash.body)}"
}

resource "aws_lambda_function" "lambda_fuction" {
  s3_bucket = "essilor-lambda"

  s3_key           = "lambda-functions/xxx/xxxx/xxxx/lambda.jar"      
  function_name    = "Test"
  description      = "test"
  handler          = "xxxx.xxxx.xxxx.xxx.xxxxx"
  role             = "arn:aws:iam::xxxxx:role/xxxxxxxx"
  runtime          = "java8"
  timeout          = "5"
  publish          = "true"
  source_code_hash = "${base64encode(data.aws_s3_bucket_object.jar_hash.body)}"`

uzun0v on 25 Apr 2019

@uzun0v @cosots

Try something like this to generate the hash locally:

python3.7 -c "import base64;import hashlib;print(base64.b64encode(hashlib.sha256(open('$FILE','rb').read()).digest()).decode(), end='')"

teiohanson on 9 May 2019

We're seeing the exact same issue, source_code_hash is never updated in the tfstate when applying so the lambda resource always requires updating no matter how many times we apply:

      ~ source_code_hash               = "83TsTFxfrLQJvQ8Re1YdXiGX2eQm1a1uX8Sc0bKeC3w=" -> "p6F5Wk4naphwng6ZQRNahuvJ7BUEFfHnMR9wQQpVkCM="

apottere on 5 Dec 2019

👍8

An easier (alternative) way to update lambda function on code change, when sourced from S3, would be to set S3 bucket versioning and set lambda zip version:

data "aws_s3_bucket_object" "lambda_zip" {
  bucket  = "bucket_name"
  key     = "lambda.zip"
}

resource "aws_lambda_function" "run_hll_lambda" {
  s3_bucket         = data.aws_s3_bucket_object.lambda_zip.bucket
  s3_key            = data.aws_s3_bucket_object.lambda_zip.key
  s3_object_version = data.aws_s3_bucket_object.lambda_zip.version_id
  function_name     = "Lambda_name"
  role              = aws_iam_role.lambda_iam.arn
  handler           = "lambda_function.lambda_handler"
  runtime           = "python3.7"
}

duxan on 29 Jan 2020

👍1

Using a version id will not work for us, because we want to use snapshot-versions during development time, without always deploying and referencing a new version number.

joerggross on 3 Feb 2020

I'm also experiencing this issue on v0.12.25, aws provider v2.62.0, with a jar that is uploaded directly to the lambda.

I've tried different hashing algorithms but the ones generated on apply never match the ones in state.

EDIT:
I noticed that the different hashes are referenced in the debug logs, related to the warning:
Provider "registry.terraform.io/-/aws" produced an unexpected new value for but we are tolerating it because i t is using the legacy plugin SDK.

I'm using:
Terraform v0.12.25
AWS v2.62.0

joerowe on 16 May 2020

I'm experiencing this with v0.12.20 and aws provider v2.65.0 with a zip file that's referenced from an s3 bucket.


data "aws_s3_bucket_object" "lambda" {
  bucket = aws_s3_bucket.lambda.id
  key    = "lambda.zip"
}

resource "aws_lambda_function" "lambda" {
  s3_bucket        = aws_s3_bucket.lambda.id
  s3_key           = data.aws_s3_bucket_object.lambda.key
  function_name    = "lambda"
  role             = aws_iam_role.lambda.arn
  handler          = "lambda.handler"
  timeout          = 300
  memory_size      = 256
  source_code_hash = base64sha256(data.aws_s3_bucket_object.lambda.etag)
  runtime          = "python3.8"
}

I'm using the etag from the s3 object as the input for the hash, which shouldn't change unless we upload a new version.

When I run apply twice in a row, the input hash is always the same, but the new hash is not being persisted to the state and the next run shows the same output.

 ~ source_code_hash               = "FxFe/pitsCj4XL/F+VORZASkGZdejRgNc7OABiKaWpg=" -> "oE4rN1nboxBBF64fQl8Q0GPtAE7bLqOofP/ACZPPz2A="

mascah on 11 Jun 2020

I am experiencing the same issue, specifically inside a CI/CD pipeline. It does not occur on OSX and it does not occur in Docker on OSX when the project directory is mounted from OSX.

resource "aws_lambda_function" "index" {
  filename      = "../lambda/index.zip"
  function_name = "${var.project}_${var.environment}_redirect2index"
  role          = "${aws_iam_role.iam_for_basic_permission.arn}"
  handler       = "index.handler"
  source_code_hash = filebase64sha256("../lambda/index.zip")
  runtime = "nodejs12.x"
  publish = true
  provider = "aws.east"
}

However, with the same Docker image, TF version, and AWS Provider version, the hashes in the CI pipeline never match. The one generated by filebase64sha256("../lambda/index.zip") match between runs, however, the ones stored in state are completely different each time.

I thought this was an issue of something else getting hashed, such as a timestamp or similar, but the generated hash is the same. Somehow, that hash that gets computed doesn't get stored under source_code_hash.

This is actually quite a nasty problem because when the Lambda is used with CloudFront, the latter redeploys each time - because AWS thinks that a new version of the Lambda has been created. This then adds an additional at least 3, but often 10+ minutes to CD pipeline.

kpashov on 28 Aug 2020

👍1

I've done a little digging into this issue as I recently encountered it.

In my use case I generate the zip files frequently, even if the underlying contents don't change, the meta data changes in the zip file cause a different hash.

I tried to generate the hash of the contents outside of the zip and set it as the source code hash to get around this.

From my observations it appears that the source_code_hash field get's set in the state file from the filename field regardless of the content supplied to it. ie: filebase64sha256(aws_lambda_function.func.filename).

eclosson on 3 Sep 2020

👍1

I have found a workaround that works for my case - using the Terraform built-in zip provider (archive_file). Generating the zip outside Terraform seems to be causing issues, even though it shouldn't.

Something like this works fine for me and doesn't cause the lambda to be updated between subsequent runs of the CI/CD pipeline, even days apart.

data "archive_file" "lambda" {
  type        = "zip"
  source_file = "../lambda/index.js"
  output_path = "../lambda/index.zip"
}

resource "aws_lambda_function" "index" {
  filename      = data.archive_file.lambda.output_path
  function_name = "${var.project}_${var.environment}_lambda"
  role          = "${aws_iam_role.iam_for_basic_permission.arn}"
  handler       = "index.handler"
  source_code_hash = data.archive_file.lambda.output_base64sha256
  runtime = "nodejs12.x"
  publish = true
  provider = "aws.east"
}

kpashov on 4 Sep 2020

I'm reporting the same concern, too.

The main problem is the purpose of source_code_hash isn't clear. The documentation of aws_lambda_function states that source_code_hash is an argument that seems to have an impact to deployment, but that doesn't seem to be the case.

Looking at the source code, it is a computed field. After a successful deploy, the value of source_code_hash is overwritten by the response from AWS's API (code) by calling resourceAwsLambdaFunctionRead().

In short, the value assigned to source_code_hash doesn't affect deployment and is always overwritten, unless otherwise it matches, the hash returned by AWS API.

What we need

We need a way to deterministically trigger lambda deployments (e.g. after code change is detected) without presumptions that everyone uses the same process in packaging their code.

Is source_code_hash the correct attribute to use for this? Yes and no. I'd be nice to keep the hash returned by AWS's API, but probably we'd need another attribute similar to source_code_hash that meets our need.

Suggestion

Update documentation so that source_code_hash is clearly defined as an output.
Remove Optional:true from the schema for source_code_hash
Add a new attribute change_trigger_hash that is optional and not computed. Suggestions for better name are welcome.
If change_trigger_hash is null, then plan and apply would work as how they are working now
If change_trigger_hash is not null, then compare current value to previous value. They they are the same, include change in plan. Otherwise, ignore resource change.

@aeschright does this sound like something that we can do? I'll submit a PR if yes

===========================
Update: Upon looking further, source_code_hash indeed triggers a change which makes my suggestion invalid. I'll try an idea out which I hope would work

Miggleness on 17 Nov 2020

👍3

I had a very similar problem where the statefile was not getting an updated source_code_hash after an apply. @Miggleness pointed me in the right direction by noting that the value in source_code_hash is overwritten by AWS. This means that the hash you use in your lambda resource definition must be computed the same way that AWS computes the hash. Otherwise, you will always have a different value in your source_code_hash, and your lambda will always be redeployed.

So when you see something like:

 ~ source_code_hash = "QuYMcyiptpzreIVxuq8AL+UWobBp3pDq045f2ISoKB0=" -> "42e60c7328a9b69ceb788571baaf002fe516a1b069de90ead38e5fd884a8281d" # forces replacement

The value on the left is the AWS calculated hash, and the value on the right is the value you are providing terraform in your lambda definition.

If you calculate the hash yourself with shell, use the following algorithm:

openssl dgst -sha256 -binary ${FILE_NAME}.zip | openssl enc -base64

If you calculate it with a python script, use something like the following:

import base64
import hashlib


def get_aws_hash(zip_file):
    '''Compute bash64 sha256 hash of zip archive, with aws algorithm.'''
    with open(zip_file, "rb") as f:
        sha256_hash = hashlib.sha256()

        # Read and update hash string value in blocks of 4K
        while byte_block := f.read(4096):
            sha256_hash.update(byte_block)

    hash_value = base64.b64encode(sha256_hash.digest()).decode('utf-8')

    return hash_value