_This issue was originally opened by @joerggross as hashicorp/terraform#20152. It was migrated here as a result of the provider split. The original body of the issue is below._
v0.11.11
data "aws_s3_bucket_object" "lambda_jar_hash" {
bucket = "${var.lambda_s3_bucket}"
key = "${var.lambda_s3_key}.sha256"
}
resource "aws_lambda_function" "lambda_function_s3" {
s3_bucket = "${var.lambda_s3_bucket}"
s3_key = "${var.lambda_s3_key}"
s3_object_version = "${var.lambda_s3_object_version}"
function_name = "${var.lambda_function_name}"
role = "${var.lambda_execution_role_arn}"
handler = "${var.lambda_function_handler}"
source_code_hash = "${base64encode(data.aws_s3_bucket_object.lambda_jar_hash.body)}"
runtime = "java8"
memory_size = "${var.lambda_function_memory}"
timeout = "${var.lambda_function_timeout}"
description = "${var.description}"
reserved_concurrent_executions = "${var.reserved_concurrent_executions}"
}
...
~ module.comp-price-import-data-reader-scheduled-lambda.aws_lambda_function.lambda_function_s3
last_modified: "2019-01-30T11:58:32.826+0000" =>
source_code_hash: "6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=" => "ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ=="
Plan: 0 to add, 1 to change, 0 to destroy.
~ module.comp-price-import-data-reader-scheduled-lambda.aws_lambda_function.lambda_function_s3
last_modified: "2019-01-30T11:58:32.826+0000" =>
source_code_hash: "6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=" => "ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ=="
Plan: 0 to add, 1 to change, 0 to destroy.
We generate an additional file in the s3 bucket along with the lambda jar file to be deployed in s3. The additional file contains a SHA256 hash of the deployed jar file. The hash value of the file is set to the source_code_hash property of the lamba function, by using the bas64 encode function.
We would expect that the hash is stored in the tfsate and reused when applying the scripts, so that the lambda jar file is not redeployed unless the hash changes.
We applied the scripts different times without changing the jar or hash file in s3. Nevertheless terraform always redeployes the jar. The output (see above) is always the same ("6HVMIk6vxvBy4AApmHbQis5Av2uQeSJh3XRosmKtv0U=" => "ZTg3NTRjMjI0ZWFmYzZmMDcyZTAwMDI5OTg3NmQwOGFjZTQwYmY2YjkwNzkyMjYxZGQ3NDY4YjI2MmFkYmY0NQ=="). It seems the the given hash is never stored in the tfstate.
I do the same for my Go lambda function but I don麓t use the base64encode as the sha256 sum is in the file.
Try this
source_code_hash = "${data.aws_s3_bucket_object.lambda_jar_hash.body}"
Dear all,
we do have the issue as described above. The code looks similar. Each time we run terraform apply
the Lambda function is redeployed, even if nothing has changed. I have looked at the output of terraform and can confirm, that the hash of source_code_hash
is not updated in the state file.
Hi All,
So the case here is pretty much the same:
My lambda function is fetching its source code from S3 object. I struggling to generate the proper source code hash string, which make the the lambda being updated every single run (even the source code is the same)
~ aws_lambda_function.lambda_fuctions
last_modified: "2019-04-25T09:59:53.110+0000" => <computed>
qualified_arn: "arn:aws:lambda:us-east-2:*:function:Test:1" => <computed>
source_code_hash: "QkHfqU5xHUNfKaRgSj4t5cSqPBZeI70Ga+b8H8QwlWk=" => "NDI0MWRmYTk0ZTcxMWQ0MzVmMjlhNDYwNGEzZTJkZTVjNGFhM2MxNjVlMjNiZDA2NmJlNmZjMWZjNDMwOTU2OQ=="
version: "1" => <computed>
In order to get *.jar filesum I creates another txt file that contains sha256sum of the file
sha256sum lambda.jar
4241dfa94e711d435f29a4604a3e2de5c4aa3c165e23bd066be6fc1fc4309569
If I use the buildin terraform filebase64sha256 function I see the same filesum as the one that terraform gets for the S3 object
filebase64sha256("../lambda.jar")
QkHfqU5xHUNfKaRgSj4t5cSqPBZeI70Ga+b8H8QwlWk=
But we I generate the filesum base64 encoded locally - the string that I'm getting is different.
echo -n "4241dfa94e711d435f29a4604a3e2de5c4aa3c165e23bd066be6fc1fc4309569" | base64
NDI0MWRmYTk0ZTcxMWQ0MzVmMjlhNDYwNGEzZTJkZTVjNGFhM2MxNjVlMjNiZDA2NmJlNmZjMWZj
NDMwOTU2OQ==
result from the terraform console
base64encode(filesha256("../lambda.jar"))
NDI0MWRmYTk0ZTcxMWQ0MzVmMjlhNDYwNGEzZTJkZTVjNGFhM2MxNjVlMjNiZDA2NmJlNmZjMWZjNDMwOTU2OQ==
In the documents for 0.11.11 is explisitly said:
base64sha256(string) - Returns a base64-encoded representation of raw SHA-256 sum of the given string. This is not equivalent of base64encode(sha256(string)) since sha256() returns hexadecimal representation.
So the question here is how to generate the properly encoded SHA256 string on the host Linux bash
tf looks liket that
`data "aws_s3_bucket_object" "jar_hash" {
bucket = "essilor-lambda"
key = "lambda-functions/xxx/xxx/xxxx/lambda.txt"
}
output "test" {
value = "${base64encode(data.aws_s3_bucket_object.jar_hash.body)}"
}
resource "aws_lambda_function" "lambda_fuction" {
s3_bucket = "essilor-lambda"
s3_key = "lambda-functions/xxx/xxxx/xxxx/lambda.jar"
function_name = "Test"
description = "test"
handler = "xxxx.xxxx.xxxx.xxx.xxxxx"
role = "arn:aws:iam::xxxxx:role/xxxxxxxx"
runtime = "java8"
timeout = "5"
publish = "true"
source_code_hash = "${base64encode(data.aws_s3_bucket_object.jar_hash.body)}"`
@uzun0v @cosots
Try something like this to generate the hash locally:
python3.7 -c "import base64;import hashlib;print(base64.b64encode(hashlib.sha256(open('$FILE','rb').read()).digest()).decode(), end='')"
We're seeing the exact same issue, source_code_hash
is never updated in the tfstate when applying so the lambda resource always requires updating no matter how many times we apply:
~ source_code_hash = "83TsTFxfrLQJvQ8Re1YdXiGX2eQm1a1uX8Sc0bKeC3w=" -> "p6F5Wk4naphwng6ZQRNahuvJ7BUEFfHnMR9wQQpVkCM="
An easier (alternative) way to update lambda function on code change, when sourced from S3, would be to set S3 bucket versioning and set lambda zip version:
data "aws_s3_bucket_object" "lambda_zip" {
bucket = "bucket_name"
key = "lambda.zip"
}
resource "aws_lambda_function" "run_hll_lambda" {
s3_bucket = data.aws_s3_bucket_object.lambda_zip.bucket
s3_key = data.aws_s3_bucket_object.lambda_zip.key
s3_object_version = data.aws_s3_bucket_object.lambda_zip.version_id
function_name = "Lambda_name"
role = aws_iam_role.lambda_iam.arn
handler = "lambda_function.lambda_handler"
runtime = "python3.7"
}
Using a version id will not work for us, because we want to use snapshot-versions during development time, without always deploying and referencing a new version number.
I'm also experiencing this issue on v0.12.25, aws provider v2.62.0, with a jar that is uploaded directly to the lambda.
I've tried different hashing algorithms but the ones generated on apply never match the ones in state.
EDIT:
I noticed that the different hashes are referenced in the debug logs, related to the warning:
Provider "registry.terraform.io/-/aws" produced an unexpected new value for but we are tolerating it because i t is using the legacy plugin SDK.
I'm using:
Terraform v0.12.25
AWS v2.62.0
I'm experiencing this with v0.12.20 and aws provider v2.65.0 with a zip file that's referenced from an s3 bucket.
data "aws_s3_bucket_object" "lambda" {
bucket = aws_s3_bucket.lambda.id
key = "lambda.zip"
}
resource "aws_lambda_function" "lambda" {
s3_bucket = aws_s3_bucket.lambda.id
s3_key = data.aws_s3_bucket_object.lambda.key
function_name = "lambda"
role = aws_iam_role.lambda.arn
handler = "lambda.handler"
timeout = 300
memory_size = 256
source_code_hash = base64sha256(data.aws_s3_bucket_object.lambda.etag)
runtime = "python3.8"
}
I'm using the etag
from the s3 object as the input for the hash, which shouldn't change unless we upload a new version.
When I run apply twice in a row, the input hash is always the same, but the new hash is not being persisted to the state and the next run shows the same output.
~ source_code_hash = "FxFe/pitsCj4XL/F+VORZASkGZdejRgNc7OABiKaWpg=" -> "oE4rN1nboxBBF64fQl8Q0GPtAE7bLqOofP/ACZPPz2A="
I am experiencing the same issue, specifically inside a CI/CD pipeline. It does not occur on OSX and it does not occur in Docker on OSX when the project directory is mounted from OSX.
resource "aws_lambda_function" "index" {
filename = "../lambda/index.zip"
function_name = "${var.project}_${var.environment}_redirect2index"
role = "${aws_iam_role.iam_for_basic_permission.arn}"
handler = "index.handler"
source_code_hash = filebase64sha256("../lambda/index.zip")
runtime = "nodejs12.x"
publish = true
provider = "aws.east"
}
However, with the same Docker image, TF version, and AWS Provider version, the hashes in the CI pipeline never match. The one generated by filebase64sha256("../lambda/index.zip")
match between runs, however, the ones stored in state are completely different each time.
I thought this was an issue of something else getting hashed, such as a timestamp or similar, but the generated hash is the same. Somehow, that hash that gets computed doesn't get stored under source_code_hash.
This is actually quite a nasty problem because when the Lambda is used with CloudFront, the latter redeploys each time - because AWS thinks that a new version of the Lambda has been created. This then adds an additional at least 3, but often 10+ minutes to CD pipeline.
I've done a little digging into this issue as I recently encountered it.
In my use case I generate the zip files frequently, even if the underlying contents don't change, the meta data changes in the zip file cause a different hash.
I tried to generate the hash of the contents outside of the zip and set it as the source code hash to get around this.
From my observations it appears that the source_code_hash field get's set in the state file from the filename field regardless of the content supplied to it. ie: filebase64sha256(aws_lambda_function.func.filename)
.
I have found a workaround that works for my case - using the Terraform built-in zip provider (archive_file). Generating the zip outside Terraform seems to be causing issues, even though it shouldn't.
Something like this works fine for me and doesn't cause the lambda to be updated between subsequent runs of the CI/CD pipeline, even days apart.
data "archive_file" "lambda" {
type = "zip"
source_file = "../lambda/index.js"
output_path = "../lambda/index.zip"
}
resource "aws_lambda_function" "index" {
filename = data.archive_file.lambda.output_path
function_name = "${var.project}_${var.environment}_lambda"
role = "${aws_iam_role.iam_for_basic_permission.arn}"
handler = "index.handler"
source_code_hash = data.archive_file.lambda.output_base64sha256
runtime = "nodejs12.x"
publish = true
provider = "aws.east"
}
I'm reporting the same concern, too.
The main problem is the purpose of source_code_hash
isn't clear. The documentation of aws_lambda_function
states that source_code_hash
is an argument that seems to have an impact to deployment, but that doesn't seem to be the case.
Looking at the source code, it is a computed field. After a successful deploy, the value of source_code_hash
is overwritten by the response from AWS's API (code) by calling resourceAwsLambdaFunctionRead()
.
In short, the value assigned to source_code_hash
doesn't affect deployment and is always overwritten, unless otherwise it matches, the hash returned by AWS API.
We need a way to deterministically trigger lambda deployments (e.g. after code change is detected) without presumptions that everyone uses the same process in packaging their code.
Is source_code_hash
the correct attribute to use for this? Yes and no. I'd be nice to keep the hash returned by AWS's API, but probably we'd need another attribute similar to source_code_hash
that meets our need.
source_code_hash
is clearly defined as an output. Optional:true
from the schema for source_code_hash
change_trigger_hash
that is optional and not computed. Suggestions for better name are welcome.change_trigger_hash
is null, then plan and apply would work as how they are working nowchange_trigger_hash
is not null, then compare current value to previous value. They they are the same, include change in plan. Otherwise, ignore resource change.@aeschright does this sound like something that we can do? I'll submit a PR if yes
===========================
Update: Upon looking further, source_code_hash
indeed triggers a change which makes my suggestion invalid. I'll try an idea out which I hope would work
I had a very similar problem where the statefile was not getting an updated source_code_hash
after an apply. @Miggleness pointed me in the right direction by noting that the value in source_code_hash
is overwritten by AWS. This means that the hash you use in your lambda resource definition must be computed the same way that AWS computes the hash. Otherwise, you will always have a different value in your source_code_hash
, and your lambda will always be redeployed.
So when you see something like:
~ source_code_hash = "QuYMcyiptpzreIVxuq8AL+UWobBp3pDq045f2ISoKB0=" -> "42e60c7328a9b69ceb788571baaf002fe516a1b069de90ead38e5fd884a8281d" # forces replacement
The value on the left is the AWS calculated hash, and the value on the right is the value you are providing terraform in your lambda definition.
If you calculate the hash yourself with shell, use the following algorithm:
openssl dgst -sha256 -binary ${FILE_NAME}.zip | openssl enc -base64
If you calculate it with a python script, use something like the following:
import base64
import hashlib
def get_aws_hash(zip_file):
'''Compute bash64 sha256 hash of zip archive, with aws algorithm.'''
with open(zip_file, "rb") as f:
sha256_hash = hashlib.sha256()
# Read and update hash string value in blocks of 4K
while byte_block := f.read(4096):
sha256_hash.update(byte_block)
hash_value = base64.b64encode(sha256_hash.digest()).decode('utf-8')
return hash_value
Most helpful comment
We're seeing the exact same issue,
source_code_hash
is never updated in the tfstate when applying so the lambda resource always requires updating no matter how many times we apply: