Aws-cdk: Lambda Code Hash changes after every git repository clone

Created on 30 May 2020  ยท  15Comments  ยท  Source: aws/aws-cdk

:question: General Issue

The Question

I have an issue with our current CDK app.
Whenever a colleague clones the repo containing the CDK app and runs a cdk diff, all Lambda functions are rebuilt. Somehow, the hash which is calculated for every Lambda function in the CDK app changes.

This issue confuses our staff since they do not expect anything to change at all when they are not explicitly editing the Lambda functions.

Is it possible to make the hash computation more deterministic?
Actually, I do not know why the hash is changing for every full rebuild of the CDK app (i.e. removing all node_modules, cdk.out and bin directories).

However, I guess a solution would be to derive the Lambda's code hash from the version attribute in each Lambda's package.json file. Is this currently possible in the CDK?

Environment

  • CDK CLI Version: 1.39.0 (build 5d727c1)
  • Module Version: 1.39.0
  • OS: Ubuntu
  • Language: TypeScript

Other information

Example diff after the repository with the CDK app was checked out:

[~] AWS::Lambda::Function CoreBackend/CoreDeploymentHookConfirmationLambda CoreBackendCoreDeploymentHookConfirmationLambda46C41C7C 
 โ”œโ”€ [~] Code
 โ”‚   โ”œโ”€ [~] .S3Bucket:
 โ”‚   โ”‚   โ””โ”€ [~] .Ref:
 โ”‚   โ”‚       โ”œโ”€ [-] AssetParametersfa10a908065450dae7afcd3ef15c0cf60cb75272385a7fdcac361366cb354d56S3Bucket95C78B44
 โ”‚   โ”‚       โ””โ”€ [+] AssetParameters700428d8041f177f9e37b4a632fde92deb0dd5d8315ccacd9b16905249cb515dS3Bucket872367FE
 โ”‚   โ””โ”€ [~] .S3Key:
 โ”‚       โ””โ”€ [~] .Fn::Join:
 โ”‚           โ””โ”€ @@ -8,7 +8,7 @@
 โ”‚              [ ]   "Fn::Split": [
 โ”‚              [ ]     "||",
 โ”‚              [ ]     {
 โ”‚              [-]       "Ref": "AssetParametersfa10a908065450dae7afcd3ef15c0cf60cb75272385a7fdcac361366cb354d56S3VersionKeyC85B62DB"
 โ”‚              [+]       "Ref": "AssetParameters700428d8041f177f9e37b4a632fde92deb0dd5d8315ccacd9b16905249cb515dS3VersionKeyCDD40801"
 โ”‚              [ ]     }
 โ”‚              [ ]   ]
 โ”‚              [ ] }
 โ”‚              @@ -21,7 +21,7 @@
 โ”‚              [ ]   "Fn::Split": [
 โ”‚              [ ]     "||",
 โ”‚              [ ]     {
 โ”‚              [-]       "Ref": "AssetParametersfa10a908065450dae7afcd3ef15c0cf60cb75272385a7fdcac361366cb354d56S3VersionKeyC85B62DB"
 โ”‚              [+]       "Ref": "AssetParameters700428d8041f177f9e37b4a632fde92deb0dd5d8315ccacd9b16905249cb515dS3VersionKeyCDD40801"
 โ”‚              [ ]     }
 โ”‚              [ ]   ]
 โ”‚              [ ] }
 โ””โ”€ [~] Metadata
     โ””โ”€ [~] .aws:asset:path:
         โ”œโ”€ [-] asset.fa10a908065450dae7afcd3ef15c0cf60cb75272385a7fdcac361366cb354d56
         โ””โ”€ [+] asset.700428d8041f177f9e37b4a632fde92deb0dd5d8315ccacd9b16905249cb515d
@aws-cdassets closed-for-staleness guidance response-requested

Most helpful comment

A simple repro : https://github.com/joraycorn/cdk-lambda-absolutepath

  1. Clone & build/diff & deploy
  2. Move the whole cloned folder somewhere else
  3. Diff again without changing anything.

I would like to know if theres any work around for that case.
A more real world case would be having 2 dev working on the same environment. At the moment, it take ~10 min to deploy our stack changes even though we haven't modified any files in them. Same goes for layers.

Thanks !

All 15 comments

The code hash is computed based on the contents of the asset source -

https://github.com/aws/aws-cdk/blob/1755cf274b4da446272f109b55b20680beb34fe7/packages/%40aws-cdk/core/lib/asset-staging.ts#L64

If the asset source changes for any reason, a new hash will be computed (including hidden files).

Can you check if the act of 'cloning' causing the asset source to change?

If you're still seeing this, please provide more information around how you're defining lambda assets, the contents of the asset and what exactly you mean when you say 'clone the repo'.
Preferably, if you can provide an example that I can directly reproduce this situation, that would be great.

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

A simple repro : https://github.com/joraycorn/cdk-lambda-absolutepath

  1. Clone & build/diff & deploy
  2. Move the whole cloned folder somewhere else
  3. Diff again without changing anything.

I would like to know if theres any work around for that case.
A more real world case would be having 2 dev working on the same environment. At the moment, it take ~10 min to deploy our stack changes even though we haven't modified any files in them. Same goes for layers.

Thanks !

I gave this a shot and cannot reproduce. The steps I took were -

  1. git clone [email protected]:joraycorn/cdk-lambda-absolutepath.git at two different locations in my workspace
  2. npm install && npm run build && cdk synth

Running a diff against cdk.out/CdkLambdaAbsolutepathStack.template.json produces no differences, and the asset id generated is exactly the same - asset.c453798e3c6d784d5e30554df84a4ac7b08867a02c78d208eb8ee634a8d109b4

I also get same behavior / asset hash asset.c453798e3c6d784d5e30554df84a4ac7b08867a02c78d208eb8ee634a8d109b4

With
cdk: 1.45.0 (build 0cfab15)
node: v10.19.0
npm: 6.13.4

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

Oh no this should not have been closed. This IS an issue =(

I will provide more information once I get back to work and have the time.

Here is what I think is happening, if the lambda function code which is checked in is in typescript, the asset hash might change depending on the typescript version used to compile the ts files to js.
Can you verify you are using the same typescript version?

@NetaNir That is a good hint, I thought of the same recently. I will update to the next typescript version, observe what happens and report back.

I think this could explain why it is difficult to share a CDK repository with the team. If they a) use the globally installed TypeScript binary or b) the package.json contains a TypeScript version which is not fixed (e.g. using the npm caret), the hash changes with each individual that issues a deploy.

@MartinLoeper - if this is the case, then the CDK is behaving as designed. Different versions of typescript can produce slightly different variations of the output javascript which can cause the asset hash to be produced differently.

If the javascript handler for lambda function changes, I believe it is correct to deploy these into the lambda functions.

Was/Is there an impact when assets changes (due to different typescript versions)? If not, I would leave it as is.

However, if this is not desirable, you could compute your own source hash and pass it to the asset API.

Yes @nija-at you are correct that this would be expected behaviour of the cdk in the first place.

However, people who are responsible for building functional repositories must somehow find a solution to handle this case.

The issue is the following: If you have a cdk app with a lot of TypeScript lambda functions and another person checks it out and deploys it, a lot of assets are uploaded. This leads to the following two inconveniences:

a) The deployment is slowed down.
b) People that request a diff via "cdk diff" often come to me and are confused that there are more changes which the cdk wants to roll out than they have intentionally caused.

Solution: Shouldn't it be possible to create an attribute for the NodejsFunction construct [1] which instructs the asset's hash being computed based on the underlying sources only? This attribute could be totally optional and defaulted to false. What do you think? Does it make sense?

Let me put it another way: Does it make sense not to care about the generated lambda handler, but to specify a list of files which are used instead for change detection (e.g. package.json and the dist/ folder)?

[1] [https://docs.aws.amazon.com/cdk/api/latest/docs/aws-lambda-nodejs-readme.html](https://docs.aws.amazon.com/cdk/api/latest/docs/aws-lambda-nodejs-readme.html)

The code that gets executed in the lambda function is actually the Javascript generated file and not the original typescript source.
If we go down the route you're suggesting, we would be inaccurate to the user when cdk diff suggests that there are no changes to deploy, while in fact, there are changes to the handler code (i.e., the generated javascript files) that are actually not deployed.

One reason, folks use range dependencies is so that they can receive future bug fixes without any additional effort.
If, for example, typescript fixes a bug. By not re-deploying the newly generated lambda function, the bug fix is not deployed. This is bound to be confusing since version lock files will claim to use the correct/latest version of typescript.

The better way to fix this would be for you to depend on a fixed version of typescript. This way there is no variation in the code generated. When you're ready to upgrade, you can do so in a controlled manner (with a single commit) knowing that assets are going to change across the board.

I think you are right @nija-at. I will double check if the typescript range in the package.json is the cause of this whole inconvenience.

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

I have had this issue as well, where 2 developers, all the same versions of TS, same node versions, the compiled output was actually the same, still on 2 different machines we were getting different hashes every single time.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

EduardTheThird picture EduardTheThird  ยท  3Comments

abelmokadem picture abelmokadem  ยท  3Comments

PaulMaddox picture PaulMaddox  ยท  3Comments

Kent1 picture Kent1  ยท  3Comments

peterdeme picture peterdeme  ยท  3Comments