Aws-cdk: Support retry strategy on Lambda LogRetention

Created on 28 May 2020  路  4Comments  路  Source: aws/aws-cdk

Deployment of one of our CDK projects randomly fails with rate exceeded errors. These errors occur when CDK creates LogRetention resources related to the Lambda functions we have.

Reproduction Steps

The issue occurs when deploying multiple CDK stacks that contain quite some Lamba's with log retention resources.

I created a small test project to reproduce the issue: https://github.com/jaapvanblaaderen/log-retention-rate-limit With this simple setup, I wasn't able to reproduce the issue when deploying a few stacks sequentially (which is what we use in our actual project). The issue can however be observed when deploying the stacks in parallel.

Error Log

128/101 | 9:04:29 AM | CREATE_IN_PROGRESS   | Custom::LogRetention        | hello_5/LogRetention (hello5LogRetention5D258C6A) Resource creation Initiated
 129/101 | 9:04:29 AM | CREATE_FAILED        | Custom::LogRetention        | hello_5/LogRetention (hello5LogRetention5D258C6A) Failed to create resource. Rate exceeded
    new LogRetention (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/log-retention.ts:67:22)
    \_ new Function (/repos/logretention-rate-limit/node_modules/@aws-cdk/aws-lambda/lib/function.ts:537:28)
    \_ new LogRetentionRateLimitStack (/repos/logretention-rate-limit/lib/log-retention-rate-limit-stack.ts:17:18)
    \_ Object.<anonymous> (/repos/logretention-rate-limit/bin/log-retention-rate-limit.ts:8:3)
    \_ Module._compile (internal/modules/cjs/loader.js:1151:30)
    \_ Module.m._compile (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:858:23)
    \_ Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
    \_ Object.require.extensions.<computed> [as .ts] (/repos/logretention-rate-limit/node_modules/ts-node/src/index.ts:861:12)
    \_ Module.load (internal/modules/cjs/loader.js:1000:32)
    \_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
    \_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
    \_ main (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:227:14)
    \_ Object.<anonymous> (/repos/logretention-rate-limit/node_modules/ts-node/src/bin.ts:513:3)
    \_ Module._compile (internal/modules/cjs/loader.js:1151:30)
    \_ Object.Module._extensions..js (internal/modules/cjs/loader.js:1171:10)
    \_ Module.load (internal/modules/cjs/loader.js:1000:32)
    \_ Function.Module._load (internal/modules/cjs/loader.js:899:14)
    \_ Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
    \_ /usr/local/lib/node_modules/npm/node_modules/libnpx/index.js:268:14

Environment

  • CLI Version: 1.41.0 (build 9e071d2)
  • Framework Version: 1.41.0
  • OS: OSX 10.14

Analysis

It fails when creating CloudWatch log groups. The issue could be fixed by relaxing the retry options for the CloudWatch SDK instance, I tested this locally by changing it to:

const cloudwatchlogs = new AWS.CloudWatchLogs({ apiVersion: '2014-03-28', maxRetries: 6, retryDelayOptions: { base: 300 }});

Another solution might be increasing a service limit. Unfortunately, I have no clue which rate limit is being hit here. It's not clear from the documentation:


This is :bug: Bug Report

@aws-cdaws-lambda efforsmall feature-request good first issue in-progress

Most helpful comment

This is my config for logRetentionRetryOptions.
Selection_318

I'm still testing but it seems to fix the 'Rate exceeded' error.
Selection_315

@jaapvanblaaderen thank you for developing this feature!

All 4 comments

Added PR with a change that fixes the issue. Code was inspired by a similar fix in: https://github.com/aws/aws-cdk/pull/2053/files

Not sure if this is the right approach though. Can imagine this can be better managed in one central location and/or be configurable.

Re-classified this as a feature request.

This is my config for logRetentionRetryOptions.
Selection_318

I'm still testing but it seems to fix the 'Rate exceeded' error.
Selection_315

@jaapvanblaaderen thank you for developing this feature!

This is my config for logRetentionRetryOptions.
Selection_318

I'm still testing but it seems to fix the 'Rate exceeded' error.
Selection_315

@jaapvanblaaderen thank you for developing this feature!

Mine seems fixed too. Thanks @georstoy !

Was this page helpful?
0 / 5 - 0 ratings