Hallo all,
I'm trying to get the safe deployment mode working, as described at https://github.com/awslabs/serverless-application-model/blob/master/docs/safe_lambda_deployments.rst.
However, it's not clear how the referenced alarms AliasErrorMetricGreaterThanZeroAlarm and LatestVersionErrorMetricGreaterThanZeroAlarm should be set up.
Below is my SAM definition file:
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >
# create a function and API gateway to return hello world
# global variable definitions
Globals:
Function:
Runtime: python2.7
Timeout: 10
Resources:
# helloworld function <- this is the main, actual function
HelloWorldFunction:
Type: AWS::Serverless::Function
Properties:
FunctionName: aahelloworld-safemode
Description: aahelloworld-safemode
Handler: helloworld.lambda_handler
Role: arn:aws:iam::XXXXX:role/aalambdainvoke-role
# set up versioning and an alias
AutoPublishAlias: prod
CodeUri: ./
Events:
GetResource:
Type: Api
# the event properties below have to match and point to the more detailed API Gateway spec
Properties:
Path: /
Method: get
RestApiId: !Ref HelloWorldAPIGateway
Tags:
Owner: aahelloworld
Status: active
Environment: development
Name: aahelloworld
Tracing: Active
# safe mode, ie rolling deployments
DeploymentPreference:
Type: AllAtOnce
Alarms:
# A list of alarms that you want to monitor
- !Ref AliasErrorMetricGreaterThanZeroAlarm
- !Ref LatestVersionErrorMetricGreaterThanZeroAlarm
# CW Alarm to monitor the new Lambda version for errors
LatestVersionErrorMetricGreaterThanZeroAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub aahelloworld-safemode-latestversion-${HelloWorldFunction}
AlarmDescription: "pre-deployment alarm to check for errors in the function"
#AlarmActions:
# - !Ref AlarmTopic
ComparisonOperator: GreaterThanOrEqualToThreshold
Dimensions:
- Name: Version
Value: !Ref HelloWorldFunction.Version
EvaluationPeriods: 1
MetricName: Errors
Namespace: AWS/Lambda
Period: '60'
Statistic: Sum
Threshold: '1'
# CW Alarm to monitor the new Lambda version for errors
AliasErrorMetricGreaterThanZeroAlarm:
#DependsOn: HelloWorldFunction.Alias
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Sub aahelloworld-safemode-alias-${HelloWorldFunction}
AlarmDescription: "pre-deployment alarm to check for errors in the function"
#AlarmActions:
# - !Ref AlarmTopic
ComparisonOperator: GreaterThanOrEqualToThreshold
Dimensions:
- Name: Alias
Value: !Ref HelloWorldFunction.Alias
EvaluationPeriods: 1
MetricName: Errors
Namespace: AWS/Lambda
Period: '60'
Statistic: Sum
Threshold: '1'
# API Gateway setup, points to the helloworld Lambda
# note !Ref function in LambdaFunction definition point to this resource, otherwise 2 API gateways will be created
HelloWorldAPIGateway:
Type: AWS::Serverless::Api
Properties:
Name: aahelloworld-safemode
StageName: prod
DefinitionBody:
swagger: 2.0
info:
title:
Ref: AWS::StackName
paths:
"/":
get:
x-amazon-apigateway-integration:
httpMethod: POST
type: aws_proxy
passthroughBehavior: when_no_match
uri:
Fn::Sub: arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${HelloWorldFunction.Alias}/invocations
responses: {}
I can't find any examples of how these alarms should be set up. I based the alarm config above on the example at https://docs.aws.amazon.com/lambda/latest/dg/with-sched-event-example-use-app-spec.html and modified it using the Cloudwatch metrics for Lambda listed at https://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-metrics.html.
Packaging and deploying the template results in the following error:
Failed to create the changeset: Waiter ChangeSetCreateComplete failed: Waiter encountered a terminal failure state Status: FAILED. Reason: Circular dependency between resources: [AliasErrorMetricGreaterThanZeroAlarm, HelloWorldAPIGatewayprodStage, HelloWorldAPIGatewayDeployment0c192927b1, HelloWorldFunctionAliasprod, HelloWorldFunctionGetResourcePermissionprod, HelloWorldAPIGateway, HelloWorldFunctionDeploymentGroup, HelloWorldFunctionGetResourcePermissionTest]
I tried to manually create the alarms, but it appears that the alarm metric doesn't accept a reference to the .Version or .Alias qualifiers for the FunctionName, nor can one search in Cloudwatch Metrics for the Dimensions Alias or Version. It's not clear how the alarm Dimensions key should be set up.
Any suggestions, examples or documentation welcome please!
regards
Corne
Update:
I realised that the
DeploymentPreference:
Type: AllAtOnce
needs to be changed to one of the rolling deployment methods, otherwise the Alias will always point to the latest Version.
The alarm definition below seems to work now. I cobbled together the Dimensions values after creating manual alarms in CloudWatch.
# CW Alarm to monitor the new Lambda version for errors
AliasErrorMetricGreaterThanZeroAlarm:
Type: AWS::CloudWatch::Alarm
DependsOn: HelloWorldFunction
Properties:
AlarmName: !Sub aahelloworld-safemode-alias-${HelloWorldFunction}
AlarmDescription: "pre-deployment alarm to check for errors in the function"
#AlarmActions:
# - !Ref AlarmTopic
ComparisonOperator: GreaterThanOrEqualToThreshold
Dimensions:
- Name: FunctionName
Value: !Ref HelloWorldFunction
EvaluationPeriods: 1
MetricName: Errors
Namespace: AWS/Lambda
Period: '60'
Statistic: Sum
Threshold: '1'
# CW Alarm to monitor the new Lambda version for errors
LatestVersionErrorMetricGreaterThanZeroAlarm:
Type: AWS::CloudWatch::Alarm
DependsOn: HelloWorldFunction
Properties:
AlarmName: !Sub aahelloworld-safemode-latestversion-${HelloWorldFunction}
AlarmDescription: "pre-deployment alarm to check for errors in the function"
#AlarmActions:
# - !Ref AlarmTopic
ComparisonOperator: GreaterThanOrEqualToThreshold
TreatMissingData: missing
Dimensions:
- Name: FunctionName
Value: !Ref HelloWorldFunction
- Name: Resource
Value: !Join [":", [!Ref HelloWorldFunction, !Select ["7", !Split [":", !Ref HelloWorldFunction.Version]]]]
EvaluationPeriods: 1
MetricName: Errors
Namespace: AWS/Lambda
Period: '60'
Statistic: Sum
Threshold: '1'
Glad to see you got it working 馃檶Is there anything that could be added to the docs to have made this simpler/clearer? Would love to see a PR! 馃檹
Most helpful comment
Update:
I realised that the
needs to be changed to one of the rolling deployment methods, otherwise the
Aliaswill always point to the latestVersion.The alarm definition below seems to work now. I cobbled together the
Dimensionsvalues after creating manual alarms in CloudWatch.