Aws-cdk: [aws-eks] Stack breaks when upgrading an EKS Cluster

Created on 5 May 2020  路  5Comments  路  Source: aws/aws-cdk

When upgrading an eks cluster, the cloudformation stack breaks, i.e. becomes in a failed state and can not be restored anymore. I'm managing my own EC2 nodes.

Reproduction Steps

Here's a code example of how to cause this issue:
```ts

export class EKSCluster extends cdk.Stack {
public readonly eksCluster: eks.Cluster

constructor(app: cdk.App, id: string, props?: cdk.StackProps) {
    super(app, id, props);
    const clusterVersion = "1.14"
    const workerNodesVersion = "1.14"

    const vpc = new ec2.Vpc(this, 'VPC')
    this.eksCluster = new eks.Cluster(this, 'Cluster', {
        defaultCapacity: 0,
        version: clusterVersion,
        vpc: vpc
    });

    const onDemandASG = new autoscaling.AutoScalingGroup(this, 'OnDemandASG', {
        vpc: vpc,
        minCapacity: 2,
        maxCapacity: 10,
        instanceType: new ec2.InstanceType('m5.xlarge'),
        machineImage: new eks.EksOptimizedImage({
            kubernetesVersion: workerNodesVersion,
            nodeType: eks.NodeType.STANDARD  // wihtout this, incorrect SSM parameter for AMI is resolved
        }),
        updateType: autoscaling.UpdateType.ROLLING_UPDATE,
        rollingUpdateConfiguration: {
          maxBatchSize: 1,
          minInstancesInService: 2,
          waitOnResourceSignals: true,
          pauseTime: cdk.Duration.minutes(1),
          minSuccessfulInstancesPercent: 100
        }
    });
    this.eksCluster.addAutoScalingGroup(onDemandASG, {
        bootstrapEnabled: true,
        mapRole: true
    })
}

}


Now do the following:
1. Set `workerNodesVersion` to 1.15
2. Deploy the stack. It will succeed. Stack is still good.
3. Set `clusterVersion` to 1.15
4. Deploy the stack. An error will happen and the stack won't be able to roll back to 1.14 since the stack is already on 1.15

### Error Log

Error from cfn:

CustomResource attribute error: Vendor response doesn't contain CertificateAuthorityData key in object arn:aws:cloudformation:eu-central-1:XXXX:stack/XXXX/95d476f0-8e18-11ea-98f7-02433c861a1c|Cluster9EE0221C|50b09ee6-82a7-43c3-ae99-523a068c48b5 in S3 bucket cloudformation-custom-resource-storage-eucentral1

CustomResource attribute error: Vendor response doesn't contain Arn key in object arn:aws:cloudformation:eu-central-1:XXXX:stack/XXXX/95d476f0-8e18-11ea-98f7-02433c861a1c|Cluster9EE0221C|50b09ee6-82a7-43c3-ae99-523a068c48b5 in S3 bucket cloudformation-custom-resource-storage-eucentral1

CustomResource attribute error: Vendor response doesn't contain Endpoint key in object arn:aws:cloudformation:eu-central-1:XXXX:stack/XXXX/95d476f0-8e18-11ea-98f7-02433c861a1c|Cluster9EE0221C|50b09ee6-82a7-43c3-ae99-523a068c48b5 in S3 bucket cloudformation-custom-resource-storage-eucentral1

The following resource(s) failed to update: [ClusterEndpoint352C929A, ClusterCertificate0B8F68BF, ClusterArnSSM9C28FFC5].
```

Environment

  • **CLI Version :1.36.0
  • Framework Version:
  • OS :
  • **Language :Typescript

Other


This is :bug: Bug Report

@aws-cdaws-eks bug p1

Most helpful comment

LOL, eventually we'll turn this module into a decent thing.

All 5 comments

This might be caused by PR #7526, since at 1.35.x I had no issues upgrading though

Thanks for reporting.

I think the @moatazelmasry2 is correct. The new code path does not return these attributes during version upgrade and therefore CFN is unable to resolve them. I am curious how this did not turn out in our testing... Investigating...

@eladb sorry got to test this first yesterday. It worked great!!! So thank you for that, now it is possible to do a cluster upgrade without breaking the stack AND without cfn returning too early. Good work!!!

LOL, eventually we'll turn this module into a decent thing.

Was this page helpful?
0 / 5 - 0 ratings