When upgrading an eks cluster, the cloudformation stack breaks, i.e. becomes in a failed state and can not be restored anymore. I'm managing my own EC2 nodes.
Here's a code example of how to cause this issue:
```ts
export class EKSCluster extends cdk.Stack {
public readonly eksCluster: eks.Cluster
constructor(app: cdk.App, id: string, props?: cdk.StackProps) {
super(app, id, props);
const clusterVersion = "1.14"
const workerNodesVersion = "1.14"
const vpc = new ec2.Vpc(this, 'VPC')
this.eksCluster = new eks.Cluster(this, 'Cluster', {
defaultCapacity: 0,
version: clusterVersion,
vpc: vpc
});
const onDemandASG = new autoscaling.AutoScalingGroup(this, 'OnDemandASG', {
vpc: vpc,
minCapacity: 2,
maxCapacity: 10,
instanceType: new ec2.InstanceType('m5.xlarge'),
machineImage: new eks.EksOptimizedImage({
kubernetesVersion: workerNodesVersion,
nodeType: eks.NodeType.STANDARD // wihtout this, incorrect SSM parameter for AMI is resolved
}),
updateType: autoscaling.UpdateType.ROLLING_UPDATE,
rollingUpdateConfiguration: {
maxBatchSize: 1,
minInstancesInService: 2,
waitOnResourceSignals: true,
pauseTime: cdk.Duration.minutes(1),
minSuccessfulInstancesPercent: 100
}
});
this.eksCluster.addAutoScalingGroup(onDemandASG, {
bootstrapEnabled: true,
mapRole: true
})
}
}
Now do the following:
1. Set `workerNodesVersion` to 1.15
2. Deploy the stack. It will succeed. Stack is still good.
3. Set `clusterVersion` to 1.15
4. Deploy the stack. An error will happen and the stack won't be able to roll back to 1.14 since the stack is already on 1.15
### Error Log
Error from cfn:
CustomResource attribute error: Vendor response doesn't contain CertificateAuthorityData key in object arn:aws:cloudformation:eu-central-1:XXXX:stack/XXXX/95d476f0-8e18-11ea-98f7-02433c861a1c|Cluster9EE0221C|50b09ee6-82a7-43c3-ae99-523a068c48b5 in S3 bucket cloudformation-custom-resource-storage-eucentral1
CustomResource attribute error: Vendor response doesn't contain Arn key in object arn:aws:cloudformation:eu-central-1:XXXX:stack/XXXX/95d476f0-8e18-11ea-98f7-02433c861a1c|Cluster9EE0221C|50b09ee6-82a7-43c3-ae99-523a068c48b5 in S3 bucket cloudformation-custom-resource-storage-eucentral1
CustomResource attribute error: Vendor response doesn't contain Endpoint key in object arn:aws:cloudformation:eu-central-1:XXXX:stack/XXXX/95d476f0-8e18-11ea-98f7-02433c861a1c|Cluster9EE0221C|50b09ee6-82a7-43c3-ae99-523a068c48b5 in S3 bucket cloudformation-custom-resource-storage-eucentral1
The following resource(s) failed to update: [ClusterEndpoint352C929A, ClusterCertificate0B8F68BF, ClusterArnSSM9C28FFC5].
```
This is :bug: Bug Report
This might be caused by PR #7526, since at 1.35.x I had no issues upgrading though
Thanks for reporting.
I think the @moatazelmasry2 is correct. The new code path does not return these attributes during version upgrade and therefore CFN is unable to resolve them. I am curious how this did not turn out in our testing... Investigating...
@eladb sorry got to test this first yesterday. It worked great!!! So thank you for that, now it is possible to do a cluster upgrade without breaking the stack AND without cfn returning too early. Good work!!!
LOL, eventually we'll turn this module into a decent thing.
Most helpful comment
LOL, eventually we'll turn this module into a decent thing.