Aws-cdk: Unable to recover from UPDATE_ROLLBACK_COMPLETE

Created on 22 Nov 2019  ยท  17Comments  ยท  Source: aws/aws-cdk

CloudFormation is able to update stacks that are in UPDATE_ROLLBACK_COMPLETE but the CDK CLI blocks these with the error:

The stack named XXX is in a failed state: UPDATE_ROLLBACK_COMPLETE

Environment

  • CLI Version : 0.24.1
  • Framework Version: 0.24.1
  • OS :
  • Language :

Other


This is :bug: Bug Report

bug good first issue in-progress p1 packagtools

Most helpful comment

I agree with @neg3ntropy this seems like a really severe issue. I am currently running into this in my multi-stack project. There are some references between the resources of the stacks, and it doesn't seem smart enough to understand how to update the stacks on subsequent deployments. Here's the call-stack I get in the console:

eboozer-app-data-storage
eboozer-app-data-storage: deploying...
eboozer-app-data-storage: creating CloudFormation changeset...
 0/1 | 1:17:56 PM | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack | eboozer-app-data-storage Export eboozer-app-data-storage:ExportsOutputFnGetAttAppDataBucket857FA106ArnDA77A1F5 cannot be deleted as it is in use by eboozer-formulary-management
 0/1 | 1:18:16 PM | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack | eboozer-app-data-storage 
 1/1 | 1:18:17 PM | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack | eboozer-app-data-storage 

 โŒ  eboozer-app-data-storage failed: Error: The stack named eboozer-app-data-storage is in a failed state: UPDATE_ROLLBACK_COMPLETE
    at /Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/api/util/cloudformation.ts:247:13
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at waitFor (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/api/util/cloudformation.ts:157:20)
    at Object.deployStack (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/api/deploy-stack.ts:248:26)
    at CdkToolkit.deploy (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/cdk-toolkit.ts:181:24)
    at main (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/bin/cdk.ts:237:16)
    at initCommandLine (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/bin/cdk.ts:170:9)
The stack named eboozer-app-data-storage is in a failed state: UPDATE_ROLLBACK_COMPLETE

I don't think UPDATE_ROLLBACK_COMPLETE should be treated as a "failed" state. And if I manually hack the CDK code to not treat rollbacks as unsuccessful, then I can get past this error, or if I manually tweak the template. Otherwise, I have to delete the stacks in the AWS console and start over (which is not always an option).

All 17 comments

@eladb - is that a misfire on the CLI version and framework version? Or is there some significance to 0.24.1?

will dig into it to write up some repro steps and get a fix going.

No, this is the version I experienced this (our ops project is still using this version). Could be that this is resolved in future versions, but I think I've seen some still open related issues.

@eladb - got around to reproducing a stack into the UPDATE_ROLLBACK_COMPLETE state (by adding an invalid tag as a property). We do display that message as part of the output of a deploy, but don't prevent further executions because of the state.

You can still update the stack and attempt subsequent deployments successfully. I also verified that the stack can be destroyed. I'm wondering if this was fixed at some point?

Was there some other action that you were trying to take other than deploy/destroy?

Got the same error today. Got around it by generating the cfn template and applying it manually

I'm seeing this too, cdk deploy is blocking the deploy:

โŒ<STACK-NAME> failed: Error: The stack named <STACK-NAME> is in a failed state: UPDATE_ROLLBACK_COMPLETE
The stack named <STACK-NAME> is in a failed state: UPDATE_ROLLBACK_COMPLETE

I'm on version 1.22.0 cli/sdk.

It would seem this check https://github.com/aws/aws-cdk/blob/4aabeb8605de18bf2afa40a7dcbd5c40e339ce83/packages/aws-cdk/lib/api/util/cloudformation.ts#L164 with isSuccess implemented (here https://github.com/aws/aws-cdk/blob/4aabeb8605de18bf2afa40a7dcbd5c40e339ce83/packages/aws-cdk/lib/api/util/cloudformation/stack-status.ts#L36) to exclude all ROLLBACK states is the cause. Why is UPDATE_ROLLBACK_COMPLETE excluded as a success state?

If I hack my local node_modules code to remove this check I am able to deploy successfully.

Is this just a bug? Should UPDATE_ROLLBACK_COMPLETE be treated as a success state?

Actually, I think this is just a race condition. At https://github.com/aws/aws-cdk/blob/435d81014a481d0828bddbf10a0a155f6efc2e7e/packages/aws-cdk/lib/api/deploy-stack.ts#L107 the stack may not yet have transitioned out of UPDATE_ROLLBACK_COMPLETE to an _INPROGRESS state before waitForStack is called which will then throw an error.

I thought managed to get around this by logging into the web console, then "updating" the template manually, it appeared to put the stack back to healthy state, but after it finished upgrading it went back to the failed state.

I had no other choice than to delete the stack and start over... this really does not feel like a solution... is there any other workaround?

Still happening on 1.41
This is a severe issue, it might be easily leaving people unable to resolve a critical situation in production.

I agree with @neg3ntropy this seems like a really severe issue. I am currently running into this in my multi-stack project. There are some references between the resources of the stacks, and it doesn't seem smart enough to understand how to update the stacks on subsequent deployments. Here's the call-stack I get in the console:

eboozer-app-data-storage
eboozer-app-data-storage: deploying...
eboozer-app-data-storage: creating CloudFormation changeset...
 0/1 | 1:17:56 PM | UPDATE_ROLLBACK_IN_P | AWS::CloudFormation::Stack | eboozer-app-data-storage Export eboozer-app-data-storage:ExportsOutputFnGetAttAppDataBucket857FA106ArnDA77A1F5 cannot be deleted as it is in use by eboozer-formulary-management
 0/1 | 1:18:16 PM | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack | eboozer-app-data-storage 
 1/1 | 1:18:17 PM | UPDATE_ROLLBACK_COMP | AWS::CloudFormation::Stack | eboozer-app-data-storage 

 โŒ  eboozer-app-data-storage failed: Error: The stack named eboozer-app-data-storage is in a failed state: UPDATE_ROLLBACK_COMPLETE
    at /Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/api/util/cloudformation.ts:247:13
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at waitFor (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/api/util/cloudformation.ts:157:20)
    at Object.deployStack (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/api/deploy-stack.ts:248:26)
    at CdkToolkit.deploy (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/lib/cdk-toolkit.ts:181:24)
    at main (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/bin/cdk.ts:237:16)
    at initCommandLine (/Users/eboozer/Workspaces/next-backend/node_modules/aws-cdk/bin/cdk.ts:170:9)
The stack named eboozer-app-data-storage is in a failed state: UPDATE_ROLLBACK_COMPLETE

I don't think UPDATE_ROLLBACK_COMPLETE should be treated as a "failed" state. And if I manually hack the CDK code to not treat rollbacks as unsuccessful, then I can get past this error, or if I manually tweak the template. Otherwise, I have to delete the stacks in the AWS console and start over (which is not always an option).

Happened to me on 1.51.0 moments ago.

Happened to me on 1.51.0 moments ago.

hey @cmckni3 the fix was released in 1.52.0
I'm hoping you don't see that happen again when you upgrade!

@shivlaks sweet, I missed the PR mention above. ๐Ÿ˜‚

I will upgrade to 1.53 this week and see how it goes.

@cmckni3 great! i hope we've sorted it out with this attempt :)

Resolving this issue as we've had the fix merged and released since 1.52.0
Please comment/reopen as needed

I got stuck at 1.45.0
But it resulted the same after I updated to1.56.0

MyECSDev: creating CloudFormation changeset...
[ยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยทยท] (0/34)


 โŒ  MyECSDev failed: Error: The stack named MyECSDev failed to deploy: UPDATE_ROLLBACK_COMPLETE
    at Object.waitForStackDeploy (/Users/apple/GitProject/hello-cdk/node_modules/aws-cdk/lib/api/util/cloudformation.ts:294:11)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
    at Object.deployStack (/Users/apple/GitProject/hello-cdk/node_modules/aws-cdk/lib/api/deploy-stack.ts:266:26)
    at CdkToolkit.deploy (/Users/apple/GitProject/hello-cdk/node_modules/aws-cdk/lib/cdk-toolkit.ts:181:24)
    at main (/Users/apple/GitProject/hello-cdk/node_modules/aws-cdk/bin/cdk.ts:268:16)
    at initCommandLine (/Users/apple/GitProject/hello-cdk/node_modules/aws-cdk/bin/cdk.ts:188:9)
The stack named MyECSDev failed to deploy: UPDATE_ROLLBACK_COMPLETE

@JoHuang are you unable to update your stack?

If your stack is in UPDATE_ROLLBACK_COMPLETE, that just means that your last stack failed to deploy but should still be updatable.

I found the root cause is not about state. I set 2 different ALB rules with the same priority.
However, the cli didn't show the corresponding error, but showed failed to deploy: UPDATE_ROLLBACK_COMPLETE.

After solving the priority issue, both 1.45.0 and 1.56.0 worked.
So maybe we should improve the error explanation?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Kent1 picture Kent1  ยท  3Comments

sudoforge picture sudoforge  ยท  3Comments

eladb picture eladb  ยท  3Comments

v-do picture v-do  ยท  3Comments

nzspambot picture nzspambot  ยท  3Comments