Aws-cdk: Thoughts about resource IDs

Created on 23 Dec 2018 · 18Comments · Source: aws/aws-cdk

@abrooksv provided some feedback on resource IDs:

Construct name must be unique within a subtree. When joined with the parent construct it will also be unique. Is there a reason why we append random characters to the end of a logical name?

If my parent Construct is called “CreateReealseEAP” and my construct is “BuildNotificationFailed”, why do we append “4402183C”? It leads to harder to read resource names that are already pretty long. For human readable resources such as CodeBuild project and Pipelines this leads me to using the name override features. According to the docs, it mentions that it is done due to the delimiter can’t be added into the logical ID. If I am understanding this right, it seems like a rare edge case between two separate constructs Foo-BarLambda vs FooBar-Lambda causing a collision. This seems like such a small edge case that we could handle it by adding in the suffix in that case only, and not in the common case.

One of the main value propositions of the CDK is to enable composition of abstract cloud architectures into higher level abstractions, stacks and apps. Since CloudFormation's logical ID namespace is flat, we needed a way to make sure that resource IDs do not collide when reusable components from multiple uncoordinated sources (i.e. 3rd parties, etc) are composed together and synthesized into the same eventual CloudFormation template.

The natural solution would have been to use a delimited full path as the logical ID of each resource. However, this wouldn't have worked because CloudFormation puts rather strict constraints on logical IDs (only alphanumeric, limited in length).

We couldn't only add a suffix in the case where there is a conflict because this means that logical IDs would stop being stable: users might add a resource and suddenly some other resource would have their logical ID change. That's unacceptable behavior, since a change of logical IDs mean recreation of physical resources - definitely something we can't afford.

We've explored options with CloudFormation, and realized they have good reasons for these constraints and that changing them would not be possible without a huge breakage potential across their huge surface area. To overcome those limitations we came up with this model where we render logical IDs that contain both a “human” part and a hash of the full delimited path. The human part is rendered by concatenating the path and making it “nice” while the hash is basically a short MD5 hash, to ensure uniqueness within the template.

CreateReleaseEAPBuildNotificationFailed4402183C vs CreateReleaseEAPBuildNotificationFailed
The concatenating leads to resource names that are arguably “worthless”. Some of this is CFN’s fault due to the stack prefix, but the long logical IDs don’t help.
: Example: I have 2 sibling Lambdas that have the Construct tree Stack->Pipeline->Stage->Action->Lambda since that is logically how it is laid out. This leads to two Lambdas:

sam-dev-AWSToolk-AwsToolkitJetBrainsRelea-XX8M2X3P1ET7

sam-dev-AWSToolk-AwsToolkitJetBrainsRelea-1EHAWJ25E4M58
There is no way to differentiate the Lambdas without adding in descriptions. Descriptions are not useful when attempting to find the metrics in CloudWatch or the log streams though.

Yes, definitely not a very useful set of names. Would love to explore any ideas you might have on helping resolve this. Things that come to mind:

Auto-tag the resources using construct full path so people can at least look them up as tags?
Allow you to override the human part of the logical ID. We still need the hash suffix for the reasons I outlined above, but the human part can be anything. What would be a nice API to do that? I am not sure.
Render the human part of the logical ID in reverse order - this might make it easier to distinguish between multiple resources that are in the same level of the tree...

The full Logical name for one of the Lambdas is:

AwsToolkitJetBrainsReleasePreReleaseCloseReleaseGateCloseReleaseGateLambdaC27581F5. It is full of duplicates because the logical name for each construct should be descriptive enough on its own for it to understand what it does.

Pipeline: AwsToolkitJetBrainsRelease

Stage: PreRelease

Action: CloseReleaseGate

Lambda: CloseReleaseGateLambda

I'd argue that the “full logical ID” is:

AwsToolkitJetBrainsRelease/PreRelease/CloseReleaseGate/CloseReleaseGateLambda

and I wonder, for example, why not use shorter logical IDs? (I know it won't solve your problem, but maybe eliviate it in certain cases):

AwsToolkitJetBrains/PreRelease/CloseGate/Lambda*

This makes me think the Construct tree should never be more than 2-3 layers deep at the most in order to prevent the long name. What are the teams’ thoughts on this?

As mentioned above, this will create a limitation that we can't have. Being able to use arbitrarily deep trees is probably the single most important part of the CDK, as this is how abstractions are composed.

If we use a name override (such as codebuild Project.withProjectName()) in order to alleviate issues mentioned above, and changing the parent constructs name lead to a logical name change, would this lead to a stack that can no longer be deployed due to the logical names getting altered but the resource name override remaining the same? Due to the long names mentioned above, I could see people commonly overriding the name for resources such as Lambdas and Projects.

Yes, that's a CloudFormation limitation that we can't really get around. If you use a physical name, CFN won't be able to replace it with another resource with the same physical name.

Again, this brings up the idea of perhaps allowing you to override only the human part of the logical ID... Definitely worth exploring this direction.

@aws-cdcore guidance

Source

eladb

👍3

Most helpful comment

This strategy continues to be a huge burden to an iterative development process. Creating a construct tree path of Stack -> MyLambda produces something like StackMyLambdaABC123 as a logical ID. Once we've rolled this out to production and have come back to development we are left with a problem.

If we wrap MyLambda with a construct which does nothing except create a new MyLambda(this) construct, the path has changed to StackMyConstructMyLambdaABC123, resulting in cloud formation destroying the resource and creating a new one. This is frequent in iterative development as refactors happen continuously. This sounds like its fine for a lambda and questionable for other resources but its actually problematic even for lambdas as well as other resources which don't contain any state. The reason for that is because cloud formation will create the new resource first before deleting to make rollbacks easier. This leads to a name conflict, since the "primary key" in cloud formation changed, but the "primary key" in the service did not. The root cause of this issue is that cloud formation tracks a different primary identifier than the underlying services do, the difference between Logical ID and Physical ID, and I understand CDK can't change that, but this strategy magnifies the issue more than if you were setting logical IDs yourself manually like you would with JSON or YAML. The only workaround is to put it back to how it was (not often a viable choice) or to rename your resource to be called something else (which sometimes can be an issue too). The last workaround is to roll out a change which deletes the resource (by commenting it out) followed by one which recreates it (by uncommenting it out); this is also not a viable strategy most times as it leaves the production system without a component of the system for some period of time.

I don't think any one solution works here, as has been discussed both have drawbacks to one side or another. I believe the default behavior should be that the ID (second parameter to all constructs) should be used as the logical ID, since they're typically hard coded or something hard coded plus some suffix for most end-user CDK code which is written. This leaves the issue of when "reusable components from multiple uncoordinated sources (i.e. 3rd parties, etc) are composed together". This is where the generation strategy which is the current default should be applied. The only time this issue comes into play again is when you wrap a construct which contains the third party construct and so happens way less frequently. Other strategies could be introduced potentially, but I believe that there is no one size fits all here and that each construct should decide how it sets the logical Ids for its children with the stack's default to being just returning the ID of the construct.

mgwidmann on 21 Jul 2020

👍7 ❤3

All 18 comments

I’d want to go further and allow users to override the entire logical ID as a standard feature of constructs, with the caveat that if another resource ends up using the same ID, it’s up to the user to deal with that.

This would have the same effect as the renaming “hack”, without needing to put the renames at the top of the stack.

I see it as a the stack author signposting “I care about the root CFN resource this construct creates outside of this app”

mipearson on 13 Jan 2019

👍5

Yes. The current experience is not good enough.

How about something like resource.overrideLogicalId(newId). We can implement something like that if logicalId would be a lazy token instead of a concrete value.

eladb on 15 Jan 2019

👍3

I think I can achieve want by providing my own naming system instead of using overrides everywhere but I have not tried it yet:

https://awslabs.github.io/aws-cdk/refs/_aws-cdk_cdk.html?highlight=iaddressingscheme#logicalids

But is there a way to get the actual Resource from inside of https://awslabs.github.io/aws-cdk/refs/_aws-cdk_cdk.html?highlight=iaddressingscheme#@aws-cdk/cdk.IAddressingScheme ?

abrooksv on 15 Jan 2019

@eladb yep, I like that as a solution - better than having to put them all at the top of the stack.

Potential use case is using CDK to generate templates for other teams to consume, and in that case "clean" resource IDs will matter.

mipearson on 21 Jan 2019

https://github.com/awslabs/aws-cdk/pull/1670 and #1595 should provide a pretty good solution for customizing logical IDs

eladb on 4 Mar 2019

1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

cal5barton on 14 Mar 2020

1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

skinny85 on 14 Mar 2020

1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

Thanks for the link to that. Unfortunately, it doesn't produce the behavior that I want. I'd like to strip or remove the auto-generated hash completely and allow for the logical id to be the id that I pass in the constructor of the resource. Basically, by overriding that method, I'm replacing the auto-generated hash with a hash of my own. It would be nice to disable that functionality altogether and let us assume the risk of collisions of logicalIds.

cal5barton on 14 Mar 2020

👍4

1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

Thanks for the link to that. Unfortunately, it doesn't produce the behavior that I want. I'd like to strip or remove the auto-generated hash completely and allow for the logical id to be the id that I pass in the constructor of the resource. Basically, by overriding that method, I'm replacing the auto-generated hash with a hash of my own. It would be nice to disable that functionality altogether and let us assume the risk of collisions of logicalIds.

I'm not sure what you mean by that... here's an example implementation:

export class TestStack extends Stack {
    protected allocateLogicalId(cfnElement: CfnElement): string {
        return cfnElement.node.id;
    }
}

Note you need to be aware the separation between L2s (which you create explicitly) and L1 classes (those starting with the Cfn prefix), which are created implicitly for you with logical IDs like Resource.

skinny85 on 16 Mar 2020

1670 and #1595 should provide a pretty good solution for customizing logical IDs

@eladb do you know if there are any plans to make this easier to override on a global level? It would be nice to not have to go down to the Cfn construct level for every single resource that we create.

You can also do it by subclassing Stack and overriding the allocateLogicalId method: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_core.Stack.html#protected-allocate-wbr-logical-wbr-idcfnelement

Thanks for the link to that. Unfortunately, it doesn't produce the behavior that I want. I'd like to strip or remove the auto-generated hash completely and allow for the logical id to be the id that I pass in the constructor of the resource. Basically, by overriding that method, I'm replacing the auto-generated hash with a hash of my own. It would be nice to disable that functionality altogether and let us assume the risk of collisions of logicalIds.

I'm not sure what you mean by that... here's an example implementation:
export class TestStack extends Stack {
    protected allocateLogicalId(cfnElement: CfnElement): string {
        return cfnElement.node.id;
    }
}
Note you need to be aware the separation between L2s (which you create explicitly) and L1 classes (those starting with the Cfn prefix), which are created implicitly for you with logical IDs like Resource.

@skinny85 I've tried and tried to override this method with the code you provided but cfnElement.node.id just returns Resource, S3VersionKey, CFDistribution, etc.... I would expect the id to be what was passed via the constructor when creating a certain construct. I've debugged several other properties on the node object and don't see where the id I pass is located.

cal5barton on 24 Apr 2020

👍1

Yes. That's why I said:

Note you need to be aware the separation between L2s (which you create explicitly) and L1 classes (those starting with the Cfn prefix), which are created implicitly for you with logical IDs like Resource.

You can try something like:

    protected allocateLogicalId(cfnElement: CfnElement): string {
        const allScopes = cfnElement.node.scopes;
        const stackIndex = allScopes.indexOf(cfnElement.stack);
        const resourceScopes = allScopes.slice(stackIndex + 1);
        return resourceScopes
            .map(cons => cons.node.id)
            .reduce((id, acc) => id + acc, '');
    }

to try and get rid of the hashes at the end of the logical IDs, while having a better chance of them being unique.

Feel free to come up with your own algorithm as you see fit (for example, you might want to filter out generic IDs in the scopes like Resource, Default, etc.).

skinny85 on 27 Apr 2020

@skinny85 This is also an issue for me because I want to migrate to CDK from a simple CloudFormation template. I was under the impression that I would be able to describe a CDK stack which would generate a template similar to what I have. Because of the hash codes at the end of the names of the resources, the tables/buckets will be replaced, which I want to avoid. So, I don't want to generate a new unique logical id, I just want to use the one I gave in the constructor.

RoKish on 29 Apr 2020

@RoKish for that, you have a few options:

If you use the Layer 1 classes (those that start with Cfn*), and you give Stack as the parent, the logical ID will be preserved:

import * as cdk from '@aws-cdk/core';
import s3 = require('@aws-cdk/aws-s3');

export class MyStack extends cdk.Stack {
    constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
        super(scope, id, props);

        new s3.CfnBucket(this, 'MyBucket')
    }
}

results in the template:

Resources:
  MyBucket:
    Type: AWS::S3::Bucket
    Metadata:
      aws:cdk:path: MyStack/MyBucket

If you'd rather use the Layer 2 classes, you can use the overrideLogicalId method on the underlying L1 classes that the the L2s create:

import * as cdk from '@aws-cdk/core';
import s3 = require('@aws-cdk/aws-s3');

export class MyStack extends cdk.Stack {
    constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
        super(scope, id, props);

        const bucket = new s3.Bucket(this, 'MyBucket');
        const cfnBucket = bucket.node.defaultChild as cdk.CfnResource;
        cfnBucket.overrideLogicalId('MyBucket');
    }
}

results in the template:

Resources:
  MyBucket:
    Type: AWS::S3::Bucket
    UpdateReplacePolicy: Retain
    DeletionPolicy: Retain
    Metadata:
      aws:cdk:path: MyStack/MyBucket/Resource

Hope this helps!

skinny85 on 29 Apr 2020

@skinny85 Thank you! The second should work for me.

RoKish on 30 Apr 2020

NP, glad I could help 🙂

skinny85 on 30 Apr 2020

@skinny85 How would you go about overriding allocateLogicalId in Python? Directly translating the code doesn't seem to work for me.