Chalice: [proposal] AWS Managed Resouces for Chalice

Created on 5 Sep 2017 · 7Comments · Source: aws/chalice

Abstract

This issue proposes a mechanism for managing additional AWS resources
that may be relied on in a serverless Chalice application. In terms of
management, it will handle creates, updates, and deletion of the resources
upon deployment of the application and allow users to easily interact
with these resources within their application. The only AWS resouce that
will be added in this proposal is a DynamoDB table, but the mechanism should
be able to support any future AWS resource.

Motivation

Currently, Chalice does no management of any AWS resources that are part of
the core buisness logic of a Chalice application but are not part of
the Chalice decorator interface (i.e. app.lambda_function(), app.route(),
etc.). There are a lot of AWS resources an application may rely on such as
an S3 bucket or a DynamoDB Table. However, these resources must be created
out of band of the actual Chalice deployment process which is inconvenient
because:

There is no way to hook into the chalice deploy command. So users have
to deploy the resources themself outside of chalice deploy by manually
deploying the resources, relying on out of band deployment scripts, etc.
If users are using CloudFormation and chalice package, they will need to
modify the CloudFormation template to add the resources you need.
Users will need to manage the deployment of resources per stage if they are
looking for resources in each of these stages to be purely isolated from
the other stages.
Within the core logic of the Chalice application, there is nothing
guaranteeing that the AWS resource you are trying to access exists. The
user needs to be extra careful to make sure the out of band resources they
deploy match up with what is used in the Chalice application.

Therefore, it is a much more friendly user story if Chalice handles the
deployment of these resources for the user. Also since Chalice did the
deployment, it can easily provide references to those deployed resources
from within the Chalice application.

Specification

This section will go into detail about the interfaces for adding these
managed resources, code samples of how users will interact with the interface,
and the deployment logic in deploying these managed resources. Since
DynamoDB tables is the only resource this proposal is suggesting to add,
this section will be specific to DynamoDB tables.

To have Chalice manage AWS resources, users will first have to declare
their resources in code via resources.py file and then may configure these
resources using the Chalice config file.

Code Interface

The top level interface into these managed resources is a resources.py
file. This is used for declaring all additional AWS resources to be managed
by managed by Chalice in an application. The resources.py file will live
alongside the app.py file in a Chalice application:

myapp$ tree .
.
|-- app.py
|-- requirements.txt
|-- resources.py

Inside of the resources.py is where the various managed AWS resources are
declared and registered to the application. To better explain how the
resources.py file works, here is an example of the contents of the file:

from chalice.resources.dynamodb import Table

def register_resources(app):
    app.resource(MyTable)


class MyTable(Table):
    name = 'mytable'
    key_schema = [
        {
            'AttributeName': 'username',
            'KeyType': 'HASH'
        },
        {
            'AttributeName': 'rank',
            'KeyType': 'RANGE'
        }
    ]
    attribute_definitions = [
        {
            'AttributeName': 'username',
            'AttributeType': 'S'
        },
        {
            'AttributeName': 'rank',
            'AttributeType': 'N'
        }
    ]
    provisioned_throughput = {
        'ReadCapacityUnits': 20,
        'WriteCapacityUnits': 10
    }

The resources.py file requires a module level register_resource() function
to include any additional resources for Chalice to manage for the application.
The register_resources() function only accepts an app object representing
the Chalice application. Within the register_reources() function, users
must use the app.resource() method to include the resource in the
application. Currently, the app.resource() will only allow one argument being
the resource class to be registered. Furthermore, all resources registered
must have a unique logical name. The logical name for a Chalice resource
is the either the class name of the resource or the value of the name
property of a Chalice resource class.

To actually declare a managed resource, users must first import the
appropriate resource class from the chalice.resources.<service-name> module.
Then they must subclass from the desired resource class and provide the
appropriate class properties to configure the resource.

In the original example, the user first imports the Table class to use to
declare a DynamoDB table for their Chalice application. The user then creates
a new class MyTable from the Table class to flush out the properties
of the DynamoDB table they want. As it relates to the configurable class
properties of a DynamoDB table, they are as follows:

name: Is the name of the logical name of the table in their application.
It is important to note that the name of the DynamoDB table will not
actually match this value in order to support stages. If this value is
not provided, the name of the class will be used as the logical name
for the table in the application. In general, all resource classes
must allow users to set the name property.
key_schema: The KeySchema parameter to DynamoDB CreateTable API.
This defines what the hash key and potential a range key is for the table.
This value is required.
attribute_definitions: The AttributeDefinitions paramter to DynamoDB
CreateTable API. This defines the types of the specified keys.
This value is required.
provisioned_throughput: The ProvisionedThroughput parameter to
DynamoDB CreateTable API. This defines the read and write capacity
for a DynamoDB table. This value is required.

With the resources.py fully flushed out Chalice will then deploy all of the
resources registered to the application in the register_resources()
function.

The resources then can be accessed from within the Chalice application. With
the addition of the resources.py file, the chalice.Chalice app object will
be updated to include a resources property.

class Chalice(object):
    ...
    self.resources = Resources()

The resources property serves as a way of referencing values for deployed
resources.

The Resources() class interface will be the following:

class Resources(object):
    def get_service(self, resource_name):
        # type: (str) -> str

    def get_resource_type(self, resource_name):
        # type: (str) -> str

    def get_deployed_values(self, resource_name):
        # type: (str) -> Dict[str, Any]

For the Resources class, its methods are the following:

get_service() - Returns the name of the service the resource falls
under. The service name should match the service name used in boto3.
For example, a DynamoDB table will return a value of dynamodb.
get_resource_type() - Returns the type of the resource. This should
match the name of resource class under chalice.resource.<service> module,
which should match the name of the boto3 resource (assuming there
is a boto3 resource available for this AWS resource type). For
example, a DynamoDB table will return the a value of Table.
get_deployed_values() - Returns a dictionary of the deployed values
of a resource. The deployed values are typically identifiers for the
resource. The key names in this dictionary should match the parameters a
user would typically use in a boto3 client call for that service's API.
For a DynamoDB table, the deployed values dictionary will be the following:
{"TableName": "<name-of-deployed-table>"}

To interact the with the deployed resources in the application, refer to the
previous resources.py and the following app.py:

from chalice import Chalice
import boto3


app = Chalice(app_name='myapp')
dynamodb = boto3.resource('dynamodb')


@app.route('/users/{username}')
def get_user(username):
    deployed_table_name = app.resources.get_deployed_values(
        'mytable')['TableName']
    table = dynamodb.Table(deployed_table_name)
    response = table.get_item(Key={'username': username})
    return response['Item']

In the above example, the application was able to retrieve the name
of the deployed DynamoDB table by calling the get_deployed_values() method.

Furthermore, if a user wants to programatically create a client or
resource object for a particular deployed resource. Respectively, users could
write the following helper functions:

def get_boto3_client(resource_name):
    return boto3.client(app.resources.get_service_name(resource_name))


def get_boto3_resource(resource_name, *resource_identifiers):
    return getattr(
        boto3.resource(app.resources.get_service_name(resource_name)),
        app.resources.get_resource_type(resource_name))(*resource_identifiers)

Config Interface

In the case a user wants the configuration options to vary by stage,
users can specify configuration of the managed resources through the Chalice
config file. To configure a resource, the user would need to specify the
following general configuration:

"resources": {
  "<logical-resource-name>": {
    "<option-name>": "<option-value>"
  }
}

As it relates to DynamoDB tables, the only options available will be
configuring the provisioned capacity. Below is a sample configuration for
the previously provided DynamoDB table in the resources.py:

"resources": {
  "mytable": {
    "provisioned_throughput": {
       "ReadCapacityUnits": 50,
       "WriteCapacityUnits": 10
    }
  }
}

For DynamoDB tables, only read capacity and write capacity can be specified.

The "resources" configuration can be specified at a top level key and per
stage basis where the values in the stage completely replace any that exist in
the top level key configuration. In addition, any resource specific
configuration provided in the config file will replace whatever values that
were specified in code.

For example, take the following defined table in the resource.py file:

from chalice.resources.dynamodb import Table

def register_resources(app):
    app.resource(MyTable)


class MyTable(Table):
    name = 'mytable'
    key_schema = [
        {
            'AttributeName': 'username',
            'KeyType': 'HASH'
        },
    ]
    attribute_definitions = [
        {
            'AttributeName': 'username',
            'AttributeType': 'S'
        },
    ]
    provisioned_throughput = {
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }

With the following Chalice config file:

{
  "version": "2.0",
  "app_name": "myapp",
  "resources": {
    "mytable": {
      "provisioned_throughput": {
        "ReadCapacityUnits": 10,
        "WriteCapacityUnits": 10
      }
    }
  },
  "stages": {
    "dev": {},
    "prod": {
      "resources": {
        "mytable": {
          "provisioned_throughput": {
            "ReadCapacityUnits": 100,
            "WriteCapacityUnits": 20
          }
        }
      }
    }
  }
}

With this Chalice config file, the mytable table for each stage will have
the following configuration values:

"dev": Read capacity of 10 and write capacity of 10. Both values are
sourced from the top level configuration.
"prod": Read capacity of 100 and write capacity of 20. Both values are
sourced from the "prod" stage configuration.

However if there is was no top-level "resources" key in the config file,
the dev stage will use the values specified in the resources.py file,
which were read capacity of 5 and write capacity of 5.

Deployment Logic

In terms of deployment logic, both chalice deploy and chalice package
will be supported.

When it comes to the chalice deploy command, Chalice will look at all
of the resources created under the Chalice.resources property and
individually deploy and make any updates to the resource using the
service's API directly. It is important to note that if the user changes
the logical Chalice name of the resource, it will be deleted on Chalice
redeploys.

Once deployed, it will save all of the deployed resources under the
"resources" key in the deployed.json whose value will be a dictionary
that contains values related to the various managed resources. The format of
the dictionary will be as follows:

{
    "<logical-resource-name>": {
       "service": "<service-name>",
       "resource_type: "<resource-type>",
       "properties": {
           ... various idenitfiers and properties of the resource...
       }
    }
}

As it relates to the specific keys:

The top level key is the name of the resource that got registered to the
application.
The "service" key is the name of the service module the resource falls
under. In general the service module, should match the name used to
instantiate a botocore client.
The "resource_type" key is the name of the resource class. In general, the
name of the resource class should match the name of the class used by
the boto3 resource.
The "properties" key contains identifying values related to the
resource. The keys and values should match up with the values used in the
botocore client calls.

Taking the previous Table example, the value of the "resources" key will
look like the following in the deployed.json:

{
    "mytable": {
        "service": "dynamodb",
        "resource_type": "Table",
        "properties": {
            "TableName": "myapp-dev-mytable"
        }
}

Making the entire deployed.json look like the following:

{
  "dev": {
    "api_handler_name": "chalice-trivia-dev",
    "api_handler_arn": "arn:aws:lambda:us-west-2:934212987125:function:myapp-dev",
    "resources": {
      "mytable": {
        "service": "dynamodb",
        "resource_type": "Table",
        "properties": {
            "TableName": "myapp-dev-mytable"
         }
      }
    },
    "lambda_functions": {},
    "backend": "api",
    "chalice_version": "1.0.0b1",
    "rest_api_id": "448qxrx2vj",
    "api_gateway_stage": "dev",
    "region": "us-west-2"
  }
}

For the chalice package command, it will take the resources in the
application and add it to the CloudFromation template. The generated
CloudFormation template will use the AWS::DynamoDB::Table resource type to
create the DynamoDB resource.

It is also important to note the actual name of the DynamoDB table that will
be created for both deployment methods will be
"<app-name>-<stage-name>-<logical-table-name>". So if in the application,
the user adds a table to the application "myapp" called "mytable", the
deployed table will be called "myapp-dev-mytable-dev" when deployed to the
"dev" stage.

Rationale/FAQ

Q: How do you imagine the interface will grow for future resources?

A lot of the future managed resources will be able to follow the same
pattern of the DynamoDB table resource. In general to add support for
a new resource, the following changes will be needed:

Add a new service module under chalice.resources if needed and add a
base class for that resource for users to subclass from.
Add the necessary chalice deploy and chalice package logic for the
resource
Allow for any necessary configurations in Chalice config

To get a better understanding of potential future interfaces, here are some
rough sketches for future AWS resouces.

S3 Bucket

Here is some sample applications on how a user may rely on Chalice to manage
and interact with their S3 bucket:

The resources.py file would be the following:

# Example of making thumbnails from a source bucket:
# http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html
from chalice.resources.s3 import Bucket

def register_resources(app):
    app.resource(SourceBucket)
    app.resource(TargetBucket)


class SourceBucket(Bucket)
    pass


class TargetBucket(Bucket)
    pass

Then the app.py would be the following:

# Example of making thumbnails from a source bucket:
# http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html
import io

from chalice import Chalice
import boto3

app = Chalice(app_name='myapp')

s3 = boto3.resource('s3')
source_bucket = s3.Bucket(
    app.resources.get_deployed_values('SourceBucket')['Bucket'])
target_bucket = s3.Bucket(
    app.resources.get_deployed_values('TargetBucket')['Bucket'])


# This is just an example of how an S3 event may look in the future.
# There is no guarantees on this interface.
@app.s3_event(source_bucket.name, event_type='ObjectCreated')
def save_thumbnail(event, context):
    image_stream = io.BytesIO()
    key = event['Records'][0]['s3']['object']['key']
    source_bucket.download_fileobj(key, image_stream)
    resized_image_stream = resize_image(stream)
    target_bucket.upload_fileobj(resized_image_stream, key)

SNS Topic

Here is a sample use of managing and interacting with an SNS topic:

The resources.py would be:

from chalice.resources.sns import Topic

def register_resources(app):
    app.resource(MyTopic)


class MyTopic(Topic)
    pass

And then in the app.py, users could publish messages to this SNS topic:

import json

from chalice import Chalice
import boto3

app = Chalice(app_name='myapp')
sns = boto3.client('sns')


@app.lambda_function()
def publish(event, context):
    arn = app.resources.get_deployed_values('MyTopic')['TopicArn']
    # Publish the message provided to the route.
    sns.publish(TopicArn=arn, Message=json.dumps(event))

Q: Why have users to specify the resources in code (instead of a config file)?

It is a much more intuitive and user friendly interface. The other option
would be the user specifies it in some configuration file and the resource
would be automatically created and can start being used in the lambda function. So something like:

from chalice import Chalice

import boto3

app = Chalice(app_name='myapp')
dynamodb = boto3.resource('dynamodb')


@app.route('/users/{username}')
def get_user(username):
    # Note: There is no code that actually adds the resource
    table = app.resources.get_deployed_values('mytable')
    response = table.get_item(Key={'username': username})
    return response['Item']

The problems with this approach are the following:

Python developers typically prefer to be working in Python code as opposed
to JSON or YAML. Also, Python is easier to write, validate, and extend.
It just seems too implicit that the dynamodb table will be automatically
available with no explicit actions in the code that created the table,
especially if the resource is a core part of their application logic.

Q: Why separate the resources into a resources.py file?

The main reason is that from a user's perspective it adds a nice layer of
separation from the core logic in the app.py and the additional AWS
resources they may require. Putting all of the declaration of resources in
the app.py file makes the app.py file bloated especially if the user
has a lot of resources. Furthermore, the resource decalartion is only really
needed for the deployment of the application, thus it does not makes sense to
have these classes in the runtime when the classes are not going to be used
directly by the application's core logic.

Q: Why have specific classes for each resource type instead of a general Resource class?

Both the resources.py and the chalice.resources package will not be
included in the deployment package. Then since the resources are not included
in the deployment package, the number of resources is not constrained by
deployment performance or Lambda package size. By having a specific class for
each resource, it allows for:

Better readability and code completion for users as there is a specific
base class for each resource that they can import.
Better validation. Instead of relying on validation of the declared classes
in the deployer, we can validate declared classes through specific
metaclasses for that resource class.
Better extensibility. In general, not being coupled to a single Resource
class allows us to add specific functionality for a resource if needed.

Q: Why have users subclass from a resource class and then define class properties instead of having them instantiate the class directly?

This was chosen for a couple of reasons:

It follows the declarative style of writing a Chalice application.
Instantiating an instance of a resource class is a more imperative style.
It allows users to leverage inheritance when declaring resources. For
example, they would be able to create a Table resource class with a default
provisioned throughput that can be subclassed by any other table class
and inherit the provisioned throughput configurations.

Q: Why can't resources managed by a Chalice application share the same logical name in a Chalice application?

It is a combination of making it easier to interact with the resources
declared in the resources.py and there being a strong reason for wanting
the ability to share the same logical name in Chalice. Specifically:

If resources could share the same name, the app.resources methods would
require the service name and resource type to be specified along with
the resource name to get the deployed values. The Chalice config file
would also require another level of nesting.
Given Chalice resources are declared by defining classes, resource classes
in general cannot share the same name as it may clobber a previously
declared class in the resources.py file. The only way to make the logical
name the same would be by setting the name property of two declared
resource classes to be the same.
Currently, users are unable to explicitly set the name of the the resources
that Chalice deploys on their behalf. This is because with the
existance of stages, deployed resources already need to have different
names so that resources can be partitioned by stage and there are no
shared resources between stages.
AWS resources of the same type generally cannot share the same name.
So sharing the same name would only be for sharing the same name across
resource types. However to reiteratte, the exact name of the deployed
resource cannot be explicitly set by the user.

Q: What if users require further configuration (i.e. secondary indexes)?

That would not be currently supported. We would need to expose deployment
hooks or add the class property to the base class. However it may be possible
in the future for users to define their own resource classes and register
their custom resources to their application.

Future Work

This section talks about ideas that could be potentially pursued in the future
but will not be addressed in this initial implementation.

Custom Resource Classes

This idea would enable users to define their own resources that can be
managed by Chalice. The purpose of allowing this would be if a user wants
Chalice to manage a resource that currently does not have first class
support or maybe there is additional logic they want to add on an existing
resource type. In order to support this, the general resource interface
will need to be solidified and figure out how users would be able to plumb
in deployment logic for that resource.

Simplified Resource Classes

This idea would allow users to specify resources that have a simplified
configuration. The purpose of adding these is to help users that are either
new to AWS or users that do not necessarily need all of the different
resource parameters. This is ultimately done by reducing and simplifying
the configuration parameters a user would have to specify in a class. Potential
resource classes include: the serverless SimpleTable and an S3 bucket (if an S3 bucket resource gets exposed)
that exposes configuration parameter solely for the purpose of hosting content
for a website.

proposals

Source

kyleknap

👍11 ❤4 👀1

Most helpful comment

We still don't have an ETA on when the implementation will be complete. I have done some work to get a rough POC together, but I also made changes to the design that I originally wrote. So I would also like to get a draft together of those proposed changes before getting a formal PR. However because we have support for experimental features in chalice, it should be a lot easier to add as this would likely be an experimental feature.

kyleknap on 18 Feb 2019

👍3

All 7 comments

This looks awesome. Any idea when this might be supported?

aalvrz on 18 Dec 2017

This is pending the work on https://github.com/aws/chalice/issues/604, which essentially makes the deployer code less API Gateway/Lambda specific. This is going to make it easier to support new resource types. I'm actively working on #604 but don't have a concrete ETA.

jamesls on 28 Dec 2017

👍1

Now that the new deployer has been finished and merged into master, what would be the starting point to begin adding functionalities described in this issue? I would love to start contributing on being able to add necessary resources for a Chalice app.

aalvrz on 27 Mar 2018

SQS triggers are showing up in what I assume is a soft-release in the console and in botocore, hopefully we can use these in Chalice soon! :)

kadrach on 29 Jun 2018

👍1

When do you think the functionality described in this issue will be completed? Would love to be able to manage my s3 buckets and dynamodb tables within my chalice code.

vbloise3 on 15 Feb 2019

kyleknap on 18 Feb 2019

👍3

Could one have some sort of simpler intermediate solution to address the points raised in the Motivation (one that still makes sense / provides value after this feature is done)? E.g., allowing the user to specify a cloudformation include file that chalice package would automatically pull in? Or does it make sense to allow chalice to generate parametric stacks that could be nested in some bigger stack that has extra resources like the DB etc?

I haven't used either of those CF features, but maybe there's some way to get low-level, but full, coverage across other AWS resources types that way -- if nothing else to get everything deployed/updated together with 1 or 2 CLI commands, but perhaps even with some uni- or bidirectional ability to reference by name between chalice and non-chalice resources.