(Work in progress)
There have been multiple requests for a new mechanism to define & query custom indexes for @model types. This will allow users to implement much more specific and efficient access patterns for data in their applications and fill a number of holes that have not yet been supported. This design will take a holistic view at improving the usability of @model types and will suggest changes to multiple directives.
First a little background on DynamoDB indexes. A DynamoDB table by default has a single primary index that is made up attributes called the partition key and optionally the sort key. The partition key for a row designates which partition (aka which physical host) will store the data under the hood and the sort key optionally specifies how the data should be sorted within that partition. The primary partition/sort key combo identifies a unique row in the table.
DynamoDB exposes 3 primary operations to read data from a table (or index).
DynamoDB's design makes it efficient at scale but also makes it relatively inflexible. A single table without any secondary indexes partitions and sorts information by at most two attributes. To design new access patterns into an existing DynamoDB table you have two options:
DynamoDB leverages its internal data structures to provide efficient access at any scale, but, this also comes at the trade-off of flexibility. When designing a DynamoDB table you first need to think about your access patterns and design the index structures around those requirements.
The goal of this design is to allow users to easily define and query custom index structures for @model types. Here are a few primary use cases.
Stretch goals:
https://github.com/aws-amplify/amplify-cli/issues/152
https://github.com/aws-amplify/amplify-cli/issues/487
https://github.com/aws-amplify/amplify-cli/issues/956
https://github.com/aws-amplify/amplify-cli/issues/853
https://github.com/aws-amplify/amplify-cli/issues/902
In PR: #1358
There is currently no support for passing sort key conditions to query operations created by the @connection directive. Key conditions offer a different set of operator than filter expressions so we are unable to reuse the ModelXFilterInput
types that are currently generated. Read more about key conditions in the Amazon DynamoDB docs https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/LegacyConditionalParameters.KeyConditions.html.
Given a schema:
type Post @model {
id: ID!
name: String!
comments: [Comment] @connection(name: "PostComments", keyField: "postId", sortField: "statusDate")
}
type Comment @model(queries: null) {
id: ID!
post: Post! @connection(name: "PostComments", keyField: "postId", sortField: "statusDate")
postId: ID!
statusDate: String!
}
This work would generate additional input types and update the generated connection resolver.
# The new input types.
input ModelStringKeyConditionInput {
eq: String
le: String
lt: String
ge: String
gt: String
beginsWith: String
between: [String]
}
input ModelIDKeyConditionInput {
eq: ID
le: ID
lt: ID
ge: ID
gt: ID
beginsWith: ID
between: [ID]
}
input ModelIntKeyConditionInput {
eq: Int
le: Int
lt: Int
ge: Int
gt: Int
between: [Int]
}
input ModelFloatKeyConditionInput {
eq: Float
le: Float
lt: Float
ge: Float
gt: Float
between: [Float]
}
# The updated post type
type Post {
id: ID!
name: String!
# new statusDate sort key condition input
comments(
statusDate: ModelStringKeyConditionInput,
filter: ModelCommentFilterInput,
sortDirection: ModelSortDirection,
limit: Int,
nextToken: String
): ModelCommentConnection
}
With this in place you can create comments with a "statusDate" such as "PUBLISHED_2019-04-25" and then run a query like shown below to get a post and its comments published in April, 2019.
query {
getPost(id: 1) {
comments(statusDate: { startsWith: "PUBLISHED_2019-04" }) {
items {
...
}
}
}
}
directive @key(name: String, fields: [String!]!) on OBJECT
This @key directive simplifies the syntax for creating indexes on @model types. The fields argument specifies a list of fields that will be included as part of the key for the index. The first element in this list always denotes the hash key and all subsequent fields will be joined into a composite sort key. If a name is provided, creates a GSI or LSI else the key is assumed to be the primary key.
Use @key
without supplying a name to specify the tables primary key. You may have one per @model
.
# Table(HashKey = 'email')
type User
@model
@key(fields: ["email"]) # add a second to specify the sort key.
{
email: String!
fname: String
lname: String
}
When overriding the primary key for a model, the default Query.getX
resolver will include a required argument for each member of fields. If we define a type
type Comment
@model
@key(fields: ["postId", "commentId"])
{
postId: ID!
commentId: String!
content: String!
}
then the generated Query.getComment
field will look like:
type Query {
getComment(postId: String!, commentId: String!): Comment
}
Use @key
and supply a name to create a secondary index.
# Table(HashKey = 'id', GSIS = [{ HashKey = 'publicationId', SortKey = 'date' }])
type Post
@model
@key(name: "ByPublicationByDate", fields: ["publicationId", "date"])
{
id: ID!
publicationId: String!
date: String!
}
When building applications with DynamoDB, we use composite keys to create query patterns that filter by more then 2 fields. Let's say you wanted all published posts for a given publication in 2018. If we were using a relational database we could store objects like:
{
"id": "post-1",
"publicationId": "publication-1",
"status": "published",
"date": "2018-12-25"
}
and then query them as expected using SQL.
SELECT *
FROM PostTable
WHERE status = 'published'
AND publicationId = 1
AND date > '2018' AND date < '2019'
DynamoDB, unlike a relational database, is a distributed hash table that horizontally scales but that can be effectively queried by at most two attributes at a time. Composite keys allow us to build more complex query patterns despite this restriction. To implement the same query pattern as described, you can create a composite key called statusDate from the two existing fields status and date. After making the change, we would store objects like:
{
"id": "post-1",
"publicationId": "publication-1",
"status": "published",
"date": "2018-12-25",
"statusDate": "published|2018-12-25"
}
To get all published posts for a given publication in 2018, we need an index using publicationId as the partition key and statusDate as the sort key. Using this index, we can run the following DynamoDB query to get the published posts for a publication in 2018.
{
"expression": "publicationId = :pubId AND BEGINS_WITH(statusDate, :statusDateFilter)",
"expressionValues": {
":pubId": 1,
":statusDateFilter": "published|2018"
}
}
The fields argument was designed to simplify the process of creating and managing composite keys. The first string in fields specifies the partition key, the second string in fields specifies the sort key, and each subsequent string in fields will be joined into a composite key automatically by the AppSync resolver. To implement the query pattern described above, we can use this SDL definition:
type Post
@model
@key(
name: "ByPublicationByStatusByDate",
fields: ["publicationId", "status", "date"]
)
{
id: ID!
publicationId: String!,
status: String!,
date: String
}
This will create an index named ByPublicationByStatusByDate on two attributes publicationId and a new statusDate attribute that will be managed for you behind the scenes. The new attribute will also create a top level query field:
type Query {
queryPostByPublicationByStatusByDate(
publicationId: String!,
status: String,
date: StringKeyConditionInput,
filter: ModelPostFilterInput,
nextToken: String,
limit: Int
): PostConnection
}
that is configured to effectively use the custom index. The arguments may seem a bit strange at first glance. Why is publicationId of type String!, status of type String, and date of type StringKeyConditionInput? The publicationId is required because it is the index's partition key and cannot be omitted. The status is the first component of the composite key named statusDate and only supports strict equality. The date is the last component of the composite key named statusDate and supports the full set of key condition operators.
query {
queryPostByPublicationByStatusByDate(
publicationId: "publication-1",
status: "published",
date: { beginsWith: "2018" }
) { items { id status date } }
}
To enable this query, the queryPostByPublicationByStatusByDate resolver will have this basic shape of a mapping template:
{
"operation": "Query",
"index": "ByPublicationByStatusByDate",
"query": {
"expression": "publicationId = :pubId AND BEGINS_WITH(statusDate, :statusDatePrefix)",
"expressionValues": {
":pubId": "$ctx.args.publicationId",
":statusDatePrefix": "$ctx.args.status|$ctx.args.date",
}
}
}
Other Considerations
queryPostByPublicationByStatusByDate(publicationId: "publication-1", date: { beginsWith: "2018" })
{ eq: "2018-12-25" }
for date would result in the = operator being used instead of BEGINS_WITH. The same is true for gt, ge, lt, le.fields: ["status", "data"]
to fields: ["geohash", "status", "data"]
will require an explicit back fill of existing records. We also need to be careful about how we name attributes in the index and be explicit about the behavior when the fields change. CFN templates cannot "update" a GSI in place and you can also not remove an existing GSI in the same operation as creating a new GSI (as we have seen before in previous issues). My recommendation would be to recommend that in the future you should opt towards creating new @key
s, back-filling them with data using the existing @key
and whatever new attributes are needed, and then removing the old @key
when you are finished and certain that the new access pattern is working.(WIP. Update with @connection
s or replacement to view connected behavior)
Let's try to implement a non-trivial example of an application with these access patterns:
type Customer
@model
@key(fields: ["email"])
{
email: String!
orders: OrderConnection
}
type Order
@model
@key(name: "ByCustomerByDate", fields: ["customerEmail", "orderDate"])
@key(name: "ByVendorByDate", fields: ["vendorId", "orderDate"])
{
id: ID!
customerEmail: String!
orderDate: AWSDateTime!
vendorId: String!
}
type Driver
@model
@key(fields: ["email"])
@key(name: "ByStatusByLocation", fields: ["status", "location"])
{
email: ID!
location: String!
status: String!
}
type Vendor @model {
id: ID!
name: String!
}
type Item
@model
@key(name: "ByDriverByDate", fields: ["driverEmail", "orderDate"]) {
id: ID!
driverEmail: String!
orderDate: String!
}
Compare this against option a. This design comes from an earlier version.
The first problem to tackle is that of defining the indexes themselves. To solve this problem, I propose introducing a new directive @index.
directive @index(
name: String!,
key: IndexKey!,
projection: IndexProjection = { type: ALL },
# If provided, we generate a top level query field for the index.
queryField: String
)
input IndexKey {
partition: String!
sort: String
}
input IndexProjection {
type: IndexProjectionType!
# If using the INCLUDE projection type, specify which attributes to include in the index.
attributes: [String]
}
enum IndexProjectionType {
ALL KEYS_ONLY INCLUDE
}
These new directives can be used to configure secondary index structures of an @model type.
type Post
@model
@index(
name: "ByBlog",
key: { partition: "blogId", sort: "createdAt" }
)
@index(
name: "ByCategory",
key: { partition: "category", sort: "createdAt" },
queryField: "listPostsByCategory"
)
@index(
name: "RecentlyUpated",
key: { partition: "blogId", sort: "updatedAt" },
queryField: "listPostsByRecentlyUpdated"
)
{
id: ID!
blogId: ID!
title: String
category: String
createdAt: String!
updatedAt: String!
rating: Int!
}
type Query {
# Would query the index & filter on the auth rules in the response mapping template.
listPostsByCategory(
category: String!,
createdAt: StringModelSortKeyInput,
filter: StringModelFilterInput,
nextToken: String,
limit: Int
): PostConnection
# Would query the index & filter on auth rules in response mapping template.
listPostsByRecentlyUpdated(
category: String!,
createdAt: StringModelSortKeyInput,
filter: StringModelFilterInput,
nextToken: String,
limit: Int
): PostConnection
}
The above definition would create the following:
Alongside the power to define indexes must come the power to query them. To do this, I suggest adding a new argument to the @connection directive
directive @connection(
name: String,
useIndex: String,
keyField: String,
sortField: String
) on FIELD_DEFINITION
This directive allows you to create connection fields that target a custom index instead of creating an index on your behalf.
For example, you might have a schema like this:
type Comment
@model
@index(name: "ByPost", key: { partition: "postId", sort: "createdAt" })
{
id: ID!
postId: ID!
content: String!
createdAt: String
}
type Post @model {
id: ID!
comments: [Comment] @connection(useIndex: "ByPost")
}
The compiled output of this schema would include the following pieces:
type Post {
id: ID!
comments(
createdAt: StringSortKeyExpression,
filter: ModelPostFilterExpression,
nextToken: String,
limit: Int
): CommentConnection
}
The @connection directive will automatically pass the “id” value from the “parent” object as the partition key in the underlying query resolver. The Post.comments field would configure a DynamoDB query resolver against the Comment table and would automatically pass the post's “id” value as the “postId” (aka the partition key) in the query.
Alongside this work, we will be sprucing up the @auth directive to support more advanced authorization use cases. One of the first things to change will be that @auth will now be supported on FIELD_DEFINITIONs which enables you to specify auth rules for @connection fields.
For example, you could protect the Post.comments field so that you could only see them if you were the owner of the “Post” by doing the following:
type Comment
@model
@index(name: "RecentCommentsByPost", key: { partition: "postId", sort: "createdAt" })
@index(name: "CommentsByOwnerByPost", key: { partition: "postId", sort: "owner" })
{
id: ID!
postId: ID!
content: String!
createdAt: String
owner: String
# You can omit the useIndex on the single side of the relationship because
# the key is stored on this object. Providing the useIndex value is allowed
# as long as it is the same as the other side of the connection.
post: Post @connection
}
type Post @model {
id: ID! # This id will be forwarded as the hash key of the comments connection query.
owner: String
# The comment resolver will only run if the logged in user is the owner of the post.
comments: [Comment]
@connection(useIndex: "RecentCommentsByPost")
@auth(rules: [{ allow: owner }])
}
With this configuration, if the currently logged in user is not the owner of the post then the comments resolver will return null.
We have seen a number of GitHub issues resulting from a DynamoDB limit that specifies that you cannot remove a GSI and create a new GSI in a single CloudFormation stack update. In terms of the transform this means that you get an error when moving from:
type Post @model {
id: ID!
title: String
comments: [Comment] @connection
}
type Comment @model {
id: ID!
content: String
}
to
type Post @model {
id: ID!
title: String
comments: [Comment] @connection(name: "PostComments")
}
type Comment @model {
id: ID!
content: String
post: Post @connection(name: "PostComments")
}
The reason for the error is that the details of the underlying GSI have changed and although there will still only be 1 GSI defined on the CommentTable, the structure of this GSI is different. From DynamoDB's perspective this operation looks like a GSI needs to be removed and a new GSI needs to be created in a single stack update and thus the update fails due to the limitation. To avoid this issue from happening in the future, I want to propose a variation to how the @connection directive could work.
This is the current definition of the @connection directive.
directive @connection(name: String, keyField: String, sortField: String) on FIELD_DEFINITION
# name - If provided, used to name the underlying GSI
# keyField - If provided, specifies the partition key of the GSI
# sortField - If provided, specifies the sort key of the GSI
With the introduction of the @index
directive, there is technically no need for the @connection
directive to configure GSIs. Instead we can allow the @index
directive to define the indexes and the @connection
directive to wire up resolvers that use the index.
If we were to consider going down the path where @connection no longer configures indexes, I would suggest a new definition for the @connection directive:
directive @connection(useIndex: String, sourceField: String) on FIELD_DEFINITION
# useIndex - Specify the index to query to find results. Only valid when a @connection returns a list e.g. [Comment]
# sourceField - Specify which field on $ctx.source should be forwarded as the HASH key to the query on the index. Only valid when also providing "useIndex".
Alternative names for sourceField might be "hashField", "hashExpression", "indexHash". Another option is to allow VTL expressions so that you can create compound hash keys from multiple fields for overloading reasons e.g. something like
Post-${ctx.source.id}
.
The purpose of the @connection directive is to allow you to define relationships between @model types. In DynamoDB there are two primary ways to achieve this:
At its simplest, the @connection directive can help configure a situation as described in the first bullet point above. For example, we might have a schema with a one-directional connection such that a Post object contains pointers to a list of items.
type Post @model {
id: ID!
title: String
comments: [Comment] @connection
}
type Comment @model {
id: ID!
content: String
}
Assuming that we are operating under the new definition, the Post.comments field would store an inline list of primary keys in the Post table. To DynamoDB, a Post object might look like:
{
"id": { "S": "post1" },
"title": { "S": "some title" },
"comments": { "SS": ["comment1","comment2"] }
}
This can be generalized to work with DynamoDB tables that it works with a compound primary key as well. For example, we could have:
{
"id": { "S": "post1" },
"title": { "S": "some title" },
"comments": { "L": [{ "M": { "HashKey": "hashkey1", "SortKey": "sortkey1" }}]}
}
With this setup, the Post.comments field can be resolved using a BatchGetItem operation. DynamoDB's BatchGetItem operation is limited to 100 items and does not include any pagination concept so we would need to build one in resolver logic OR introduce the limit that the simple version of @connection only works if you are connecting fewer than 100 items (I think we can figure out the pagination scenario but this is worth bringing up). The compiled output of this schema would include a Post type like this:
type Post {
id: ID!
title: String
# Note there are no filter arguments here.
comments(
limit: Int,
nextToken: String
): CommentConnection
}
Here is pseudo-code for the new Post.comments resolver:
## Post.comments.req.vtl **
## Custom pagination logic that slices $ctx.source.comments based on $ctx.args.limit & $ctx.args.nextToken goes here ... **
#set($keys = $ctx.source.comments)
{
"operation": "BatchGetItem",
"tables": {
"CommentTable": {
"keys": $util.toJson($keys),
"consistentRead": true
}
}
}
## Post.comments.res.vtl **
$util.toJson($ctx.result.data.CommentTable)
In the case where the @connection field returns a single value, we use a GetItem operation as we do today.
In a situation where an object might be connected with a large number of other objects then you would want to use an index to store information about the relationship. Again, there are a bunch of ways to configure an index to store relationship info, but even a simple setup can serve you well. The goal is to have an index where the hash key contains a value that points back to the source of the relationship.
type Post @model {
id: ID!
title: String
comments: [Comment] @connection(useIndex: "ByPostId")
}
type Comment @model @index(name: "ByPostId", key: { hash: "postId", sort: "createdAt" }) {
id: ID!
content: String
postId: ID!
createdAt: String
}
In this example, the CommentTable has an index named "ByPostId" with a hash key "postId". To get all the comments for a single post, we can use a DynamoDB Query operation against the "ByPostId" index and pass id of the post as the hash key in the query. When using an index, the pagination and filter expressions on @connection fields will work like they do today, but we can do even better by adding a sort key condition argument as well. The compiled output of the above schema would include a Post type like this:
type Post {
id: ID!
title: String
comments(
# Sort key conditions specify where within a partition a Query operation should start reading.
# This is different than "filter" which occurs after reading from the partition.
createdAt: StringModelSortKeyInput,
filter: StringModelFilterInput,
nextToken: String,
limit: Int
): CommentConnection
}
The last thing to discuss is the "sourceField" argument. The purpose of this argument is to allow you to specify which value should be passed as the hash key in the @connection fields Query operation. By default, @connection will always use the $ctx.source.id
but this may not be the desired behavior. Here is a quick example.
type User @model {
id: ID!
username: String!
posts: [Post] @connection(useIndex: "ByUsername", sourceField: "username")
}
type Post @model @auth(rules: [{ allow: owner }]) @index(name: "ByUsername", key: { hash: "owner", sort: "createdAt" }) {
id: ID
# contains the username
owner: String
createdAt: String
}
By default, the User.posts resolver would pass $ctx.source.id
as the hash key but we have told it to instead use the $ctx.source.username
. This allows us to use the Post.owner attribute to implement the ownership authorization check as well as act as the joining attribute in the GSI.
Please leave your comments and concerns below. Thanks.
Thanks for your time and effort, but I believe none of these changes would tackle the issue #1039,
Please let me re-explain my issue,
Let's say I have a "Post" type as following:
type Post @model {
id: ID!
title: String!
editors: [PostEditor] @connection(name: "PostEditors")
categories: [PostCatoegory] @connection(name: "PostCategories")
likes: [PostLike] @connection(name: "PostLikes")
}
The "editors", "categories" and "likes" have MANY-TO-MANY relationships with "Writer", "Category" and "Reader" types respectively.
I can easily get:
with something like this:
query {
listWriters(
filter:{
name:{
eq: "X"
}
})
{
items{
posts{
items{
post{
title
}
}
}
}
}
}
But, my issue is how can I combine these filters, I need to get the posts that are written by the writer X, and in the category Y and are liked by the reader Z.
I believe this use case is very common in many apps and websites, think about things like marketplaces which have many filters based on many criteria.
My example has three filters but you can think of any number of filters someone may have.
It's possible (and I hope) that the issue is much simpler than how I am thinking about it.
I really appreciate your effort.
As DynamoDB supports only index on String, Number and Binary, would like to know how Boolean is handled.
type Comment
@model
@index(name: "BySomeCondition", key: { partition: "isSomeCondition", sort: "createdAt" })
{
id: ID!
postId: ID!
content: String!
isSomeCondition: Boolean!
createdAt: String
}
One possibility I can imagine is to use so-called 'sparse index' that write string "true"
when SomeCondition
is true, in the case of Boolean!
, write "false"
if SomeCondition
is false too
The @index proposal is exactly what we were looking for. Just want to make sure that @model will also allow the partition and sort key to be specified.
I am not sure how creating a GSI out of
input IndexKey {
partition: String!
sort: String
}
helps?
We need to specify a sort key when creating the Primary Key.
GSIs and Local Secondary Indexes are good to haves, but being able to fully specify a Primary Key (partition key + sort key) is the minimum to have a useful DynamoDB table.
I think many people will be disappointed if the result of all those people asking for a sort key ends up being a system that auto creates GSI but does not allow for composite Primary Key.
This proposal looks great, really looking forward to this!
@abualsamid @dougmlee I will update the RFC with more details for options to enable custom primary key structures.
@kftsehk Do you have a requirement for boolean fields in keys/indexes? The plan was to fail at build time when trying to use an invalid attribute type in an index.
@ialmoqren I will update the RFC to address these issues. In general, I would like to improve this proposal to handle using compound keys in indexes for advanced use cases.
For example, in your case you want the posts written by the writer X, in the category Y and are liked by the reader Z. In DynamoDB, you could configure this a few ways but one option is to create a table like this:
Table: Objects
PK:
- HashKey: ID
GSI1:
- HashKey: LikedByReader
- SortKey: WriterId/Category/PostId
The table might have contents that look like:
| ID | LikedByReader | WriterId/Category/PostId | Writer | Category |
| --- | --- | --- | --- | --- |
| Post1 | - | - | Writer1 | Fiction |
| Post2 | - | - | Writer2 | Suspense |
| Like1 | User1 | Writer1/Fiction/Post1 | - | - |
| Like2 | User1 | Writer2/Suspense/Post1 | - | - |
You would be able to efficiently get the posts you want by querying GSI1 for rows where the HashKey = "liking-user-id"
and the SortKey begins_with "WriterId/Category"
. The goal is to support these use cases and I will update the RFC with more details on how this will work.
@ialmoqren
It sounds like your use case is stretching the limits of nosql and dynamodb. For the type of advanced filtering that you might want to do, then you might want to consider a relational database, and/or using a lambda/server, and/or client side filtering.
I agree that you're talking about a common use case, but I think that people solve that use case by using SQL and relational databases. Also, it might be your lucky day because Amplify just announced support for Aurora (https://aws.amazon.com/about-aws/whats-new/2019/04/aws-amplify-announces-new-amazon-aurora-serverless--graphql--and/)
is there an idea when custom indexes are available, we need it very urgently
For those interested in relational data modeling in DynamoDB I recommend this presentation from re:invent 2018. https://www.youtube.com/watch?v=HaEPXoXVf2k
For those interested in relational data modeling in DynamoDB I recommend this presentation from re:invent 2018. https://www.youtube.com/watch?v=HaEPXoXVf2k
Thanks Michael, this presentation is exactly why we need custom indexes in amplify. All the advanced DynamoDB sessions at re:Invent tell you how you should/can model your whole application using a single Dynamo table by leveraging adjacency lists and overloading indexes. But all the Amplify examples go the opposite direction by creating a separate table, with single primary key, for every object in your model.
@abualsamid - see https://github.com/aws-amplify/amplify-cli/issues/431#issuecomment-444303514 for @mikeparisstuff's answer on why multiple dynamo db tables are created.
Eager to see this feature accepted 🥇
Exactly what I have been looking for
I would pay to get this feature now. How can we move this faster?
I would pay to get this feature now.
now most likely would mean building a sort of polyfill for @index yourself. Or as a workaround you can extend the functionality of @model transform that will take the another parameters that will pass the information needed for GSI to be created by CloudFormation. Here is precisely where CloudFormation template for @model table is created:
https://github.com/aws-amplify/amplify-cli/blob/84b3d93b67afa99dec482224d33238e7012ed389/packages/graphql-dynamodb-transformer/src/DynamoDBModelTransformer.ts#L131-L134
which generates CloudFormation template:
You would want to pass IndexName, KeySchema and ProjectionType and add GlobalSecondaryIndexes
key to DynamoDB.Table
above, so that the generated cloudformation template will have GlobalSecondaryIndexes
attribute look the following way:
"GlobalSecondaryIndexes": [
{
"IndexName": "GSI",
"KeySchema": [
{
"AttributeName": "TicketSales",
"KeyType": "HASH"
}
],
"Projection": {
"ProjectionType": "KEYS_ONLY"
},
"ProvisionedThroughput": {
"ReadCapacityUnits": 5,
"WriteCapacityUnits": 5
}
}]
To make this easier, one good way would be to write a unit test in graphql-dynamodb-transformer/src/__tests__
and debug the jest runner in vscode with launch.json
like:
{
"version": "0.2.0",
"configurations": [
{
"name": "Debug DynamoDB Transformer Tests",
"type": "node",
"request": "launch",
"runtimeArgs": ["--inspect-brk", "${workspaceRoot}/packages/graphql-dynamodb-transformer/node_modules/.bin/jest", "--runInBand", "--no-watchman"],
"console": "integratedTerminal",
"internalConsoleOptions": "neverOpen",
"cwd": "${workspaceRoot}/packages/graphql-dynamodb-transformer"
}
]
}
Hope this helps
How can we move this faster?
I personally see it this way
Community (us) should create PR on plugin-support for graphql-transformers.
Then we can develop a set of stand-alone transformers outside of core amplify-cli
. Given the missing functionality has such a high demand, maybe it makes good sense for community to create an umbrella repository under something like amplify-community
with community laid-out roadmap for graphql transformer. Then as a community we can manage and develop a set of transformers and use it until or in addition to when they will land into amplify-cli
.
All - We are actively working on this, and it's high on our radar. We appreciate your patience.
I have updated the design with a new section under the header "Item 2: The @key directive". Please take a look and offer feedback. @ambientlight @hisham @abualsamid @davekiss @buggy @nagad814 @dougmlee et al.
We are trying to develop a solution that solves all the necessary use cases while removing as much implementation specific detail and conceptual overload as possible. The new design tackles the use case of defining custom primary keys and goes a step further to simplify the process of creating composite keys and configuring resolvers that use them effectively. We are not yet planning to move from a multiple table setup to a single table setup (for reasons mentioned here https://github.com/aws-amplify/amplify-cli/issues/431#issuecomment-444303514) but can work towards offering an option that allows you to move to the single table design once your application's access patterns are fully understood.
@mikeparisstuff I'm certainly not a pro db architect, so pardon any missteps here.
I read through the @key
directive proposal twice, and the first time through it all felt fine up to the Composite Keys section. I've used an RDS on most of my projects so I'm still relatively new to the technical challenges that come along with querying a NoSQL store, but I'll say that storing/managing duplicate data on a record to effectively query it doesn't feel very natural.
I recently came off of a Google Cloud Firestore project, and querying against their document store (while not GraphQL) still seemed to work in a clean manner without discernible changes to the document fields. Dunno what their implementation looks like but might be worth taking a look.
There are probably good reasons against this approach, but I wonder what it'd look like to see the records fetched by the first argument assuming that will perhaps take care of most of the performance concerns/grunt work and then filtered down by pipeline resolvers afterwards. For example queryPostByPublicationByStatusByDate
would grab the posts by publication, filter down by status in a pipeline resolver and updating the record count, pass that to the next resolver to filter down by date and updating the record count and nextToken
and then return as a response.
However, the second read through resulted with a different sentiment. If AppSync is managing all of this stuff behind the scenes, is effective and performant, and I don't have to touch anything, then _who cares what it does?_ My app gets the data it needs, my customers go along their way, everyone is happy. Rick even demonstrated that point in his presentation.
The tricky part is that AppSync seems to be at this inflection point where it allows or suggests that you use the defaults and provides you with the ability to override those defaults when you need greater customization, which typically sounds like a huge plus –– but as the defaults become more and more complex, it becomes somewhat difficult to manually manage any time you wish to make a small (or larger) tweak.
The VTLs start looking pretty gnarly as more of these directives are added to the schema, some of them I'm already apprehensive to touch and I'm only a month in to my AppSync project.
I have updated the design with a new section under the header "Item 2: The @key directive". Please take a look and offer feedback. @ambientlight @hisham @abualsamid @davekiss @buggy @nagad814 @dougmlee et al.
@mikeparisstuff Item 2: The @key directive (option a) looks very very nice :)
Any idea when it will be implemented?
@mikeparisstuff Could we add support for modifiers to field values? I imagine something like "field|modifier1|modifier2|...". This would allow "field|lowercase" or "field|uppercase" to create case insensitive indexes which makes lists appear natural (all the A's and a's together) plus it makes queries easier.
We launched support for custom indexes today. You can find docs for the same out here - https://aws-amplify.github.io/docs/cli/graphql#key
Please let us know if you're still not able to solve your problem through this solution and we'll re-open this issue for you.
this implementation is exactly what i needed, thank you
The main issue I'm running into is that I have 125 GSI's across 35 tables in my schema.graphql
I have defined them all with the new @key directive with the latest version of the amplify cli (1.7.0). It successfully compiles with "amplify api gql-compile" - then when I push it to CloudFront it fails with "Cannot perform more than one GSI creation or deletion in a single update".
This is the major issue I'm running into - there does not appear to be any way within Amplify or CloudFront to be able to pause the deployment waiting for a GSI to complete. You can use a DependsOn attribute for a resource (ie. TABLE), but a GSI is defined within a resource so you can't pause the creation of GSI's waiting for previous GSI's to complete.
"GlobalSecondaryIndexes": [
{
"IndexName": "gsi-PlantAsset"
},
{
"IndexName": "gsi-PlantPhoto",
"DependsOn": "gsi-PlantAsset"
},
{
"IndexName": "gsi-PlantPackage",
"DependsOn": "gsi-PlantPhoto"
}
]
======================================
All ideas gratefully appreciated. We can define everything in amplify - but none of it is deployable because you can't get it to pause the deployment of GSI's that are defined
Do you have a requirement for boolean fields in keys/indexes?
@mikeparisstuff I have a case where I want to have an isArchived: Boolean field on an object and only return objects that are false. I know I can filter on the client but that's a lot of data to pay for (aws bill and payload) that will not be used.
@sacrampton your issue is discussed, here: #922
@aireater - in native DynamoDB without AppSync you can't index a boolean field, so there is no way to do it in AppSync as that is just a layer on top of the native DynamoDB. The solution I use for this is to make the table (type) @searchable so that the data streams into ElasticSearch then do a search query and it works fine. As a general rule, any time I need to filter I get the data from ElasticSearch.
@aireater Isn't this ideal scenario for sparse indexes?
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-general-sparse-indexes.html
Thanks @sceptyk, that makes sense. I'm going to change my boolean to a string that tracks an enum status field.
Most helpful comment
We launched support for custom indexes today. You can find docs for the same out here - https://aws-amplify.github.io/docs/cli/graphql#key
Please let us know if you're still not able to solve your problem through this solution and we'll re-open this issue for you.