Is your feature request related to a problem? Please describe.
The major drawing point of GraphQL is that you can query for only the data that you need. While this feature still works to save bandwidth when it comes to data in and out of AWS, because all scan-filtering / querying happens on the server side, but when it comes to the AppSync <-> Dynamo communication, there is no such optimization.
The fact that Amplify generates GSIs which project all attributes leads to consuming more RCUs and WCUs than otherwise needed when intending to read from or write to a key (e.g. a GSI). For a customer at scale this non-optimization could become very costly.
Describe the solution you'd like
Add a list field to specify projected attributes on @key
directives.
Describe alternatives you've considered
Not much option besides creating a GSI outside of Amplify and overriding all Amplify generated resolvers to reference the other index.
@jkeys-ecg-nmsu could you please give an example what are you thinking of exactly? @key
supports a list of fields that will be used to build up the keys for the GSI.
@attilah sorry if my feature request was unclear. Basically, whenever the @key
directive provisions a GSI, it projects all attributes to that index.
Since most access patterns relying on custom keys won't need to project all attributes, there should be another field that allows the user to set which fields to project.
Let's say we have this schema:
type User @model
@key(fields: ["id"])
@key(name: "userById", fields: ["id"])
@key(name: "userByExternalPrimaryKey", fields: ["externalPrimaryKey"]) {
id: ID!
externalPrimaryKey: String!
largeBlob1: AWSJSON #only internal APIs will need field
largeBlob2: AWSJSON #only external APIs ("externalPrimaryKey") will need field
}
So here id
is a unique identifier for a user in our internal system, externalPrimaryKey
is a UUID for the same user provided by an external system. We know our access pattern is that anyone querying by id
will never need largeBlob2
, and we know that anyone querying by externalPrimaryKey
will never need largeBlob1
.
The way Amplify currently works, it will project attribute largeBlob2
to the userById
index, and it will project largeBlob1
to the userByExternalPrimaryKey
index. This leads to consuming more RCUs than necessary when querying, and more WCUs than necessary when mutating (my assumption is that when the table is projecting to indices, it consumes WCUs proportional to the amount of data in the projected attributes for that GSI).
The way it should work is this:
type User @model
@key(fields: ["id"])
@key(name: "userById", fields: ["id"], projectedAttributes: ["id","externalPrimaryKey", "largeBlob1"])
@key(name: "userByExternalPrimaryKey", fields: ["externalPrimaryKey"], projectedAttributes: ["id","externalPrimaryKey", "largeBlob2"]) {
id: ID!
externalPrimaryKey: String!
largeBlob1: AWSJSON #only internal APIs will need field
largeBlob2: AWSJSON #only external APIs ("externalPrimaryKey") will need field
}
Then GraphQL transformer only projects the attributes listed in projectedAttributes
to the created GSIs.
Does that use case make sense?
Edit: fix typo.
Maybe this AWS resource can explain what I am asking for better than I can.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.Projections
"If you need to access just a few attributes with the lowest possible latency, consider projecting only those attributes into a global secondary index. The smaller the index, the less that it costs to store it, and the less your write costs are."
[...]
"If you need to access most of the non-key attributes on a frequent basis, you can project these attributes—or even the entire base table— into a global secondary index. This gives you maximum flexibility. However, your storage cost would increase, or even double."
So projecting the entire table into an index can be very costly. I hope that helps explain the ask.
I'm one more user needing GSI with projectedAttributes. Right now i have had to create it "on the side" in the AWS console, but it would be great to include it in the Amplify appsync stack.
@oyof8376 I think one way the Amplify team prioritizes their backlog is by tracking number of +1s on issues, so please +1 this issue.
We need this feature too. Here is our scenario
We have a large field which is about 40kb and it is rarely used, but we don't keep this field across all the indexes. We would like to exclude this field for indexes. The workaround is to store this field in S3
We could use this feature as well. Projecting all attributes is fine when the data set is small but at scale this is going to cost way more money for data that we don't need...
+1 for this feature. I have a large data set and would like to not project all attributes into each index.
+1. Now this can be achieved through DynamoDB web admin console by manually creating global secondary indexes, but would be fine to achieve it directly with schema definition.
+100
+1000
+1. Now this can be achieved through DynamoDB web admin console by manually creating global secondary indexes, but would be fine to achieve it directly with schema definition.
How do I do that in the console please?
This is duplicate for
https://github.com/aws-amplify/amplify-cli/issues/2135
@attilah
I made git patch for this request
+1 Any updates on this?
Most helpful comment
@oyof8376 I think one way the Amplify team prioritizes their backlog is by tracking number of +1s on issues, so please +1 this issue.