Amplify-cli: Add list field to specify projected attributes on @key directives

Created on 14 Oct 2019 · 14Comments · Source: aws-amplify/amplify-cli

Is your feature request related to a problem? Please describe.
The major drawing point of GraphQL is that you can query for only the data that you need. While this feature still works to save bandwidth when it comes to data in and out of AWS, because all scan-filtering / querying happens on the server side, but when it comes to the AppSync <-> Dynamo communication, there is no such optimization.

The fact that Amplify generates GSIs which project all attributes leads to consuming more RCUs and WCUs than otherwise needed when intending to read from or write to a key (e.g. a GSI). For a customer at scale this non-optimization could become very costly.

Describe the solution you'd like
Add a list field to specify projected attributes on @key directives.

Describe alternatives you've considered
Not much option besides creating a GSI outside of Amplify and overriding all Amplify generated resolvers to reference the other index.

enhancement graphql-transformer

Source

jkeys-ecg-nmsu

👍13

Most helpful comment

@oyof8376 I think one way the Amplify team prioritizes their backlog is by tracking number of +1s on issues, so please +1 this issue.

jkeys-ecg-nmsu on 1 Nov 2019

👍7

All 14 comments

@jkeys-ecg-nmsu could you please give an example what are you thinking of exactly? @key supports a list of fields that will be used to build up the keys for the GSI.

attilah on 14 Oct 2019

@attilah sorry if my feature request was unclear. Basically, whenever the @key directive provisions a GSI, it projects all attributes to that index.

Since most access patterns relying on custom keys won't need to project all attributes, there should be another field that allows the user to set which fields to project.

Let's say we have this schema:

type User @model 
@key(fields: ["id"])
@key(name: "userById", fields: ["id"])
@key(name: "userByExternalPrimaryKey", fields: ["externalPrimaryKey"]) {
  id: ID!
  externalPrimaryKey: String!
  largeBlob1: AWSJSON                    #only internal APIs will need field
  largeBlob2: AWSJSON                    #only external APIs ("externalPrimaryKey") will need field
}

So here id is a unique identifier for a user in our internal system, externalPrimaryKey is a UUID for the same user provided by an external system. We know our access pattern is that anyone querying by id will never need largeBlob2, and we know that anyone querying by externalPrimaryKey will never need largeBlob1.

The way Amplify currently works, it will project attribute largeBlob2 to the userById index, and it will project largeBlob1 to the userByExternalPrimaryKey index. This leads to consuming more RCUs than necessary when querying, and more WCUs than necessary when mutating (my assumption is that when the table is projecting to indices, it consumes WCUs proportional to the amount of data in the projected attributes for that GSI).

The way it should work is this:

type User @model 
@key(fields: ["id"])
@key(name: "userById", fields: ["id"], projectedAttributes: ["id","externalPrimaryKey", "largeBlob1"])
@key(name: "userByExternalPrimaryKey", fields: ["externalPrimaryKey"], projectedAttributes: ["id","externalPrimaryKey", "largeBlob2"]) {
  id: ID!
  externalPrimaryKey: String!
  largeBlob1: AWSJSON                    #only internal APIs will need field
  largeBlob2: AWSJSON                    #only external APIs ("externalPrimaryKey") will need field
}

Then GraphQL transformer only projects the attributes listed in projectedAttributes to the created GSIs.

Does that use case make sense?

Edit: fix typo.

jkeys-ecg-nmsu on 15 Oct 2019

👍6

Maybe this AWS resource can explain what I am asking for better than I can.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.Projections

"If you need to access just a few attributes with the lowest possible latency, consider projecting only those attributes into a global secondary index. The smaller the index, the less that it costs to store it, and the less your write costs are."

[...]

"If you need to access most of the non-key attributes on a frequent basis, you can project these attributes—or even the entire base table— into a global secondary index. This gives you maximum flexibility. However, your storage cost would increase, or even double."

So projecting the entire table into an index can be very costly. I hope that helps explain the ask.

jkeys-ecg-nmsu on 15 Oct 2019

I'm one more user needing GSI with projectedAttributes. Right now i have had to create it "on the side" in the AWS console, but it would be great to include it in the Amplify appsync stack.

oyof8376 on 31 Oct 2019

👍5

@oyof8376 I think one way the Amplify team prioritizes their backlog is by tracking number of +1s on issues, so please +1 this issue.

jkeys-ecg-nmsu on 1 Nov 2019

👍7

We need this feature too. Here is our scenario
We have a large field which is about 40kb and it is rarely used, but we don't keep this field across all the indexes. We would like to exclude this field for indexes. The workaround is to store this field in S3

gaochenyue on 27 Dec 2019

We could use this feature as well. Projecting all attributes is fine when the data set is small but at scale this is going to cost way more money for data that we don't need...

kldeb on 6 Jan 2020

+1 for this feature. I have a large data set and would like to not project all attributes into each index.

barryaconway on 2 Jun 2020

+1. Now this can be achieved through DynamoDB web admin console by manually creating global secondary indexes, but would be fine to achieve it directly with schema definition.

lazy-var on 16 Jun 2020

👍1

+100

mdoesburg on 23 Jul 2020

+1000

mujeex on 13 Aug 2020

+1. Now this can be achieved through DynamoDB web admin console by manually creating global secondary indexes, but would be fine to achieve it directly with schema definition.

How do I do that in the console please?

mujeex on 13 Aug 2020

This is duplicate for
https://github.com/aws-amplify/amplify-cli/issues/2135

@attilah
I made git patch for this request