Amplify-cli: amplify codegen model adds non-existent Id field and ignores the key definition

Created on 2 Jan 2020 · 18Comments · Source: aws-amplify/amplify-cli

Describe the bug
While generating the model files (Java/Android in my case) the codegen tool on its own adds Id field to the model, while it is NOT in the schema. Of course that doesn't work down the line because this field does not exist. The problem is here: https://github.com/aws-amplify/amplify-cli/blob/9951402777b047277e6e05c6cb84fea92c238e82/packages/amplify-codegen-appsync-model-plugin/src/visitors/appsync-visitor.ts#L199 I used @key directive without the key name to specify HK, which is also not supported. This is not as much as a bug but just really strange implementation that on purpose generates invalid models. Also how is that supposed to work with models that have HK + SK? Code sample down below.

type UserProfile @model @key(fields: ["userId"]) {
  userId: ID!
}

Result:

@ModelConfig(pluralName = "UserProfiles")
@Index(name = "undefined", fields = {"userId"})
public final class UserProfile implements Model {
  public static final QueryField ID = field("id");
  public static final QueryField USER_ID = field("userId");

Expectation: the codegen model tool picks up that the HK is called userId and uses it accordingly, instead of generating invalid models.

Amplify CLI Version
4.6.0

To Reproduce
Create a model with HK other then Id and run amplify codegen model

Expected behavior
The specified column is used as HK.

DataStore bug code-gen

Source

asyschikov

👍3

Most helpful comment

@lazy-var @asyschikov For a synchronization protocol to have correctness and convergence objects need to have unique identifiers. Since DataStore is local-first before mutations are reconciled and committed to DynamoDB, we create these identifiers locally using the id field with type ID. There is complexity in the use cases such as relational patterns that not only leverage @key with DynamoDB implications but also in the local storage engine that might leverage SQLite or IndexedDB depending on the platform. For our initial launch we wanted to ensure that the data integrity could be trustworthy and reliable which is why we require id however we do hear your feedback and are looking to enhance this functionality in the future such as providing another directive for customers like yourself to disambiguate the fields that are unique and will be used as alternative identifiers. I don't have a specific timeline on this as there are a couple other higher priority items in the customer backlog based on interest and feedback, however we are looking at a design on this in the coming months.

undefobj on 28 Feb 2020

👍4 🚀2 ❤2 🎉1

All 18 comments

@asyschikov could you please try with the latest version of the CLI if the problem persists?

attilah on 2 Jan 2020

@asyschikov Is the generated code used with DataStore or with plain API?

yuth on 2 Jan 2020

@attilah I updated to the latest version, but the result is expectedly same because I just sent you the link to the master branch of the amplify-cli where the id attribute is forcedly added to the model. I would not call it a bug, just a strange design decision. I guess it is some sort of convention with amplify-android (iOS too?) library because they expect Id to be there, I opened a sister issue there https://github.com/aws-amplify/amplify-android/issues/210

@yuth it is generated by amplify codegen model, also by modelgen gradle task, not sure whether it means DataStore or plain API. The steps are taken from https://aws-amplify.github.io/docs/android/api .

asyschikov on 2 Jan 2020

@yuth it is generated by amplify codegen model, also by modelgen gradle task, not sure whether it means DataStore or plain API. The steps are taken from https://aws-amplify.github.io/docs/android/api .

OK, the generated model is used with API category and not with DataStore.

DataStore needs an id field to be present and it is not allowed to change the id field with @key directive. We need to have a way to know if the generated model is used for API category or DataStore. @TrekSoft, is it possible for generation of separate files for API and DataStore?

yuth on 3 Jan 2020

@yuth - yeah let's setup a time to discuss as a team

TrekSoft on 6 Jan 2020

DataStore needs an id field to be present and it is not allowed to change the id field with @key directive.

@yuth I don't get that part. I don't think it is documented anywhere. If this is really mandatory for whatever reason the codegen should not validate the schema instead of appending the "id" field that just doesn't exist, because that's just not gonna work. It should state it with big bold letters.

The larger question is why this restriction even exists? It makes developers' life harder instead of easier. If you are adding a new android application to an existing amplify application you are basically screwed if your models don't have id fields, you can't just rename HK on a DynamoDB table. Not to mention tables with hash key and sort key, they are often used to create one-to-many relationships and the HK in the child table refers to the HK of the parent table (with a sort key to enable many children). This case is basically unimplementable right now. Including half of the examples for directives @connection and @key from here https://aws-amplify.github.io/docs/cli-toolchain/graphql .

asyschikov on 7 Jan 2020

DataStore needs an id field to be present and it is not allowed to change the id field with @key directive.

Hi @yuth , @TrekSoft , forcing id field to be present is fair for local to work, but does this means not custom @key is allowed for cloud side?
I see DataStore as a 'client' tool/component. And I expect it to work with custom indexes, custom AppSync resolvers, etc.

By providing flexibility with DataStore would allow projects to grow, going custom once the business model has been validated and 'opinionated' modelgen/GraphQL transform solution is not enough.

lazy-var on 12 Jan 2020

undefobj on 28 Feb 2020

👍4 🚀2 ❤2 🎉1

@undefobj sorry I missed your reply. Thanks for the clarification why ID is needed. My issue with the current implementations are two fold:

codegen is not fully consistent with GraphQL Transforms and the incompatibilities are not explicitly stated. @key works with AWS AppSync, it is properly mapped to the PK in a Amazon DynamoDB table and all the generated GraphQL queries. Changing PK in Amazon DynamoDB is very painful because it requires to drop/create the table. The codegen tool can't just make assumption that the id field is there even if it is not in the schema. My suggestion would be very explicitly list the requirements for the schema and list of directives that are not supported. I would also suggest that codegen explicitly fails with error if the schema does not validate.
Since @key is not supported it means that Sorted Keys are not supported as well if I got it right, which reduces Amazon DynamoDB to a simple Key Value store which it is not. I have no idea how to implement One To Many relationships without Sort Key. For example how would Post-Comments relationship work from the example schema? I understand that for a beta release it is a needed measure, it would help greatly if it was highlighted which functionality is not supported, because we already developed a schema, Web App client and were entirely blocked by the DataStore.

asyschikov on 21 Apr 2020

@asyschikov @key is supported as well as many to many as noted in the docs on relational models. However there are a few things that we need to put into place for some finer grained controls on ID fields. We understand this feedback and are looking at providing the optimization in the future.

Codegen is designed to fail if the schema isn't valid. Sometimes the known cases are missed though which is what happened here.

We're looking at additional workflows for table recreation while in dev/test stage and iterating on schemas which you can comment on here: https://github.com/aws-amplify/amplify-cli/issues/2384

DataStore on JavaScript is not beta. It's fully released. On iOS/Android it's in Preview. Again, sometimes things aren't covered in docs or in code checks because they are unforeseen.

undefobj on 21 Apr 2020

@lazy-var @asyschikov For a synchronization protocol to have correctness and convergence objects need to have unique identifiers. Since DataStore is local-first before mutations are reconciled and committed to DynamoDB, we create these identifiers locally using the id field with type ID. There is complexity in the use cases such as relational patterns that not only leverage @key with DynamoDB implications but also in the local storage engine that might leverage SQLite or IndexedDB depending on the platform. For our initial launch we wanted to ensure that the data integrity could be trustworthy and reliable which is why we require id however we do hear your feedback and are looking to enhance this functionality in the future such as providing another directive for customers like yourself to disambiguate the fields that are unique and will be used as alternative identifiers. I don't have a specific timeline on this as there are a couple other higher priority items in the customer backlog based on interest and feedback, however we are looking at a design on this in the coming months.

I understand that DataStore creates autogenerated id on local model when creating new document because it will be unique and it prevents situation when user created local data and when sync will be performed reveals that data already exists.

But it will be great to have ability create own id, with in example some property like UNSAFE_key or method dangerouslySetKey. Programmer will be consciously use this ability.

Why it needed?

Because there are may be situations where you doesn't know autogenerated ID, but you know some attribute which is ID. Or ID can be composed from some attributes.
When key consist from partition and sort key. For example user subscriptions where pk - subscriber ID and sk - subscription ID. For this case also there is need to add support of object keys: DataStore.query(Subscription, {id: SOME_ID, target: SOME_TARGET}) to get single Model.

Also with above suggestions needed to add mecanism to restore deleted data, because key may stay same and now there is no chance to save model second time with DataStore.save

lenarmazitov on 3 Jun 2020

👍2

This is soooo necessary!

But it will be great to have ability create own id, with in example some property like UNSAFE_key or method dangerouslySetKey. Programmer will be consciously use this ability.

This would be needed in all scenarios where you're creating something with a custom primary key. I'm trying to do this right now with usernames. I need to set username as an "unsafe key" as in, it'd be saved locally but if there's an error (already exists) it'd have to throw an error from DataStore.save. There's no way to do this right now!

GavinSawyer on 10 Jul 2020

@attilah

Hey just wondering if there is any update for this.

jayfry1077 on 11 Aug 2020

Hello,

I'm wondering about this also. Personally I have used the schema from Nader Dabit's Chatt app here..

https://dev.to/dabit3/building-chatt---a-real-time-multi-user-graphql-chat-app-3jik

CLI: 4.27.1

I'm using using his schema to create a messaging feature in an app using DataStore.

Big thanks to Nader by the way, I would be lost without all his resources online!

type Message {
  id: ID!
  author: User
  authorId: String
  content: String!
  conversation: Conversation!
  messageConversationId: ID!
  createdAt: String
  updatedAt: String
}

But I notice the following query to get all messages in a conversation doesn't work as I get the invalid field error..

const messages = await DataStore.query(Message, (m) =>
        m.messageConversationId('eq', '601afe05-f39a-4a08-b391-47aed452c361')
      )

Error: error fetching data... Error: Invalid field for model. field: messageConversationId, model: Message

There's even an explicit messageConversationId field in the schema which I use in the message mutation, and I see it in my DataStore message items, but its not seen in my query. For the moment I may just create another field in the messages schema called something like 'msgConvId' and query on that, if thats bad practice please let me know as I'm new to Amplify, thank you!

myfairshare on 12 Aug 2020

Its critical that AWS improve their documentation on the ID field. The rules around the ID field to not jump out at you.

Equally, we should be able to set the ID field... just insane that amplify overwrites a perfectly good UUID. To match future updates back to our internal model we have to maintain a parallel UUID. Silly.

renenw on 16 Aug 2020

I would also use this functionality as I just had to change my model's primary key to not be a composite key as I had designed, but instead switch that to a secondary index in order to also be able to use DataStore on the frontend.

It's not ideal, especially now that the GraphQL query generated for the secondary key has all the pagination constructs because it's not guaranteed to return a unique record for the composite key since it's no longer the primary key.

Perhaps this is just a limitation of the combination of DataStore's requirement to understandably ensure a unique id locally and DynamoDB's inability to specify unique constraints on secondary indexes, but if there is a solution here I would very much appreciate it.

danrivett on 26 Sep 2020

I believe this functionality would be helpful for my use case as well. I have a set of items which are not modeled in DataStore with globally unique identifiers (“Service Tickets”) which users can collaborate on. There is a separate system of record which manages the Service Tickets. What I’d like to use DataStore for is to manage how those tickets change over time. For example: Multiple users can “Update” certain fields on those tickets. It is only the “Update” information that resides in DataStore. If two users both Update the same Ticket simultaneously, I would like for both of those changes to be "merged" to maintain the invariant that only 0 or 1 "Update" exists for a given Service_Ticket_Id. Since DataStore uses a separate uniqueness identifier (id), there doesn't seem to be a way to accomplish this.

@danrivett I don't have much experience with secondary indexes, but as I understand them they are only eventually consistent with the base table. How are you managing scenarios where propagation is delayed (or fails)?

hennessey on 27 Oct 2020

Hi!
please, do we have any news about this one?

@lazy-var @asyschikov For a synchronization protocol to have correctness and convergence objects need to have unique identifiers. Since DataStore is local-first before mutations are reconciled and committed to DynamoDB, we create these identifiers locally using the id field with type ID. There is complexity in the use cases such as relational patterns that not only leverage @key with DynamoDB implications but also in the local storage engine that might leverage SQLite or IndexedDB depending on the platform. For our initial launch we wanted to ensure that the data integrity could be trustworthy and reliable which is why we require id however we do hear your feedback and are looking to enhance this functionality in the future such as providing another directive for customers like yourself to disambiguate the fields that are unique and will be used as alternative identifiers. I don't have a specific timeline on this as there are a couple other higher priority items in the customer backlog based on interest and feedback, however we are looking at a design on this in the coming months.