This issue will primarily detail the design regarding adding an existing Relational Database as a data source for an AppSync API via the amplify-cli. In the future, we can extend the functionality beyond Aurora Serverless to other data sources.
We propose a subcommand 'add-datasource', under the amplify 'api' command, with the intended functionality:
The proposed subcommand 'add-datasource' will execute a question walkthrough that will present the user with the following workflow:
fun@aws:~$ amplify api add-datasource
What kind of data source do you wish to support?
> Aurora Serverless (Relational Database)
Do you wish to import a database? (Y/n)
> Yes
Subsequent prompts will be used to collect the Aurora Cluster Identifier, Secrets Manager ARN (used to access the cluster), Database name, JDBC inputs, and other information regarding the data source.
Further information on the walkthrough questions:
With the above information, we will have enough to perform the Database introspection and generate the CFN template with the AppSync Resources ready to utilize the Aurora Serverless Database.
Imagine you have an existing database with two tables, 'Post' and 'Author', and wish to use it as data-source for AppSync.
Upon going through the workflow mentioned above, amplify will perform the introspection and automatically generate the GraphQL Schema for the data source as well as the VTL Resolvers. In addition to the basic resolvers based on the Types (i.e. CreatePost, GetPost, ListPosts, etc.), it will generate additional query resolvers that leverage SQL's intended queryable nature.
Examples of the additional query resolvers would be:
The selection of the fields will be based on non-primary id Ints and Strings (as the basic resolvers cover the primary id already).
As a result of performing the introspection, the command will be able to help users get started with AWS AppSync by allowing them to simply bring in their already existing data source and not need to worry about creating the GraphQL schema or writing resolvers in Velocity (VTL). The outputted CFN template will be stored in the '/backend/api/[apiname]/stacks' directory as mentioned in #574. Therefore, the next amplify push (post a successful add-datasource command) will provision the AppSync resources in the cloud via CFN and the user will be ready to use AppSync with their existing data source.
So, first comment: 鉂es, please.
Second comment: How would the above work with Amplify's multienv capability? Namely, if another engineer checks out a separate environment would they:
(a) Have to run amplify api add-datasource again and connect it to a new, pre-configured resource? If so, how would schema mismatches between the two datasources (which are "supposed" to be a single datasource, just in different branches) be handled?
(b) Be able to auto-generate a compliant underlying resource via CFN when they run amplify push. This would involve actually pushing schema DDL changes into the Aurora Serverless resource once provisioned.
(c) Other?
This is exactly what we need.
Instead of generating a lot of query methods I think it's worth looking at how search works with Elasticsearch data sources as it's a much more flexible approach. Instead of having GetPostsWithUpvotes, GetPostsWithTitleLike and GetAuthorsWithNameLike as proposed you would have a SearchPosts which would allow searching on any combination of fields.
It be more orthogonal to use
$amplify api add
? Please select from one of the below mentioned services (Use arrow keys)
> GraphQL
REST
What kind of data source do you wish to support?
> Aurora Serverless (Relational Database)
DynamoDB (Non Relational Database)
Agree with @doom777 here, consistency around setting up data sources is essential IMO. I also think this RFC should include an equivalent for the @model GraphQL directive using Aurora Serverless.
May I ask why Aurora Serverless will be supported before provisioned Aurora?
May I ask why Aurora Serverless will be supported before provisioned Aurora?
I believe it's because Amplify is meant to be serverless, (dynamic db, user pools). See ElasticSearch for exception.
@jkeys-ecg-nmsu Aurora Serverless is being supported first in the CLI as AppSync has released support for Aurora Serverless and we would like to maintain the consistency between the two.
@doom777 I do understand where you are coming from in suggesting that datasource addition be included as part of the API creation flow, but the reason we decided to create a separate 'add-datasource' subcommand was so that users that have created an API already can simply add a datasource to their existing API. It is also a more intuitive way to simply 'tack on more datasources.' As AppSync continues to support more datasources, users with existing APIs can easily add more. This is of course assuming they want to have multiple data-sources in their APIs; but this approach provides the flexibility - that isn't there if we were to include it as part of the API creation flow only.
@buggy The point you made on having a SearchPost resolver (that takes in combinations that will be transformed to WHERE clauses) as opposed to just generating a multitude of getXbyY query resolvers is a great and valid one! I will update the RFC to suggest both options to see if we can get more feedback between the two. Great suggestion!
@doom777 I do understand where you are coming from in suggesting that datasource addition be included as part of the API creation flow, but the reason we decided to create a separate 'add-datasource' subcommand was so that users that have created an API already can simply add a datasource to their existing API. It is also a more intuitive way to simply 'tack on more datasources.' As AppSync continues to support more datasources, users with existing APIs can easily add more. This is of course assuming they want to have multiple data-sources in their APIs; but this approach provides the flexibility - that isn't there if we were to include it as part of the API creation flow only.
@ashwindevendran, I think this question should be dealt as part of a broader question on how to deal with existing changes to apis. Currently, there is more support for creating new apis than editing existing. Creation is the primary use case, therefore it should be supported first. Editing existing apis by adding datasources is also useful, and I see where you come from saying that it is a common use case since people will want to migrate, but creating is still first.
By the way, I can't picture how people will migrate their existing apis. I mean, migration is a heavy task, you need to recreate the data scheme (which is very different for RDS/NoRDS), ETL the data, etc. Recreating the apis in amplify console is the least of the problems.
@ashwindevendran you've given me all the material I need to go to my bosses and convince them to migrate all our Aurora clusters to Aurora Serverless :P Thank you!
This may be outside of the scope of this RFC, but it would be useful to plan out how the directives that currently work with dynamoDB data sources could be ported over to AS data sources (@model, @auth, @searchable, @connection, @versioned).
With an AS data source, I don't expect the @model directive to create the tables as this is managed outside of cloud formation. But I would see more value in being able to write up specific @model directives rather than using the introspection feature. It would let me declare a model without first creating the database schema, and would let me filter down which tables to expose via AppSync endpoints. To support this, maybe the @model directive could take an argument to specify the data source it should connect to, and the table:
type Post @model(datasource: "datasource name", table: "database table name") {
id: ID!
title: String!
}
and prompt the user to specify a name for the datasource when running the add-datasource subcommand.
Support for the other directives would be useful regardless of the data source a model is connected to, and having directive parity between the two would make it easier to switch from a dynamoDB backed model to an AS one.
The @auth and @connection directives should also work together, as array of strings are not typically found in mysql schemas:
type Post
@model(datasource: "datasource name", table: "posts")
@auth(rules: [
{ allow: owner },
{ allow: owner, ownerField: "editors", mutations: [update] }
]) {
id: ID!
title: String!
# expect the field `owner` to be the ID of a user, typically a foreign key constraint to the `users` table
owner: User @connection
# expect no `editors` field to exist in the `posts` table, but rather use the `post_editors` table
editors: [PostEditor] @connection(name: "PostEditors")
}
type PostEditor @model(datasource: "datasource name", table: "post_editors", queries: null) {
id: ID!
post: Post! @connection(name: "PostEditors")
editor: User! @connection(name: "UserEditors")
}
type User @model(datasource: "datasource name", table: "users") {
id: ID!
username: String!
# expect no `posts` field to exist in the `users` table, but rather use the `post_editors` table
posts: [PostEditor] @connection(name: "UserEditors")
}
This may be outside of the scope of this RFC, but it would be useful to plan out how the directives that currently work with dynamoDB data sources could be ported over to AS data sources (
@model,@auth,@searchable,@connection,@versioned).With an AS data source, I don't expect the
@modeldirective to create the tables as this is managed outside of cloud formation. But I would see more value in being able to write up specific@modeldirectives rather than using the introspection feature. It would let me declare a model without first creating the database schema, and would let me filter down which tables to expose via AppSync endpoints. To support this, maybe the@modeldirective could take an argument to specify the data source it should connect to, and the table:type Post @model(datasource: "datasource name", table: "database table name") { id: ID! title: String! }and prompt the user to specify a name for the datasource when running the
add-datasourcesubcommand.Support for the other directives would be useful regardless of the data source a model is connected to, and having directive parity between the two would make it easier to switch from a dynamoDB backed model to an AS one.
The
@authand@connectiondirectives should also work together, as array of strings are not typically found in mysql schemas:type Post @model(datasource: "datasource name", table: "posts") @auth(rules: [ { allow: owner }, { allow: owner, ownerField: "editors", mutations: [update] } ]) { id: ID! title: String! # expect the field `owner` to be the ID of a user, typically a foreign key constraint to the `users` table owner: User @connection # expect no `editors` field to exist in the `posts` table, but rather use the `post_editors` table editors: [PostEditor] @connection(name: "PostEditors") } type PostEditor @model(datasource: "datasource name", table: "post_editors", queries: null) { id: ID! post: Post! @connection(name: "PostEditors") editor: User! @connection(name: "UserEditors") } type User @model(datasource: "datasource name", table: "users") { id: ID! username: String! # expect no `posts` field to exist in the `users` table, but rather use the `post_editors` table posts: [PostEditor] @connection(name: "UserEditors") }
The support to RDS is useless without integration with another datasources. Any plans to implement integration with @auth and @connection?
Most helpful comment
This may be outside of the scope of this RFC, but it would be useful to plan out how the directives that currently work with dynamoDB data sources could be ported over to AS data sources (
@model,@auth,@searchable,@connection,@versioned).With an AS data source, I don't expect the
@modeldirective to create the tables as this is managed outside of cloud formation. But I would see more value in being able to write up specific@modeldirectives rather than using the introspection feature. It would let me declare a model without first creating the database schema, and would let me filter down which tables to expose via AppSync endpoints. To support this, maybe the@modeldirective could take an argument to specify the data source it should connect to, and the table:and prompt the user to specify a name for the datasource when running the
add-datasourcesubcommand.Support for the other directives would be useful regardless of the data source a model is connected to, and having directive parity between the two would make it easier to switch from a dynamoDB backed model to an AS one.
The
@authand@connectiondirectives should also work together, as array of strings are not typically found in mysql schemas: