So far database migrations have been almost entirely implicit (except for the temporary directives @rename
and @migrationValue
). Especially when working in multi-stage/multi-branch environments, a deterministic (and imperative) migration solution is necessary.
The basic idea is that whenever the data model (types.graphql
) is changed, the CLI automatically creates a migration file with the missing migration steps when running prisma deploy
.
Migration files are YAML based and contain a list of all migration steps and an optional before/after script. The filename starts with a UTC timestamp (format: YYYYMMDDHHMMSS
) and optionally contains a short "description" (e.g. YYYYMMDDHHMMSS_create_products.yml
).
The missing migration steps can always be calculated locally (without any "knowledge" of a deployed service) which is the delta of the current types.graphql
file and all existing migration steps combined. If the data model described by types.graphql
and the data model described by all migrations line up, the migrations are "in sync".
Assuming you're starting out with an empty service definition (0 models/types) and add a new User
model with an id
and name
field, this is how your project should look like:
# types.graphql
type User @model {
id: ID! @unique
name: String!
}
# migrations/20080906120000.yml
steps:
- operation: type.create
definition: 'type User @model`
- operation: field.create
type: User
definition: 'id: ID! @unique'
- operation: field.create
type: User
definition: 'name: String!'
steps:
# Types
- operation: type.create
definition: 'type User @model'
- operation: type.update
id: User
definition: 'type NewUser @model implements Human'
- operation: type.delete
id: NewUser
- operation: type.create
definition: 'type User @model'
# Fields
- operation: field.create
type: User
definition: 'firstName: String! @defaultValue(value: "John")'
- operation: field.update
id: User.firstName
definition: 'name: String! @defaultValue(value: "John Doe")'
- operation: field.delete
id: User.name
# Relations
- operation: relation.create
definition:
- 'posts: [Post!]! @relation(name: "UserPosts")'
- 'user: User! @relation(name: "UserPosts")'
- operation: relation.update
id: UserPosts
definition:
- 'posts: [Post!]! @relation(name: "UsersOnPosts")'
- 'users: [User!]! @relation(name: "UsersOnPosts")'
- operation: relation.delete
id: UserOnPosts
# Multi-field unique constraints
- operation: field.create
type: User
definition: 'firstName: String! @unique(group: "UserName")'
- operation: field.create
type: User
definition: 'lastName: String! @unique(group: "UserName")'
# Interfaces
- operation: interface.create
definition: |
interface Human {
name: String!
}
- operation: interface.update
id: Human
definition: |
interface DoubleHuman {
firstName: String!
lastName: String!
}
- operation: interface.delete
id: DoubleHuman
# Enums
- operation: enum.create
definition: |
enum RGBColor {
Blue
Green
Red
}
- operation: enum.update
id: RGBColor
definition: |
enum Color {
Blue
Green
Red
Yellow
}
- operation: enum.delete
id: Color
- operation: api.lock
- operation: api.unlock
You can run a script before and/or after a migration. This allows you to migrate not only the structure of your data but your data itself (e.g. iterate over all User
nodes and calculate a value for a newly created field).
A migration script can be written in any language and is simply invoked through a bash command. When a migration script fails (i.e. returns non-zero exit code) the migration will be rolled back. Migration scripts are executed locally (or in your CI environment) - not on the Prisma server.
# migrations/20080906120000.yml
scripts:
before: node before-migration.js
# after: node after-migration.js
steps:
# ...
Some data model changes result in ambiguous migration steps where the developer needs to choose the desired behaviour from an interactive CLI prompt.
The following cases are ambiguous whether an element should be renamed or deleted+recreated:
Here is an example for changing the type and according fields:
type Person {
name: String
description: String
}
# ...migrate to...
type User {
firstName: String
lastName: String
comment: String
age: Float
}
type Member {
name: String
title: String
other: String
}
_WIP_
Implementation should incorporate changes described in Improved infrastructure for service deployment.
This is what the deployment API could look like:
type Schema {
mutation: Mutation
}
input MigrationStep {
operation: String!
definition: String
id: String
type: String
}
input Migration {
id: String! # UTC timestamp
hash: String! # based on id + file content
steps: [MigrationStep!]!
}
type MigratePayload {
success: Boolean!
messages: [String!]!
}
type Mutation {
migrate(migrations: [Migration!]!): MigratePayload!
}
@migrationValue
directive/functionality given that it wasn't really a good solution for most cases and rather caused confusion for developers.gc deploy
command?Update: this is now accounted for in the proposal 🎉
The missing migration steps can always be calculated locally (without any "knowledge" of a deployed service) which is the delta of the current types.graphql file and all existing migration steps combined. Deltas are sometimes ambiguous - how can the developer distinguish between different 'migration paths' in such a case? Example: Solely based on the two One is to rename another is to remove Note that applying the former migration when the latter migration was intended leads to undesired data loss. old comment
Base Version
type User @model {
id: ID! @isUnique
title: String!
}
Latest version
# latest version
type User @model {
id: ID! @isUnique
name: String!
}
Possible migrations
types.graphql
, at least two migration paths are matching.title
to name
: - operation: field.update
id: User.title
definition: 'name: String!'
title
and add name
: - operation: field.delete
id: User.title
- operation: field.create
type: User
definition: 'name: String!'
Thanks, some thoughts from my side:
Migration Scripts:
deploy
/ migrate
command, and the server executes them? Do I do that locally? The former is a good idea, the latter problematic, to say the least. In my opinion, as Graphcool sits above the database, it is perfectly reasonable to only allow migration scripts to run on the server itself that acts as a controlling entity to serialize migration attempts and prevent mutiple developers from doing something unintentional and wreaking havoc in their database. This still allows for a simple approach locally and a more controlled one on live systems.Migration .yml files:
This is more like thinking out loud how things must work and tries to fill the gaps of the spec (as you didn't talk about the client-server interaction yet).
Okay so files are apparently ordered by timestamp. One goal for migrations is Resistant to merge conflicts
. I assume the merge conflicts in question are git merge conflicts. I also assume that migration files are checked into version control, as the other case would be too complicated for multiple devs to handle with wiring up all the correct script files and whatnot. There will simply be next to no merge conflicts if two developers happily develop on different branches and then merge. Why? Git will just take all those separate files and throw them into the migration folder, as there are no (or rarely) name conflicts due to timestamps naming. The only potential for merge conflicts here is of course the types.graphql
file, which is certainly an indicator for developers that they need to take a hard look at their migrations as well, but even this is not a given.
Now imagine a scenario where two branches from two devs diverged from master. One branch is shortliving and one is a longer, larger feature. The shortliving branch gets merged and the migrations are run. The larger branch is merged and now master has interleaving migrations of run and not run migrations:
I assume state keeping is done on the server which knows which migrations have been executed already (e.g. a traditional migrations
table), or else this scenario can't work, as there is no way to detect an issue with C
. It also can't be computed locally (ie. as there is no state available), so the CLI is out here and has to rely on errors from the server during migrate or deploy, which is a good thing. Will the server just error out immediately as soon as it finds a migration that is not run and before an already applied one, or will it try to resolve the conflict (more complex, more sophisticated, higher critical bug potential)? What is the flow if there is such an error? Does the developer create a new migration and delete the old one, or redate the conflicting migrations to a newer date? The tooling and the (error) messages certainly have to be on point to reduce friction as much as possible. Most of these things I mentioned are more or less solved by rails migrations already and should give us a good starting point of what developers expect.
One random thought: In rails it was common to delete or squash all migrations at some point into one base migration if the migration folder grew too large. The schema.rb
file (that was more or less used as the base migration) was _sort of_ our types.graphql
. We should definitely consider this scenario, as serious projects will amass migrations fairly quickly.
One last thing here: @marktani has a good point with the migration ambiguity. I wonder how this is complicated by the flow I described above with parallel branches / parallel changes to the types, I have to think more about these edge cases.
"Apologies for being picky" points:
Each migration step should be applicable within <100ms
design goal is all good until you hit data column migrations that can occupy the database for multiple minutes and more. I don't see why this is relevant or on the list in the first place if there is at least one obvious case where operation time is unbounded to such an extent that making guarantees is futile.definition
) - it should adhere to the same type, even if it is just YAML.@dom
Some comments :-D
In Rails each migration script is unrestricted ruby code - it could send an email if it wanted to. I assume this is what you mean by input?
What are the primary use cases for this - I assume loading or manipulating existing data?
The current idea is that the cli will invoke the scripts.
During development, a single developer is assumed to work exclusively on a branch. We recommend spinning up a database for each feature branch.
On shared databases - typically staging and production, the cli should be executed from a CI environment that can control exclusive access to running migrations. I see a future where Graphcool provides the CI environment for Graphcool database migrations, but for now people are encouraged to use their existing CI setup.
Additionally I think that Graphcool should prevent multiple migrations from running at the same time. When this involves arbitrary scripts before and after the actual migration we need a way for the cli to inform the server about a migration beginning or ending. We can either make this implicit (cli sends messages to server) or explicit (the scripts will have to run mutations to mark beginning and end).
I agree that this has to be tied to input from the cli. Maybe we should have a concept of a special connection for migrations
I agree that only one deploy should be allowed at a time.
Migrations from two branches with interleaving timestamps Rails simply run all migrations in order, skipping any that has already been executed. We should investigate if this has any severe downsides in our context. Otherwise, that would probably be the solution that is easiest to understand.
Merge Conflicts Rails include the timestamp of the latest migration in db/schema.rb
in order to ensure that the developer is made aware that another branch has introduced changes and should verify that there are no issues.
Relationship between migrations and db/schema.rb In Rails, individual migrations are created manually (either by writing code or giving input to the cli.) After running a migration, Rails automatically inspects the resulting database and recreates the db/schema.rb
file. This file is auto-generated, and manual changes will not be preserved. So even though migrations are the source of the current state of the database, db/schema.rb
is the authoritative source for the database schema. This is a bit weird, but comes out of a practical realisation:
There is no need (and it is error prone) to deploy a new instance of an app by replaying the entire migration history. It is much simpler and faster to just load into the database a description of the current schema.
In Graphcool this works differently. Our equivalent of db/schema.rb
is types.graphql
which developers are expected to modify directly. This file is never automatically generated. Instead migrations are automatically generated by diffing against the current state.
Also worth noting that db/schema.rb
is database independent (works with mysql, postgress etc), but if engine specific features are used, rails support switching to a sql based format instead.
Transactional guarantees I don't think we can rely on database transactions for migrations. For one, this isn't supported by MySql, and secondly there is no good way to ensure that we use a single connection for both the migration steps as well as possible migration scripts. Instead we should add primitives to temporarily disable writes and potentially reads.
A list of actions a developer will perform when working with Graphcool. Please point out any actions not on this list yet so we can make it exhaustive
This happens when you use Graphcool the first time or when you start working on a new project
gc init
: Create types.graphql and first migration file to init empty databasegc deploy
: run first migration (or whatever we do to create empty db)This happens on a daily basis as you make changes to your db structure
types.graphql
gc deploy
: generate current database schema; diff against new database schema from types.graphql; iff they are different, create migration filegc deploy
: run all pending migrations against serverWhen you realise that your latest migration wasn't quite right before you actually perform the migration.
Don't do this if migration has been executed against a shared database (ie. staging or prod) or the branch has been merged
types.graphql
gc deploy --recreate
: generate current database schema, excluding last migration; diff against new database schema generated from types.graphql; iff they are different, overwrite last migration filegc deploy
: run all pending migrations against serverWhen you realise that your latest migration wasn't quite right and you have already performed migration against your local database.
Don't do this if migration has been executed against a shared database (ie. staging or prod) or the branch has been merged
types.graphql
gc deploy --recreate
: generate current database schema excluding last migration; diff against new database schema generated from types.graphql; iff they are different, overwrite last migration filegc deploy --redo
: roll back latest migration and then run all pending migrations against serverWhen you integrate changes from the master branch that includes new migrations, or when running migrations against a shared database (typically staging and prod). Migrations have already been created and tested by developers in the feature branch where they were introduced.
gc deploy
run all pending migrations against serverThis happens whenever you need to spin up a new test database and when you start working on a new feature branch as well as when a new developer joins the project.
gc deploy
Should we replay all old migrations or simply create a new giant migration that goes directly to finished schema? Rails does the latter for performance and robustness and I think we should do the same.If you have lost access to your migration files and types.graphql
you need a way to get it back from the server
gc pull-types
Server generates types from the existing database structure. cli stores in local fileabove
gc deploy
always performs the following check to decide if it should a) generate a new migration file or b) perform migrations on the database: generate current database schema by iterating over all migration files; diff against new database schema generated from types.graphql; are there any changes?
Awesome, if this can help a bit I have a script that manage the migrations scripts. Like rails there is different migration files prefixed the date and then all migrations are stored in the database as a new type (that can be included in the system api) and from that just apply every migrations that are not yet in the database. If this can help https://gist.github.com/antho1404/576642a12f07f09f853a42ecf67a45f3
Also in rails few times I had to do a migration of the structure then run a script and migrate again the structure (when add a unique not null field) for exemple. Of course it's still possible to generate 3 different migrations (create the field, migrate the data, mark as unique and not null) but I think it could be nice to have this in a single file and so allow to add scripts between operations and not only before or after the migration
I also think that the types.graphql
should only be generated, if a project have migrations it will be really messy to have migrations and types.graphql
file, which one will be applied if both are modified with different data ? Maybe have a script that can generate all the initial migrations based on the file and then ignore it
Thanks @antho1404! This is very helpful.
One thing I want to clear up right away: The types.graphql file is never generated. It is and will continue to be the primary way you change your Graphcool database schema. If, for example you want to add a field to an existing model, the workflow looks like this:
types.graphql
fileThe generated migrations should not be changed manually as this will result in your types.graphql
being behind causing the cli to create a new correcting migration that will revert your manual change to the migration file.
Does that make sense?
We should consider if migrations is the right way for us to do schema change deployments.
Redgate and Microsoft SQL Server Sata Tools use a state based diffing approach compared to rails style migrations. This is described in Critiquing two different approaches to delivering databases: Migrations vs state
This might be a more natural fit to Graphcool
Hum my bad I didn't read carefully
whenever the data model (types.graphql) is changed, the CLI automatically creates a migration file
I feel it's kind of weird to do that way. Maybe because I worked too much with rails but for me I feel if I want to update something on my database (types, data, relations, whatever...), I don't want to have to go in the big file with everything and loose time to know where to put my modification and maybe it will be at the beginning and the end of the file, but just want to write one file with only the updates I want to apply exactly like the migration files you describe and then just run a command to dump the new migration into the types.graphql
file and then deploy the modification.
I guess maybe the architecture doesn't suit this idea and maybe I'm just really biased with rails migrations. Maybe I just have to try to think differently :)
It's really interesting to learn from developers with different experiences. To me, the fact that I can have one coherent view of my data model in the types.graphql
file is the magic behind the Graphcool migration system.
Having to manually write migration files would be a big step backwards for me. I'd be very curious to hear from other people how they feel about this :-)
I think the stuff that "bothers" me a bit is to call it migrations maybe call it "diff" or something like that will be less confusing for me or also another idea if the idea is really to have the state of the database and just update the state maybe it can be an idea to have all the full different states (so different versions of the types.graphql
) The stuff confusing is to have some migrations and some states I feel less confusing to have only one. Again this are just some thoughts
This sounds really good. I especially like that it's all static and doesn't introspect the DB.
Some thoughts:
types.graphql
? For example to change one of the "Ambiguous change" decisions made in the CLI.@tamlyn
Presumably it would be possible to edit the generated migrations provided the end result of running all the migrations matches types.graphql? For example to change one of the "Ambiguous change" decisions made in the CLI.
Yes
Would the before/after scripts be stored in the migration file or referenced or auto loaded based on a naming convention?
Migration scripts has to be explicitly referenced from within the migration file. Migration scripts will be invoked by the CLI
In development you could end up with a whole load of small migrations while you tinker with the schema. In this case could you just delete all the uncommitted migrations and have the CLI regenerate a single migration?
This is a great point. As long as no migration scripts are involved, this will be possible
Sorry for chiming in so late. I thought a bit about the CLI centered approach for migrations and i don’t think it’s the right approach. I agree that migrations should be stored in the Git repo of a Graphcool project. However i don’t think that this means that migrations need to be generated by the CLI.
The server stores migrations in an immutable log. The server is responsible for generating migration proposals which can then be edited by the developer. Once the editing is done the migrations can be pushed to the server.
The process of deploying would look like this:
deploy
the CLI sends the desired types.graphql
to the server. The server than diffs with the current state and sends back a list of migrations that would result in the desired state change (those migrations are a proposal).When deploying to a new env:
Not directly related to this proposal but more on the implementation side, @cjoudrey shared a couple of thoughts with me. He was suggesting to look into online schema changes similar to these projects to continuously allow read/write access to the database while performing the migrations:
The proposed steps are fairly coarse. given the following schema:
Initial Schema
type User {
id: ID! @unique
name: String
}
If a developer want to make it a unique field they would generate the following migration script:
Migration A:
- operation: field.update
id: User.name
definition: 'name: String @unique'
This includes the intended change (User.name should be unique) as well as the current state of the field (called name, of type String, not a list, and optional)
If a second branch created from the same initial schema introduces a change to make the field required, that migration will look like this:
Migration B:
- operation: field.update
id: User.name
definition: 'name: String!'
Again, the desired change as well as the current state of the field is contained in the migration.
In this situation two branches introduce changes that could in theory be compatible. Before we dive into the possible conflict resolution strategies I want to clarify how Graphcool and Git branches play together:
In a team with multiple developers it is common to have the following environments:
local development can use the local docker based version of the GraphQL Database or connect to a hosted development cluster. Crucially, changes to the database schema during development of a feature should be applied to an environment available only to the current developer.
It is suggested that a separate branch is maintained throughout the development of the feature. When the feature is done, the branch is merged to a dedicated staging branch and continuous integration will deploy the database changes to a dedicated staging database.
When tested in staging, migrations can be performed on production.
In the scenario above, two developers created a feature branch from the same starting point, and now merge their branches to staging at the same time.
As the migrations are stored in files with unique names, there will be no git-level conflict. So we will have to deal with the conflict at deploy time.
We have 3 possible conflict resolution strategies:
1. Reject the second deploy
To do this we would need to store extra information about the origin allowing us to keep track of the origin of a deploy. This would be very similar to the tree structure git maintains for branches. Maybe we could find a trick that allows us to implement this in a more simplistic way.
2. Last write wins
This would be very easy to implement. If we go for this I believe we need some other way to make the developer aware that there is a potential issue. Rails deals with this by always forcing a source control conflict when branches simultaneously change the database schema Link.
This is easy to implement, but probably not the developer experience we want to deliver.
3. Perform more granular migrations
In this particular case the conflict could be avoided if we perform only the intended change instead of overriding all settings for the field. This can be implemented in two ways:
- operation: field.update
id: User.name
required: true
We should consider adding a migration status command: https://orator-orm.com/docs/0.9/migrations.html#getting-migrations-status
With this comment i would like to work towards a common understanding of conflicts that can occur with regards to migrations. When we discuss actual solutions we can check how a given conflict would be handled within that solution.
It is important to note that migrations consist of 2 parts based on our current understanding:
A conflict might affect only one or both parts of a migration. The affected part is mentioned in each of the following scenarios.
The following is the standard scenario for the potential conflicts. Developer A and B start a branch from master from the same commit. Both developers perform their changes on their respective branches. Developer A gets to merge first. Developer B merges afterwards and therefore might be exposed to a conflict.
When does the conflict occur?
This conflict can occur whenever developer A renames something (model, field, enum..) and developer B does not know that change. If a migration of developer B refers to the old name it will fail when it is applied after the changes of developer A.
Example
Developer A changes the name of the type _User_ to _Customer_. Developer B adds a new field to the _User_ type on their branch. Developer A merges first to master and the change gets applied to the staging environment. Developer B merges afterwards and when their changes get applied the system does not know the type _User_ anymore and therefore fails.
Affected migration part
schema & script
Consequences
When does the conflict occur?
This conflict can occur whenever developer A deletes something (model, field, enum..) and developer B does not know that change. If a migration of developer B refers to the deleted thing it fails.
Examples
Developer A deletes a field and Developer B refers to it (updating or deleting it).
Affected migration part
schema & script
Consequences
When does the conflict occur?
This conflict can occur whenever developer A and B use the same name for something new (model, field, enum).
Examples
Developer A and B create a field with the same name.
Affected migration part
schema
Consequences
When does the conflict occur?
This conflict can occur whenever developer A and B update the same thing (model, field, enum).
Examples
Developer A changes the type of a field and Developer B makes that field required.
Affected migration part
schema & script
Consequences
The fundamental problem is that developer B performs their changes based on the knowledge about a certain starting point. That starting point is the commit this developer used for branching.
When developer B merges their pull request however this starting point might have changed drastically. Therefore B's migrations might fail or won't have the intended effect.
What needs to happen here is that the changes of B need to be adapted to the new starting point. In Git speak developer B needs to rebase their migrations. In a fully manual approach that would mean to go through all the migrations they have created and reevaluate them against the new starting point.
With this comment i want to make us aware that there 2 fundamental ways to store migrations. The goal of a migration is that we want to apply it to a service in a given state and thereby transition it to the next state in a reproducible fashion.
We can express this with the following formula: P(n) + M(n, n+1) = P(n+1)
. This can be read as:
Given a project in state n: If we apply the Migration steps that capture the transition of the project state n to the state n+1, we end up in state n+1.
The initial project state P(n)
is always given. In a migration file we can either store the migration steps M(n, n+1)
or the next state of the project P(n+1)
in order to capture what needs to be done to migrate a project into the next state. Therefore we have 2 options to store a migration:
M(n, n+1)
to the current project state P(n)
and calculate the next project state P(n+1)
.P(n+1)
. In this case the server would calculate the required migration steps M(n, n+1)
and apply them to the project.There are 2 ways to store a migration in a file. We either store the concrete steps or the next project state.
Open Question: Would it be beneficial to store the migration steps and the next project state?
Pro and Con of the migration approaches
Pro:
Con:
Pro:
Con:
Is there any news on this?
Coming from a dotnet/EF background I highly value this. Especially for the ability to roll forward and backwards in both Data and schema
Here's an example which I think summarizes a lot of the concerns:
This is a very common process. There are a few things that would currently break (as I understand it).
To summarize: While a static schema is necessary for maintaining simplicity, and the "magic" of Prisma (of which I'm a huge fan), some history of these changes needs to be recorded in the (version controllable) code base. Then the server needs to be able to accurately interpret that history, and identify where its state fits in to that history, so that it can apply the migration steps necessary to automatically update its state to be current with the code base. The server will also need to be able to roll back in time (manually or automatically) to a previous state, in the case of a failed update.
So far, the discussion seems to be surrounding how the history "log" gets generated, and what generates it, and what format it should take.
Is this a fair summary? Has the conversation progressed at all?
For the record, I'm sitting on the edge of my seat. Some of the stuff you guys are doing is blowing my mind, and I can't wait until it can blow my clients' minds as well.
Hi ,
I want to migrate db2 schema to mysql database. Is there any open source tool or script which can help to migrate schema as well as data.
Please help!
Regards,
Ravi
Hey there, here's my two cents:
(This opinion does not give any solutions regarding conflicts (eg: a migration tries to access to a deleted field), but rather a general view of what a migration file could look like, and a way to give developers flexibility)
As @dpetrick said, we really need to give developers flexibility regarding migrations, and break a little the magic of Prisma for the sake of liberty.
As soon as you run into production, you will inevitably fall into some situations where you have to customize migrations to maintain backward compatibility with old stuff, or to provide default values on targeted nodes based on some very specific needs.
Let's say you're making required a non-required field that has already nodes.
You will have to provide default values to make sure your schema stays in sync with your database.
For now, the advised way to do so, is to run graphql mutations on the playground such as:
updateUser(
where: { firstName: null }
data: { firstName: "DEFAULT_FIRST_NAME" }
)
I think leveraging the power of GraphQL Bindings and Javascript to offer flexible scripted migrations could be super powerful.
In PHP using Doctrine, a migration looks like this:
/**
* Auto-generated Migration: Please modify to your needs!
*/
class Version20180507073532 extends AbstractMigration
{
/**
* @param Schema $schema
*/
public function up(Schema $schema)
{
// this up() migration is auto-generated, please modify it to your needs
$this->abortIf('mysql' !== $this->connection->getDatabasePlatform()->getName(), 'Migration can only be executed safely on \'mysql\'.');
$this->addSql('ALTER TABLE company_metadata ADD return_policy LONGTEXT DEFAULT NULL, ADD warranty_policy LONGTEXT DEFAULT NULL');
}
/**
* @param Schema $schema
*/
public function down(Schema $schema)
{
// this down() migration is auto-generated, please modify it to your needs
$this->abortIf('mysql' !== $this->connection->getDatabasePlatform()->getName(), 'Migration can only be executed safely on \'mysql\'.');
$this->addSql('ALTER TABLE company_metadata DROP return_policy, DROP warranty_policy');
}
}
As you might have noticed, it's clearly written: Please modify to your needs!.
That means you're allowed to script a migration to your needs while leveraging the """"power""" (or not 😛) of PHP.
Let's say we introduced a new required field called allowedToAccessWebsite
.
Using Javascript, that would allow us to script a migration like this:
class Version20180507073532 extends AbstractMigration
{
async function up(db: Prisma)
{
const minorUsers = await db.query.Users({ where: { age_lt: 18 } });
const majorUsers = await db.query.Users({ where: { age_gte: 18 } });
const minorUsersIds = minorUsers.map(minorUser => minorUser.id);
const majorUsersIds = majorUsers.map(majorUser => majorUser.id);
await db.mutation.updateUser(
{ where: { ids_in: minorUsersIds } },
{ data: { allowedToAccessWebsite: false } }
);
await db.mutation.updateUser(
{ where: { ids_in: majorUsersIds } },
{ data: { allowedToAccessWebsite: true } }
);
}
async function down(db: Prisma)
{
...
}
}
Those migrations should indeed have records of the schema updates as well.
The end format could look like this:
class Version20180507073532 extends AbstractMigration
{
/* Please, do not change this */
function schemaUpdate() {
return {
schema: {},
steps: {},
...
}
}
async function up(db: Prisma)
{
...
}
async function down(db: Prisma)
{
...
}
}
Or even have them separated in two files, timestamped at the same date, with one file containing the schema changes, and another containing the scripts.
Coming from C# and Entity Framework to node I've really found that this is the biggest thing I miss. If Prisma could make somthing similar to Entity Framework (looks allot like what @Weakky just layed out) it would make prisma a real asset to the node community.
Thanks a lot everyone for chiming in. Migrations is a super interesting and tough problem, and we would love to learn more about your use cases to arrive at a solution that covers them.
We set up a working group to further discuss this, if you're interested I would love to have you in the #migrations
channel in our Slack community.
I've written in slack channel and just now found this issue. I agree with @mavilein, cli-centered approach is a bad idea. CLI-workflow is good for prototypes and quick-starts, but for heavy continious/team development with proper workflows, CI/CD pipelines, etc, it is unsuitable.
This should be server-centric
@mavilein
There are 2 ways to store a migration in a file. We either store the concrete steps or the next project state.
It's kinda imperative vs declarative. Storing next project state (e.g. declarative way) will lack transition steps (how to go from one state to another). And this is a huge problem. One should not rely on diffing algorithms, this gives very high chances for data loss or corruption. IMO migrations should always be imperative, e.g. they should instruct server exactly how to transition to new state, rather then what the state should be.
As of slack channel is very inactive, I'l also write here. I see two ways to work with versioned migrations:
migrations
table (in fact replicating local migration history). Maybe not very robust solution, but should work. One can locally build a migration process using temps and other tricks to alter/move/add structure and data, than "freeze" the process in migration history and target server will replicate this processSecond variant I see as more solid and controllable, but it fundamentally changes approach to data modeling and requires development of special datamodel-builder syntax and/or special cluster api for this. An example of graphql api for migrations can be something like this:
mutation {
model(
name: "Article",
addField: [{ name: "slug", default: "", index: INDEX_UNIQUE }],
changeFeild: [{ name: "old_name", rename: "new_name", nullable: true }]
removeField: [{name: "old_name"}]
) {
status
}
}
So the migration will be a file with mutations like this. Also this approach simplifies adding basic data: not seeds with test data, but system-critical data, like initial admin account or initial settings, etc
Without advanced migrations for migrating up AND down (inkl hooks to invoke data migrations scripts) prisma can NOT be considered production ready for any team using CI/DI. Basically this means any larger customer especially enterprise will find the lack of this a show stopper.
I don't really care about the format, but it has to be on disk, editable incl being able to make any changes to the data we need.
What is the current status and how can we help (slack channel seems to be dead)?
Is this still active? Is there something community could help?
Can we get an update on this? We're currently looking to move away from Prisma due to the lack of migrations support. Thanks.
I second @Gdewilde's sentiment. The lack of proper scripted migrations is a huge problem for any production application built on prisma right now, and is one of many issues that is causing my team to reconsider using prisma going forward.
@marktani Do you guys have any update on your thoughts here?
@schickling @sorenbs - friendly ping 😄
It would be great with confirmation that this is still planned for Q4 and being worked on. Not having migrations in Prisma is the single largest problem IMO for production applications, effectively rendering CI/CD impossible.
Agreed. This is why I'm not using Prisma on any client projects yet.
On Wed, Nov 7, 2018 at 3:34 PM jhalborg notifications@github.com wrote:
@schickling https://github.com/schickling @sorenbs
https://github.com/sorenbs - friendly ping 😄It would be great with confirmation that this is still planned for Q4 and
being worked on. Not having migrations in Prisma is the single largest
problem IMO for production applications, effectively rendering CI/CD
impossible.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/prisma/prisma/issues/1263#issuecomment-436530895, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AHJKJcTvm38WJ0nWLzxTrRS02QewF2hYks5usozqgaJpZM4QbeyN
.
--
Robby Emmert
Owner, Software Specialist @ Snowball Development, LLC
https://snowballdev.com/
LinkedIn https://www.linkedin.com/in/robbyemmert/ | @robby4g
https://twitter.com/robby4g
Thank you for all the input on this topic!
We are making progress towards a more flexible migration system, and I'd like to describe our approach.
From talking to many users of Prisma we realised that the declarative migration system is tremendously valuable to a large group of users. Thank you @terion-name for describing the difference between declarative and imperative migrations above.
There are two major problems with the current declarative migration system:
We are currently taking steps to address both issues without abandoning the idea of declarative migrations.
First, we introduced the passive mode earlier this year allowing you to use Prisma with an existing database. We are addressing some limitations and bringing this capability to MySQL in addition to Postgres. If you need full control over database migrations or want to run scripts in ci/cd, it is a viable option to simply not use the declarative migration system.
Second, we want to make it seamless to use the declarative migration system most of the time, but opt out for specific migrations that require complex logic. This way you will be able to do most migrations like you do today, and write a sql migration when you have to. We are streamlining the datamodel in order to support this. When this work is complete, the declarative migration system will also become more flexible and for example support relations without a relation table.
Work on this is underway but we are not able to give a concrete timeline yet.
In addition to these larger changes, we are also working on small items to make the declarative migration system more useful: https://github.com/prisma/prisma/issues/2323
Over the past year we have seen Prisma be adopted for more different use cases than we could have imagined. It has become clear that the current monolithic design is not serving us well. In the future we will split Prisma into individual projects each with a very clear focus. Declarative migrations is likely to be the first feature that is promoted to its own project. When this has happened we will be in a better position to deliver on the vision laid out in the original feature request.
I would love to hear your thoughts on this approach as well as specific feedback on the new datamodel describe in this spec.
Great news! Thanks for the update!
It's great to hear that this is actively being worked on! However, I'm a bit disappointed that you're leaning towards the declarative direction, with raw SQL as the 'escape hatch'/fallback. I'd much rather prefer the imperative and versioned migration model that @terion-name described, having specific migration code run on the DB for each up/downgrade of the database.
It's not usable in CI/CD if it just works 90% of the time, then i just defeats the purpose, and will not meet the requirements for larger projects. I think it should be fully controllable, meaning version for the DB, and coded migration scripts up/down for each version - and preferably not in SQL.
Or maybe I misunderstood?
@jhalborg - I like that vision too. When we have extracted the migration system to a separate project we will be able to better iterate on advanced use cases like this.
I definitely think it is possible that the Prisma migration system will evolve to support imperative, non-sql migrations as suggested by @terion-name and @schickling in the initial feature request.
@sorenbs separation of migrating system in separate project, that interacts with Prisma server via clear API is a good idea IMO, because it will open possibilities to _choose_ what migrating system suits project/setup needs.
And maybe it could even be in a pipeline manner to stack them (e.g. overlay some programmatic migration scripts over generated by declarative builder/differ)
As it is very huge amount of work and obviously will not be implemented soon, it is very important to force #2323 (please!) — this really will make life easier for many cases.
And the biggest problem for now is migrating data when structure changes. This is really a huge problem. But there are so many pitfalls with this, that I can't design any approach that would be easily tweaked in current system. Obvious part is to run some queries in process (after adding raw sql to query engine it became more useful), but how to hook in the process with this queries?
One thing that is routing in my mind is make callbacks on warnings. Now, when migrator faces data loss, it spawns warnings with certain info, what and where it sees problems. Theoretically this info can be used to form a pattern-based string, that will match a "fixing" query, list of which should be passed with schema.
For example, we change User
schema by removing FullName
and changing it to FirstName
and LastName
. Migrator creates new columns and encounters data loss due to dropping of FullName
. So it searches for "fixing" or "migrating" query in payload provided, that is named something like this: User_FullName_drop
. If it is found - it runs and there we can write a query, that takes data from fullName and moves it to firstName and lastName columns.
It's just a concept that obviously will not fix everything with declarative migration, but again, it can resolve a lot of situations IMO
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions.
The specification for the new migration system can be found here.
Declarative migrations is likely to be the first feature that is promoted to its own project
Does this mean that we will be able to use this declarative migrations library independently of Prisma?
That would be super neat. I've been looking for / considering implementing a do-one-thing-do-it-well database-agnostic migration library.
I wonder if there is an update on new migration system and if there is a release date of it.
Right now, I'm planning to create some node script to create a migration file under migrations
folder, and fill it manually. Such as,
const { prisma } = require('./generated/prisma-client')
module.exports = async ()=>{
await prisma.$graphql('mutation {executeRaw(query: "UPDATE `default@default`.User SET foo=\'bar\'")}')
}
However, this process is error tolerant and not maintainable in long-term.
I want to know if any of you have any other solutions.
I believe this is the latest update -- Lift in Prisma 2 preview:
https://www.prisma.io/blog/announcing-prisma-2-zq1s745db8i5#prisma-2-next-generation-database-tooling
https://lift.prisma.io/
Most helpful comment
Without advanced migrations for migrating up AND down (inkl hooks to invoke data migrations scripts) prisma can NOT be considered production ready for any team using CI/DI. Basically this means any larger customer especially enterprise will find the lack of this a show stopper.
I don't really care about the format, but it has to be on disk, editable incl being able to make any changes to the data we need.
What is the current status and how can we help (slack channel seems to be dead)?