When it comes time to apply a migration to a big data table that's in a production system with hundreds of GB or many TB of data things get complicated. At this time it's close to impossible to achieve these big table alterations with Phinx because we need a lot of control on what Phinx is doing, most importantly we need the ability to alter tables live using pt-online-schema.
To date we've managed to use Phinx in a big data-high volume environment by using a MySQL master-slave setup. It's not ideal because you have to pause replication and apply the migration to replication so your changes don't replicate through (otherwise you lock the tables on master and cause a production outage).
My current process for altering production tables so far has been:
A solution to this would be updating Phinx to support pt-online-schema for Percona MySQL. This is a tool built by Percona to alter existing tables that are in production and live without bringing down the database or locking tables. We've successfully altered a 7 million row table in production receiving live traffic with hundreds of thousands of selects and inserts a minute.
I'm still thinking about the overall design for this change to ensure it's as flexible as possible. For example development environments won't use Percona Online Schema so the up() and down() methods will be used to execute raw SQL against the database without being piped through pt-online-schema for development environments. However, against stage and production systems where tables are giant and traffic is busy it would be ideal to have a Phinx CLI option that routes the up() and down() method calls to specific pt-online-schema methods.
For example --prefer-pt-online-schema which will instruct Phinx to run pt-online-schema methods if available, otherwise fall back to their up() and down() counterparts.
Thoughts and ideas? Possible problems?
:+1:
@reloaded are you still using phinx? if yes, how you solved those issues?
We never implemented pt-online schema support into Phinx for big data tables. Instead we ran PT online schema directly from the command line.
I'm currently doing full stack .NET / SQL Server work now so I haven't been using MySQL/PHP much.
I could probably take a look to get an idea of how much effort this would require to implement this. If someone wants to pair up with me on this kind of feature that would help me out a lot with my limited time.
While searching on this topic I found that: https://github.com/masom/lhm_php
Which is exactly what I wanted.
I would love to see this in phinx. Right now my team will manually run the pt-online tool for large tables in multiple environments. It would be great to work it in to the phinx workflow so the migrations are run / tracked sequentially and easy to run/deploy.
@jblotus https://github.com/masom/lhm_php is pretty much pt-online-schema. The original project is https://github.com/soundcloud/lhm
For those looking for solutions around online-schema changes. MySQL 5.6+ adds online DDL changes for a number of common changes obviating the need for pt-online-schema. It doesn't capture every kind of migration but it at least reduces the need.
Thanks @markstory - was not aware of this! Seems worth investigating for many use cases.
Anything concrete to be done here now? Or shall we close?
@dereuromark I haven't been in the PHP and MySQL realm for some time now and no longer with the big data company where we needed this.
There is still a TON of value for the Phinx product to support pt-online-schema for handling migrations. This gives Phinx the ability to have support for alterations against big data tables (using Percona Tools) which provides more options to the end users.
I'd suggest not closing this issue and finding a way to get it added to your road map.
Seems like the interest is not too high yet.
PR welcome.
Most helpful comment
@dereuromark I haven't been in the PHP and MySQL realm for some time now and no longer with the big data company where we needed this.
There is still a TON of value for the Phinx product to support pt-online-schema for handling migrations. This gives Phinx the ability to have support for alterations against big data tables (using Percona Tools) which provides more options to the end users.
I'd suggest not closing this issue and finding a way to get it added to your road map.