Rails: MySQL UTF8MB4 breaks ActiveRecord schema setup

Created on 21 Mar 2013  路  49Comments  路  Source: rails/rails

Mysql's UTF8 support (charset name 'utf8') has been limited to 3-byte characters. This was changed in mysql 5.5 by adding a new charset 'utf8mb4' and creating an alias for 'utf8' of 'utf8mb3' (for multibyte 3 chars). It is my understanding that 'utf8mb4' is the "real" utf8 support, and it will likely be a popular option amongst developers. I discovered this problem by storing the text of Twitter messages where activerecord would throw an exception because mysql was rejecting the INSERT statements due to 'invalid' 4 byte UTF8 characters.

The current mysql2 gem does not support charset utf8mb4 but the github head version of the gem does. After setting up the new character set in database.yml and the mysql server, it is no longer possible to create a new database through the usual rake db creation tasks.

The mysql docs talk about a new limitation on database indexes for string columns when the utf8mb4 charset is in use. The limit is 191 characters for string columns using utf8mb4. (http://dev.mysql.com/doc/refman/5.6/en/charset-unicode-upgrading.html). When the schema_migrations table is created with string length 255, mysql throws an error.

$ rake db:setup

Mysql::Error: Specified key was too long; max key length is 767 bytes:
CREATE UNIQUE INDEX unique_schema_migrations ON schema_migrations (version)

A slightly longer writeup is in http://donpark.org/blog/2013/02/16/rails-3-2-12-not-ready-for-mysql-5-5-utf8mb4

Most helpful comment

this will set default mysql string column length to 191 instead of 255 which is the new index limit on utf8mb4 (aka real utf8).

# config/initializers/mysqlpls.rb
require 'active_record/connection_adapters/abstract_mysql_adapter'

module ActiveRecord
  module ConnectionAdapters
    class AbstractMysqlAdapter
      NATIVE_DATABASE_TYPES[:string] = { :name => "varchar", :limit => 191 }
    end
  end
end

Update: please note that either updating MySQL config and patching table creation to create dynamic row as mentioned here or using newer MySQL is a better long term solution.

Or just use PostgreSQL.

All 49 comments

Fixed by 8744632fb5649cf26cdcd1518a3554ece95a401b :grin:

Thank You!

I still see the Mysql2::Error: Specified key was too long; max key length is 767 bytes: CREATE UNIQUE INDEX when performing rake db:create && rake db:migrate with Rails 4.

Having the same issue as @heaven, Rails4 as well, while trying to run default Devise migration.

I'm still seeing this error in Rails4

this will set default mysql string column length to 191 instead of 255 which is the new index limit on utf8mb4 (aka real utf8).

# config/initializers/mysqlpls.rb
require 'active_record/connection_adapters/abstract_mysql_adapter'

module ActiveRecord
  module ConnectionAdapters
    class AbstractMysqlAdapter
      NATIVE_DATABASE_TYPES[:string] = { :name => "varchar", :limit => 191 }
    end
  end
end

Update: please note that either updating MySQL config and patching table creation to create dynamic row as mentioned here or using newer MySQL is a better long term solution.

Or just use PostgreSQL.

:+1: Thanks @edogawaconan's monkeypatch worked! :sparkles:

Are there plans to fix Rails 3 with this patch?

@yyyc514 unfortunately not, Rails 3 is not being supported for bug fixes anymore.

@jeremy Can you confirm the necessity of the patch by @edogawaconan for Rails 4? Readers of this issue will currently be unclear as to whether they need to include that patch and although it is trivial to find out, it would be good if an authoritative source could weigh in, so 'finding out' doesn't need to be repeated by every reader. Especially if the patch _is_ required, in which this case this issue should be reopened?

My Rails 4.0.2 schema exploded after I switched to utf8mb4 (needed it for Emoji support):

-    t.text     "parameters"
+    t.text     "parameters",     limit: 16777215
-    t.string   "state",      default: "current"
+    t.string   "state"

What's worse - some defaults are missing. Please advise should I merge new limits with old defaults?

@Confusion Tricky one. Changing the default string length is a major breaking change. Some options to explore:

  • Explicitly limit your string length in the migration.
  • Use prefix indexes to limit index bloat (rarely need an index on all 191 chars).
  • Make default varchar limits configurable and dump it along with schema.rb to preserve compatibility.
  • Switch to MySQL DYNAMIC table format and enable innodb_large_prefix to increase the max key length from 767 bytes to 3072 bytes: http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix

@firedev Your schema is dumped from your db, so if a default is missing, it was inadvertently removed from your db. Do a migration to re-add your default and the next schema dump will have it.

Okay, thanks, but now I have the same

Mysql2::Error: Specified key was too long; max key length is 767 bytes: CREATE UNIQUE INDEX `unique_schema_migrations`  ON `schema_migrations` (`version`)

Guess it was patched for >= 4.1.0.rc1, was it?

I have the same issue on Rails 4.1.1:

Mysql2::Error: Index column size too large. The maximum column size is 767 bytes.: ALTER TABLE ...

I have used gem demoji and converted database back to UTF8 to get rid of these issues.

I'm also experiencing this issue in a Rails app recently upgraded from 3.2.18 to 4.1.1 in that it cannot create an index on a VARCHAR(255) column.

The only workaround I've found so far is updating the column to VARCHAR(191).

with sql:

ALTER TABLE table_name MODIFY column_name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

in a migration:

execute "ALTER TABLE table_name MODIFY column_name VARCHAR(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;"

So it appears to me that either @edogawaconan's monkeypatch is still required in Rails 4.1.1 or you must manually keep track of any string values that you want to index on, to keep those specific columns limited to VARCHAR(191).

I'd love to be wrong about this, though.

fwiw I hit this error on rails 4.1.4, monkey patch fixed. Sorry haven't investigated more.

What I've done and what I've learned:

  • switched the database.yml encoding to utf8mb4
  • left all my tables utf8
  • left the database utf8 (ensures new tables are utf8 by default)
  • converted individual columns as necessary to utf8mb4
  • switched to schema.sql dumps and imports so the column encodings are preserved properly

This seems to work great without any weird Rails issues - if you can give up schema.rb, which I've always found more annoying than helpful in a large production app anyways.

I'm getting this same problem with 4.2.0.beta1 (about key lengths being too long). The monkeypatch fixes this, but obviously is not ideal. Is this on the roadmap somewhere?

Solved.
mysql 5.6
rails 4.1

in your my.cnf:

innodb_large_prefix             = 1
innodb_file_format              = barracuda
innodb_file_per_table           = 1

and you must create the table with the option ROW_FORMAT=DYNAMIC (by default it's COMPACT)

here I found a monkey patch http://3.1415.jp/mgeu6lf5

config/initializers/ar_innodb_row_format.rb
ActiveSupport.on_load :active_record do
  module ActiveRecord::ConnectionAdapters   
    class AbstractMysqlAdapter 
      def create_table_with_innodb_row_format(table_name, options = {}) 
        table_options = options.reverse_merge(:options => 'ENGINE=InnoDB ROW_FORMAT=DYNAMIC')

        create_table_without_innodb_row_format(table_name, table_options) do |td|
         yield td if block_given? 
        end
      end
      alias_method_chain :create_table, :innodb_row_format 
    end
  end
end

better solution, instead of use the monkey path just add the option.

create_table :table_name, options: 'ROW_FORMAT=DYNAMIC' do |t|
  ...

I think the point of the monkey patch is because it's easy to forget to always add those options... or new team members replace old ones and no one knows that special options are always required, etc...

Sure you can do it manually, but if you want to do it once and forget it (though document the monkey patch somewhere) the patch makes sense. My solution above has also proven to work well in practice.

@yyyc514 you can use any solution, which one depends on your project and your team. If the option to add ROW_FORMAT is there I will use it. I prefer use the options that rails provide instead of monkey patch something.

Monkeypatch is the last resort. You say it will be easy for new team members... I don't agree, you can document this issue or even read the previous migrations. Also monkeypatch is not upgrade safe.

The required options will be visible for them even if they don't read the documentation or the migrations because they will receive the error discussed here. the obvious questions after face this error will be, How the previous migrations are working? and the first step will be look at them for sure.

I encourage the use of the options parameter in create_table method.

For stuff like this I usually also include a Ruby or Rails version check. If someone bumps the Ruby or Rails version then an error gets raised and they re-evaluate the use of the monkey patch and it's utility - making the monkey patching about as upgrade safe as it's going to be. Also putting all monkey patches in a single documented location also helps. Though technically in this case the version of the DB might also matter in the "check" process.

Again, monkey patching isn't how I solved this particular issue, but it can be made safe enough. create_table is great for those who are going to remember it every time. But if you know you aren't, there are other choices.

This is just an icky problem all around. :)

they will receive the error discussed here

Not if the indexes that trigger the issue are going to being added at a later date. This can sometimes be quite a subtle issue that doesn't bite you until later. :-)

@yyyc514 Do you create indexes without testing it? before pushing the indexes to production you should test them in development, run your tests (env test), deploy to staging and if all is working you can deploy to production. If you don't follow a similar process you are doing it wrong...

I can't imagine a case where I create a migration with indexes that will trigger the error when deploying to production, even a hotfix should be tested first.

@NaN1488 No, I test migrations locally first but the problem is if it's a large table that you created previously in the wrong format (by forgetting the create_table options)... the problem isn't that you'll get an error adding the index (which isn't a biggy in and of itself) the problem is the downtime you'll have re-writing the entire table on disk before you can THEN try add the actual index. Better to make sure the table is created properly even if someone is forgetful.

Everyone is welcome to their opinion though. :)

I noticed that people from emojli mentioned that utf8mb4 still has problem dealing with emoji.

Reference: https://www.youtube.com/watch?v=GsyhGHUEt-k

Take in account that the row format option is not translated in the schema file.

2 years later. Where are we at with this? I just had the same error.

I think it's read the full thread and pick a solution that works for you. If using MySQL 5.6 or newer is an option and you can tweak a few settings and pick your own table storage format there are some options that don't have the weird 767 byte limitation at all - so all of this would "just work". But otherwise you have to jump thru some hooks and hack around the index sizing issues.

We need to be able to have the NATIVE_DATABASE_TYPES to a length of 191 when full UTF8 (utf8mb4) is used.

after switching encodeing in config/database.yml no issues was founded. working good

I have always this error, when it was not this error:

Mysql2::Error: Specified key was too long; max key length is 767 bytes:

it was this other:

Mysql2::Error: Specified key was too long; max key length is 3072 bytes

so I as suggested before I create a new database with this settings using docker and the official mysql

docker run -p 3306:3306 --name mysql-name -e MYSQL_ROOT_PASSWORD=name -d mysql:5.6 --innodb-large-prefix=1 --innodb-file-format=barracuda --innodb-file-per-table=1

and finally the database.yml as follows change utf8 latin1 the sswedish is because in the server I saw that:

development:
  host: 192.168.99.100
  database: name
  username: root
  password: root
  adapter: mysql2
  charset: latin1
  encoding: latin1
  collation: latin1_swedish_ci


test:
  host: 192.168.99.100
  database: name_test
  username: root
  password: root
  adapter: mysql2
  charset: latin1
  encoding: latin1
  collation: latin1_swedish_ci

production:
  host: 192.168.99.100
  database: name
  username: root
  password: root
  adapter: mysql2
  charset: latin1
  encoding: latin1
  collation: latin1_swedish_ci

after that works

bundle exec rake db:create
bundle exec rake db:migrate

I did it with the monkey patch and without the monkey patch suggested here and it works

config/initializers/ar_innodb_row_format.rb

ActiveSupport.on_load :active_record do
  module ActiveRecord::ConnectionAdapters   
    class AbstractMysqlAdapter 
      def create_table_with_innodb_row_format(table_name, options = {}) 
        table_options = options.reverse_merge(:options => 'ENGINE=InnoDB ROW_FORMAT=DYNAMIC')

        create_table_without_innodb_row_format(table_name, table_options) do |td|
         yield td if block_given? 
        end
      end
      alias_method_chain :create_table, :innodb_row_format 
    end
  end
end

For those of you thinking "wasn't this fixed in 8744632, why isn't it working?". The fix was removed in #23168, for some reason...

Until this is re-applied, @NaN1488's solution: https://github.com/rails/rails/issues/9855#issuecomment-57666389 is helping to allow utf8mb4 in rails 5.0.2 and 5.1.0.rc2 for my applications.

This issue is still happening in Rails 5.1.1 馃槙

The monkey patch doesnt work with Rails 5.1.2/Ruby 2.4.1. Running Rails errors with with:

NoMethodError: undefined method `alias_method_chain' for ActiveRecord::ConnectionAdapters::AbstractMysqlAdapter:Class
Did you mean?  alias_method

I am migrating a project from Rails 4.2.8/Ruby 2.4.1. It works great with these versions (I have been using the monkey patch for a while). If anyone has a suggestion for a new monkey patch, i would appreciate it. I have tried the other suggestions in this thread, but they dont last long-term, something always breaks. The monkey patch worked flawlessly for me every time I dropped, created, and migrated the DB. Without it, i get the _767 bytes_ error.

@scott-knight which one did you try? I've used @nanaya's and that works every time 馃槃

@deanpcmad I have been using the one supplied by @NaN1488 and @anquegi. It's been great up until I tried to migrate as mentioned. I will try @nanaya's solution. Thanks for the input.

I have worked around this by
ALTER DATABASE nnnnn CHARACTER SET utf8mb3
But I would like an out-of-box solution so I don't need to change the characterset in the database.

@matthewdarwin Thanks for your solution, Matthew! Did you tried to set charset to utf8mb3 in database.yml and then run rails db:create?

No, I did not try that.

Guys, can we reopen this issue? Still happens with Rails 5.1.6. Even if @yumitsu 's workaround works, Rails need more sane defaults for encoding: utf8, even if it means we have to drop "real" UTF-8 and use something that more reliable.

Shouldn't be a problem anymore on MySQL 5.7.7 or later with its DYNAMIC default row type and large prefix default enabled.

@nanaya still working thank you

better solution, instead of use the monkey path just add the option.

create_table :table_name, options: 'ROW_FORMAT=DYNAMIC' do |t|
  ...

Where should this option be added?
None of the above fixes are working with MySQL 5.6.33 Ruby 2.5 and the latest stable 4.0 release 4.0.0 (2018-12-09)

thanks!

I am still running in to this problem with Redmine 4.1.1 and Ruby 2.5.7

Was this page helpful?
0 / 5 - 0 ratings