Framework: Ability to define index key length while creating schema

Created on 17 Jun 2015 · 37Comments · Source: laravel/framework

Sorry, that I have to open the issue again. Since the issue doesn't get cleared.

Following #9256, what @GrahamCampbell said is not the case. I've quoted here for convience:

I've just tried out v5.1, and got the infamous ERR1071 while doing artisan migrate, right after laravel new. I can't find out a solution to it, yet. Because table charset is utf8mb4, and the email column is (by default) varchar(255), creating a unique on it will definitely cause error. I can't find out a way to specify the key length while calling
$table->string('email')->unique();
from the 2014_10_12_000000_create_users_table.php file.

After a little digging, it seems that we'll have to modify several index related functions in the Illuminate\Database\Schema\Blueprint class, and eventually (maybe not), the Illuminate\Support\Fluent.

Any hint?

@GrahamCampbell answered

Yeh, it's possible. Pass a second param to string with the length.
$table->string('email', 128)->unique();

Continue the talk

_Passing in the length to string() changes the length of the column, but not the KEY length. The email column does need to be long enough to hold the data, truncating it to 128 may break the thing._

From what I've dug up:

Illuminate\Database\Schema\Blueprint::string is defined as:

public function string($column, $length = 255)
{
    return $this->addColumn('string', $column, compact('length'));
}

which just wraps up a call to Illuminate\Database\Schema\Blueprint::addColumn

protected function addColumn($type, $name, array $parameters = [])
{
    $attributes = array_merge(compact('type', 'name'), $parameters);
    $this->columns[] = $column = new Fluent($attributes);
    return $column;
}

which in turn creates a Fluent instance.

I didn't dig deeper into it, except checking the generated SQL. I can't find any key length related process in the code.

A bit more to clarify, that the key length is specifically MySQL, which makes a partial index using only the start of the column, given length in bytes.

bug

Source

eidng8

👍6

Most helpful comment

@taylorotwell we can not get an answer from Graham, please look at it.

Malkaduhimi on 3 Aug 2016

👍12

All 37 comments

If 128 is too short for you, use something else!

GrahamCampbell on 17 Jun 2015

👎6

@GrahamCampbell, I think you are misunderstanding @eidng8. As he points out, he does not want to change the length of the column. He wants to change the length of the index prefix, see https://dev.mysql.com/doc/refman/5.0/en/column-indexes.html.

Malkaduhimi on 24 Jun 2015

@GrahamCampbell I also think you are misunderstanding @eidng8 and I have a similar requirement for this.

MySQL (and I'm sure other storage engines) support limiting an _index_ not the _column_ by a length. For example for performance purposes you may choose to only index the first 7 characters of varchar column, you may still want a 256 limit but the INDEX needs to be limited.

MySQL SQL code for this could be:
..... .... COLUMN message VARCHAR(300) , ADD UNIQUE INDEX index4 (message(100) ASC);

williamjulianvicary on 25 Jun 2016

👍3

@GrahamCampbell Please take a look at this again, I think you misunderstood the ticket and closed a perfectly valid feature request. If you re-open the ticket I will implement this myself.

TheWildHorse on 31 Jul 2016

👍1

@taylorotwell we can not get an answer from Graham, please look at it.

Malkaduhimi on 3 Aug 2016

👍12

Indeed, I just received a "Specified key was too long, max key length is 767 bytes" error during a migration. I suspect this is because we recently standardized on utf8mb4 going forward. Going to talk to DevOps about increasing the max but it would be nice to specify a key length.

reedmaniac on 14 Nov 2016

👍2

+1 on this. It's a problem.

adamthehutt on 10 Feb 2017

salopot on 20 Feb 2017

👍2

Especially critical for BLOB/TEXT columns.

NBZ4live on 21 Mar 2017

👍2

+1, Ran into this problem trying to create an index on a TEXT column.

ozzylaundrapp on 9 Jun 2017

I pinged @taylorotwell on Twitter to see if this could be re-opened - this would be very helpful for more complex data sets where performance is critical.

williamjulianvicary on 9 Jun 2017

🎉1

I recently faced this issue. A good use case is when you want to store an IPv4/6 address in database you can use inet_pton to make it binary and inet_ntop to revert it to readable format.
For column type you should use binary and if you want to index this column the length of the index is required.

PS: @taylorotwell it will be nice if you add this common case as native $casts type for IP addresses for Model and change the implementation for $table->ipAddress in migrations to use binary for Laravel 5.5

ivandokov on 21 Jun 2017

karlhepler on 25 Oct 2017

myquote-dev on 2 Nov 2017

As a temporary workaround. It is possible to use the following raw syntax for defining indexes etc. $table->index([DB::raw('column(100)')])

myquote-dev on 2 Nov 2017

👍8

That whole migration hassle is so bad, I can't describe how bad :((

rbkkm on 4 Feb 2018

👎4 👍1

This is an issue for one of my projects as well. I have a data that is 255 long and changing the default length to 191 as the laravel manual suggests isn't enough. Setting the length of the column to 255 is fine but then I cannot index it with without being able to specify and index length of 255. Will try @myquote-dev workaround

jblotus on 15 Feb 2018

$table->index([DB::raw('value(100), name')], 'parameters_value_name_index');

chanjsq on 15 Mar 2018

I think that this would also be helpful in this. It has been causing issues for me with the collation not matching up when migrating on systems with slightly different mysql configurations.

https://phplaravel.wordpress.com/2016/07/06/using-mysql-utf8mb4-character-set-and-collation-with-laravel/

ryangurn on 30 Jan 2019

@myquote-dev Thanks for the suggestion. It works, and it can be used in only the TEXT/BLOB fields.

$table->index([DB::raw('column(100)')])

trianity on 25 Aug 2019

We're reconsidering https://github.com/laravel/framework/pull/25200 as a fix for this in Laravel 7. If anyone's up to it you may send in a PR to the master branch. Support for all DB engines is probably wanted.

driesvints on 21 Jan 2020

Just noticed that this issue is still alive. And just in case this bothers you and you haven't stumbled upon a new option since early MySQL 5.6:

https://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix

By enabling this, index prefix is enlarged to 3072, enough to fit most of your needs. So if you can change your MySQL server setting, you can go for it.

eidng8 on 22 Jan 2020

@eidng8 the problem is that it can't be set through the schema builder

driesvints on 22 Jan 2020

Yes, of course. That's why I opened this issue. I just want to leave a note in case it could help.

eidng8 on 24 Jan 2020

I would like to pick up this. But I'm having a long trip to Moscow next week. So it would be a bit later on the task.

Before diving into code, I need some input from you. There are differences among supported databases:

MySQL

You've mostly have the idea. It's the prefix length of the index.

PostgreSQL

There is no such thing as prefix or index length. They use expression index instead. So if we should implement this, the index will be translated from something like (say, prefix length is 10)

-- MySQL
CREATE INDEX a_index ON a_table (a_column(10));

to something like:

-- PostgreSQL
CREATE INDEX ON a_table ((substring(a_column 1 for 10)));

However, people with proper PostgreSQL background may specifically do this instead of letting the framework to decide on it. So I'm not sure if this is the right way for us.

SQLite

SQLite doesn't seem to have limit specifically on index. There isn't even a clause or expression for this kind of thing. So the parameter shall be ignored by SQLiteGrammar.

SQL Server

This is the worst. It has a 900 or 1700 bytes limit on key length. Depending on the version of SQL Server and whether the index were clustered or not. However, there is nothing in T-SQL to work around this limitation. If it has to be an index for wide columns, many would use an extra hash column to hold the hash (e.g. SHA) value of the wide column, then create an index on the hash column. However, it doesn't seem proper for a framework to make this kind of decision. So I tend to prefer ignoring this parameter in SqlServerGrammar.

Important: I'm not a database professional. All of the above are just what I know. They may not be all true. But I do need your decisions about handling differences among those databases.

eidng8 on 25 Jan 2020

@staudenmeir is it true that index length is only supported on MySQL? In that case I guess your PR is good to go? We could override the methods on the different grammers to throw a descriptive exception when people attempt to set the length on different engines maybe?

driesvints on 27 Jan 2020

@driesvints Only the MySQL implementation can easily be used in queries.

PostgreSQL and SQLite also support prefix indexes, but they can only be used in a way that requires raw expressions in Laravel:
https://stackoverflow.com/a/21824039/4848587
https://sqlite.org/expridx.html

Of course, we could support the index creation regardless of how complex it is to use them in queries.

staudenmeir on 27 Jan 2020

@staudenmeir thanks! Let's get this only implemented for MySQL then and document it clearly that it's only available there.

driesvints on 27 Jan 2020

Should I do anything on the PR then? or is it good enough?

eidng8 on 3 Feb 2020

@eidng8 it looks good

driesvints on 3 Feb 2020

Since this is a feature request I'm closing this. Anyone's free to attempt the PR of staudenmeir again.

driesvints on 24 Apr 2020

I wanted to add an index on my text column and ran into this issue.

I was able to work around it by removing the ->index() from the $table->text('some_column') and then added $table->index([DB::raw('some_column(750)')])

I experimented with the number inside the parentheses.. my understanding is that you can only index the first n characters inside a text field. If you go a little higher than 750 you run into this error:

SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 3072 bytes

vesper8 on 9 Dec 2020

I wanted to add an index on my text column and ran into this issue.

I was able to work around it by removing the ->index() from the $table->text('some_column') and then added $table->index([DB::raw('some_column(750)')])

I experimented with the number inside the parentheses.. my understanding is that you can only index the first n characters inside a text field. If you go a little higher than 750 you run into this error:

SQLSTATE[42000]: Syntax error or access violation: 1071 Specified key was too long; max key length is 3072 bytes

The most likely reason for this is that 'some_column' is using the utf8mb4 character set which is 4 bytes per character. Therefore the maximum number of characters you can use for your index is 3072 bytes / 4 bytes = 768 characters. utf8mb4 text indexes over 768 will exceed the 3072 byte limit.

myquote-dev on 9 Dec 2020

@myquote-dev hrm interesting to know.. thanks for explaining

For a few years now I have just been defaulting to using utf8mb4 everywhere, even though in 90% of my projects I only deal with English.. do you think I should re-think my strategy and use regular utf8 instead? Which in this case would allow me to have a larger indexable text/varchar field?

vesper8 on 10 Dec 2020

750 should be ample for an index, what you’re after is a low cardinality in the indexes that also doesn’t bloat them (my understanding is the optimiser built in to MySQL will first look to your indexes, but if those indexes are just as big as your data it won’t help quite so much).

If 750 characters isn’t enough length to identify a single record in your table or provide ordering (I expect it is but perhaps not!) then there are other issues with the architecture.

I think you may be stressing over micro optimisations that will probably have little difference on performance - it’s easy enough to test, but I’d expect 750 or 1500 length to provide identical performance. (And for that matter, 250 or less would probably be plenty)

williamjulianvicary on 10 Dec 2020

@williamjulianvicary thanks for you input. Well perhaps you can give me your quick advice regarding my specific problem. I am dealing with a database table containing 7 million Chess games. The column I wanted to index in order to speed up queries is a column that contains the Chess line. Here are a few examples of what those values are:

d4d5c4c6Nc3Nf6e3e6Nf3Nbd7Bd3dxc4Bxc4b5Be2Bb7O-OBe7e4
d4e6c4Nf6Nf3Bb4+Nbd2b6a3Bxd2+Bxd2Bb7Bg5d6e3Nbd7Be2O-OO-Oh6Bh4Qe7b4c5Qb3Rab8Rfd1g5Bg3Ne4Nd2Nxg3hxg3d5cxd5Bxd5Bc4Bxc4Qxc4b5Qc3c4Ne4Rfd8a4f5Nc5Nxc5dxc5Rxd1+Rxd1Rd8Rd4e5Rd2a6g4fxg4axb5axb5Rxd8+Qxd8Qxe5g3fxg3
d4Nf6Nf3e6c4b6a3Bb7Nc3d5Bg5Be7Qa4+c6Bxf6Bxf6cxd5exd5g3O-OBg2c5Rd1c4O-Oa6Ne5b5Qc2Qd6e4Bxe5dxe5Qxe5f4Qd6Nxd5Nd7Qf2Rad8Rd2Bxd5Rxd5Qc7Rfd1Nb6R5d4Na4e5Rxd4Qxd4c3bxc3Nxc3Re1Rd8Qe3Nd1Qe2Qc5+Kf1Nc3Qe3Qc4+Kg1a5f5b4e6fxe6fxe6Re8e7Qc8axb4axb4Qd4b3Qb4Kh8Rf1h6Be4Ne2+Kg2g5Qd6

I perform queries such as, show me all lines that begin with d4d5c4c6Nc3Nf6

e.g. SELECT * FROM games WHERE line LIKE "d4d5c4c6Nc3Nf6%"

I know I could throw that up on an Elastic Search instance somewhere etc but right now I'm working with it on my local dev so CPU/RAM is limited and I just wanted to see if adding an index would speed things up. It looks like it did a little bit but it's still quite slow

vesper8 on 10 Dec 2020

You could try a shorter index on that field (say 255 characters so you sit within the index length limits of UTF8 or shorter) - that may help ensure your index is able to be used.

If you're requesting X hundred k records with that query though, it's not going to be that quick.

williamjulianvicary on 10 Dec 2020

Was this page helpful?

0 / 5 - 0 ratings