Diesel: INTEGER PRIMARY KEY in SQLite

Created on 8 Apr 2017 · 25Comments · Source: diesel-rs/diesel

Diesel deduces the a column's type as BigInt if it's declared as BIGINT in the database. This is generally fine in SQLite because INTEGER and BIGINT are synonyms, but INTEGER PRIMARY KEY is treated differently and is always 64-bit according to https://sqlite.org/autoinc.html. Moreso, BIGINT PRIMARY KEY AUTOINCREMENT is not accepted.

It might be a good idea to detect these columns and use BigInt for them. Note that the SQLite documentation says that INTEGER PRIMARY KEY is special only for tables that aren't declared with WITHOUT ROWID, but BIGINT PRIMARY KEY is still not allowed even when using that.

Source

lnicola

👍2

Most helpful comment

I'm not saying that I have a problem with the diesel developers, only that the justification here seems weak and the usability much less than ideal. Not having the resources and developing something for free are indeed good reasons to not do something, but the justification here seems to be that manually maintaining these is fine, and if it's a resource issue then in general I'd have expected someone to say "that's a resource issue". Perhaps they did and I missed it.

Not having one authoritative location for configuration in your code is a good way to introduce bugs. But if there's wider discussion to be had, that's probably out of scope for this issue in any case, and regardless I found (and documented) a workaround which at least temporarily reduces the pain points for people who need to use Sqlite. maybe I will revisit this topic in general somewhere else in a month or two after we've finished adopting the Rust stack, when I may perhaps be able to propose concrete action.

ahicks92 on 19 Sep 2020

👍3

All 25 comments

Is this really an issue ? I just tried and I was able to insert 2^256 in an integer field. SQLite does not really care about types. It would be confusing for users to deserialize what they think is an integer to an i64 though as I'm sure nobody knows about this bug. If you need to retrieve a I Bigint from SQLite, you can use the table macro and say it's a bigint

Eijebong on 17 Jul 2017

It's not an issue for SQLite, but diesel deduces the type as i32 IIRC.

If you need to retrieve a I Bigint from SQLite, you can use the table macro and say it's a bigint

Yeah, that works.

lnicola on 17 Jul 2017

I'm a bit torn on what to do here. It's really unfortunate that SQLite requires the type name to be exactly INTEGER in this case. As @Eijebong mentioned, our inference on SQLite is really more just trying to deduce intent, as types are just suggestions more than anything in SQLite. Ultimately I'd prefer not to special case this, as the use cases for SQLite trend towards not needing 64 bit PKs, and you can override manually in the cases where it matters. Ultimately though I think keeping our inference rules simple and consistent here takes priority.

sgrif on 17 Jul 2017

Is there a reason i32 was picked instead of i64? I can store an i32 in a i64 but not the other way around, so using a i64 seems like the obviously better choice.

Are there any docs I can read on how to work around this? I'm currently stuck with a number that I need to store but can't.

kpcyrd on 20 Nov 2019

I think you can use diesel print-schema and edit the resulting file to use your own types.

lnicola on 20 Nov 2019

@kpcyrd To quote Sean from the replay literally above yours:

Ultimately I'd prefer not to special case this, as the use cases for SQLite trend towards not needing 64 bit PKs, and you can override manually in the cases where it matters. Ultimately though I think keeping our inference rules simple and consistent here takes priority.

weiznich on 21 Nov 2019

My column is not a primary key, should I raise a separate issue for this?

For anybody with the same issue, I'm currently editing my src/schema.rs manually like this:

-        balance -> Nullable<Integer>,
+        balance -> Nullable<BigInt>,

Since this file is auto-generated it is overwritten every time diesel migration run and diesel migration redo is used.

kpcyrd on 21 Nov 2019

Since this file is auto-generated it is overwritten every time diesel migration run and diesel migration redo is used.

I didn't realize that. How does diesel migration run which is the schema file? Are you sure you don't have diesel print-schema somewhere?

lnicola on 21 Nov 2019

% cat diesel.toml 
# For documentation on how to configure this file,
# see diesel.rs/guides/configuring-diesel-cli

[print_schema]
file = "src/schema.rs"

I think it's correct that the file is rewritten to pick up possible schema changes, the problem is that I need to edit this file afterwards. Allowing me to overwrite the type for certain fields with a config file might work as a solution.

kpcyrd on 21 Nov 2019

👍1

Ah, okay. My solution was to not run print-schema and maintain it manually.

lnicola on 21 Nov 2019

@kpcyrd If you column is not a primary key you can just use BigInt as column type on sql side then diesel print-schema will pick things automatically.

About changing the auto generated schema: This guide is covering options to modify the schema after generation.

weiznich on 21 Nov 2019

👍2

@weiznich I have a primary key and want it to be i64. Your proposed guide mentions the patch_file field. This works, but only until I change the database schema. Then the patch file does not match with the newly generated schema.rs anymore and the following error occurs:

Failed to apply schema patch. stdout: patching file src/schema.rs

So after changing the database schema I would still have to manually change the schema.rs and additionally rebuild the schema.patch file.

tobx on 10 Jan 2020

Ok, now I understand. First, deactivate the auto-generation of the schema.rs in diesel-toml:

[print_schema]
# file = "src/schema.rs"

Then use the command diesel print_schema and edit the schema.rs manually.

I can live with that, but I still do not understand how patch_file should generally work after a schema change.

tobx on 10 Jan 2020

@tobx patch_file just applies the given file via patch to the generated schema file, so there is no magic involved there. This means applying the patch is only possible as long as you schema didn't change in all places at once. (The context for the changed lines needs not to be changed). That does entirely depend on your changes to the database schema, so it can work or it cannot work.

weiznich on 10 Jan 2020

@weiznich patch_file is not working for me in Git shell / msys:

$ diesel print-schema
Failed to apply schema patch. stdout: patching file 'C:\Users\me\AppData\Local\Temp\.tmp0VJQNn'
Hunk #1 FAILED at 43.
Hunk #2 FAILED at 67.
2 out of 2 hunks FAILED -- saving rejects to file 'C:\Users\me\AppData\Local\Temp\.tmp0VJQNn.rej'
 stderr:

How can I make it work on Windows?

I followed the guide and did git diff -U6 > src/schema.patch and I have

[print_schema]
file = "src/schema.rs"
patch_file = "src/schema.patch"

Boscop on 20 Jan 2020

I want to propose reopening this. While it is true that most applications using sqlite won't ever have 4 billion rows in a table at a time, sqlite's autoincrement algorithm can and will generate integers above the i32 limit. This means that if you end up with around 2 billion inserts to a table, you'll end up over the i32 limit and something will happen. It would be nice to at least clarify what that something is. This isn't rows in the table, just inserts.

You can hit this in a month with a mere 1000 updates per second, so that's kind of rendering schema inference unsuitable for any application with 1000 updates a second. That seems like a pretty low limit. You could hit that just using sqlite to store metrics data or logs or something with a retention policy in a pretty reasonable time frame. For more reasonable sqlite volumes, you can hit it in a year or so, say 100 or 200 updates a second. Your database needn't even go over a few megabytes to make that happen.

I understand why you wouldn't want to make the schema inference special, but it's already bad enough to maintain two places on every schema change, and for production applications you appear to literally have no choice but to do so if you don't want to hit the above problem. If there's also an underlying ticking time bomb that's going to fail silently, I strongly suggest that fixing said problem would be worth the special case. This needs to at least fail loudly, otherwise you just get spurious failures that you'd have to spend an afternoon tracking down.

ahicks92 on 15 Sep 2020

@camlorn As it is simple to change the corresponding field type in the autogenerated schema later I do not see any reason to change the behavior here. Additionally the CLI tool provides multiple ways to automate such changes for future schema changes, so I really do not see why we should break a lot of already existing code to fix a "special" case.

weiznich on 16 Sep 2020

Nb. I'm the one who filed this issue.

I've come to think that this is actually a bug in diesel: if the SQLite docs say that INTEGER PRIMARY KEY is Integer64, that's what diesel print-schema should output and Integer32 is incorrect. It may seem like a special case, but it's a special case in the SQLite specification, not in diesel.

As for the use cases for SQLite and whether they need 64-bit surrogate keys or not, I think that's up to the application developers to decide, and not the ORM's choice. Consider that SQLite is actually used in embedded systems.

On the other hand, manually updating the generated schema on changes will usually take less time than installing the diesel CLI from scratch. If the CLI doesn't (or can't) produce the right types, it's worth being explicit. You still have to update the database and the model, so one more change for the schema isn't that much of a hassle. So don't blindly trust the ORM generated code, but use it as a starting point and then maintain it like you maintain your models.

I really do not see why we should break a lot of already existing code to fix a "special" case.

It could be done for a 2.0 release, whenever that happens.

lnicola on 16 Sep 2020

👍1

I really do not see why we should break a lot of already existing code to fix a "special" case.

As @camlorn said, a silent failure in a production system would be a good reason to fix this case. I think not everyone should be forced to understand such detail of the ORM just to be sure to run a production system that will not randomly break after a month or a year.

To avoid breaking code maybe for now the system could just print out a big warning for this case?

tobx on 16 Sep 2020

First of all the type mapping support from diesel for sqlite has been that what we as diesel maintainers have defined by ourself as being reasonable since diesel supports sqlite. That's because in the end types do not really matter for sqlite. From a theoretical point of view we as developer cannot assume that a column defined as Integer in sqlite will actually contain numbers, as sqlite happy accepts any value there. So following the reasoning behind this issue that the generated schema.rs should be correct the only fully "correct" solution here would be not to generate any schema as we cannot do that without assuming a lot of things.

So let's assume we only want to "fix" this specific issue here:
We currently generate column types based on a mapping based on the type names used by postgres/mysql. See here for the exact mapping code. If you look a bit more in detail in the surrounding code you will see that we are getting the column type names via PRAGMA TABLE_INFO('table_name') from sqlite.
So let's just do a small experiment: Let's create the following two tables and have a look at the output of PRAGMA TABLE_INFO:

create table test(id integer primary key autoincrement);
create table test2(id integer primary key);

Now let's have a look at the corresponding output:

sqlite> pragma table_info('test');
cid|name|type|notnull|dflt_value|pk
0|id|integer|0||1

sqlite> pragma table_info('test2');
cid|name|type|notnull|dflt_value|pk
0|id|integer|0||1

As you can see both return the same table info. That means we do not have any chance to even infer if a primary key is autoincrement or not. The only place where this information is available is in the corresponding CREATE TABLE statement contained in the sqlite_master table. Getting the information from there would require parsing those statements. At least I do not plan to implement a SQL parser any time soon, as there are quite a lot things that are more important for me.

weiznich on 16 Sep 2020

👍1

That's a fair reason. Pre-emptively (since others will probably point them out anyway), you can use some heuristics:

look in sqlite_sequence for a sequence with the same name as the table
look for primary key autoincrement in the table's sqlite_master entry

Neither of them is perfect (e.g. first one breaks when you rename the table or if it's never had any data), but they would probably cover most of the users. And it only reinforces my point about "declare your column types instead of letting the ORM infer them".

But I agree that both of the solutions above are icky and you're not unreasonable if you don't want to implement something like that.

lnicola on 16 Sep 2020

👍1

Just for the sake of completeness:

look in sqlite_sequence for a sequence with the same name as the table

As already noted that works only for tables that already have entries. I would say that this assumption is not true for a significant amount of tables that are generated via diesel migration run + where schema.rs is generated directly afterwards (by diesel migration run for example). Therefore this approach is not acceptable, as it would break on a relatively common use case.

look for primary key autoincrement in the table's sqlite_master entry

As written above: that requires parsing sql, because otherwise this will fail as soon as someone has something like the following table:

create table test3(text defaut 'primary key autoincrement');

As already stated: It least I do not want to write or maintain a sql parser for sqlites create table statements.

And it only reinforces my point about "declare your column types instead of letting the ORM infer them".

That's definitively a valid strategy. To reiterate this: Diesel explicitly supports this use case, it's at any point possible to write your own table! macro calls or edit those generated by diesel-cli.

weiznich on 16 Sep 2020

👍1

For anyone else who finds this later and who wants to use schema inference, the solution appears to be as follows:

Create a sqlite_mapping.rs containing the following:

use diesel::sql_types::*;

type Integer = BigInt;

Then, in your print-schema, add --import-types mycrate::sqlite_mapping::*. I'm typing this off memory, sorry if there's typos.

If it's obvious, I missed where it's documented. There's a couple offhand mentions of --import-types in various places but that's all I could find.

I now understand that it breaks code to change this (and why), but I'm not too enthused about the silent breakage. I almost missed this point myself. Your autogenerated schema will work fine until it doesn't and until it doesn't is far enough out that it'd be in prod by that point in all likelihood. At the moment, the only warning you get is that i64 doesn't work. Maybe this should be called out somewhere in the documentation?

Though beside the point for this issue, "maintain everything yourself" doesn't make Diesel competitive with options in other languages. Checking is a good idea, but going down that road gives manually written sql migrations, manually written table! macros, and a set of structs whose fields have to be kept in order with both of the proceeding. I understand why these should be decoupled components, but it feels like taking the stance of doing all of this yourself makes it easier only on the developers of Diesel.

ahicks92 on 16 Sep 2020

😕3

For anyone else who finds this later and who wants to use schema inference, the solution appears to be as follows:

Create a sqlite_mapping.rs containing the following:

To call that out at least once explicitly: This will change the mapping for all Integer columns in your database to i64, not only the PRIMARY KEY AUTOINCREMENT ones. Depending on your use case this can be fine or not...

Then, in your print-schema, add --import-types mycrate::sqlite_mapping::*. I'm typing this off memory, sorry if there's typos.

If it's obvious, I missed where it's documented. There's a couple offhand mentions of --import-types in various places but that's all I could find.

As for the location where options to manipulate the generated schema are documented like this are documented, have a look at this guide at the official web page? Otherwise I'm fine with getting a PR extending that documentation to make this more clear.

Though beside the point for this issue, "maintain everything yourself" doesn't make Diesel competitive with options in other languages. Checking is a good idea, but going down that road gives manually written sql migrations, manually written table! macros, and a set of structs whose fields have to be kept in order with both of the proceeding. I understand why these should be decoupled components, but it feels like taking the stance of doing all of this yourself makes it easier only on the developers of Diesel.

I mean nobody forces you to use diesel. If you don't like it just use something different or write your own ORM 🤷 . I do not consider it as friendly to call out design decisions because "makes it easier only on the developers of Diesel". I mean we provide this crate in our free time, so maybe recheck your wording.
That said the guide I've mentioned above lists 2 possibilities how to in cooperate that with automatic schema generation.

weiznich on 16 Sep 2020

ahicks92 on 19 Sep 2020

👍3

Was this page helpful?

0 / 5 - 0 ratings