prisma 🚀 - Generate TEXT fields from String

MacOS version: 10.13.6
Prisma Version: [email protected], binary version: c1f6f69f4539f84181254759805d89eb6893d76d
MySQL version: 5.7.14

nlarusstone on 3 Sep 2019

This should be a pretty trivial case. @matthewmueller can you help out here?

schickling on 4 Sep 2019

Just FYI on this one, all of the docs (Data Modelling, MySQL Data Modelling) say that String should be mapped to TEXT.

I guess maybe VARCHAR allows the column to be used as an index?

This one is a big issue for us.

robmurtagh on 4 Sep 2019

@mavilein I think this is your area of expertise ☺️

Looks like we need to adjust this line: https://github.com/prisma/prisma-engine/blob/bbe7af6959179dfc5feb98354d2d3d1096ccd9e2/migration-engine/connectors/sql-migration-connector/src/sql_renderer/mysql_renderer.rs#L43

matthewmueller on 4 Sep 2019

The comment in the code explains why we are using this type right now:

// we use varchar right now as mediumtext doesn't allow default values
// a bigger length would not allow to use such a column as primary key

@matthewmueller : Let's discuss in person how we should handle this next week.

mavilein on 6 Sep 2019

Talked with @matthewmueller. He will write an update how we could tackle this.

mavilein on 9 Sep 2019

Thanks @mavilein. Quick brain dump in why this is not as easy as changing one line of code. Given the following example:

model User {
  id String @id
  email String @unique
  bio String
}

If we do TEXT, then id and email won't work because you cannot have a unique index on TEXT in MySQL.

If we do VARCHAR(191), we can't have a bio that's longer than 191 characters. If you're curious why 191, this blog post is great.

The naive approach would be that whenever we see @unique or @id, we change the type. This may lead to migrations issues when adding these attributes, leading to underlying column type unexpectedly changing.

We're going to keep thinking about this but if you have any ideas, let us know!

matthewmueller on 9 Sep 2019

Thanks for the responses @mavilein and @matthewmueller!

Maybe a better solution doesn't involve handling this implicitly, but rather requires the user to explicitly specify that they want to use a TEXT type. The syntax would therefore be something like:

model User {
  id   Int    @id
  name String @ms.text
}

You could prevent any annotations of @unique or @id from being applied to fields with that annotation.

Seems related to https://github.com/prisma/prisma2/issues/335 as well

For us, this is a very basic/important usecase, so any help you can provide would be greatly appreciated!

nlarusstone on 10 Sep 2019

👍6

@matthewmueller an alternative could be that we acknowledge this as a very common use case and introduce both a String and Text type as Prisma scalar types?

schickling on 11 Sep 2019

@schickling that's how I'm leaning as well, since the alternatives seem sketchy. But I am worried about a datatype explosion, making it harder for generators to do their job. We might just need to acknowledge this and deal with it.

Actually, perhaps generators could have a different mapping (more restrictive) than lift.

matthewmueller on 11 Sep 2019

👍2

My proposal here is that given the following SQL:

create extension if not exists citext;

create table if not exists blogs (
  id serial primary key not null,
  website text not null unique,
  created_at timestamp not null default now()
);

create table if not exists users (
  id serial primary key not null,
  email citext not null unique,
  location point not null,
  first_name text not null,
  last_name text not null,
  created_at timestamp not null default now()
);

create table if not exists posts (
  id serial primary key not null,
  blog_id int references blogs (id) on delete cascade on update cascade,
  author_id int references users (id) on delete cascade on update cascade,
  title text not null,
  created_at timestamp not null default now()
);

create table if not exists comments (
  id serial primary key not null,
  post_id int references posts (id) on delete cascade on update cascade,
  comment text not null,
  created_at timestamp not null default now()
);

We generate a schema that looks like this:

generator photon {
  provider = "photonjs"
}

datasource pg {
  provider = "postgresql"
  url      = "postgresql://m@localhost:5432/prisma-blog?schema=public"
}

model blogs {
  id         pg.serial      @id
  created_at pg.timestamp
  posts      posts[]
  website    pg.text   @unique
}

model comments {
  id         pg.serial      @id
  comment    pg.text
  created_at pg.timestamp
  post_id    posts?
}

model migrate {
  version pg.serial @id
}

model posts {
  id         pg.serial        @id
  author_id  users?
  blog_id    blogs?
  comments   comments[]
  created_at pg.timestamp
  title      pg.text
}

model users {
  id         pg.serial      @id
  created_at pg.timestamp
  email      pg.citext
  location   pg.point
  first_name pg.text
  last_name  pg.text
  posts      posts[]
}

Connectors will still know how to map to pg.text to String during photon generation.

cc/ @schickling @sorenbs

matthewmueller on 19 Sep 2019

@matthewmueller : I don't think this is the way to go here. I see 2 major problems with this approach:

It becomes very hard to port a Prisma schema from one datasource to the other. Imagine finding a tutorial online for Postgres and you want to use MySQL. Now you can have to remap all those fields.
This approach has the implicit assumption that one field of a model always interacts with exactly one datasource. This is a very limiting in a scenario with multiple databases:
- Imagine a @cache(mongo) annotation on a model that instructs photon to maintain a read only cache in a MongoDB. Now the MongoDB connector needs to understand all custom types.
- We plan to implement a Change Data Capture System for the prisma framework. This system would suddenly need to be able to understand all custom types as well.

mavilein on 20 Sep 2019

For 1. in our capability map, we'd have in the Postgres connector pg.text => String and in the Mongo connector mongo.string => String. It should be possible to automatically remap with about as good of accuracy as switching the underlying datasource and hoping that String matches String.

For 2. Yes, multiple datasources in a single model would be possible in this world:

model users {
  id         pg.serial      @id
  created_at pg.timestamp
  email      pg.citext
  location   pg.point
  first_name mongo.string
  last_name  mongo.string
  posts      posts[]
}

I think the @@cache(mongo) would be too, we know all the type mappings.

matthewmueller on 20 Sep 2019

We discussed a solution for this. On MySQL we currently map the String core data type to varchar(191). We do this because there's a restriction on the maximum size of a column so it can be part of a (unique) index. Details on this limitation can be found here.
There's a different solution though. We will start to map the String type to Text as specified in the Prisma Schema Language spec. To work around the limitation we will instead configure the index to just consider the first 191 characters of a text column. This could lead to false duplicates when inserting two records with the same first 191 characters but we think this is an acceptable trade off. You are exposed to this limitation also when using MySQL directly.

_Hint_: The number 191 stems from the fact that index key prefix length in MySQL is 767 bytes. We are using utf8_mb4 encoding on MySQL which uses 4 bytes per character. 191 * 4 = 764

mavilein on 26 Sep 2019

👍1

I just started working on this. I immediately hit the default values issue. Note: text columns can have a default value starting with MySQL 8.0 - I haven't looked very deep into this yet but it looks like there is no straightforward way to do this with older versions of MySQL, and prisma 1 handles this by doing the default value generation itself instead of leaving it to the database (?).

Reference: https://dev.mysql.com/doc/refman/8.0/en/data-type-defaults.html

tomhoule on 27 Sep 2019

Hi @mavilein,

Thanks for the update -- the reasoning makes sense and that solution would work for us!

nlarusstone on 29 Sep 2019

Dear @mavilein ,

wouldn't it be a good approach to add MORE predefined datatypes? i.e., not only have String but also Text, SmallText, MediumText, LargeText and even JSON and so on? (see this issue here: https://github.com/prisma/prisma2/issues/446 )

I would suggest to have different Grammars (as in Laravels ORM) that map the Prisma Datatypes to their appropriate Database Datatypes. For example, the SQLite Grammar may map the JSON datatype to a Text, whereas the Postgres Grammar map the JSON datatype to JSON.

I added some examples (including code-references) to the linked issue. Maybe this can work out here as well!?

In addition, we could rely on the datasource.provider (i.e., postgresql) to load respective grammar. What do you think?

All the best

johannesschobel on 30 Sep 2019

👍2

I think that sounds like a very good idea @johannesschobel, as there will be similar issues for the mapping from almost any Lift data type to its Persistent (DB) data type.

Like you say, JSON support e.g. for MySQL would be amazing (we're really missing it), and seems to fall into a very similar category of feature.

robmurtagh on 30 Sep 2019

@johannesschobel @robmurtagh : We are currently actively working on specing out Prismas core datatypes and their mappings to the underlying database. Your suggestions are among the ones we have on the table right now 👍 .

mavilein on 30 Sep 2019

🎉1

That is good to hear! If you have a dummy grammar available to peek, I can surely work on the PostgreSQL grammar

johannesschobel on 30 Sep 2019

One thing I'm curious/worried about is the performance implications of this change. I've had bad experiences with slow TEXT fields in the past, so I think it's unlikely that I would want all of my String fields to be TEXT (I'd like to use VARCHAR where possible).

This seems like it would be addressed by @johannesschobel 's suggestion, however.

nlarusstone on 30 Sep 2019

Dear @nlarusstone ,
yeah, that would be certainly "solved" as you (as the developer) could choose the most appropriate datatype (i.e., Text or String).

I am certainly aware, that the Grammar approach proposed by me is probably the most complex and "costly" one to implement - however, i think, it is also the most versatile and easiest to go on with / built up onto.

johannesschobel on 1 Oct 2019

@nlarusstone : Can you elaborate on the some cases where TEXT fields were slow for you? 🙏

mavilein on 1 Oct 2019

Dear @mavilein , as far as i know, VARCHAR are typically stored "in-row", whereas TEXT (or other formats like JSON, BLOB, ...) are stored in an "external storage" (table, file on system, ...). This means, that a "lookup" for such TEXT values are more costly...

I don't know, if this is still true, however, "older" database engines used to handle it this way..

johannesschobel on 1 Oct 2019

@johannesschobel I'm also in favor of your proposal. Can you provide link(s) to where we can learn more and see examples of how Laraval does this? ☺️

~I'm just seeing: https://github.com/laravel/framework/blob/4.2/src/Illuminate/Database/Grammar.php~

Update: Nevermind, I found your links in the other issue. Thanks!

matthewmueller on 2 Oct 2019

sorry.. i only linked it in the other issue.. my bad :cry:
But glad you found it!

johannesschobel on 2 Oct 2019

There is a hard limitation regarding mapping all String fields to mediumtext: foreign keys can't reference text fields. Whether we want to map one prisma core type to two SQL types depending on external factors (whether the column is part of the primary key, in this case) is a more high level question. (see https://github.com/prisma/specs/issues/177).

Relevant MySQL docs page\
Index prefixes on foreign key columns are not supported. One consequence of this is that BLOB and TEXT columns cannot be included in a foreign key because indexes on those columns must always include a prefix length.

tomhoule on 2 Oct 2019

🎉1

I really think, that a String = Varchar type is a crucial requirement. Often, i have the need for "short" text fields (i.e., to store a title of a person or the firstname itself is also rarely longer than 190 chars).

Further, i would map Text to Text in the database.

johannesschobel on 4 Oct 2019

There is spec work being done about Prisma's core datatypes that will influence what solution we implement to this problem. Until https://github.com/prisma/specs/issues/177 is resolved, work on this issue is blocked.

tomhoule on 7 Oct 2019

Internal note: dropping from next sprint, blocked on product.

divyenduz on 10 Oct 2019

😕3

Just adding a few thoughts to this thread in case they are helpful in deciding on this issue or the spec definition.

We are hopeful that whatever is decided will help us handle the following situation on our MySQL db:

We've used prisma2 schema.prisma to define our schema
But we absolutely need some columns to be TEXT/LONGTEXT
(We also would really really like some of our columns to be JSON, but we have modelled them as TEXT for now just to avoid that complexity)
So after prisma2 lift up migration, we're manually running ALTER TABLE ... MODIFY column statements

I guess the above scenario is pretty typical for end users right now.

The problem is at the moment, because of our need for TEXT, once we've run the manual ALTER TABLE ... migration, I guess we're no longer able to take advantage of the Lift migration framework, as it's declarative, and will try and change things back (probably by dropping columns and data) if we were to run it again?

Hope that this kind of usecase information might inform your choice on datatype spec...

As always, thanks!

robmurtagh on 15 Oct 2019

Dear @robmurtagh ,
yeah, if you manually run another sql statement to alter your database (i.e., change the column type), Lift is completely useless for you at this point in time. You will not be able to properly work with it again - which is a shame.

I really hope that there will be support for additional datatypes, as this may be a real show-stopper for many scenarios.

johannesschobel on 15 Oct 2019

👍1

@robmurtagh @johannesschobel : The support of custom datatypes is very high on the list of things we are figuring out at the moment. In addition to that we will also support hooks which will allow you to run custom SQL scripts that we will respect and not revert their changes.

mavilein on 16 Oct 2019

👍1

Dear @mavilein ,
thank you very much for your support with this very complex and demanding issue / feature request! I really like, that you (and the prisma team) put so much effort into this feature request.

that will allow you to run custom SQL scripts that we will respect and not revert their changes.

What do you mean with this? Is this something like @robmurtagh proposed with his _custom SQL Script_ that runs ALTER TABLE ... MODIFY COLUMN ... in order to change the type of a respective column?

Thank you very much in advance!
All the best

johannesschobel on 16 Oct 2019

👍2

What do you mean with this? Is this something like @robmurtagh proposed with his custom SQL Script that runs ALTER TABLE ... MODIFY COLUMN ... in order to change the type of a respective column?

Yes this is what i meant.

mavilein on 16 Oct 2019

🎉2

That would be a very nice feature, I guess :)

johannesschobel on 16 Oct 2019

@mavilein Just wondering do you have any ideas when this might be available in preview? Is it something that you're currently working on?

Thanks so much

robmurtagh on 21 Oct 2019

🎉2

@robmurtagh : This is currently the focus of our spec work. When the spec is done we will start with the implementation right away. So very active but still a bit out.

mavilein on 28 Oct 2019

Hello i'm pulling some data from an API and one of the fields is a JSON which is quite big and goes over the limit of the String type. Any idea when we will get a BLOB or LONGTEXT type?

crimson-med on 29 Oct 2019

What is the current safe workaround if we want a TEXT field for some column? Can we do something on the steps.json file of the generated migration?

soullivaneuh on 1 Nov 2019

👀2

@soullivaneuh : There's no workaround right now. But we are actively working on bringing support for any type a given datasource supports.

mavilein on 8 Nov 2019

I know you guys are working on a solution, but I just want to address a few concerns I have about the current situation and the proposed solutions above:

So if I understand correctly, at the moment there is no way of storing anything longer than 191 characters in prisma? To be honest, for my personal use this is a huge dealbreaker. I just bumped into the same issue for a project I'm working on.

Having Text, LongText, Blob types in the schema is a potential fix but in that case prisma would be changing its schema definition types towards one specific datasource (mysql types). Which doesn't seem like a clean thing to do.

Wouldn't it be a better solution to keep everything under String and determining the type with a @size() attribute in the schema, or something similar.

From my personal experience, I would always avoid creating indexes for every string field automatically. It's usually best practice to create indexes on demand (like prisma already does with @unique). Why doesn't prisma have something like @index, to specifically create indexes for the fields that need one (fields the user wants to query on)? Creating indexes for all String fields could create lots of data overhead on large datasets, while a user might not even need indexes for some of those fields.

By creating indexes only for fields that have @unique, @index and @id (primary) attributes, you avoid this data overhead and give the user full control of indexing fields. If a user still decides to query on fields that don't have an index, you could even let photon log a warning about quering using an unindexed field in WHERE.

GuusDeGraeve on 21 Nov 2019

👍1

I know you guys are working on a solution, but I just want to address a few concerns I have about the current situation and the proposed solutions above:

So if I understand correctly, at the moment there is no way of storing anything longer than 191 characters in prisma? To be honest, for my personal use this is a huge dealbreaker. I just bumped into the same issue for a project I'm working on.

Having Text, LongText, Blob types in the schema is a potential fix but in that case prisma would be changing its schema definition types towards one specific datasource (mysql types). Which doesn't seem like a clean thing to do.

Wouldn't it be a better solution to keep everything under String and determining the type with a @size() attribute in the schema, or something similar.

From my personal experience, I would always avoid creating indexes for every string field automatically. It's usually best practice to create indexes on demand (like prisma already does with @unique). Why doesn't prisma have something like @index, to specifically create indexes for the fields that need one (fields the user wants to query on)? Creating indexes for all String fields could create lots of data overhead on large datasets, while a user might not even need indexes for some of those fields.

By creating indexes only for fields that have @unique, @index and @id (primary) attributes, you avoid this data overhead and give the user full control of indexing fields. If a user still decides to query on fields that don't have an index, you could even let photon log a warning about quering using an unindexed field in WHERE.

I really have to agree with you! I love prisma2 and we are using it on our current project. But we are arriving at a critical point where we are really in need of a BLOB at this point. We ar building huge graph maps and need to keep some relationship references and the character limitation lets us store to ids in one field when we need a list of 20.

We are arriving at the point were we are hesitating to just drop prisma2 but that would mean having to refactor the whole code.

@tomhoule would you have any kind of estimation on when that might happen?

crimson-med on 27 Nov 2019

After reading this discussion am I correct in that prisma2 will only support data with a length of less than 191 characters.

garyro on 15 Dec 2019

After reading this discussion am I correct in that prisma2 will only support data with a length of less than 191 characters.

@garyro for mysql only. If you use postgresql you should be fine.

crimson-med on 16 Dec 2019

👍1

@crimson-med : Correct 👍 . 🙏
@garyro : This is a current limitation. We will lift this limitation once we release lift. See more details on our roadmap on our blog.

mavilein on 16 Dec 2019

👍1

I know you guys are working on a solution, but I just want to address a few concerns I have about the current situation and the proposed solutions above:

So if I understand correctly, at the moment there is no way of storing anything longer than 191 characters in prisma? To be honest, for my personal use this is a huge dealbreaker. I just bumped into the same issue for a project I'm working on.

Having Text, LongText, Blob types in the schema is a potential fix but in that case prisma would be changing its schema definition types towards one specific datasource (mysql types). Which doesn't seem like a clean thing to do.

Wouldn't it be a better solution to keep everything under String and determining the type with a @size() attribute in the schema, or something similar.

From my personal experience, I would always avoid creating indexes for every string field automatically. It's usually best practice to create indexes on demand (like prisma already does with @unique). Why doesn't prisma have something like @index, to specifically create indexes for the fields that need one (fields the user wants to query on)? Creating indexes for all String fields could create lots of data overhead on large datasets, while a user might not even need indexes for some of those fields.

By creating indexes only for fields that have @unique, @index and @id (primary) attributes, you avoid this data overhead and give the user full control of indexing fields. If a user still decides to query on fields that don't have an index, you could even let photon log a warning about quering using an unindexed field in WHERE.

Is there any news about the type mapping spec? More specifically regarding the concerns I have raised in this post? I checked the specs repo but couldn't really find anything aside from the already mentioned pull request (https://github.com/prisma/specs/pull/291) which hasn't gotten an update since november.

GuusDeGraeve on 15 Jan 2020

👍3

Internal note: dropping from current sprint

divyenduz on 31 Jan 2020

Any updates on this? Seeing it was dropped multiple times from the current sprin by divyenduz

The possibility to add TEXT fields on schema is necessarily for all bigger projects.

Please push this topic. Thank you!

PS: Need help? Like to work on this topic to get this integration done.

ChristophDietrich on 28 Mar 2020

👍13

Bumping this because it's been open for a while, and it's pretty important. Json data type is now supported which creates JSON => LONGTEXT in MySQL, so the next logical step would be to allow longer fields.

It's troublesome because LONGTEXT in SQL maps to String when introspecting, but there's no way for the opposite, so our databases using the same Prisma schema turn out inconsistent.

AnandChowdhary on 13 May 2020

👍3

Bumping! Also will this take into account if you wanted to set a custom character limit? Is it maybe a new @ directive?

firstName  String  @length(64)
bio        String  @length(text)

Maybe?

cannikin on 29 Jun 2020

👍15 👀1 🎉1

Checking in from the RedwoodJS community (along with Rob above). We're starting to see more conversation about this amongst the community -- it's moving up in priority as a request + need.

If there's anything we can do to help, please don't hesitate to ask. Thanks again, Team Prisma!

thedavidprice on 11 Jul 2020

Just a quick sidenote: It's a bit weird that Prisma is on v2.2.0 and this issue has been open since the betas. Maybe it'd be a good idea to make a milestone for features that are extremely important for any production use case, like a custom length of field or defaults in JSON (prisma/prisma#2556), that should've been there in the stable v2.0.0 but are missing. Perhaps still being in beta would've been better?

AnandChowdhary on 11 Jul 2020

Definitely a priority for my team as well! There are a lot of other issues pointing to this same topic and my understanding is that it should be resolved as part of https://github.com/prisma/prisma/issues/446

It would be awesome if this could be pushed as a priority for the next release.

maoosi on 13 Jul 2020

It looks like it will never come and taking not very seriously.

Other less important topics are more important for them, really disappointed for myself.

Fixing bugs / problems, should be the highest priority on working.

This topic blocks all my important projects now.

My clients getting very uncomfortable with this new technology and thinking about to spent a lot of time and money to rewrite everything into other db adapter ...

Please @janpio can you push this internally?!?

ChristophDietrich on 13 Jul 2020

Hi everyone,
We understand the frustration. We're looking at solving this through native type support and started conversations around how we could go about this. This is key for us to solve this right. This also comes with quite a few implications on the entire product (introspection, schema, client and migrations) to ensure we not only solve this specific issue, but make it possible to have sustainable types support once we open up to other database vendors.
We'll come back once the epic created to track that here: https://github.com/prisma/prisma/issues/446
Thanks for your patience!

thebiglabasky on 13 Jul 2020

👍4 ❤1

Hey @ChristophDietrich, @maoosi and others! Thanks a lot for sharing your thoughts and your activity here, this definitely helps us prioritize this feature better! As Hervé pointed out that this change comes with a number of implications, so it's not as trivial and quick to solve.

However, I just want to point out that you can totally use any native DB types with Prisma (just not with Prisma Migrate, yet)! Taking the example from this issue of changing VARCHAR(191) to TEXT.

Assume you have the following Prisma model:

model User {
  id   Int    @id @default(autoincrement())
  name String
}

Prisma Migrate creates the following SQL table for you when running save and up:

CREATE TABLE `User` (
  `id` INT(11) NOT NULL AUTO_INCREMENT,
  `name` VARCHAR(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`id`)
);

You can now manually change the the type of the name column from VARCHAR(191) to TEXT as follows:

ALTER TABLE `User` MODIFY `name` TEXT NOT NULL;

Once you've done this, you can just keep using Prisma Client as before!

nikolasburk on 13 Jul 2020

👎9 👍5 😕3

@nikolasburk
Thanks for sharing, manually is not very good.

Sure i can change everything manually ... but when working with dev, stage and production and when you don't have permission to one of the latest system, you are not having a good deployment.

No one likes to have a separate sql file to take care of and handling the update process by themselves.

What about to introduce a possibility to add column type changes like you show here with ALTER Table in the "prisma migration" process unless Prisma team is not working on this topic in "full" detail?

ChristophDietrich on 3 Aug 2020

@ChristophDietrich : We are working currently very actively on Prisma Migrate. This can also be seen on our public roadmap. So this restriction will get lifted soonish.

mavilein on 4 Aug 2020

👍2

Tracking issue for Text in the prisma schema: https://github.com/prisma/prisma-engines/issues/72

tomhoule on 5 Nov 2020

Prisma: Generate TEXT fields from String

Most helpful comment

All 60 comments

Related issues