Prisma-client-js: Support case insensitivity when querying

Created on 14 May 2020  路  41Comments  路  Source: prisma/prisma-client-js

Problem

When filtering and sorting by string attributes, the case sensitivity will default to the underlying database behavior. Different DB engines have different defaults when it comes to case sensitivity:

Depending on the use case, there might be a need to support case insensitive filtering and sorting. Existing workarounds (see below) can be quite complex and often involve manipulating the database schema.

By allowing to programmatically specify case sensitivity when filtering/sorting, we would reduce significantly reduce friction and make developers more productive.

Suggested solution

Alternatives

There are many approaches and workarounds to supporting optional case insensitivity:

  • duplicating columns with a lower case copy of the string and using that for filtering and sorting
  • using the LOWER function in where clauses and (optionally) additional case insensitive indexes (requires special work)
  • using ILIKE in PostgreSQL (does not use an index!)
  • use CITEXT (case-insensitive extension) extension for PostgreSQL (Per the docs, using CITEXT is slightly more efficient than calling LOWER in your queries explicitly to perform case-insensitive matching in the database)

Some frameworks and ORMs do not support specifying the case sensitivity at runtime, at all, leaving it up to the developer to use one of the available workarounds.

Additional context

enginequery engine kinfeature roadmawip tecengines tectypescript

Most helpful comment

Merged on the Engine side by @dpetrick, more work on Client side required by @Jolg42.

All 41 comments

Is it possible to at least force prisma to create a citext field instead of a normal text field in postgres?

What is the best way to go about this currently?
Would it be to have a searchText column and dump all your terms you are wanting into there? (with everything forced to lowercase)
eg:

model Something {
  item1: String;
  item2: String;
  item3: String;
  item4: String;
  search: String;
}

//Creating a record (similar pattern for updating)
prisma.something.create({
data: {
  item1: item1,
  item2: item2,
  item3: item3,
  item4: item4,
  search: `{item1.toLowercase()}{item2.toLowercase()}{item3.toLowercase()}{item4.toLowercase()}`
});

//Searching:
prisma.something.findMany({where: {search: {contains: "search string".toLowercase()}});

What is the best way to go about this currently?

@richardwardza Your example is currently the only possiblity I see and it's exactly how I do it. Not very pretty, but it works.

would also be cool to have kind of "computed" properties that will be saved as well on certain mutations, a denormalization

@albertoperdomo @thebiglabasky as discussed in Slack, my Journaly teammate @Lanny and I have started work on this, with https://github.com/prisma/quaint/pull/153 which has been merged and we are planning on continuing work with the remaining pieces this weekend and will try our best to resolve the issue. I'll keep you updated 馃槉

That's awesome @robin-macpherson ! If you have the chance to share the design you and @Lanny have in mind that'd be great so we can have a look and ensure we're all in agreement there.
Thanks again!

That's awesome @robin-macpherson ! If you have the chance to share the design you and @Lanny have in mind that'd be great so we can have a look and ensure we're all in agreement there.
Thanks again!

Thanks @thebiglabasky we'll update you with that as soon as we can!

Hi @thebiglabasky ,

Our thought here was to add a new operator similar to the current contains, perhaps icontains (following the pg like/ilike thing) that eventually boils down to SQL along the lines of WHERE LOWER(some_col) LIKE '%literal%'. That should be portable across the supported SQL DBs. Less sure about indexing, don't know what the support the prisma schema syntax has for specifying index types, and it gets kind of tricky since the index you need to support LIKE queries is significantly different across DBs and PG in particular requires an extension that only exists after some version. In our particular case we'd be OK with just having prisma support for the query and manually creating the index, although it would definitely be nice to have schema support for it too.

I'd let @timsuchanek chime in here. It seems the proposal would only cover the contains use case but the question could be raised for equals (iequals?) and in (iin? 馃槬) etc. right?

I'd let @timsuchanek chime in here. It seems the proposal would only cover the contains use case but the question could be raised for equals (iequals?) and in (iin? 馃槬) etc. right?

Yes that's a great call @thebiglabasky ! If adding support for those other use cases is significantly more challenging, would it be potentially a viable option to start by adding coverage for contains so that we at least enable case-insensitive search while working on that more robust API design? Or would we need to support all of them right away?

cc: @timsuchanek @Lanny

@thebiglabasky @timsuchanek just wanted to follow up on my above comment/suggestion. No worries if implementing this partial support would not be sufficient.

_(Bringing some internal discussion external)_

We're in the process of designing the Prisma JS Client API for this feature.

If you have any additional solutions or pros/cons with going with one of the solutions below, please let us know. We're very interested to hear what you think!

Solution A: Prefix all insensitive queries on the String filter with insensitive

export type StringFilter = {
  equals?: string
  not?: string | StringFilter
  in?: Enumerable<string>
  notIn?: Enumerable<string>
  lt?: string
  lte?: string
  gt?: string
  gte?: string
  contains?: string
  startsWith?: string
  endsWith?: string
  /* case insensitive */
  insensitiveEquals?: string
  insensitiveNot?: string
  insensitiveIn?: string
  insensitiveNotIn?: string
  insensitiveLt?: string
  insensitiveLte?: string
  insensitiveGt?: string
  insensitiveGte?: string
  insensitiveContains?: string
  insensitiveStartsWith?: string
  insensitiveEndsWith?: string
}

Usage

await prisma.users.findMany({
  where: {
    name: {
      insensitiveEquals: 'Alice',
    },
  },
})
  • Pros

    • Just type insensitive and you have autocomplete for all possible actions
    • Only possible on the String filter
    • Explicit (insensitive is spelled out)
    • Does not conflict with field names
  • Cons

    • Explicitness not consistent (e.g. insensitiveGt)
    • Not very flexible for other types of functions like upper or substring. Prefixing may become our general practice for function composition

Solution B: Prefix all insensitive queries on the String filter with i

export type StringFilter = {
  equals?: string
  not?: string | StringFilter
  in?: Enumerable<string>
  notIn?: Enumerable<string>
  lt?: string
  lte?: string
  gt?: string
  gte?: string
  contains?: string
  startsWith?: string
  endsWith?: string
  /* case insensitive */
  iequals?: string
  inot?: string
  iin?: string
  inotIn?: string
  ilt?: string
  ilte?: string
  igt?: string
  igte?: string
  icontains?: string
  istartsWith?: string
  iendsWith?: string
}

Usage

await prisma.users.findMany({
  where: {
    name: {
      iequals: 'Alice',
    },
  },
})
  • Pros

    • Short and concise
    • Only possible on the String filter
    • Does not conflict with field names
  • Cons

    • Implicit, very SQL-centric (e.g. ILIKE)
    • Typing i will give you additional results like iin
    • Not very flexible for other types of functions like upper or substring. Prefixing may become our general practice for function composition

Solution C: { insensitive: true } on the where filter

export type WhereFilter = {
  name?: string | StringFilter
  email?: string | StringFilter
  insensitive?: boolean
}

Usage

await prisma.users.findMany({
  where: {
    name: 'Alice'
    insensitive: true
  }
})
  • Pros

    • Doesn't require us to introduce a prefix on every filter
    • Autocomplete works nicely
  • Cons

    • Would apply to every type, not just strings
    • Could conflict with field names
    • Not composable. If I have additional functions, no clear order of operations

Solution D: Insensitive object wrapper

export type WhereFilter = {
  name?: string | StringFilter | InsensitiveStringFilter
  email?: string | StringFilter | InsensitiveStringFilter
  insensitive?: boolean
}

export type InsensitiveStringFilter = {
  name?: string | StringFilter
  email?: string | StringFilter
}

Usage

await prisma.users.findMany({
  where: {
    name: {
      insensitive: {
        equals: 'Alice',
      },
    },
  },
})
  • Pros

    • Doesn't require us to introduce a prefix on every filter
    • Supports composition: name: { insensitive: { substring: { first: 10, { equals: 'Alice' } } }
    • Does not conflict with field names
    • Only possible on the String filter
    • Autocomplete works nicely
  • Cons

    • More verbose than all the alternatives
    • Composition may lead to weird edge cases.

I have a preference for Solution D as I think is the most flexible, and it doesn't 'pollute' the options like the prefixes do.

When typing it out it seems to feel most natural, as each 'step' gets autocompleted.

where: { name: { insensitive: { equals: 

Also, it seems more in line with other methods, because we have:

data: { project: { connect: { id: 

Instead of:

data: { project: { connectId: 

would also probably vote for D or A

Solution D looks good

  • Allows insensitive filter to apply to all fields, or selected fields
  • Won't break existing queries, as it is additive

Limitation in all solutions:

  • not able to expose insensitivity as a variable to be configured by a GraphQL client

Solution E? - (Unknown if possible)

  • Does GraphQL allow for optional + default values like in javascript?

    • (example definition pseudocode function query(id: string, insensitive?: boolean = null) { ... })

    • (example standard usage: where: { name: 'Alice' })

    • (example insensitive usage: where: { name { equals: 'Alice', insensitive: $variable) } } or similar)


Edit:

  • Replace GraphQL above with Prisma Client =]

@james-ff Prisma is just a type safe database client and is not especially related to GraphQL. So you always have to define this in your queries on your own, when using GraphQL.
And if you're using it with nexus-prisma, it's on the side of Nexus to implement this.

I also would go for Solution D as it's the most flexible solutions and doesn't pollute the options. All the Pros are outweighing it's own Cons and the Pros of the other solutions for me.

Edit: @james-ff just to clarify a little bit more: GraphQL doesn't support default values, but nullable ones. So you can use a default value in JavaScript if there's no value set in the GraphQL variables.

Solution E (below) makes the most sense to me.

Solution E? - (Unknown if possible)

  • Does GraphQL allow for optional + default values like in javascript?

    • (example definition pseudocode function query(id: string, insensitive?: boolean = null) { ... })
    • (example standard usage: where: { name: 'Alice' })
    • (example insensitive usage: where: { name { equals: 'Alice', insensitive: $variable) } } or similar)

Edit:

  • Replace GraphQL above with Prisma Client =]

So Solution E is similar to Solution C but positioned deep inside the query.

await prisma.users.findMany({
  where:  {
    name: { equals: 'Alice', insensitive: true },
    name: { contains: 'Alice', insensitive: true }
  }
})

this is also not a bad idea. It also feels quite natural and doesn't pollute the options.
So my favorites are solutions D or E.

Probably a bit late, but I have another proposal. Maybe Solution F.

It's a little adjustment to Solution E:

await prisma.users.findMany({
  where:  {
    name: { equals: {value: 'Alice', insensitive: true } },
    name: { contains: {value:  'Alice', insensitive: true }}
  }
})

This should not break the current default behaviour:

await prisma.users.findMany({
  where:  {
    name: { contains: 'Alice' },                                 // default is still case sensitive
    name: { contains: {value:  'Alice', insensitive: false}}     // same in new object notation
  }
})

So this would be a very generic format. Each filter could take a direct scalar value for default behaviour (case sensitive) or an object containing the scalar value (as value attribute) and additional options like insensitive. This approach could be interesting for future use, if there may be other options for some filters.
And like solution D or E it feels natural and self descriptive.

Thanks for all the input so far! We're currently looking at how the implementation constraints will allow us to decide on the best possible solution. We'll post an update once done.

Hi everyone, after some more investigations, we realized that the current API proposal was inferring that using insensitive: false could lead to the false expectation that Prisma could guarantee filtering in a case sensitive fashion.
This is not the case since it case sensitivity itself depends on various underlying factors defined at the database level (collation, encoding, native column type). In other words, we can use techniques to perform case insensitive filtering, but we can't do the opposite.

We're going to adjust the API proposal to see if we can get to something which better reflects what the API will do and post back here.

Hi everyone, after some more investigations, we realized that the current API proposal was inferring that using insensitive: false could lead to the false expectation that Prisma could guarantee filtering in a case sensitive fashion.
This is not the case since it case sensitivity itself depends on various underlying factors defined at the database level (collation, encoding, native column type). In other words, we can use techniques to perform case insensitive filtering, but we can't do the opposite.

We're going to adjust the API proposal to see if we can get to something which better reflects what the API will do and post back here.

good catch!

so proposals A or B ?

Thanks for the update @thebiglabasky ! Will be looking forward to the new proposal/adjustments 馃檪

@thebiglabasky This is indeed unclear.

TypeScript would make it possible to explicitly type something to true. For example for Solution D it's possible to use this typing:

export type WhereFilter = {
  name?: string | StringFilter | InsensitiveStringFilter
  email?: string | StringFilter | InsensitiveStringFilter
  insensitive?: true
}

The drawback I see in this solution is that it's not type safe in JavaScript. And it's not directly visible to the the user that only true can be passed when he's not using an IDE which shows an error if false is passed.

Another possible solution would be to to use something like forceInsensitive?: true which sounds more explicit to me.

+1

Thank you everyone for helping us shape this solution. Since guaranteeing insensitive: false at the database-level is a hard problem, we're trying to to simplify the solution and would love to hear your thoughts.

If you have any additional ideas or thoughts on the solutions below, please let us know!

Solution A

Support insensitive: true | undefined type. Exclude insensitive: false. Test for it at runtime for Javascript.

_Pros_

  • Addresses the problem while greatly reducing the complexity.

_Cons_

  • A bit surprising that you can't do insensitive: false. For scripting languages and less advanced type systems (e.g. JS & Go), you'd discover this at runtime.
  • Would need to be checked at runtime in Rust, which we'd like to avoid.

Solution B

Support lower?: boolean instead of insensitive.

_Pros_

  • Addresses the problem while greatly reducing the complexity.
  • lower: false can be the same as lower: undefined

_Cons_

  • SQL-specific.

Solution C

Support forceInsensitive?: boolean

_Pros_

  • Addresses the problem while greatly reducing the complexity.
  • forceInsensitive: false can be the same as forceInsensitive: undefined

_Cons_

  • Forcing insensitivity feels a bit reckless though it's a harmless operation.

Solution D

Support mode?: 'default' | 'insensitive'

_Pros_

  • Addresses the problem while greatly reducing the complexity.
  • Enums provide more clear options
  • Allows us more modes in the future

_Cons_

  • If we need multiple modes at once, it may get complicated.

I would prefer solution A. lower does not express what it does.

@macrozone thanks for your quick feedback! I added a few more solutions based on some internal discussion. Please have another look.

Slight alternative to B): lowercase. lower is very DB specific (as it is a common function name), but lowercase expresses what happens with all the data. (does this only apply to sorting though or also the query results?)

Slight alternative to B): lowercase. lower is very DB specific (as it is a common function name), but lowercase expresses what happens with all the data. (does this only apply to sorting though or also the query results?)

lower and lowercase are both a bit missleading. It suggest that whatever you pass, it would lowercase it and compare it. But in fact, it lowercase both to achieve case insensitive search. Its kind of a workaround.

I would prefer A or D. Its probably bike-shading between the two

Would need to be checked at runtime in Rust, which we'd like to avoid.

maybe do it only on dev mode? this would be similar to prop checks in react.

Also, newer IDEs might do type checking in such cases even if its JS and not TS. And also to be honest: who dares to use prisma without typescript? ;-)

I prefer solution A, because this feels natural to me from a developers perspective. Solution D might get ugly in the future, if there will be more options to filter operations (especially multi modes).

Solution C has the advantage over A that you can use false the same way as undefined for default behavior. If I see it right, otherwise the only real difference between A and C is the prefix "force". Maybee change the prefix to "use"? "useInsensitive" looks more harmless and describes what it does.

Thanks for your feedback!

If I see it right, otherwise the only real difference between A and C is the prefix "force". Maybee change the prefix to "use"? "useInsensitive" looks more harmless and describes what it does.

Just to clarify, Solution C is the most similar to Solution B 鈥撀爄t's just a field name change. Solution A disallows for false, B and C allow for false.

Quick update here. After much discussion, we're planning on going with Solution D.

mode?: 'default' | 'insensitive'

We think this one is the most obvious and also doesn't suffer some of the shortcomings of the other boolean options. Multiple modes is a real concern that we'll be careful about in the future.

Thanks so much everyone for helping us narrow down the possibilities!

Thanks a lot for the update, @matthewmueller ! Do we have a rough timeline for the implementation?

Happening right now, so expected pretty soon.

Merged on the Engine side by @dpetrick, more work on Client side required by @Jolg42.

Is this feature available for 2.5 version and above? (2.4.x??)

Yes it is now available in 2.5.0 as a preview feature with a flag: release notes are here: https://github.com/prisma/prisma/releases/tag/2.5.0

Yes it is now available in 2.5.0 as a preview feature with a flag: release notes are here: https://github.com/prisma/prisma/releases/tag/2.5.0

Alright! Thanks for the reply @Jolg42 馃憤

If the preview feature works well or if you have anything to share, your feedback will be greatly appreciated!

You can post comments and feedback in this issue:
Preview feature feedback: insensitiveFilters #3340

If you encounter a bug, please open a bug report

Sure, I'll report if something weird appears 馃槈

Was this page helpful?
0 / 5 - 0 ratings

Related issues

emroot picture emroot  路  4Comments

divyenduz picture divyenduz  路  3Comments

FluorescentHallucinogen picture FluorescentHallucinogen  路  3Comments

esistgut picture esistgut  路  4Comments

macrozone picture macrozone  路  4Comments