Prisma1: Prisma client is producing too many DB queries

Created on 22 Nov 2018 · 9Comments · Source: prisma/prisma1

Describe the bug
If we have this datamodel:

type User {
  id: ID! @unique
  name: String!
  courses: [Course!]!
}

type Course {
  id: ID! @unique
  title: String!
  owner: User!
}

And we want to make this query:

query getAllUsers {
  getAllUsers{
    id name courses {
        title
    }
  }
}

then I found out that, Prisma Client is producing these queries:

Query:
{
  users {
    id
    name
  }
}

and for each user in DB:

Query:
query ($where: UserWhereUniqueInput!) {
  user(where: $where) {
    courses {
      id
      title
    }
  }
}

Variables:
{"where":{"id":"cjmqbxjg8000i0a34qsluv5x9"}}

Prisma Client's queries causes that underlaying database (in my case Postgres) executes:

One query to get all users
For each user a separate query to get user's id
For each user a separate query for getting data from RelationTable(s).

If we sum all queries then we get the final number of DB queries executed equivalent to:
(numOfUsers * 2) + 1

To Reproduce
Steps to reproduce see this repo:
Repo

Expected behavior
I was expecting that Prisma Client is producing more optimized queries (less DB round trips), something like we can achieve with Facebook dataloader.

Versions (please complete the following information):

OS: OS X Mojave
prisma CLI: 1.20.7
Prisma Server: 1.20.7

kinquestion

Source

tadejstanic

👍8

Most helpful comment

@tadejstanic Thanks for the link!
So this is expected behavior and the way the client works for now.
As first all users are received and then for each user the courses being fetched, it's performing a round trip. This is mostly related to how graphql-js works, but also how it works together with the client.

Let me dissect this situation. We have 2 nonoptimal situations going on here:

A roundtrip happening after the .users() call has been done.
For each individual user, executing .user({ id: root.id }).courses(), resulting in many queries being sent.

For 1. you could eager load the courses, too. We'll introduce an API called $deep() in the Prisma Client to make this happen. While using the client, you could already use $fragment() or $graphql() methods to load more than just the scalar fields. The limitation here would be, that if you don't want to query the courses, you're loading them for nothing.

To actually solve both problems, you can use schema delegation. Schema delegation can be leveraged through prisma-binding. You can find an example for schema delegation in graphql resolvers here
Schema delegation means here, that you're delegating the resolving of fields and selection of fields to the resulting schema. It's achieved with passing through the info object of the resolver, which contains the selection set of the incoming query. In other words, if a query comes in, requesting the fields id and name, only these 2 fields are being requested by Prisma. This works perfectly when the application schema and Prisma schema match. However, as soon as they don't match anymore, Schema delegation gets pretty complicated. Because of that, we're recommending to start out with the Prisma Client - with the cost of being less performant.

You can build a graphql server from scratch with schema delegation, but it's a more advanced topic, so we introduce most of our users to the Client for now.

Coming back to the many queries being sent with the client right now: We're already leveraging batching in the client under the hood already. The current batching looks like this:
Instead of sending a payload of this form:

{
  "query": "{ myQuery }"
}

We send queries in a batched manner like this:

[
  {
    "query": "{ myQuery }"
  },
  {
    "query": "{ myQuery2 }"
  }
]

This can be processed in parallel by the Prisma server, making the setup less inefficient.

TL DR;
To have the minimal amount of queries being sent, Schema Delegation is the way to go.

If you always want the courses in your example anyways, you can eager load them with $graphql() or $fragment() for now and soon using the $deep() method.

Does this answer your question? I'll only close this issue when you have a satisfiable answer because many other people will also think about this problem.

timsuchanek on 26 Nov 2018

👍6

All 9 comments

Thanks for reporting @tadejstanic how is the call to the Prisma Client api exactly looking like, that is producing this query?

timsuchanek on 22 Nov 2018

@timsuchanek here is a direct link to the code where Prisma Client is being called.

https://github.com/tadejstanic/prisma-test-js/blob/master/index.js

tadejstanic on 22 Nov 2018

Let me dissect this situation. We have 2 nonoptimal situations going on here:

A roundtrip happening after the .users() call has been done.
For each individual user, executing .user({ id: root.id }).courses(), resulting in many queries being sent.

You can build a graphql server from scratch with schema delegation, but it's a more advanced topic, so we introduce most of our users to the Client for now.

{
  "query": "{ myQuery }"
}

We send queries in a batched manner like this:

[
  {
    "query": "{ myQuery }"
  },
  {
    "query": "{ myQuery2 }"
  }
]

This can be processed in parallel by the Prisma server, making the setup less inefficient.

TL DR;
To have the minimal amount of queries being sent, Schema Delegation is the way to go.

If you always want the courses in your example anyways, you can eager load them with $graphql() or $fragment() for now and soon using the $deep() method.

Does this answer your question? I'll only close this issue when you have a satisfiable answer because many other people will also think about this problem.

timsuchanek on 26 Nov 2018

👍6

@timsuchanek thank you for this awesome explanation which shows how strong Prisma and its OSS packages are.

Here are my findings so far:

I made a test with prisma-binding as you suggested and managed to get only two DB queries executed with the same graphQL query. The only concern with this approach is the situation when application schema and Prisma schema don't match entirely. I haven't had any experience with that situation so far but I can imagine that it could introduce some unnecessary code complexity.
Then I tried with $fragment() API. With it I've also managed to reduce DB queries at the same number as with prisma-binding. Maybe writing fragment feels as an extra step comparing to schema delegation, but on the other hand you're working with Prisma Client which is more flexible solution.

I'm thinking, we could create fragment definition dynamically from infoobject? Something like this pseudo code shows:

async getAllUsers(root, args, ctx, info) {
    const fragment = generateFragmentFromInfo(info)
    return await prisma.users().$fragment(fragment);
}

We could also generate the fragment via decorator, so the resolver function body would be quite similar to the one in prisma-binding. However, I'm not sure whether this solution would work in all cases. It would be great to hear your opinion.

Also looking forward to use $deep() method which will give us another option to make graphQL server more performant.

tadejstanic on 28 Nov 2018

@timsuchanek thanks for the detailed explanation. Spent some time on this yesterday and finally feel like I have a good grasp on it.

Perhaps it would be useful to mention this performance limitation in the docs here, with some suggested design patterns.

I'm not sure I have suggestions on a recommended pattern. There are a few solutions I can see from the original example:

query getAllUsers {
  getAllUsers{
    id name courses {
        title
    }
  }
}

The options include:

Schema Delegation lets the client specify exactly what it wants, which is the cleanest solution here. But it only works to the extent that your client and server GraphQL APIs are the same (Edit: passing the 'info' object along is possible in prisma-binding but not prisma client)
Always retrieve the courses relation when you retrieve a user, which could be a performance bottleneck (but may be the sweet spot here for some use cases?)
Need to duplicate the client logic on the server, to fetch the 'courses' in a performant way only for the 'getAllUsers' query (and again for any other queries which need something similar). The server method would need to be updated if the client needed a different nested relation

@tadejstanic your idea of generating the fragment from the info object is interesting, but I'm unclear as to how it is better than schema delegation with the info object? It is clearly dynamic, and thus provides no type-safety, just like schema delegation. In other words it also requires your application and server graphQL APIs to match

vjsingh on 28 Nov 2018

@vjsingh Generating the fragment from info object is not better than schema delegation. In fact, it uses the same approach behined the scenes.

The main difference is that you don't need to bring prisma-binding into the project. However, after reading this post , I think my idea is rather a workaround since schema delegation in any form is not something that Prisma Client would support. There was also a$delegate method which has been removed from the Prisma client API.

I'd agree with you that Prisma community is looking for a recommended pattern for handling such situations. Maybe the upcoming $deep method will pave the way to the pattern. However, I'm afraid we won't be able to avoid overfetching the data at the end, unless some additional logic is implemented as you suggested in the 3rd option.

tadejstanic on 29 Nov 2018

Any updates on this? Querying nested data is a bread-and-butter operation, and I'm not sure what the recommended pattern is here. The two main issues being code simplicity / maintainability, and performance implications re: generating too many queries against the network / database

Edit: I found https://github.com/prisma/prisma/issues/3668. I suppose that's the proposal to wait for? @schickling

vjsingh on 2 Jan 2019

👍1

@vjsingh "make it work, make it right, make it fast"

I believe the most simple and correct approach is to delegate nested fetching to type resolvers.

For example: From web query return web and resolve web creator in Web type resolver itself.

export const Web: WebResolvers.Type = {
  ...WebResolvers.defaultResolvers,
  creator: (parent, _, ctx) => {
    return ctx.db.web({ id: parent.id }).creator();
  },
};

If slow, use fragment or some custom dataloader or raw SQL query.

steida on 12 Jan 2019

👍1

Hey there, I wanted to get some clarity around the way Prisma is described in the intro

comes with a built-in dataloader to solve the N+1 problem.

It seems to me that this is actually not true given the issue described by @timsuchanek , which ends up being bad with deep levels of nesting -- or is this document talking about the $fragment API? (which seems a bit misleading). It batches the query to the Prisma server, but then the Prisma server itself ends up doing a quadratic number of queries against the underlying DB.