Describe the bug
If we have this datamodel:
type User {
id: ID! @unique
name: String!
courses: [Course!]!
}
type Course {
id: ID! @unique
title: String!
owner: User!
}
And we want to make this query:
query getAllUsers {
getAllUsers{
id name courses {
title
}
}
}
then I found out that, Prisma Client is producing these queries:
Query:
{
users {
id
name
}
}
and for each user in DB:
Query:
query ($where: UserWhereUniqueInput!) {
user(where: $where) {
courses {
id
title
}
}
}
Variables:
{"where":{"id":"cjmqbxjg8000i0a34qsluv5x9"}}
Prisma Client's queries causes that underlaying database (in my case Postgres) executes:
If we sum all queries then we get the final number of DB queries executed equivalent to:
(numOfUsers * 2) + 1
To Reproduce
Steps to reproduce see this repo:
Repo
Expected behavior
I was expecting that Prisma Client is producing more optimized queries (less DB round trips), something like we can achieve with Facebook dataloader.
Versions (please complete the following information):
prisma CLI: 1.20.7Thanks for reporting @tadejstanic how is the call to the Prisma Client api exactly looking like, that is producing this query?
@timsuchanek here is a direct link to the code where Prisma Client is being called.
https://github.com/tadejstanic/prisma-test-js/blob/master/index.js
@tadejstanic Thanks for the link!
So this is expected behavior and the way the client works for now.
As first all users are received and then for each user the courses being fetched, it's performing a round trip. This is mostly related to how graphql-js works, but also how it works together with the client.
Let me dissect this situation. We have 2 nonoptimal situations going on here:
.users() call has been done..user({ id: root.id }).courses(), resulting in many queries being sent.For 1. you could eager load the courses, too. We'll introduce an API called $deep() in the Prisma Client to make this happen. While using the client, you could already use $fragment() or $graphql() methods to load more than just the scalar fields. The limitation here would be, that if you don't want to query the courses, you're loading them for nothing.
To actually solve both problems, you can use schema delegation. Schema delegation can be leveraged through prisma-binding. You can find an example for schema delegation in graphql resolvers here
Schema delegation means here, that you're delegating the resolving of fields and selection of fields to the resulting schema. It's achieved with passing through the info object of the resolver, which contains the selection set of the incoming query. In other words, if a query comes in, requesting the fields id and name, only these 2 fields are being requested by Prisma. This works perfectly when the application schema and Prisma schema match. However, as soon as they don't match anymore, Schema delegation gets pretty complicated. Because of that, we're recommending to start out with the Prisma Client - with the cost of being less performant.
You can build a graphql server from scratch with schema delegation, but it's a more advanced topic, so we introduce most of our users to the Client for now.
Coming back to the many queries being sent with the client right now: We're already leveraging batching in the client under the hood already. The current batching looks like this:
Instead of sending a payload of this form:
{
"query": "{ myQuery }"
}
We send queries in a batched manner like this:
[
{
"query": "{ myQuery }"
},
{
"query": "{ myQuery2 }"
}
]
This can be processed in parallel by the Prisma server, making the setup less inefficient.
TL DR;
To have the minimal amount of queries being sent, Schema Delegation is the way to go.
If you always want the courses in your example anyways, you can eager load them with $graphql() or $fragment() for now and soon using the $deep() method.
Does this answer your question? I'll only close this issue when you have a satisfiable answer because many other people will also think about this problem.
@timsuchanek thank you for this awesome explanation which shows how strong Prisma and its OSS packages are.
Here are my findings so far:
I made a test with prisma-binding as you suggested and managed to get only two DB queries executed with the same graphQL query. The only concern with this approach is the situation when application schema and Prisma schema don't match entirely. I haven't had any experience with that situation so far but I can imagine that it could introduce some unnecessary code complexity.
Then I tried with $fragment() API. With it I've also managed to reduce DB queries at the same number as with prisma-binding. Maybe writing fragment feels as an extra step comparing to schema delegation, but on the other hand you're working with Prisma Client which is more flexible solution.
I'm thinking, we could create fragment definition dynamically from infoobject? Something like this pseudo code shows:
async getAllUsers(root, args, ctx, info) {
const fragment = generateFragmentFromInfo(info)
return await prisma.users().$fragment(fragment);
}
We could also generate the fragment via decorator, so the resolver function body would be quite similar to the one in prisma-binding. However, I'm not sure whether this solution would work in all cases. It would be great to hear your opinion.
Also looking forward to use $deep() method which will give us another option to make graphQL server more performant.
@timsuchanek thanks for the detailed explanation. Spent some time on this yesterday and finally feel like I have a good grasp on it.
Perhaps it would be useful to mention this performance limitation in the docs here, with some suggested design patterns.
I'm not sure I have suggestions on a recommended pattern. There are a few solutions I can see from the original example:
query getAllUsers {
getAllUsers{
id name courses {
title
}
}
}
The options include:
@tadejstanic your idea of generating the fragment from the info object is interesting, but I'm unclear as to how it is better than schema delegation with the info object? It is clearly dynamic, and thus provides no type-safety, just like schema delegation. In other words it also requires your application and server graphQL APIs to match
@vjsingh Generating the fragment from info object is not better than schema delegation. In fact, it uses the same approach behined the scenes.
The main difference is that you don't need to bring prisma-binding into the project. However, after reading this post , I think my idea is rather a workaround since schema delegation in any form is not something that Prisma Client would support. There was also a$delegate method which has been removed from the Prisma client API.
I'd agree with you that Prisma community is looking for a recommended pattern for handling such situations. Maybe the upcoming $deep method will pave the way to the pattern. However, I'm afraid we won't be able to avoid overfetching the data at the end, unless some additional logic is implemented as you suggested in the 3rd option.
Any updates on this? Querying nested data is a bread-and-butter operation, and I'm not sure what the recommended pattern is here. The two main issues being code simplicity / maintainability, and performance implications re: generating too many queries against the network / database
Edit: I found https://github.com/prisma/prisma/issues/3668. I suppose that's the proposal to wait for? @schickling
@vjsingh "make it work, make it right, make it fast"
I believe the most simple and correct approach is to delegate nested fetching to type resolvers.
For example: From web query return web and resolve web creator in Web type resolver itself.
export const Web: WebResolvers.Type = {
...WebResolvers.defaultResolvers,
creator: (parent, _, ctx) => {
return ctx.db.web({ id: parent.id }).creator();
},
};
If slow, use fragment or some custom dataloader or raw SQL query.
Hey there, I wanted to get some clarity around the way Prisma is described in the intro
comes with a built-in dataloader to solve the N+1 problem.
It seems to me that this is actually not true given the issue described by @timsuchanek , which ends up being bad with deep levels of nesting -- or is this document talking about the $fragment API? (which seems a bit misleading). It batches the query to the Prisma server, but then the Prisma server itself ends up doing a quadratic number of queries against the underlying DB.
Most helpful comment
@tadejstanic Thanks for the link!
So this is expected behavior and the way the client works for now.
As first all users are received and then for each user the
coursesbeing fetched, it's performing a round trip. This is mostly related to how graphql-js works, but also how it works together with the client.Let me dissect this situation. We have 2 nonoptimal situations going on here:
.users()call has been done..user({ id: root.id }).courses(), resulting in many queries being sent.For 1. you could eager load the courses, too. We'll introduce an API called
$deep()in the Prisma Client to make this happen. While using the client, you could already use$fragment()or$graphql()methods to load more than just the scalar fields. The limitation here would be, that if you don't want to query thecourses, you're loading them for nothing.To actually solve both problems, you can use schema delegation. Schema delegation can be leveraged through
prisma-binding. You can find an example for schema delegation in graphql resolvers hereSchema delegation means here, that you're delegating the resolving of fields and selection of fields to the resulting schema. It's achieved with passing through the
infoobject of the resolver, which contains the selection set of the incoming query. In other words, if a query comes in, requesting the fieldsidandname, only these 2 fields are being requested by Prisma. This works perfectly when the application schema and Prisma schema match. However, as soon as they don't match anymore, Schema delegation gets pretty complicated. Because of that, we're recommending to start out with the Prisma Client - with the cost of being less performant.You can build a graphql server from scratch with schema delegation, but it's a more advanced topic, so we introduce most of our users to the Client for now.
Coming back to the many queries being sent with the client right now: We're already leveraging batching in the client under the hood already. The current batching looks like this:
Instead of sending a payload of this form:
We send queries in a batched manner like this:
This can be processed in parallel by the Prisma server, making the setup less inefficient.
TL DR;
To have the minimal amount of queries being sent, Schema Delegation is the way to go.
If you always want the courses in your example anyways, you can eager load them with
$graphql()or$fragment()for now and soon using the$deep()method.Does this answer your question? I'll only close this issue when you have a satisfiable answer because many other people will also think about this problem.