Prisma-client-js: Built-In DataLoader

Created on 25 Jul 2019 · 12Comments · Source: prisma/prisma-client-js

It would be nice if Photon would have DataLoader built in to it. In other words, making multiple requests to the Photon API would batch and cache the queries to improve performance.

Here is a very small example and use case:

For the calls to Photon findOne as seen below, Photon batch these calls together to make one request to the database. It would also cache the return authors so that subsequent calls to retrieve authors in the same request would simply return the cached version.

const Post = {
  author: ({ id }, args, context) => {
    return context.photon.posts
      .findOne({
        where: {
          id,
        },
      })
      .author()
  },
}

There would have to be some way to make Photon aware of different requests as caching should not be shared across requests. I do not know if this is already possible or not. The built in DataLoader should work for batching on findOne and findMany (which includes where clauses, orderBy and pagination).

Issue #151 is related but at a lower level I believe and not exactly what I am talking about.

I do not know if this is possible but would be extremely helpful as DataLoader is essential to GraphQL APIs.

kinfeature

Source

ryan-cat

👍12

Most helpful comment

Thanks for starting this discussion!
This will definitely be solved before GA!

There are two major optimizations that have to happen before GA:

Parallelization in the Rust query engine: This one is of far higher importance, as of now the query engine can only execute one query at a time.
Dataloader pattern, which is "batching and caching queries per api request":
When the underlying execution model in the query engine is able to process queries in parallel, we can look into this model.

Important to understand: Dataloader = query batching + per request caching

Per request caching
Compared to the effect, that batching has on performance, the effect of per request query caching is not very high and only helps, when the response payload includes a lot of duplicates, as they wouldn't be fetched again.
It's fairly easy to implement, management of the cache in the Node application is then needed though, as you need to tell Photon, when to "clear the cache", optimally you do that on a per API request basis.
We'll explore how to expose this functionality.
When we introduced it in Prisma 1, it was mostly causing problems, while not having many benefits. When you e.g. forget to reset the query cache, you'll just get stale data.

Query Batching
When talking about data loader, we're mostly talking about batching queries together, so that we can query the database in bulk instead of many single queries. This is something Photon will just do under the hood without you having to manage it. As soon as we have the parallel execution in the query engine fixed, we'll look into this optimization.

TL;DR
The effect on performance of these optimizations is ordered like this, highest first:

Execution parallelization
Query Batching
Per request caching (deduplication)

1 & 2 don't need any refactoring in the Node app, while 3 doesn't bring a lot of bang for the buck and needs careful handling/restructuring of the application.

timsuchanek on 31 Jul 2019

❤25

All 12 comments

I think I had the same issue if there is more than 1.000 users. Then following query returns "408 Request Timeout" and makes prisma2 server totally useless.

    t.list.field("posts", {
      type: "Post",
      resolve: (parent, {}, ctx) => {
        console.log("ctx.photon.posts.findMany");
        return ctx.photon.posts.findMany({
          include: { user: true }
        });
      }
    });

dex4er on 30 Jul 2019

😕2

@pantharshit00 - Just to clarify, does the 'priority/mid' label mean that it's planned to be implemented before general release, or after? Our primary pain point of Prisma 1 was performance (mainly due to the default many-to-many relational tables on the 'old' datamodel), so seeing an issue for DataLoader surprised me. I would assume this to be part of Prisma 2, as it would otherwise, as @dex4er noted, be useless for even moderately sized apps. 😄

jhalborg on 31 Jul 2019

Thanks for starting this discussion!
This will definitely be solved before GA!

There are two major optimizations that have to happen before GA:

Parallelization in the Rust query engine: This one is of far higher importance, as of now the query engine can only execute one query at a time.
Dataloader pattern, which is "batching and caching queries per api request":
When the underlying execution model in the query engine is able to process queries in parallel, we can look into this model.

Important to understand: Dataloader = query batching + per request caching

TL;DR
The effect on performance of these optimizations is ordered like this, highest first:

Execution parallelization
Query Batching
Per request caching (deduplication)

1 & 2 don't need any refactoring in the Node app, while 3 doesn't bring a lot of bang for the buck and needs careful handling/restructuring of the application.

timsuchanek on 31 Jul 2019

❤25

First of all, thanks for all the work you have been putting in!

Wondering if we could get an update on the ETA for Execution Parallelism? This is the number one issue preventing us from moving forward with Prisma2.

Thanks again!

nlarusstone on 24 Sep 2019

This was mentioned here, but is only related to this issue - so you might have better luck creating a new issue @nlarusstone.

janpio on 24 Sep 2019

Thanks @janpio -- created #236

nlarusstone on 24 Sep 2019

This will be unblocked once https://github.com/prisma/specs/issues/242 is specced and implemented.

schickling on 10 Oct 2019

Thanks a lot for reporting 🙏
This issue is fixed in the latest version of prisma2.
You can try it out with npm i -g prisma2.

In case it’s not fixed for you - please let us know and we’ll reopen this issue!

timsuchanek on 5 Mar 2020

🎉6 ❤1

Thanks a lot for reporting 🙏
This issue is fixed in the latest version of prisma2.
You can try it out with npm i -g prisma2.

In case it’s not fixed for you - please let us know and we’ll reopen this issue!

Could you confirm, is there automated batching of queries? What about caching? Could this be disabled? I can't see any documentation about this. I want to use prisma as a drop in replacement for raw sql. I don't want it coupled to any other parts of the app as I might rip it back out again or mix and match. I would rather that I could just use data-loader manually.

MattGson on 8 Apr 2020

The batching is very limited and only on findOne with the same params. Their is no caching. I am also tracking all prisma issues to make sure it doesnt become an ORM monster!

Sytten on 8 Apr 2020

👍2

Could this be disabled? I can't see any documentation about this. I want to use prisma as a drop in replacement for raw sql. I don't want it coupled to any other parts of the app as I might rip it back out again or mix and match. I would rather that I could just use data-loader manually.

@MattGson It might be worth opening a new issue and asking for this as a feature request to make sure this is considered.

janpio on 9 Apr 2020

Could this be disabled? I can't see any documentation about this. I want to use prisma as a drop in replacement for raw sql. I don't want it coupled to any other parts of the app as I might rip it back out again or mix and match. I would rather that I could just use data-loader manually.

@MattGson It might be worth opening a new issue and asking for this as a feature request to make sure this is considered.

Done: https://github.com/prisma/prisma-client-js/issues/645#issue-597632265

MattGson on 10 Apr 2020

👍3

Was this page helpful?

0 / 5 - 0 ratings