Prisma-client-js: Built-In DataLoader

Created on 25 Jul 2019  路  12Comments  路  Source: prisma/prisma-client-js

It would be nice if Photon would have DataLoader built in to it. In other words, making multiple requests to the Photon API would batch and cache the queries to improve performance.

Here is a very small example and use case:

For the calls to Photon findOne as seen below, Photon batch these calls together to make one request to the database. It would also cache the return authors so that subsequent calls to retrieve authors in the same request would simply return the cached version.

const Post = {
  author: ({ id }, args, context) => {
    return context.photon.posts
      .findOne({
        where: {
          id,
        },
      })
      .author()
  },
}

There would have to be some way to make Photon aware of different requests as caching should not be shared across requests. I do not know if this is already possible or not. The built in DataLoader should work for batching on findOne and findMany (which includes where clauses, orderBy and pagination).

Issue #151 is related but at a lower level I believe and not exactly what I am talking about.

I do not know if this is possible but would be extremely helpful as DataLoader is essential to GraphQL APIs.

kinfeature

Most helpful comment

Thanks for starting this discussion!
This will definitely be solved before GA!

There are two major optimizations that have to happen before GA:

  1. Parallelization in the Rust query engine: This one is of far higher importance, as of now the query engine can only execute one query at a time.
  2. Dataloader pattern, which is "batching and caching queries per api request":
    When the underlying execution model in the query engine is able to process queries in parallel, we can look into this model.

Important to understand: Dataloader = query batching + per request caching

Per request caching
Compared to the effect, that batching has on performance, the effect of per request query caching is not very high and only helps, when the response payload includes a lot of duplicates, as they wouldn't be fetched again.
It's fairly easy to implement, management of the cache in the Node application is then needed though, as you need to tell Photon, when to "clear the cache", optimally you do that on a per API request basis.
We'll explore how to expose this functionality.
When we introduced it in Prisma 1, it was mostly causing problems, while not having many benefits. When you e.g. forget to reset the query cache, you'll just get stale data.

Query Batching
When talking about data loader, we're mostly talking about batching queries together, so that we can query the database in bulk instead of many single queries. This is something Photon will just do under the hood without you having to manage it. As soon as we have the parallel execution in the query engine fixed, we'll look into this optimization.

TL;DR
The effect on performance of these optimizations is ordered like this, highest first:

  1. Execution parallelization
  2. Query Batching
  3. Per request caching (deduplication)

1 & 2 don't need any refactoring in the Node app, while 3 doesn't bring a lot of bang for the buck and needs careful handling/restructuring of the application.

All 12 comments

I think I had the same issue if there is more than 1.000 users. Then following query returns "408 Request Timeout" and makes prisma2 server totally useless.

    t.list.field("posts", {
      type: "Post",
      resolve: (parent, {}, ctx) => {
        console.log("ctx.photon.posts.findMany");
        return ctx.photon.posts.findMany({
          include: { user: true }
        });
      }
    });

@pantharshit00 - Just to clarify, does the 'priority/mid' label mean that it's planned to be implemented before general release, or after? Our primary pain point of Prisma 1 was performance (mainly due to the default many-to-many relational tables on the 'old' datamodel), so seeing an issue for DataLoader surprised me. I would assume this to be part of Prisma 2, as it would otherwise, as @dex4er noted, be useless for even moderately sized apps. 馃槃

Thanks for starting this discussion!
This will definitely be solved before GA!

There are two major optimizations that have to happen before GA:

  1. Parallelization in the Rust query engine: This one is of far higher importance, as of now the query engine can only execute one query at a time.
  2. Dataloader pattern, which is "batching and caching queries per api request":
    When the underlying execution model in the query engine is able to process queries in parallel, we can look into this model.

Important to understand: Dataloader = query batching + per request caching

Per request caching
Compared to the effect, that batching has on performance, the effect of per request query caching is not very high and only helps, when the response payload includes a lot of duplicates, as they wouldn't be fetched again.
It's fairly easy to implement, management of the cache in the Node application is then needed though, as you need to tell Photon, when to "clear the cache", optimally you do that on a per API request basis.
We'll explore how to expose this functionality.
When we introduced it in Prisma 1, it was mostly causing problems, while not having many benefits. When you e.g. forget to reset the query cache, you'll just get stale data.

Query Batching
When talking about data loader, we're mostly talking about batching queries together, so that we can query the database in bulk instead of many single queries. This is something Photon will just do under the hood without you having to manage it. As soon as we have the parallel execution in the query engine fixed, we'll look into this optimization.

TL;DR
The effect on performance of these optimizations is ordered like this, highest first:

  1. Execution parallelization
  2. Query Batching
  3. Per request caching (deduplication)

1 & 2 don't need any refactoring in the Node app, while 3 doesn't bring a lot of bang for the buck and needs careful handling/restructuring of the application.

First of all, thanks for all the work you have been putting in!

Wondering if we could get an update on the ETA for Execution Parallelism? This is the number one issue preventing us from moving forward with Prisma2.

Thanks again!

This was mentioned here, but is only related to this issue - so you might have better luck creating a new issue @nlarusstone.

Thanks @janpio -- created #236

This will be unblocked once https://github.com/prisma/specs/issues/242 is specced and implemented.

Thanks a lot for reporting 馃檹
This issue is fixed in the latest version of prisma2.
You can try it out with npm i -g prisma2.

In case it鈥檚 not fixed for you - please let us know and we鈥檒l reopen this issue!

Thanks a lot for reporting 馃檹
This issue is fixed in the latest version of prisma2.
You can try it out with npm i -g prisma2.

In case it鈥檚 not fixed for you - please let us know and we鈥檒l reopen this issue!

Could you confirm, is there automated batching of queries? What about caching? Could this be disabled? I can't see any documentation about this. I want to use prisma as a drop in replacement for raw sql. I don't want it coupled to any other parts of the app as I might rip it back out again or mix and match. I would rather that I could just use data-loader manually.

The batching is very limited and only on findOne with the same params. Their is no caching. I am also tracking all prisma issues to make sure it doesnt become an ORM monster!

Could this be disabled? I can't see any documentation about this. I want to use prisma as a drop in replacement for raw sql. I don't want it coupled to any other parts of the app as I might rip it back out again or mix and match. I would rather that I could just use data-loader manually.

@MattGson It might be worth opening a new issue and asking for this as a feature request to make sure this is considered.

Could this be disabled? I can't see any documentation about this. I want to use prisma as a drop in replacement for raw sql. I don't want it coupled to any other parts of the app as I might rip it back out again or mix and match. I would rather that I could just use data-loader manually.

@MattGson It might be worth opening a new issue and asking for this as a feature request to make sure this is considered.

Done: https://github.com/prisma/prisma-client-js/issues/645#issue-597632265

Was this page helpful?
0 / 5 - 0 ratings

Related issues

maartenraes picture maartenraes  路  4Comments

Errorname picture Errorname  路  3Comments

divyenduz picture divyenduz  路  4Comments

pantharshit00 picture pantharshit00  路  4Comments

FluorescentHallucinogen picture FluorescentHallucinogen  路  3Comments