Graphql-js: Let's talk about caching in GraphQL

Created on 21 Jun 2017 · 3Comments · Source: graphql/graphql-js

Hey guys, I'm having some difficulties with this and would love to hear some thoughts from the community.

I feel this is one point where REST is potentially far superior, and something which could be probably handled much better in GraphQL.

It's not relevant for Facebook since you have such fine-grained privacy controls, but most of us don't have those privacy requirements or complexity.

Lets say I have the following shape:

thread: {
  ...
  posts : {
  ...
    user: {
      ...
      viewerDoesFollow
    }
}

This query could return 50 posts, with 50 different users, and even more subqueries.

You can use DataLoader to cut down on queries, but it's useless in this situation unless I prime the loaders ahead of time, e.g.:

// posts resolver
const posts = await models.Post.where({ threadId });

const followedUserIds = (await models.Follows.where({ followerId: viewer.id }))
  .map(follow => follow.followeeId);

const postUserIds = posts.map(post => post.userId);

const users = (await models.User.whereIn('id', postUserIds))
  .map(user => ({
    ...user,
    viewerIsFollowing: followedUserIds.includes(user.id),
  }))

loaders.users.prime(users);

return posts;

and this really really bad, it assumes a specific tree shape, and that isn't how GraphQL is supposed to work. Resolvers aren't supposed to know anything but who their parent is. I can safely assume I'll need the user and post data in the above example, but for other queries this all falls apart.

The perfect solution would allow you to define query shapes which are cacheable based on args and context data, with cacheability also defined at the resolver level.

The entire output from the thread down could be cached in redis, then later loaded at the query level, allowing the server to skip each cacheable resolver, while still resolving uncacheable child resolvers like viewerDoesFollow.

Again I understand why this wouldn't be relevant for Facebook's use case, but when the majority of your data is the same for all users, all those extra queries and cache hits are so ridiculously wasteful!

Unless I'm missing an obvious solution, it seems like this is going to make scaling our GraphQL server more difficult than it should be.

I'd love to see some discussion on this. What are your thoughts?

Thanks!

Source

scf4

All 3 comments

Closing since there's no issue to track here - but let's continue discussion on the closed issue.

I feel this is one point where REST is potentially far superior, and something which could be probably handled much better in GraphQL.

This is unequivocally true. If your HTTP responses are of a predetermined shape with data that is unaffected by who is loading it, then you can cache "at the edge" using something like varnish. If you're careful about invalidating your cache, that can lead to extremely good performance. Since most services which describe themselves as REST can often meet these two criteria, they can benefit from edge caching. GraphQL cannot utilize edge caching (at least not in the same way) in exchange for flexible queries from clients. Typically GraphQL is used on APIs which are also viewer-aware (such as viewerDoesFollow), and thus wouldn't be able to use edge caching anyhow.

Instead, GraphQL services are highly encouraged to use data caching instead of edge caching. In this alignment, rather than the cache layer sitting between your user and your HTTP GraphQL service, your cache layer sits between your HTTP GraphQL service and your database or other data access APIs. This still means cache-hits when accessing the API, however the GraphQL execution always runs completely. As an example, Facebook's servers have operated in this way for over a decade, long before a Facebook API or GraphQL, and GraphQL benefits from this architecture.

Designing a data caching layer isn't always easy, but is important for building a scalable service. Memcached or Redis are examples of useful tools for building such a thing. They almost always are provide a key-value store interface. As a result, efforts to have GraphQL map to custom SQL queries are often premature optimizations which make building a caching layer significantly more difficult and thus result in worse performance overall.

DataLoader is not a data caching layer, however, that is a tool which allows for the coalescing of data requests into batch fetches, and memoizing so repeated requests for the same resource don't produce multiple fetches. Batch loading and data caching are both useful tools for building a scalable and performant service, and can be implemented and benefited from independently, however are even more useful when used together. Memcached, Redis, and many other data cache layers provide batch get APIs (https://redis.io/commands/mget) which result in even better overall performance. In your example, despite accessing 50 posts and 50 users, using DataLoader to batch would result in 3 or 4 requests to a backing service, and if those requests are to a cache layer like Redis then they're likely to return very quickly and with minimal resources used.

You can use DataLoader to cut down on queries, but it's useless in this situation unless I prime the loaders ahead of time

I think you might be misunderstanding the purpose of DataLoader in this example. Priming should be a pretty rare use case. It's usually used when maintaining a second-index for a data set, like loading users by id or username, and even then you probably only want to use priming after you detect a real opportunity to improve performance. Using your DataLoaders in this way is attempting to mirror the behavior of the GraphQL executor in a way that's unlikely to result in an equivalent number of batches or meet the same performance and as you point out, code like that is unmaintainable and isn't how GraphQL is supposed to work.

leebyron on 21 Jun 2017

👎1 👍1

To follow up on the your point about privacy access:

Needing to account for privacy or any other access control is certainly not something all APIs need to deal with, but either way shouldn't have significant impact on how you design your data storage and caching layers. At Facebook we actually evaluate all privacy rules in our API gateway layer (where we do all "business logic") and not in our data storage layer. We made that choice intentionally to ensure we could use caching as much as possible. If access control logic was handled in our data storage services, then we wouldn't have been able to cache that data since each user requesting it might get something different.

This can help model viewer-dependent data like viewerDoesFollow. Of course each User won't have a field of that name in data storage, that would be information derived in business logic, however it is likely derived from a very cacheable sources of data.

leebyron on 21 Jun 2017

❤1

Thank you for the great response Lee.

I must apologize, I'm somewhat new to the back-end and seem to have heavily misunderstood a few important points, and perhaps hadn't read the docs as carefully as I should have.

While I understood the misuse of DataLoader in my example, I somehow completely missed that it batches separately called load functions into one! It's the primary feature and I've essentially been using it for memoization during requests for the past few months. 😩
Redis is a lot faster than I realized and there's much, much less overhead from hitting the cache more often than I had assumed.

Both of these points completely eliminate my concerns. Thank you again, and keep up the great work! (I just realized you wrote DataLoader itself!)

scf4 on 21 Jun 2017

Was this page helpful?

0 / 5 - 0 ratings