Currently, when Apollo doesn't find a complete cache match for a query, it fetches the entire query via the given ApolloLink
. I'd like to set up a system whereby I can retrieve a partial match from the cache and only fetch any missing data.
I think we could facilitate a plugin for this in one of 2 ways:
some sort of middleware for hooking into the query handling (i.e. should it be fetched from cache or from a link, etc.)
Give the link access to the client / cache object
It seems there might be. If I create the cache first, then make a function linkWithCache<T>(cache: ApolloCache<T>): ApolloLink
, I may be able to access the cache within the link via a closure.
Thoughts from the Apollo team would be much appreciated here, re: what the best approach would be.
This is related to #2872 .
This is how the store used to work a while ago since we had query diffing logic that only fetched the portion of the query that needed to be fetched. However, this makes reasoning about missing data and reactivity significantly more complicated and this functionality was dropped.
In general, Apollo Client's cache's primary job isn't to reduce roundtrips: it is to provide reactivity. So, giving up reactivity or making it more complicated to reason about for the sake of reduced roundtrips/increasing cache hit rate probably isn't the right decision.
IIRC, one of the design goals of GraphQL itself is to minimise roundtrips (in contrast to REST where the design lends itself to many roundtrips). Considering that, I think apollo should at least allow for this option even if it would not be the default.
I can see how cache-diffing could cause problems for some project, but in others I think it would be very helpful. So I am a bit surprised that apollo does not support this.
@Poincare perhaps you could elaborate on " this makes reasoning about missing data and reactivity significantly more complicated"?
To give a concrete example where I think cache-diffing would be useful. We have a lot of product data that is published in versions. Each published version is immutable and this is what the client application reads. So the client application can be sure that the server-side product data never changes. However, performance is critical so we want to read just the amount of data we need from the server and also minimise roundtrips.
So lets say I have this query:
query GetProductData($selectedProducts: [ID!]) {
allProducts: products {
id
name
}
selectedProducts: productsByIds(ids: $selectedProducts) {
id
name
foo {
bar1
bar2 {
zoo1
... a lot of other fields ...
}
}
This query is used for a view where there is a list of products fed from allProducts
field, and also a detailed product view where the user can add products from the list. The detailed view requires the data in the selectedProducts
field.
Now when I run this the first time I get all data at once which is good. Then when the user picks a product, the $selectedProducts
variable changes and the query is re-fetched. Unfortunately the allProducts
field is also refetched now. Ideally it would come from cache and instead only the selectedProducts
field would be fetched. Actually this is the way I thought apollo would work.
The only other option I can see is to break the query into multiple queries. Now reality is not as simple as the example above, so I would end up with a lot of small queries, which would mean a lot of round-trips, which will kill performance. Sure you could batch some queries, and if you are lucky with the timing you will decrease the round-trips, but that is more of a work-around than a solution by design.
So my question here would be what disadvantages cache-diffing would have for this kind of application?
This is also related to #2425, a proposal which sounds like a very reasonable approach to the problem.
In my opinion the most important aspect of returning a cache hit with partial data is to render the UI with partial data immediately while the gaps are being filled in with a network request. If the request just queries all the specified fields again, then the cache diffing complexity might be avoidable?
This wouldn't solve the need to reduce server load for @jonaskello, but it could be a first step in that direction.
To help provide a more clear separation between feature requests / discussions and bugs, and to help clean up the feature request / discussion backlog, Apollo Client feature requests / discussions are now being managed under the https://github.com/apollographql/apollo-feature-requests repository.
This feature request / discussion will be closed here, but anyone interested in migrating this issue to the new repository (to make sure it stays active), can click here to start the migration process. This manual migration process is intended to help identify which of the older feature requests are still considered to be of value to the community. Thanks!
Most helpful comment
To give a concrete example where I think cache-diffing would be useful. We have a lot of product data that is published in versions. Each published version is immutable and this is what the client application reads. So the client application can be sure that the server-side product data never changes. However, performance is critical so we want to read just the amount of data we need from the server and also minimise roundtrips.
So lets say I have this query:
This query is used for a view where there is a list of products fed from
allProducts
field, and also a detailed product view where the user can add products from the list. The detailed view requires the data in theselectedProducts
field.Now when I run this the first time I get all data at once which is good. Then when the user picks a product, the
$selectedProducts
variable changes and the query is re-fetched. Unfortunately theallProducts
field is also refetched now. Ideally it would come from cache and instead only theselectedProducts
field would be fetched. Actually this is the way I thought apollo would work.The only other option I can see is to break the query into multiple queries. Now reality is not as simple as the example above, so I would end up with a lot of small queries, which would mean a lot of round-trips, which will kill performance. Sure you could batch some queries, and if you are lucky with the timing you will decrease the round-trips, but that is more of a work-around than a solution by design.
So my question here would be what disadvantages cache-diffing would have for this kind of application?