Firebase-js-sdk: Firestore query reads all the documents again (from database) whenever a document is modified even with persistence enabled

Created on 29 Jan 2018  Â·  18Comments  Â·  Source: firebase/firebase-js-sdk

  • Operating System: windows
  • Firebase SDK version: v4.9.0
  • Firebase Product: Firestore (auth, database, storage, etc)

Problem:
Assume I am running a query on my collection with n documents and later I modify some field of a document via console. What I see is n reads initially, and n reads later whenever one document is modified regardless of Offline Persistence being enalbed or not; resulting in 2n reads with all "fromCache : false" metadata.

Is there any way to avoid reading all the documents from database again or at least read them from cache when there is only one document that is being modified?

```
const someCollectionRef = firebase.firestore().collection("collection/document/subdocl‌​lection");
someCollectionRef.onSnapshot(s => {

 var d = [];
  s.forEach(doc => {

       d.push(doc.data()));

       //fromCache is always false even when one document is 
       //modified it seems that all documents are read from db again 
       console.log(`metadata is ` , doc.metadata);  
  }
  //d seems to have all my documents even when only 1 is modified

});


firestore

Most helpful comment

@SourceCipher This probably isn't the best place for this conversation (you could post on https://groups.google.com/forum/#!forum/google-cloud-firestore-discuss in the future) but I want to clarify that:

  1. The get() API by default prefers to give you fresh data from the server and so if you're online, it'll wait for that data before returning. If it can't reach the backend (e.g. because you've called disableNetwork()) then it'll return from cache. I assume this is the behavior that you're talking about.
  2. You can control this behavior by passing options, e.g. get({source:'cache'}) if you just want to get from cache.
  3. You could also use the onSnapshot() API to attach a listener which will always fire from cache first if we have cached data, and then fire again once we get up-to-date server data.

All 18 comments

This behavior is correct. If you want to iterate just the files that changed, take a look at the docs:
https://firebase.google.com/docs/firestore/query-data/listen#view_changes_between_snapshots

I am wondering about number of document reads. The metadata of query results "fromCache : false" suggests that reading n documents each time from database even with persistence enabled when only one document is modified. If a document modification results in all the documents to be read from database again, that can be rather an expensive operation. It would be much more efficient to read only the modified document from the database and the rest from cache, i.e, n+1 read instead of 2n reads.
Iterating on the data is fine.
Cheers

fromCache means that the snapshot is taken from local storage without ensuring it's the latest version (offline/before data comes from a server). I assure you the collection is not downloaded every time a document changes, just the changes are (with some minor overhead in form of session keys etc.).

Alright, so if I understand you correctly, despite the fact that all the documents appear after a document is modified, only the changes in that document is read again from database (and some minor overhead). (?)
It would be really nice to somehow address the confusion here since neither the documentation nor the metadata seem to reflect this. Thanks for the clarification and all the good work.

I'm in iOS, but I'm seeing a related strange behavior. When I add a document to a collection and then check for pendingWrites to server - if it's local, then under documentChanges it calls ".added." But once pendingWrites comes back as false, then under documentChanges it comes in under .modified. This seems to be a bug, as I'm only adding new documents and they cannot be modified.

@reneeOlson please make a separate issue for this, it’s not related

@Thkasis I believe that it could be documented in “enabling offline data” section of a documentation. I was actually pretty sure it was already somewhere there, but couldn’t find it. It is mentioned that data is synced when device is back online, but there is no explanation how is being done. Anyways, do you mind closing this issue and requesting specifically a documation update on a page which explains how it works? (if you rate it below 5 stars you get an option to send feedback).

Cheers! Done and done.

I have the same issue when I am reading the data when the network is enabled.. Every time I launch the app, all the data will be fetched from the database with metadata -> fromCache: false but once I disable the network it returns from cache as it should.

It should always return from cache unless something changed in the collection which listener is attached to

@SourceCipher
Its the correct behaviour
fromCache: false notify that you have the exact document on server.
fromCache: true mean this a local copy and maybe the server have a new version of this doc.

@merlinnot How is this correct? If I disable the network, all the data will be fetched from the cache and it returns fromCache as true indicating that the data is being loaded from the cache and while its online it will load from the server every time, indicating that with fromCache as false

So the problem is that even having the local data, firestore will fetch from the database no matter what, wasting big amount of time

@SourceCipher , see the answer from @merlinnot above on Jan 28. Only data that's been changed will be downloaded. The flag is a bit confusing that's all.

That's strange, as it always takes 3 times slower to load the same data while online comparing it with the offline mode.

I am guessing when it's offline there's no need of checking for data change, hence faster load, when its online, data is checked with the server to ensure you see the changes. It's only my guess though.

@Thkasis Dont think thats the case as it would load the data from cache anyway so the user will see the data instantly and then it would check for the changes from the db. For example the way messenger works. Once you open the app it will show all the messages etc and after it connects and syncs, it will show any new messages, changes from the db

@SourceCipher This probably isn't the best place for this conversation (you could post on https://groups.google.com/forum/#!forum/google-cloud-firestore-discuss in the future) but I want to clarify that:

  1. The get() API by default prefers to give you fresh data from the server and so if you're online, it'll wait for that data before returning. If it can't reach the backend (e.g. because you've called disableNetwork()) then it'll return from cache. I assume this is the behavior that you're talking about.
  2. You can control this behavior by passing options, e.g. get({source:'cache'}) if you just want to get from cache.
  3. You could also use the onSnapshot() API to attach a listener which will always fire from cache first if we have cached data, and then fire again once we get up-to-date server data.

@mikelehen Well it seems like its a related problem to me.

Great info you provided here, but thats exactly what I am doing. I always fetch the data via onSnapshot, but it is fetching from the database every single time and not loading the cache data.

I would expect the data always loading from the cache as it will speed up the process and then load in the background and apply the changes if there are any, replacing or amending the cache data which is already loaded to the user. Now this case never happens as it loads always from the server (or checks first the server data and then loads the cache which is basically the same thing as far as timing)

It seems like its firebase for react native issue as I had no problem when I developed with android studio in java. It would always load the data from the cache and then apply the changes if there are any.

@SourceCipher If you're using the vanilla Firebase JS SDK with React Native, then persistence won't work because React Native doesn't include IndexedDb (see https://github.com/firebase/firebase-js-sdk/issues/436). And so there will not be anything in the cache. :-(

You might try using https://rnfirebase.io/ instead, which wraps the native (iOS / Android) Firebase SDKs.

@mikelehen Thats exactly what I am using and I set the persistence in the android java files and the js via setting. It works if my device is offline, but the problem is with the online persistence where the cache data is always ignored

Was this page helpful?
0 / 5 - 0 ratings