Firebase-ios-sdk: Firestore: Deleting a couple of documents returned by a limited subcollection query triggers all documents to disappear locally

Created on 17 Jul 2018  Â·  12Comments  Â·  Source: firebase/firebase-ios-sdk

[REQUIRED] Step 2: Describe your environment

  • Xcode version: 9.4 (9F1027a)
  • Firebase SDK version: 5.3.0
  • Firebase Component: Firestore
  • Component version: 5.3.0

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

  • Set up a query on any subcollection with a limit smaller than the number of elements in the subcollection already, for example, 5.

  • Set up another query on the same subcollection with a limit set to 1. (Possibly unneeded step.)

  • Start deleting documents returned by the first query randomly one by one.

  • Observe that at some points the documents will begin disappearing from the query snapshot on their own. They will remain gone even if the app restarts but will be still visible via Firestore Console.

Relevant Code:

See example app along with more detailed description and video here.

Also, the following code in Firestore/Source/Remote/FSTRemoteEvent.mm in remoteEventAtSnapshotVersion: method seems to have something to do with this:

      if (targetState.current && [queryData.query isDocumentQuery]) {
        // Document queries for document that don't exist can produce an empty result set. To update
        // our local cache, we synthesize a document delete if we have not previously received the
        // document. This resolves the limbo state of the document, removing it from
        // limboDocumentRefs.
        FSTDocumentKey *key = [FSTDocumentKey keyWithPath:queryData.query.path];
        if (_pendingDocumentUpdates.find(key) == _pendingDocumentUpdates.end() &&
            ![self containsDocument:key inTarget:targetID]) {
          [self removeDocument:[FSTDeletedDocument documentWithKey:key version:snapshotVersion]
                       withKey:key
                    fromTarget:targetID];
        }
      }
firestore

Most helpful comment

Thanks for filing this. It turns out to be the cause of this discussion. We'll treat this as the canonical issue for this bug. The problem exists on all Firestore mobile/web SDKs.

For context, when a client is listening to a limit query (and in certain other cases) and a document moves out of the query results the server is allowed to tell the client merely that a document is removed without fully describing why that was. This is an intentional ambiguity that gives us some wiggle room in the server implementation.

When this happens the client considers the document to be in _limbo_ and attempts to figure out what happened to that document by requesting it out of band. Without this _limbo resolution_ process, the next time the client ran the query you would see a _flicker_: the client would serve stale data from the cache which would disappear when the server gave up-to-date results again. This is something we try to avoid.

So with that background, the bug is in this limbo resolution process. There's a certain order of events from the server that can cause the client to mistakenly determine that a limbo document doesn't exist and this is causing the client to populate negative entries in its cache. The fix for this is in flight here: https://github.com/firebase/firebase-js-sdk/pull/1014.

This alone would not be a problem because the next update from the server would fix the client's cache, but the client is also recording the commit time of the delete as the current time of the limbo resolution read, which is typically later than the last server-side modification. There are corner cases around updates to documents that don't match a query where the server is not obligated to send the absolute latest version of a document so the client will prefer its version of a document if it's later than the server issued one. This compounds the issue and causes it to be persistent. A fix that will allow the client to self-heal its cache is progress here: https://github.com/firebase/firebase-js-sdk/pull/1015.

(Please don't be alarmed: we're fixing the js-sdk first because we have a cross-platform test suite written in typescript. It's easier to fix there and then port.)

All 12 comments

Thank you for the repro app and video! I'll take a look and get back to you.

Thanks for filing this. It turns out to be the cause of this discussion. We'll treat this as the canonical issue for this bug. The problem exists on all Firestore mobile/web SDKs.

For context, when a client is listening to a limit query (and in certain other cases) and a document moves out of the query results the server is allowed to tell the client merely that a document is removed without fully describing why that was. This is an intentional ambiguity that gives us some wiggle room in the server implementation.

When this happens the client considers the document to be in _limbo_ and attempts to figure out what happened to that document by requesting it out of band. Without this _limbo resolution_ process, the next time the client ran the query you would see a _flicker_: the client would serve stale data from the cache which would disappear when the server gave up-to-date results again. This is something we try to avoid.

So with that background, the bug is in this limbo resolution process. There's a certain order of events from the server that can cause the client to mistakenly determine that a limbo document doesn't exist and this is causing the client to populate negative entries in its cache. The fix for this is in flight here: https://github.com/firebase/firebase-js-sdk/pull/1014.

This alone would not be a problem because the next update from the server would fix the client's cache, but the client is also recording the commit time of the delete as the current time of the limbo resolution read, which is typically later than the last server-side modification. There are corner cases around updates to documents that don't match a query where the server is not obligated to send the absolute latest version of a document so the client will prefer its version of a document if it's later than the server issued one. This compounds the issue and causes it to be persistent. A fix that will allow the client to self-heal its cache is progress here: https://github.com/firebase/firebase-js-sdk/pull/1015.

(Please don't be alarmed: we're fixing the js-sdk first because we have a cross-platform test suite written in typescript. It's easier to fix there and then port.)

@mikelehen @wilhuff Thanks for the detailed update!

I see the js and ios fixes have been merged. Into master. Any ETA on when we can expect new version releases with the fix?

@scottmas There's still one more part to the iOS fix (https://github.com/firebase/firebase-ios-sdk/pull/1558) and then we'll work on getting a new release out with the fixes. We'll update this thread as that happens.

@scottmas It's now possible to try this out. Add these lines to your Podfile (replacing any existing entries for FirebaseCore and FirebaseFirestore.

pod 'FirebaseCore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'wilhuff/pre-5.4.1'
pod 'FirebaseFirestore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'wilhuff/pre-5.4.1'

Great! What about Android?

That's coming, but not ready yet. I'm mostly interested in verifying that with this in place your iOS simulator with the broken calendar entries is fixed.

Hey Gil, those pod entries do not work when I try to pod install them. You
were missing a slash "Firebase/Core", but even after I added that it still
doesn't work.

On Thu, Jul 19, 2018 at 2:15 PM Gil notifications@github.com wrote:

That's coming, but not ready yet. I'm mostly interested in verifying that
with this in place your iOS simulator with the broken calendar entries is
fixed.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/firebase/firebase-ios-sdk/issues/1548#issuecomment-406400479,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACJxubImdTGPlJkwdsaFJpqRsnM1P1h4ks5uIOjvgaJpZM4VSiT2
.

Could you elaborate on how it didn't work?

Note Firebase/Core is the Core subspec of the Firebase pod. It's distinct from FirebaseCore which is the source pod actually containing the code. Similarly Firebase/Firestore is distinct from FirebaseFirestore.

After pruning unrelated things out I have the following in my Podfile for a test app and I'm able to build successfully:

target 'mcg-playground' do
  use_frameworks!

  pod 'FirebaseCore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'wilhuff/pre-5.4.1'
  pod 'FirebaseFirestore', :git => 'https://github.com/firebase/firebase-ios-sdk.git', :branch => 'wilhuff/pre-5.4.1'

  pod 'Firebase/Core'
  pod 'Firebase/Firestore'
  pod 'Firebase/Auth'
end

This has been released in Firebase JavaScript SDK 5.3.0 and Firebase iOS SDK 5.4.1. Android will follow separately.

Any ETA on Android?

On Fri, Jul 20, 2018 at 11:26 AM Gil notifications@github.com wrote:

This has been released in Firebase JavaScript SDK 5.3.0 and Firebase iOS
SDK 5.4.1. Android will follow separately.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/firebase/firebase-ios-sdk/issues/1548#issuecomment-406669999,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ACJxuejR6kyPkEtBwyLZYp9SiH3g7w3Yks5uIhKygaJpZM4VSiT2
.

Was this page helpful?
0 / 5 - 0 ratings