Google-cloud-python: Cloud Firestore: Can't iterate through all query results

Created on 19 Sep 2018  路  3Comments  路  Source: googleapis/google-cloud-python

google-cloud-firestore
Version: 0.29.0
Linux, Python 2.7.13

When you try to iterate over a query result, the generator times out after a while and does not return an error. Query result generator will silently miss results in queries with many documents.

from google.cloud import firestore
db = firestore.Client()
coll_ref = db.collection(u'users')

count = 0
mycoll = db.collection(u'users').get()

for e in mycoll:
    # There are 1000 docs in this collection. Only about 283 get updated. No error message  
    db.collection(u'users').document(e.id).update({
        u'myfield': False
    })
    count += 1
    # time.sleep(.5) the longer this wait is, the fewer docs get updated

print(count)

Issue brought up in this SO post.

question firestore

Most helpful comment

Your example mutates the collection while iterating it, which isn't likely to be safe even for an in-process datastructure like list or dict. This example fetches the collection snapshots first, converting them to document references, and then iterates them, performing the updates:

>>> documents = [snapshot.reference for snapshot in mycoll.get()]
>>> count = 0
>>> for document in documents:
...     document.update({u'myfield': False})
...     count += 1
...     time.sleep(0.01)
>>> count
1000

All 3 comments

Your example mutates the collection while iterating it, which isn't likely to be safe even for an in-process datastructure like list or dict. This example fetches the collection snapshots first, converting them to document references, and then iterates them, performing the updates:

>>> documents = [snapshot.reference for snapshot in mycoll.get()]
>>> count = 0
>>> for document in documents:
...     document.update({u'myfield': False})
...     count += 1
...     time.sleep(0.01)
>>> count
1000

Thank you for the prompt reply!

@jlara310 #6043 is a request to document that Query.get / Collection.get have a time limit on iteration, which explains why making longer calls to time.sleep reduced the number of rows in your example.

Was this page helpful?
0 / 5 - 0 ratings