google-cloud-firestore
Version: 0.29.0
Linux, Python 2.7.13
When you try to iterate over a query result, the generator times out after a while and does not return an error. Query result generator will silently miss results in queries with many documents.
from google.cloud import firestore
db = firestore.Client()
coll_ref = db.collection(u'users')
count = 0
mycoll = db.collection(u'users').get()
for e in mycoll:
# There are 1000 docs in this collection. Only about 283 get updated. No error message
db.collection(u'users').document(e.id).update({
u'myfield': False
})
count += 1
# time.sleep(.5) the longer this wait is, the fewer docs get updated
print(count)
Issue brought up in this SO post.
Your example mutates the collection while iterating it, which isn't likely to be safe even for an in-process datastructure like list or dict. This example fetches the collection snapshots first, converting them to document references, and then iterates them, performing the updates:
>>> documents = [snapshot.reference for snapshot in mycoll.get()]
>>> count = 0
>>> for document in documents:
... document.update({u'myfield': False})
... count += 1
... time.sleep(0.01)
>>> count
1000
Thank you for the prompt reply!
@jlara310 #6043 is a request to document that Query.get / Collection.get have a time limit on iteration, which explains why making longer calls to time.sleep reduced the number of rows in your example.
Most helpful comment
Your example mutates the collection while iterating it, which isn't likely to be safe even for an in-process datastructure like
listordict. This example fetches the collection snapshots first, converting them to document references, and then iterates them, performing the updates: