Mongoengine: MultipleObjectsReturned: 1 items returned, instead of 1

Created on 23 Apr 2014  路  9Comments  路  Source: MongoEngine/mongoengine

I have occasionally got the error, I use uwsgi(4 threads) with django.

The case I occurred the error is very simple:

obj = SomeDocument.objects.filter(user_id=xxx).get()
if not obj:
    obj = SomeDocument(...)
    obj.save()

And when I look up the document(I have logged the id), there is no duplicated entry, and I can not reproduce the exception. Normally, if there existed more than one results match user_id=xxx, we should have

  MultipleObjectsReturned: 2 items returned, instead of 1

but

 MultipleObjectsReturned: 1 items returned, instead of 1

is quite weird, since there is only one match. I am aware that before raising the exception in QuerySet.get, the code called .rewind() and .count() to form a exception message, so there may a case if remove a duplicate entry causing the exception message inconsistent. But I have no code removing SomeDocument instances.

Is there anybody happen to occur the error? How could this happened.

Bug

Most helpful comment

@socrateslee as you mentioned, you work in a multi-threaded environment. It's possible that a dupe doc responsible for the MultipleObjectsReturned error won't be there anymore by the time this error is raised.

BaseQuerySet.get does a few non-atomic things that can lead to this confusing result. First it sets up a cursor and tries to iterate over two of its docs. It fails it there are none or if the 2nd iteration doesn't fail. In such case, it rewinds the cursor and calls count on it. The thing is, between the iteration and the count, another thread might delete the 2nd document, so the count's result will be 1, leading to the confusing MultipleObjectsReturned: 1 items returned, instead of 1.

I'm not even sure if that count call is useful there. If there's a ton of duplicate docs, it can even lead to a very poor performance. What do you guys think about removing the count and instead of saying e.g. MultipleObjectsReturned: 125 items returned, instead of 1 we say MultipleObjectsReturned: 2 or more items returned, instead of 1? It will avoid a round-trip to the server and will also better convey what happened during iteration.

All 9 comments

  1. Misspelled obj.save().
  2. According to http://mongoengine-odm.readthedocs.org/apireference.html#mongoengine.queryset.QuerySet.get get() will raise DoesNotExist exception if there is nothing found. Consider using first()

....obj = SomeDocument.objects.filter(user_id=xxx).first()
....if not obj:
........obj = SomeDocument(...)
........obj.save()

Moreover it looks like upsert will be helpful in your case. Look here for example http://stackoverflow.com/a/16100666/396862

Thank you for pointing out the misspelling, but my question is how

MultipleObjectsReturned: 1 items returned, instead of 1

happened. I have updated my question above.

Ok. Will review the code and try to find data to reproduce the problem.

Also encountered this when executing a batch import into a very active cluster with a write concern that forces fsync and waits for all three rs's to confirm. Each entry in the import uses get or create on a related object.

@etxeba, you shouldn't use get_or_create because it is not atomic and deprecated due to that
Do upsert(s) instead. Does it still happen to you with upsert(s)?

@socrateslee, is your collection sharded?

@DavidBord no, the collection is not sharded.

Same here.
The exception occurs in the get method. It looks like this now:

def get(self, *q_objs, **query):
    """Retrieve the the matching object raising
    :class:`~mongoengine.queryset.MultipleObjectsReturned` or
    `DocumentName.MultipleObjectsReturned` exception if multiple results
    and :class:`~mongoengine.queryset.DoesNotExist` or
    `DocumentName.DoesNotExist` if no results are found.

    .. versionadded:: 0.3
    """
    queryset = self.clone()
    queryset = queryset.order_by().limit(2)
    queryset = queryset.filter(*q_objs, **query)

    try:
        result = queryset.next()
    except StopIteration:
        msg = ("%s matching query does not exist."
               % queryset._document._class_name)
        raise queryset._document.DoesNotExist(msg)
    try:
        queryset.next()
    except StopIteration:
        return result

    queryset.rewind()
    message = u'%d items returned, instead of 1' % queryset.count()
    raise queryset._document.MultipleObjectsReturned(message)

The collection is replicated and sharded, and the code throwing the exception looks like this:

class MyDocument(db.Document):
    meta = {
        'indexes': [{
            'fields': ['uuid'],
            'unique': True,
            'sparse': True
        }],
        'index_background': True,
    }
    uuid = db.StringField()
    lang = db.StringField(required=True)
    country = db.StringField(required=True)

MyDocument.objects(uuid=uuid).update(
    lang=lang,
    country=country,
    upsert=True
)

MyDocument.objects.get(uuid=uuid)

(The uuid is not None here)

@socrateslee as you mentioned, you work in a multi-threaded environment. It's possible that a dupe doc responsible for the MultipleObjectsReturned error won't be there anymore by the time this error is raised.

BaseQuerySet.get does a few non-atomic things that can lead to this confusing result. First it sets up a cursor and tries to iterate over two of its docs. It fails it there are none or if the 2nd iteration doesn't fail. In such case, it rewinds the cursor and calls count on it. The thing is, between the iteration and the count, another thread might delete the 2nd document, so the count's result will be 1, leading to the confusing MultipleObjectsReturned: 1 items returned, instead of 1.

I'm not even sure if that count call is useful there. If there's a ton of duplicate docs, it can even lead to a very poor performance. What do you guys think about removing the count and instead of saying e.g. MultipleObjectsReturned: 125 items returned, instead of 1 we say MultipleObjectsReturned: 2 or more items returned, instead of 1? It will avoid a round-trip to the server and will also better convey what happened during iteration.

Was this page helpful?
0 / 5 - 0 ratings