master branch of Django REST framework.Create these 2 models with the serializer and the view:
class Collection(Model):
pass
class Document(Model):
collection = models.ForeignKey(Collection, on_delete=models.CASCADE)
uid = models.TextField()
blob = postgres_fields.JSONField(default=dict)
class Meta():
unique_together = (('collection', 'uid'), )
class DocumentSerializer(serializers.ModelSerializer):
class Meta:
model = Document
fields = '__all__'
validators = [
UniqueTogetherValidator(
queryset=Document.objects.all(),
fields=('collection', 'uid')
)
]
class DocumentView(viewsets.ModelViewSet):
serializer_class = DocumentSerializer
Create two Documents (can be in the same collection) and send update requests for both documents in parallel:
# drf-race-condition.py
import sys
from datetime import datetime, timedelta
import requests
document_ids = [1, 2] # TODO: adjust
base_url = 'http://example.org/' # TODO: adjust
def send_request(index):
document_update_response = requests.patch(
url=f'{base_url}documents/{document_ids[index]}/',
headers=headers,
json={'blob': ''},
)
if document_update_response.status_code == 200:
print(index, document_update_response.status_code)
if __name__ == '__main__':
index = int(sys.argv[1])
end = datetime.now() + timedelta(seconds=5)
while end > datetime.now():
send_request(index=index)
#!/bin/bash
python3 drf-race-condition.py 0 &
python3 drf-race-condition.py 1 &
The documents are updated without any side effects.
We run into 500 status codes. (ie. IntegrityError: duplicate key value violates unique constraint "document_document_collection_id_1234_uniq"; DETAIL: Key (collection_id, uid)=(1, 'foo') already exists.)
It's even worse, when the documents are in different collections. Then one of them gets the other one's uid assigned.
Serializers are instances of rest_framework.fields.Field and the following code (inside of run_validators) is responsible for running validators:
https://github.com/encode/django-rest-framework/blob/78367ba1024ec43349aba362758e826fdeafbb03/rest_framework/fields.py#L533-L538
validator.set_context() sets validator.instance:
https://github.com/encode/django-rest-framework/blob/78367ba1024ec43349aba362758e826fdeafbb03/rest_framework/validators.py#L106-L112
Then validator() uses this instance. validator is the same instance for every instance of the serializer: DocumentSerializer(instance=d).validators[0] is DocumentSerializer(instance=d2).validators[0]. If two serializers call validator.set_context(self) before any one of them calls validator(value), both use the same instance.
It's even worse, when the documents are in different collections. Then one of them gets the other one's uid assigned.
validator.filter_queryset() adds uid (not sure why collection did not behave as I expected) to attr, which reference the serializers validated_data. This way uid of the wrong instance ends up in serializer and serializer.save() uses it.
https://github.com/encode/django-rest-framework/blob/78367ba1024ec43349aba362758e826fdeafbb03/rest_framework/validators.py#L130-L139
uid happened reliably on every execution of the script against our staging plattform.Pass serializer/serializer_field to __call__ and drop set_context.
Django's validators do not expect a second argument. According to drf's documentation one “can use any of Django's existing validators“.
This can be solved by using inspect (which has differents APIs for Python 2 and 3) or by catching the TypeError (which could come from within the call). => Both solutions have their downsides.
This would be a breaking change!
Clone the validator before calling set_context.
Nice catch and very nice work.
Most of the time, this will be handled by the fact that field's are deepcopied, except that indeed, root's serializers don't got through it.
Sorry for the noise, but recently I thought that problem of race condition is that validation was working on while it didn't really make a query. So it looks like this, correct me pls if I will wrong.
Step 1 - retrieve the value with the unique constraint for a checking.
Step 2 - insert record - this we are thinking that we checked it.
But between step 1 and step 2 another record has been inserted.
@dennypenta your use case is different from the OP's.
Yours may happen and is hard to prevent although not impossible.
@dennypenta Here's the issue related to your case: https://github.com/encode/django-rest-framework/issues/3876
The issue was closed with this note:
If anyone is actually hitting this in production it'd be useful to know, but either way around I don't see it as likely that we'll do anything other than allowing the 5xx response by default - if you need something else, write some view code, and catch the exceptional case.
If this happens only for one of your endpoints, you can catch the IntegrityError in the serializer or view and return 4xx instead of 5xx.
If you want to solve it in a generic way, you can write a custom exception handler. Here's an example:
from django.db.utils import IntegrityError
from rest_framework.status import HTTP_400_BAD_REQUEST
from rest_framework.views import exception_handler
def my_exception_handler(exc, context):
if isinstance(exc, IntegrityError) and \
'duplicate key value violates unique constraint' in str(exc):
return Response(
'Unique constraint violated',
status=HTTP_400_BAD_REQUEST,
)
return exception_handler(exc, context)
Is there anything I can do to move my PR #5762 forward?
@michael-k: Not yet. It's milestoned for 3.8, which means we'll resolve it one way or the other before then. Most likely it's fine and we'll take it as-is — it looks great at first glance — but I just need to sit down with it to review it properly.
Good job catching this issue! I guess we need some more eyes and opinions to look over the changes in PR #5762 ?
Presumably ATOMIC_REQUESTS = True would resolve this issue, right?
I've tried ATOMIC_REQUESTS = True with rest-framework 3.9.0 and Django 2.1, but it did not solve the issue.
@michael-k You are running DRF in multiple threads in the same process, right? And the bug happens when Python decides to release the GIL just between the set_context and the call?
This is a really bad bug, potentially causing silent data corruption. @michael-k's PR #6172 fixes it, and makes the code cleaner and safer in the process by removing the unneeded set_context mutability. I hope it gets merged for the next release.
Removing from the milestone, as we're tracking through the PR.
Most helpful comment
@dennypenta Here's the issue related to your case: https://github.com/encode/django-rest-framework/issues/3876
The issue was closed with this note:
If this happens only for one of your endpoints, you can catch the IntegrityError in the serializer or view and return 4xx instead of 5xx.
If you want to solve it in a generic way, you can write a custom exception handler. Here's an example: