It appears that accessing serializer.data has side effects. Merely accessing the property inside perform_create() will cause the response data to change. I created a simple Django project with a test case that illustrates the bug: django-rest-bug.
The expected behavior is for response.data to contain the serialized version of the created object:
class AlbumViewSet(mixins.CreateModelMixin, viewsets.GenericViewSet):
# ... snip ...
def perform_create(self, serializer):
serializer.save()
# >>> post_data = {'title': 'The Wall', 'artist': 'Pink Floyd'}
# >>> response = self.client.post(reverse('album-list'), post_data)
# >>> print response.data
# {'id': 1, 'artist': u'Pink Floyd', 'title': u'The Wall'}
But when serializer.data is present it will cause response.data to be equal to the initially posted data (ex: the id is missing from response):
class AlbumViewSet(mixins.CreateModelMixin, viewsets.GenericViewSet):
# ... snip ...
def perform_create(self, serializer):
serializer.data # This line being present changes the value of `response.data`
serializer.save()
# >>> post_data = {'title': 'The Wall', 'artist': 'Pink Floyd'}
# >>> response = self.client.post(reverse('album-list'), post_data)
# >>> print response.data
# {'artist': u'Pink Floyd', 'title': u'The Wall'}
Even if serializer.data is not "safe", I think it's incorrect for property access to result in side effects. I _think_ the culprit is somewhere in BaseSerializer.data but am not certain.
I'd be interested in hearing why you want to access the serializer.data within the view.
The flow you've demonstrated there isn't intended usage, and it's possible that its something we should guard against. (Ie raise an exception if .save() is called _after_ accessing .data)
It's valid to access either .initial_data or .validated_data prior to .save(), but it shouldn't be valid to access .data and then subsequently call .save().
What's happening internally is that serializer.data is serializing the _unsaved_ instance, because you've called it prior to save.
When we properly restrict the flow then effectively our properties don't have side effects, but this is a case we've obviously left unguarded.
Retitled this issue to reflect the change we should make.
@xordoquy I was attempting to use it for some validation/permission logic before creating the object. Let's say a User is only allowed to create Messages attached to a Project they have access to. In that case MessageViewSet. perform_create would look like this:
class MessageViewSet(mixins.CreateModelMixin, viewsets.GenericViewSet):
def perform_create(self, serializer):
project = serializer.data['project']
if project in self.request.user.projects.all():
serializer.save(project=project)
else:
raise exceptions.PermissionDenied
It feels wrong to put this into the view, but I wasn't sure where else to put it. I guess this might be a case for creating a validation method in my serializer which uses .context to access request.user and validate there?
@tomchristie Good to know, thanks. I actually found, through experimentation, that .validated_data worked properly in this scenario. Glad to know this is the correct way to access it :) Though as I mentioned above I think now I understand the correct way to implement what I'm trying to do (.context). I suppose it's not _as critical_ to prevent side effects when this flow/scenario isn't intended usage :sweat_smile:
Not critical no, but preferable, so let's keep this open for now.
I guess this might be a case for creating a validation method in my serializer which uses .context to access request.user and validate there?
Yup. Validation should really be kept inside of the serializer. Or at least it should happen before perform_create is called.
I usually recommend people limit the [serializer_field].queryset property during the initialization of the serializer so the user gets an error if they try to set the relation to something invalid.
@kevin-brown Thanks for pointing me in that direction — this is what I wound up with in my serializer:
class MessageSerializer(serializers.ModelSerializer):
class Meta:
model = Message
fields = ('id', 'project', 'body')
def __init__(self, *args, **kwargs):
super(MessageSerializer, self).__init__(*args, **kwargs)
self.user = self.context['request'].user
self.fields['project'].queryset = self.user.projects.all()
def validate_project(self, project):
if project not in self.user.projects.all():
raise exceptions.PermissionDenied
return project
Because I'm already limiting the queryset in __init__, maybe I can remove validate_project() entirely...
If the side effect of calling .data is just the caching of the representation on _data, wouldn't it make sense that .save() just invalidates the cache, instead of refusing to perform the save?
@dyeray one shouldn't access serializer.data before saving. That's a sign of a possible bigger issue although one may not be aware of it already. It is just not meant for that.
I don't understand why in tutorial http://www.django-rest-framework.org/tutorial/3-class-based-views/ the example shows exactly the opposite.
```
serializer = SnippetSerializer(data=request.data)
if serializer.is_valid():
serializer.save()
```If I try this example, I get the exception "You cannot call '.save()' after accessing serializer.data. So what's wrong ?
@Denelbe what are you referring to ? serializer.data are used after calling save(), not before.
No, data is present from the instance creation when initializing with (data=request.data)
Ok, let's avoid too much noise here and move to the discussion group.