Django-rest-framework: Specifying different serializers for input and output

Created on 2 May 2014  Â·  23Comments  Â·  Source: encode/django-rest-framework

I find that, particularly when it comes to POST methods, that I often need a different serializer for input than for output. E.g., for a a particular model I may need only two or three input values, but the server will calculate/retrieve/whatever some additional values for fields on the model, and all those values need to get back to the client.

So far, my method has been to override get_serializer_class() to specify a separate input serializer for the request, and then override create() to use a different serializer for my output. That pattern works, but it took me some time to figure it out because the docs don't really suggest that option; the assumption that the generic APIViews are built around is that you specify one serializer and that is used for both input and output. I agree this generally works in the common case, but using my method breaks a little bit of the CBV magic. In particular, it can be difficult to troubleshoot if you make a mistake specifying a custom output serializer.

I propose two solutions:

  1. Build a consistent way to optionally specify different input and output serializers, or
  2. Add some additional documentation explaining the presumption that one serializer is intended for input _and_ output, and best practices for overriding that in the case that the default behavior doesn't suit your use case.
Documentation

Most helpful comment

Yeah, I'm fine either way, honestly. I just found it difficult to figure out how to do what I wanted from the docs, but as much may be my misunderstanding of how DRF is designed to work.

Here's a sample so you can see what I am talking about; feel free to note if I'm doing something monumentally stupid and that there is/should be a better way in DRF itself that I'm just missing.

from rest_framework import generics, status
from rest_framework.response import Response

from rack.models import RackItem
from rack.serializers import RackItemSerializer, NewRackItemSerializer


class ListCreateRackItem(generics.ListCreateAPIView):
    model = RackItem

    def get_serializer_class(self):
        if self.request.method == 'POST':
            return NewRackItemSerializer
        return RackItemSerializer

    def get_queryset(self):
        return RackItem.objects.filter(shopper=self.request.user)

    def create(self, request, *args, **kwargs):
        serializer = self.get_serializer(data=request.DATA)

        if not serializer.is_valid():
            return Response(
                serializer.errors, status=status.HTTP_400_BAD_REQUEST)

        item = RackItem.objects.create(
            shopper=request.user,
            item_url=serializer.data['item_url'],
            item_image_url=serializer.data['item_image_url'])

        result = RackItemSerializer(item)
        return Response(result.data, status=status.HTTP_201_CREATED)


class GetUpdateDeleteRackItem(generics.RetrieveUpdateDestroyAPIView):
    model = RackItem
    serializer_class = RackItemSerializer

    def get_queryset(self):
        return RackItem.objects.filter(shopper=self.request.user)

and the serializers themselves:

from rest_framework import serializers

from models import RackItem


class RackItemSerializer(serializers.ModelSerializer):
    class Meta:
        model = RackItem


class NewRackItemSerializer(serializers.Serializer):
    item_url = serializers.URLField()
    item_image_url = serializers.URLField()

All 23 comments

You need to use the read/write_only kwargs if you want to return different data but allow writing to other attributes.
If you need different formatting entirely or even different field classes for the same attribute then yeh, you need to use two different serializers.

Hi @foresmac,

You've hit on one of the... erm... _learning points_ of DRF. This sort of thing comes up on StackOverflow a number of times.

Overriding get_serializer_class() works well enough, so I'd favour your Option 2.

If you fancy drafting up some changes in a pull request — or doing a blog post or something else like that — that would be cool.

In the meantime I'll close this particular issue.

@carltongibson:

Overriding get_serializer_class() only works if you're using a different serializer for different HTTP methods, right? IS there a way to override it so that it returns a different serializer for input on a request vs output on a Response?

@foresmac — I see — a slightly different case. I think the short answer is "Not automagically, not currently". I imagine the simplest thing (if this is really necessary) is setting the response data (with your _output_ serialiser by hand. (But you've found your own way via create it seems.)

if this is really necessary

You really can get a long way with read only fields and so on — I can certainly believe there are cases where this isn't enough but I'm not sure at all that such cases would fall in the 80:20 that needs to be served (in the core) by DRF.

If you think we're missing something, I recommend you explain it in depth, show where the code would change, show what use-cases would be resolved by it — if it sounds good, then open a pull request to that effect so that it can be reviewed.

If you fancy taking a pop at Option 2 — that'll always be well received.

Yeah, I'm fine either way, honestly. I just found it difficult to figure out how to do what I wanted from the docs, but as much may be my misunderstanding of how DRF is designed to work.

Here's a sample so you can see what I am talking about; feel free to note if I'm doing something monumentally stupid and that there is/should be a better way in DRF itself that I'm just missing.

from rest_framework import generics, status
from rest_framework.response import Response

from rack.models import RackItem
from rack.serializers import RackItemSerializer, NewRackItemSerializer


class ListCreateRackItem(generics.ListCreateAPIView):
    model = RackItem

    def get_serializer_class(self):
        if self.request.method == 'POST':
            return NewRackItemSerializer
        return RackItemSerializer

    def get_queryset(self):
        return RackItem.objects.filter(shopper=self.request.user)

    def create(self, request, *args, **kwargs):
        serializer = self.get_serializer(data=request.DATA)

        if not serializer.is_valid():
            return Response(
                serializer.errors, status=status.HTTP_400_BAD_REQUEST)

        item = RackItem.objects.create(
            shopper=request.user,
            item_url=serializer.data['item_url'],
            item_image_url=serializer.data['item_image_url'])

        result = RackItemSerializer(item)
        return Response(result.data, status=status.HTTP_201_CREATED)


class GetUpdateDeleteRackItem(generics.RetrieveUpdateDestroyAPIView):
    model = RackItem
    serializer_class = RackItemSerializer

    def get_queryset(self):
        return RackItem.objects.filter(shopper=self.request.user)

and the serializers themselves:

from rest_framework import serializers

from models import RackItem


class RackItemSerializer(serializers.ModelSerializer):
    class Meta:
        model = RackItem


class NewRackItemSerializer(serializers.Serializer):
    item_url = serializers.URLField()
    item_image_url = serializers.URLField()

The gist here is that I'm only getting some small bit of information to create a rack item; the server itself generates all the other fields on the model and stores them in the DB. But, I want my endpoint to spit out all that info when a new item is created.

The pattern isn't too difficult, but all the docs seem to make a lot of assumptions that everyone is doing the 80% common case, and what's there makes it hard to see what to override or where to achieve other ends. If doing what I'm doing isn't common enough to address in the code, I'm more than happy to provide some sample code and an explanation of how it works and why.

@foresmac — it looks to me like you're switching the serialisers in the most sensible way.

_However_, I'd guess you could get the same result by marking the server-provided fields — shopper in your example — as read-only in RackItemSerializer and then just use that. — I guess it's just a question of what you prefer in the end.

I think we should state that we prefer read only fields in order to DRY up the code.
Creating two serializers seems redundant.
@foresmac What do you think?

There will be more fields that are editable later—mostly some boolean fields that record some user actions with the model—so I'm not sure that solves the problem in the long term. Agree that I probably should be making better use of read_only, though. Maybe setting required explicitly in some cases may also help? I'm not sure, TBH.

I'm used to basically creating a Django form to use for input validation, and basically just building a dict of key_name, value pairs and basically just doing json.dumps() for output. The whole concept of a serializer that works both ways was completely foreign to me before using DRF.

The answer I came here to find turned out to be:

If read_only doesn't provide the control you need, and you want to customise the input or output validation logic, you should override to_internal_value() or to_representation(), respectively.

In this case, you'd use to_internal_value() to customise the generation of validated_data from whatever the client provides. If the subsequent built-in call to YourModel.objects.create(**validated_data) doesn't work, you can then override create().

Im trying to tackle this issue right now.
Im going to give a try to this package - https://github.com/vintasoftware/drf-rw-serializers.

Your approach seems logical, why not just make a mixin out of it and re-use it. Something along the lines of:

from rest_framework import status
from rest_framework.response import Response


class DifferentOutInViewsetSerializers:
    """
    Mixin for allowing the use of different serializers for responses and
    requests for update/create
    """
    request_serializer_update = None
    response_serializer_update = None

    request_serializer_create = None
    response_serializer_create = None

    def get_serializer_class(self):
        if self.action == 'update':
            return self.request_serializer_update
        elif self.action == 'create':
            return self.request_serializer_create
        return super().get_serializer_class()

    def create(self, request, *args, **kwargs):
        serializer = self.get_serializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        self.perform_create(serializer)

        response_serializer = self.response_serializer_create(
            instance=serializer.instance)

        headers = self.get_success_headers(response_serializer.data)
        return Response(
            response_serializer.data,
            status=status.HTTP_201_CREATED, headers=headers)

    def update(self, request, *args, **kwargs):
        partial = kwargs.pop('partial', False)
        instance = self.get_object()
        serializer = self.get_serializer(instance, data=request.data, partial=partial)
        serializer.is_valid(raise_exception=True)
        self.perform_update(serializer)

        if getattr(instance, '_prefetched_objects_cache', None):
            # If 'prefetch_related' has been applied to a queryset, we need to
            # forcibly invalidate the prefetch cache on the instance.
            instance._prefetched_objects_cache = {}

        response_serializer = self.response_serializer_update(
            instance=serializer.instance)
        return Response(response_serializer.data)

I find myself in need of this as well, but in my case it's not for POST requests, but a GET request that returns a bunch of non-modal objects, and I need a bunch of filter parameters that are not directly related to the fields on the objects.

I'm not sure if i'm doing something weird or missing something entirely, as I can't be the first to have an endpoint that is not model CRUD and needs to generate the documentation from the endpoint?

I can just retrieve the arguments from the request object, but I'm trying to have my API be self-documenting by using the API documentation generator in drf, or the drf-yasg package, hence my reason for wanting to use the serializers for specifying the parameters.
Is there another way to describe the endpoint parameters besides serializers?

Sorry if this is not within the scope of this issue.

Previously I said I was going to use https://github.com/vintasoftware/drf-rw-serializers.
After 6 months using it, it has been really helpful. For most problems stated here, this library will help.

@Moulde Take a look at the lib Im using. If it does not suit you, try implementing the method get_serializer_class - http://www.django-rest-framework.org/api-guide/generic-views/#get_serializer_classself.

@frenetic This seems to solve a problem that `get_serializer_class’ can already solve most of the time. What @moulde (who presents a slightly different use-case than I previously mentioned) is saying is that there are times when you want/need to use different serializers for the Request vs the Response. And being able to take advantage of automatic documentation is an important consideration for this in my mind, outside of the desire to avoid boilerplate code in this case.

Yes, basically I think DRF has less than ideal support for non-modal views, where you want to use a serializer to describe the interface, so that the automatic documentation can be generated.
An example could be a endpoint for filtering/searching (external) non-modal data.

there are times when you want/need to use different serializers for the Request vs the Response

This was a big reason to use https://github.com/limdauto/drf_openapi (when it was still maintained). It would be awesome if it was possible to differentiate request and response schemas in django-rest-framework. There are a lot of times we're extending another api which has these characteristics.

I do think that having APIView support for different serializers for input and output would be a killer feature. It might even be easy to implement; for example a parameter could be supplied to get_serializer so that get_serializer_class knows the context (input vs output). This seems like it'd be fairly unobtrusive, and would allow for apps needing this functionality to follow the usual advice of "override get_serializer_class".

In the absence of that, I wanted to highlight a couple current django-rest-framework features I didn't see in this thread that might help someone else coming through here later on. If it's workable for you to include all your desired input and output fields into one Model, you can use a ModelSerializer to specify read-only and write-only fields such that you effectively get different input and output schemas.
(Versions: djangorestframework 3.10.3, Django 2.2.5, Python 3.7.4)

So a Model like this:

from django.db import models

class ThingModel(models.Model):
    input_field = models.CharField()
    output_date = models.DateTimeField()
    output_id = models.IntegerField()
    output_string = models.CharField()

And a Serializer like this:

from rest_framework import serializers
from .models import ThingModel

class ThingSerializer(serializers.ModelSerializer):
    class Meta:
        model = ThingModel
        fields = [
            "input_field",
            "output_date",
            "output_id",
            "output_string",
        ]
        read_only_fields = [
            "output_date",
            "output_id",
            "output_string",
        ]
        extra_kwargs = {"input_field": {"write_only": True}}

Will produce the following OpenAPI docs when generateschema is used:

...
  /urlpath/:
    post:
      operationId: CreateThing
        parameters: []
        requestBody:
          content:
            application/json:
              schema:
                properties:
                  input_field:
                    type: string
                    write_only: true
                required:
                - input_field
        responses:
          '200':
            content:
              application/json:
                schema:
                  properties:
                    output_date:
                      type: string
                      format: date-time
                      readOnly: true
                    output_id:
                      type: integer
                      readOnly: true
                    output_string:
                      type: string
                      readOnly: true
                  required: []
          description: ''

This helps when you want one endpoint, within the context of a single method (eg POST), to have different input and output schemas, if the whole object can be expressed as a Model. It'd be great to have similar functionality without having to use a ModelSerializer.

What is the "state of the art" solution for this problem? Was there any progression since 2014?

@fronbasal We've been accomplishing this with Serializers by using write_only=True and read_only=True on each serializer Field as appropriate. I'm not sure what the generated Swagger docs look like for that, but it does accomplish different schemas for input and output to a sufficient extent for our purposes.

I'm not sure whether the maintainers have a different/better solution in mind

@matthewwithanm All right, thank you very much!

@fronbasal I have come up with a simple solution, just pass the model to another serializer

here is the example

def create(self, request):
    serializer = InputSerializer(data=request.data)
    output_serializer = serializer # default
    if not serializer.is_valid():
        return Response(
             {'errors': serializer.errors},
            status=status.HTTP_400_BAD_REQUEST)
    try:
        output_serializer = OutputSerializer(serializer.save())
    except Exception as e:
        raise ValidationError({"detail": e})
    return Response(output_serializer.data, status=status.HTTP_201_CREATED)

Cheers

@michaelhenry thats a really clever solution! Love it.

Was this page helpful?
0 / 5 - 0 ratings