I have a serializer that looks the the following:
class ReportSerializer(serializers.ModelSerializer):
person = serializers.SlugRelatedField(slug_field='uuid')
which is causing an N +1 select issue because select_related is not being used to pre-query the Person object. This means that in order to serialize the person field when serializing a Report, each Person must be queried for its uuid.
We likely want to use select_related on all SlugRelatedField and likely all nested serializers somewhere like here: https://github.com/tomchristie/django-rest-framework/blob/e437520217e20d500d641b95482d49484b1f24a7/rest_framework/serializers.py#L570
It's actually been a conscious design decision to leave the select_related and prefetch_related down the end users.
Attempting to automatically add them would be helpful in many cases, but could also end up failing or giving worse performance in other cases. The other bit that's awkward is that we'd only want to add this to ModelSerializer (Serializer shouldn't have anything specific to Django models in it), which would end up introducing a further divide between what those two classes do, which we might or might not want.
Of course we'd also have to do the clevernesses of figuring out when to use prefetch_related and when to use select_related, so we'd be introducing a bunch of complexity.
Having said all that, might not be averse to building something like this in, in the future. Any thoughts from anyone else?
A follow up question, would be "given that DRF does not currently do the
needed magic, how would I as a user specify to use select related?" I would
think that I would want to subclass and change the line of code I linked to
(ie subclass data property) or explicitly specify the queryset to the
serializer?
I feel at the very least more documentation is deserved. I was quite
surprised to see the large number of SQL queries occurring due to the slug
related usage.
On Oct 17, 2014 10:35 AM, "Tom Christie" [email protected] wrote:
It's actually been a conscious design decision to leave the select_related
and prefetch_related down the end users.Attempting to automatically add them would be helpful in many cases, but
could also end up failing or giving worse performance in other cases. The
other bit that's awkward is that we'd only want to add this to
ModelSerializer (Serializer shouldn't have anything specific to Django
models in it), which would end up introducing a further divide between what
those two classes do, which we might or might not want.Of course we'd also have to do the clevernesses of figuring out when to
use prefetch_related and when to use select_related, so we'd be
introducing a bunch of complexity.Having said all that, might not be averse to building something like this
in, in the future. Any thoughts from anyone else?—
Reply to this email directly or view it on GitHub
https://github.com/tomchristie/django-rest-framework/issues/1964#issuecomment-59521704
.
Assuming you're using generic views then specify it on the queryset attribute of view or viewset, eg...
queryset = Report.objects.all().select_related('person')
And yes, we should make to include some notes on that in the generic view section, plus also have a a section on general usage tips etc. Including:
Okay. That makes sense.
I am curious to hear what others think, but I would love to see some magic
here with automatically adding select related based on slug related usage.
On Oct 17, 2014 10:47 AM, "Tom Christie" [email protected] wrote:
Assuming you're using generic views then specify it on the queryset
attribute of view or viewset, eg...queryset = Report.objects.all().select_related('person')
And yes, we should make to include some notes on that in the generic view
section, plus also have a a section on general usage tips etc. Including:
- Good API style.
- querysets and select_related, prefetch_related.
- DB indexes, particularly wrt filters, relationships, view lookups.
- Model encapsulation.
—
Reply to this email directly or view it on GitHub
https://github.com/tomchristie/django-rest-framework/issues/1964#issuecomment-59523476
.
"slug related usage" That's just one of many possibilities of course - nested serializers and other relationship types also benefit from this.
I believe that this approach if implemented correctly can enhance the performance of DRF greatly.
If @tomchristie doesn't want this to be the default it can be provided as a mixin.
The algorithm for generating the values of the arguments of select_related is:
foo.bar then the path of the relationship should be foo)The algorithm for generating the values of the arguments of prefetch_related is:
foo.bar then the path of the relationship should be foo)The only problem I see here is what happens when the relationship source involves a nested relationship since I don't know how Django performs the queries.
E.g. The following serializer represents a model that has a m2m to a foo model that has an FK to the bar model:
class MySerializer(ModelSerializer):
foo = CharField(source='foo.bar.name')
class Meta:
model = MyModel
Should generate which query exactly?
I _think_ that the query should be:
MyModel.objects.prefetch_related('foo').select_related('foo__bar').only('foo__bar__name')
But I'm not sure what django will do in that case. In any case, I'd use only to ensure only the fields I require are fetched.
I believe that extending this issue to support automatically detecting exactly which fields should be fetched using only()/defer() is a good idea since it relates to the exact same problem and might offer us some solutions in paths like the above.
Detecting when to use only() or defer() is much easier. You just iterate on each field and translate the source attribute into a relationship path if required.
As a bonus Django will use update_fields when updating an existing record since the fields which are not used by the serializer will be deferred. See https://github.com/django/django/blob/master/django/db/models/base.py#L614 for the implementation.
We can compromise and specify that we don't support detecting *_related on nested relationships or defer the support for those relationships to later version (which I'd prefer if any of you thinks it's too complex to implement).
Note that prefetch_related should not be applied when pagination is off or if no filtering is performed.
Also after thinking about the nested relationships problem I believe it can be solved only in Django 1.7 and above using the Prefetch object.
At this point in time I'm closing this off in favor of the documentation ticket #1977.
Would I be happy to consider this at some point in the future? Sure, but it needs to start off life as a third party project. There's nothing to stop someone else from taking this on outside of core.
Hi,
i'm trying to query with generic relation what should i use, either 'Select_related' or 'Prefetch_related'.
i'm having to two table user and contact.
i'm using contact table as generic(created generic relation from django view) .
so i'm making query like:
result = User.objects.filter().select_related('id')
in above queryset i'm passing "ID" to select_related . (ID is pk of "user" table and acting as Entity_object_id in "contact" table and Entity_content_type_id)
How can i get each user details with their contact details
for generic relation used link :https://docs.djangoproject.com/en/1.7/ref/contrib/contenttypes/#generic-relations.
@Deepakdubey90 please do use the mailing list, IRC or stack overflow for usage question.
ok
I'm working on this over at django-auto-prefetching
I'm not quite ready to publish to PyPi yet, but so far I have an algorithm that passes tests for all the relationships(fk, m2m, 1to1)
It recursively scans the fields of the serializer, fetching nested relations if needed.
If anyone wants to try it out before I'm done publishing (maybe find out where it doesn't work and submit an issue to the repo?) I'm sharing the first version of the code here:
https://gist.github.com/GeeWee/5f82fd11d5e9c95bfe61993c07624ccc
First usable version of django-auto-prefetching is now live if anyone still has this problem
Hello everyone,
We faced the same issues in some projects and I have created a simple package to use in our projects. Then I have decided to publish it in github and pypi for anyone who is interested. Btw it is still being tested and there might be bugs. You can check it out on https://github.com/thetarby/django-auto-related
Simply what it does:
What django-rest gives us is the ability to know what will be accessed from the object before it is accessed by inspecting the source attribute of the fields defined in a serializer.
Then it traces those sources back on django fields instances and translates them to select_related, prefetch_related and only methods of django. Note that it does not use django-rest fields to decide what to prefetch but django fields because what django-rest says a related_field may not be a related field for the django or vice versa.
For instance, serializers.CharField(source=related_model.title) is not a related field for djangorest but in reality related_model should be prefetched to avoid n+1 problems.
If you have any feedback I would be glad to hear from you on github.
First usable version of django-auto-prefetching is now live if anyone still has this problem
I'm using django-auto-prefeching in my company's project. It helps a lot. Thank you.
You're very welcome!
Most helpful comment
First usable version of django-auto-prefetching is now live if anyone still has this problem