Django-rest-framework: OPTIONS fetches and shows all possible foreign keys in choices field

Created on 16 Dec 2015  路  24Comments  路  Source: encode/django-rest-framework

I have an object with foreign keys and when I load a OPTIONS request for that object type, it shows me a truly insane JSON object with every possible value for the foreign key (there are millions!)

So, for example, I have this object:

class OpinionCluster(models.Model):
    """A class representing a cluster of court opinions."""
    docket = models.ForeignKey(
        Docket,
        help_text="The docket that the opinion cluster is a part of",
        related_name="clusters",
    )

Which is serialized with:

class OpinionClusterSerializer(serializers.HyperlinkedModelSerializer):
    docket = serializers.HyperlinkedRelatedField(
        many=False,
        view_name='docket-detail',
        queryset=Docket.objects.all(),
        style={'base_template': 'input.html'},
    )

When I load this with OPTIONS, I get back something that contains:

    "docket": {
        "type": "field",
         "required": true,
         "read_only": false,
         "label": "Docket",
         "choices": [
            {
                "display_name": "4: United States v. Goodwin",
                "value": "http://127.0.0.1:8000/api/rest/v3/dockets/4/"
            },
            {
                "display_name": "5: Quality Cleaning Products v. SCA Tissue of North America",
                "value": "http://127.0.0.1:8000/api/rest/v3/dockets/5/"
            },

....millions more....

I know that there's a way to disable listing these items in the form of the HTML view, but in the OPTIONS request we need better default functionality than displaying millions of records.

Bug

Most helpful comment

Now that this has been fixed, is there a way to get the choices if I want this behaviour(listing all the choices)? Or is there perhaps a way to indicate from OPTIONS where the URI listing the choices might be found?

For my API OPTIONS returns :

    "actions": {
        "POST": {
            "date_created": {
                "type": "datetime",
                "required": false,
                "read_only": true,
                "label": "Date created"
            },
            "owner": {
                "type": "field",
                "required": false,
                "read_only": true,
                "label": "Owner"
            },
            "language": {
                "type": "field",
                "required": true,
                "read_only": false,
                "label": "Language"
            },
        }
    }
}

And I would like a way for the frontend API to know where to look for the URIs with which it can populate language

All 24 comments

I made an attempt at resolving this by overwriting the metadata class as mentioned above, but that doesn't work because all you can do at that stage is remove data that's already been pulled from the DB. Meaning: If you do this, you still fetch millions of objects from the DB, you just don't pass those values to the front end.

The fix for this, unless I'm mistaken, is one of:

  • Remove the OPTIONS request from the API (Sad, but would work)
  • Make the endpoints read_only (Sad, breaks the API)
  • Some sort of fix in DRF

Thanks for any help or other ideas.

Milestoning to raise the priority.

Agreed. This is odd behavior, treating FK fields as a choices field. Makes OPTIONS a bear trap.

I think the fix is to alter the conditional at https://github.com/tomchristie/django-rest-framework/blob/master/rest_framework/metadata.py#L140 to not do the choices thing on FK fields, and then possibly to do something else for them.

+1

(This may better be submitted as a separate issue. Let me know if I should.)

Also note that if one of those choice fields is a settings.AUTH_USER_MODEL, e.g. a owned_by field, it puts actual account data in there. A censored snipped of the returned JSON:

"owned_by":{
  "type":"field",
  "required":true,
  "read_only":false,
  "label":"Owned by",
  "choices":[
        {"display_name":"engineering+admin@<us>.com","value":"18"}, 
        {"display_name":"engineering+notanadmin@<us>.com","value":"19"},...

IMHO, the default shouldn't do this considering the security risk.

Is there any known work around while this issue is being sorted out? I'm having the security issue @fleur described with a ForeignKey to AUTH_USER_MODEL.

I've made these fields read_only. Works for our implementation, but sucks otherwise. Now there seems to be two reasons to fix this:

  1. Information leakage.
  2. Performance.

In order to work around this limitation you'll need to implement your own SimpleMetadata subclass:

from rest_framework.metadata import SimpleMetadata
from rest_framework.relations import RelatedField

class NoRelatedFieldChoicesMetadata(SimpleMetadata):
    def get_field_info(self, field):
        related_field = isinstance(field, RelatedField)
        # Remove the related fields choices since they can end up with many values.
        if related_field:
            field.queryset = field.queryset.none()
        field_info = super(NoRelatedFieldChoicesMetaData, self).get_field_info(field)
        # Remove the empty `choices` list.
        if related_field:
            field_info.pop('choices')
        return field_info

You can set it per serializer using the metadata_class attribute or per-project using the REST_FRAMEWORK['DEFAULT_METADATA_CLASS'] setting.

@charettes beware, however, that this doesn't fix the performance issue. From my earlier comment:

I made an attempt at resolving this by overwriting the metadata class as mentioned above, but that doesn't work because all you can do at that stage is remove data that's already been pulled from the DB. Meaning: If you do this, you still fetch millions of objects from the DB, you just don't pass those values to the front end.

There is definitely a performance issue here when a field has a large number of possible options. I honestly have no idea what the best way to fix it would be without breaking backwards compatibility.

...information leakage...

This is something I keep hearing whenever this comes up: that DRF is leaking information that it shouldn't be. And I guess it's true, you aren't intending for that information to be public. I mean, why should anyone be able to get a list of the users that are possible values for the field?

But that's usually a sign that you are not restricting your querysets down to only the objects that should be allowed. All objects that DRF lists in the OPTIONS request are valid inputs to the field, so if there are objects showing up in the list then you probably aren't filtering them out for the field.

@mlissner note that I use queryset.none() to avoid any queryset from being performed.

I mean, why should anyone be able to get a list of the users that are possible values for the field?

This is a common request to have all the related values within the option to let front end apps fill choice fields.
Note that it doesn't show read only fields.

@charettes That actually does look promising. You're right, that may be a workaround for the performance aspect of this issue.

Hi,
I'm also affected by this bug.
Meanwhile the fix arrives to PIP, I overrided the SimpleMetadata class with @charettes's fix in a6732e2. It solves the problem for related fields, but is missing to check also for serializers.ManyRelatedField.

+1

Fixed via #4021.

Hi @tomchristie, I checked again #4021 and is still missing the case with ManyRelatedField.
In rest_framework/metadata.py:140
if (not field_info.get('read_only') and not isinstance(field, serializers.RelatedField) and hasattr(field, 'choices')):
should be
if (not field_info.get('read_only') and not isinstance(field, serializers.RelatedField) and not isinstance(field, serializers.ManyRelatedField) and hasattr(field, 'choices')):

@callmewind, you're right, I assumed ManyRelatedField was a subclass of RelatedField. @tomchristie I can submit another PR if you don't have time to deal with it.

Thanks for that! :)

Many to Many now also resolved.

When can we get a release for this? This is quite a severe information security concern.

@unklphil Note that users are only able to see this information if they also have permission to set those fields in the first place.

The 3.4.0 release will be out sometime in the next few weeks.

If you want to disable this behavior prior to that set REST_FRAMEWORK['DEFAULT_METADATA_CLASS'] = None, or use a custom metadata class that implements the behavior as per the related pull requests.

Now that this has been fixed, is there a way to get the choices if I want this behaviour(listing all the choices)? Or is there perhaps a way to indicate from OPTIONS where the URI listing the choices might be found?

For my API OPTIONS returns :

    "actions": {
        "POST": {
            "date_created": {
                "type": "datetime",
                "required": false,
                "read_only": true,
                "label": "Date created"
            },
            "owner": {
                "type": "field",
                "required": false,
                "read_only": true,
                "label": "Owner"
            },
            "language": {
                "type": "field",
                "required": true,
                "read_only": false,
                "label": "Language"
            },
        }
    }
}

And I would like a way for the frontend API to know where to look for the URIs with which it can populate language

This is what I would try:

Derive a class from SimpleMetadata. Undo commit https://github.com/tomchristie/django-rest-framework/commit/05b0c2adff35e136ca2022156534c87f8394f34a in your implementation. It should also give you a hint on how to return an URI. And don't forget to change the metadata class in your settings.

```python
from rest_framework.metadata import SimpleMetadata
from rest_framework.serializers import ManyRelatedField

class CustomMetadata(SimpleMetadata):

def get_field_info(self, field):
    field_info = super().get_field_info(field)

    if (not field_info.get('read_only') and
        isinstance(field, ManyRelatedField) and
             hasattr(field, 'choices')):
        field_info['choices'] = [
            {
                'value': choice_value,
                'display_name': force_text(choice_name, strings_only=True)
            }
            for choice_value, choice_name in field.choices.items()
        ]

    return field_info
Was this page helpful?
0 / 5 - 0 ratings