Graphene: Techniques for enhancing Django query performance

Created on 30 Apr 2017  路  8Comments  路  Source: graphql-python/graphene

Hi folks!

I was curious as to how one would go about increasing query performance, when one is using DjangoObjectTypes directly connected to a model. One example of an optimization that I think would be easy is the following:

if 'arg x' is a field that is asked for:
    make sure that "something_set" is included in "select_related"

Any one willing to share ideas? Also, how would one go about creating a LRU cache?

wontfix

Most helpful comment

I've extracted some code and simplified it a bit (our implementation is a bit more complex) but hopefully you find it useful. One day I'll find the time to release this as a reusable package (or perhaps directly in to graphene-django)

helpers.py

from graphql.utils.ast_to_dict import ast_to_dict
from django.utils.six import iteritems


def collect_fields(node, fragments, variables):
    field = {}
    selection_set = node.get('selection_set') if node else None
    selections = selection_set.get('selections', None) if selection_set else None

    if selections is not None:
        for leaf in selections:
            leaf_kind = leaf.get('kind')
            leaf_name = leaf.get('name', {}).get('value')
            leaf_directives = leaf.get('directives')

            # Check if leaf should be skipped
            # - If name is '__typename'
            # - if @skip directive is used and evaluates to True
            # - if @include directive is used and evaluates to False (not yet implemented!)
            should_skip = False
            for directive in leaf_directives:
                if directive.get('name', {}).get('value') == 'skip':
                    for arg in directive.get('arguments', []):
                        arg_value = arg.get('value', {})
                        if arg.get('name', {}).get('value') == 'if':
                            if arg_value.get('kind') == 'Variable':
                                var_name = arg_value.get('name', {}).get('value')
                                should_skip = variables.get(var_name, should_skip)
                            elif arg_value.get('kind') == 'BooleanValue':
                                should_skip = arg_value.get('value')

            if leaf_name != '__typename' and not should_skip:
                if leaf_kind == 'Field':
                    field.update({leaf_name: collect_fields(leaf, fragments, variables)})
                elif leaf_kind == 'FragmentSpread':
                    field.update(collect_fields(fragments[leaf_name], fragments, variables))
    return field


def get_fields(info):
    """Return a nested dict of the fields requested by a graphene resolver"""
    fragments = {}
    node = ast_to_dict(info.field_asts[0])

    for name, value in iteritems(info.fragments):
        fragments[name] = ast_to_dict(value)

    fields = collect_fields(node, fragments, info.variable_values)

    return fields

Example usage

from django.db.models import Prefetch

def resolve_posts(self, args, context, info):
    qs = Post.objects.all()
    fields = get_fields(info)

    # posts { author }
    if fields.get('author', {}):
        qs = qs.select_related('author')

    # posts { comments }    
    if fields.get('comments', {}):
        all_comments = Comment.objects.all()

        # posts { comments { author } }
        if fields.get('comments', {}).get('author', {}):
            all_comments = all_comments.select_related('author')

        qs = qs.prefetch_related(Prefetch('comments', queryset=all_comments))

    return qs

All 8 comments

That's exactly what I do to build select_related and prefetch_related in my top level resolvers. I'm out at the moment but will share some code when I get home 馃檪

I've extracted some code and simplified it a bit (our implementation is a bit more complex) but hopefully you find it useful. One day I'll find the time to release this as a reusable package (or perhaps directly in to graphene-django)

helpers.py

from graphql.utils.ast_to_dict import ast_to_dict
from django.utils.six import iteritems


def collect_fields(node, fragments, variables):
    field = {}
    selection_set = node.get('selection_set') if node else None
    selections = selection_set.get('selections', None) if selection_set else None

    if selections is not None:
        for leaf in selections:
            leaf_kind = leaf.get('kind')
            leaf_name = leaf.get('name', {}).get('value')
            leaf_directives = leaf.get('directives')

            # Check if leaf should be skipped
            # - If name is '__typename'
            # - if @skip directive is used and evaluates to True
            # - if @include directive is used and evaluates to False (not yet implemented!)
            should_skip = False
            for directive in leaf_directives:
                if directive.get('name', {}).get('value') == 'skip':
                    for arg in directive.get('arguments', []):
                        arg_value = arg.get('value', {})
                        if arg.get('name', {}).get('value') == 'if':
                            if arg_value.get('kind') == 'Variable':
                                var_name = arg_value.get('name', {}).get('value')
                                should_skip = variables.get(var_name, should_skip)
                            elif arg_value.get('kind') == 'BooleanValue':
                                should_skip = arg_value.get('value')

            if leaf_name != '__typename' and not should_skip:
                if leaf_kind == 'Field':
                    field.update({leaf_name: collect_fields(leaf, fragments, variables)})
                elif leaf_kind == 'FragmentSpread':
                    field.update(collect_fields(fragments[leaf_name], fragments, variables))
    return field


def get_fields(info):
    """Return a nested dict of the fields requested by a graphene resolver"""
    fragments = {}
    node = ast_to_dict(info.field_asts[0])

    for name, value in iteritems(info.fragments):
        fragments[name] = ast_to_dict(value)

    fields = collect_fields(node, fragments, info.variable_values)

    return fields

Example usage

from django.db.models import Prefetch

def resolve_posts(self, args, context, info):
    qs = Post.objects.all()
    fields = get_fields(info)

    # posts { author }
    if fields.get('author', {}):
        qs = qs.select_related('author')

    # posts { comments }    
    if fields.get('comments', {}):
        all_comments = Comment.objects.all()

        # posts { comments { author } }
        if fields.get('comments', {}).get('author', {}):
            all_comments = all_comments.select_related('author')

        qs = qs.prefetch_related(Prefetch('comments', queryset=all_comments))

    return qs

Similar approach but not as fully featured:

def field_selects_path(field, path):
    if len(path) == 0:
        return False
    for subfield in field.selection_set.selections:
        if subfield.name.value == path[0]:
            if len(path) == 1:
                return True
            return field_selects_path(subfield, path[1:])
    return False


def path_is_selected(info, path):
    for field in info.field_asts:
        if field.name.value == path[0]:
            return field_selects_path(field, path[1:])
    return False


def select_related(info, qs, path, select):
    """
    @param info  {graphene.Info} ie the info at the end of: `def resolve_thumb(agent, args, context, info):`
    @param qs {QuerySet}
    @param path {str} - eg `deals.edges.node.client`
    @param select {str} - The table to select eg 'client' or 'client__showingrequest'
    """
    if path_is_selected(info, path.split('.')):
        return qs.select_related(select)
    return qs

Usage:

qs = select_related(info, qs, 'deals.edges.node.client', 'client')

Here's another approach on solving this: https://gist.github.com/camd/cae123923ba8ce2442ebd6e1be34377e

It's a bit verbose because I included my supporting graphs for usage and context. Most of it is in the helper.py

I definitely borrowed heavily from several different sources (including those in this thread.) :)

Hi @gamesbrainiac . We're currently going through old issues that appear to have gone stale (ie. not updated in about the last 6 months) to try and clean up the issue tracker. If this is still important to you please comment and we'll re-open this.

Thanks!

@jkimbo hey, I'm still interested in this, would be great if it could be re-opened.

I wonder if something like this might be a nice approach:

class ItemType(DjangoObjectType):
    class Meta(object):
        model = Item
        only_fields = (
            'id',
            'stock',
            'price_gbp',
            'price_previous_gbp',
            'images',
            'sizes',
        )
        interfaces = (relay.Node,)

        def qs(self, fields):
            selects = {
                'brand_name': ['item_base__brand'],
                'name': ['item_base'],
            }
            prefetches = {
                'sizes': ['sizes'],
            }
            return self.model.objects.select_related(
                *[x for y in fields for x in selects.get(y, [])],
            ).prefetch_related(
                *[x for y in fields for x in prefetches.get(y, [])],
            )

In this example, fields is a list of the field names requested on this type, for which the resolvers will be called.

selects and prefetches could possibly be defined on the Meta class instead, but it would be good to have a hook to do this entirely manually anyway.

@danpalmer sure I can reopen this issue but I think it would make more sense for this issue to be opened on the graphene-django repo since that is where any improvements would happen.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings