Django-filter: Filtering is very slow on large datasets

Created on 5 Jul 2016  Â·  6Comments  Â·  Source: carltongibson/django-filter

Hello, there seems to be a problem when filtering large ( > 100 000 rows) datasets. Please take a look at the https://github.com/spapas/django_table_filtering repository (which is just an example of integrating django-tables2 with django-filter). Issue spapas/django_table_filtering#1 describes the initial problem.

To research this, I added a fill_data management command to quickly fill the Book table with data (run it with python manage.py fill_data 100000 10000 100 and introduced a bunch of extra CBVs (on books/views.py) beyond the initial BookFilteredSingleTableView:

  • One that displays only the table, without the filter: BookSingleTableView, path = /nofilter
  • One that displays only the filter, without the table: FilteredListView, path = /notable
  • One that uses both the table and the filter in the simplest way possible FilteredTableView, path = /fiilter2

Now, I have added around 300 000 books. When running the BookSingleTableView CBV, the view renders immediately (django-debug-toolbar shows ~ 100 ms) -- so the django-tables2 table is not the bottleneck. On the other hand, when I run either FilteredListView or FilteredTableView the view needs more than 5 s to render (FilteredListView doesn't use a table at all) ! Notice that the render time seems to be increasing serially with the number of books, i.e when I added 300 000 more books the time to render was around 10 s !)

Could somebody explain to me why I am experiencing this behavior? Could you recommend any solution?

Most helpful comment

Hello @carltongibson thanks for this -- it seems that two changes were needed to make everything work as expected:

  • I needed to pass the filter.qs to the django-table2 constructor
  • I needed to use .count instead of |length in the template

All 6 comments

First step would be to run with (say) Django Debug Toolbar enabled and look at the SQL. What are the queries? Do you have indexes in the right places etc.

Django Filter just proxies down to the ORM — so you need to look there. It will depend on the details.

I'm afraid I can't really say more. (This isn't really an _Issue_ with Django Filter)

Hi @carltongibson ,

SQL Query took just 21 ms with empty filters on all columns, but still loading time is high. When i tried search on one column, time is even higher. While without filter with same pagination and data, it is just taking few ms. Please see the attached screenshots.

screenshot from 2016-07-05 15-34-08

screenshot from 2016-07-05 15-33-35

Hello @carltongibson,

continuing from the previous comment, I actually think it is a django-filter problem (or maybe it is a problem in how we use django-filter)...

In the django-table-filtering project there are two very simple views, one with the filter (/filter2) and one without it (/nofilter). The one renders in < 100 ms, the other needs > 10 s. When I debug with django-debug-toolbar I see that they use the same query with the difference that the view with the filter _does not_ include a limit and is duplicated:

So here are the queries when the view contains the filter:
SELECT COUNT(*) AS "__count" FROM "books_book"
SELECT "books_book"."id", "books_book"."title", "books_book"."author", "books_book"."category" FROM "books_book"
Duplicated 2 times.

And here are without it:

SELECT COUNT(*) AS "__count" FROM "books_book"
SELECT "books_book"."id", "books_book"."title", "books_book"."author", "books_book"."category" FROM "books_book" LIMIT 25

Does that make things any more clear? Here's the CBV that uses the filter:

class FilteredTableView(ListView):
    model = books.models.Book

    def get_context_data(self, **kwargs):
        context = super(FilteredTableView, self).get_context_data(**kwargs)
        filter = books.filters.BookFilter(self.request.GET, queryset=self.object_list)
        table = books.tables.BookTable(filter)
        django_tables2.RequestConfig(self.request, paginate={'per_page': 25}).configure(table )

        context['filter'] = filter
        context['table'] = table
        return context

Could you possibly check out the https://github.com/spapas/django_table_filtering project and observe the bad behavior yourself (just clone the repo, do a pip install requirements.txt, migrate and run fill_data to insert books in the database).

Thanks !

The filter exposes the qs attribute. You're welcome to add a limit to that.

Hello @carltongibson thanks for this -- it seems that two changes were needed to make everything work as expected:

  • I needed to pass the filter.qs to the django-table2 constructor
  • I needed to use .count instead of |length in the template

Awesome!

Was this page helpful?
0 / 5 - 0 ratings