Django-tables2: Duplicated table rows after sorting by column header

Created on 3 Aug 2016 · 32Comments · Source: jieter/django-tables2

I'm experiencing an issue where some rows are rendered multiple times in place of other rows. The table in question used to render perfectly well on an earlier versions of django and django-tables2. I upgraded both of them at the same time to the latest version of each (django 1.10, django-tables2 1.2.4) then this issue started happening.

See this image here for a demonstration: http://imgur.com/a/pPRT8

When I load a table without choosing a sort order by the column, I don't see this issue. Once I click on a column header to change the sort order then rows are duplicated.

So far I've identified that the data passed into the BoundRows object is correct, with no duplicated items. However iterating through the bound rows returns duplicated entries. So it looks like something may be changing the data object while the iteration is taking place.

This issue is happening on tables of length ~150, and I haven't (yet) been able to reproduce this issue with a smaller test dataset.

This snippet shows the test code I have used to narrow down the cause of the issue.

--- a/django_tables2/rows.py
+++ b/django_tables2/rows.py
@@ -8,6 +8,7 @@ from django.utils import six
 from .columns.linkcolumn import BaseLinkColumn
 from .utils import A, AttributeDict, call_with_appropriate, computed_values

+import sys

 class BoundRow(object):
     """
@@ -187,9 +188,17 @@ class BoundRows(object):
     def __init__(self, data, table):
         self.data = data
         self.table = table
+        for d in data:
+            print >>sys.stderr, d
+
+        print >>sys.stderr, "end data"
+

     def __iter__(self):
         for record in self.data:
+            print >>sys.stderr, "__iter__", record
             yield BoundRow(record, table=self.table)

bug

Source

wtfrank

Most helpful comment

wohoo, finally fixed!

Yes, I'll release within an hour.

jieter on 27 Feb 2017

🎉4

All 32 comments

Crystallising the queryset via list() stops the issue happening:

+++ b/django_tables2/rows.py
@@ -8,6 +8,7 @@ from django.utils import six
 from .columns.linkcolumn import BaseLinkColumn
 from .utils import A, AttributeDict, call_with_appropriate, computed_values

+import sys

 class BoundRow(object):
     """
@@ -187,9 +193,26 @@ class BoundRows(object):
     def __init__(self, data, table):
         self.data = data
         self.table = table


     def __iter__(self):
+        print >>sys.stderr, type(list(self.data.data))

wtfrank on 3 Aug 2016

Thanks for reporting, casting to list is not acceptable as a solution since that won't work with bigger datasets.

Can you try to downgrade your django-tables2 version to make sure it's not a problem with updating your Django version?

jieter on 17 Aug 2016

Yeah definitely not a solution - but it may be indicative of the cause. The other reason casting to list isn't a suitable solution is because it doesn't always seem to be a queryset holding the data anyway, in which case self.data.data won't exist.

wtfrank on 17 Aug 2016

@wtfrank, I see, but it would really help if you could provide a reproducible test case. Did you try downgrading django-tables2 to the previous version you used?

jieter on 22 Aug 2016

I think i'm running into the same issue: In my table, one row is missing, and another one is a duplicate... For me this started with version 1.2.2.
Downgrading to 1.2.1 "fixxes" the problem for me.... Is there any information i can provide to make the search easier?

fliphess on 5 Sep 2016

A minimal test case would be really nice to have. We can then run git bisect to find the bad commit.

jieter on 6 Sep 2016

I'm looking into a test case at the moment, but strangely enough i cannot reproduce the issue on my test environment ...

Locally (using ./manage runserver using sqlite) all is well, while in production (running uwsgi and mysql) the problem of dupes exists... (both environments run on equal versions)

The queryset does not contain any dupes.... But at the time the table.html is rendered, the queryset is missing one object and another one is duplicate...

Getting back to you soon with a test case hopefully :)

fliphess on 6 Sep 2016

The issue occurs also in 1.2.6 version. I think it is sth with #329 or #330 but so far I can not find way to fix it.

felixxm on 6 Sep 2016

Those are only applicable to 1.2.6, so I don't expect them to be the source of this bug.

jieter on 7 Sep 2016

Sorry, I'm not able to provide a proper reproducible case, but I've just got this issue (or, at least, I believe it's this issue) and can tell that it certainly had happened between 1.2.1 and 1.2.2.

Whenever I downgrade to 1.2.1, everything looks just fine. When I upgrade to 1.2.2 or above (I've started with latest 1.2.6, of course), I consistently lose the first row (out of three I have in DB) and get a copy of another one instead of it. E.g., when on 1.2.1 I have:

| date | hits | ... |
| --- | --- | --- |
| 2016-10-08 | 123 | ... |
| 2016-10-07 | 321 | ... |
| 2016-10-06 | 0 | ... |

On 1.2.2-1.2.6 I consistently get this instead:

| date | hits | ... |
| --- | --- | --- |
| 2016-10-06 | 0 | ... |
| 2016-10-07 | 321 | ... |
| 2016-10-06 | 0 | ... |

I'm using Django 1.9.9 on Python 2.7.12, project was started using pydanny/cookiecutter-django (although heavily modified after that), and my code looks like this:

class DailySummaryTable(tables.Table):
    class Meta:
        model = DailySummary
        attrs = {"class": "paleblue"}
        empty_text = _("No stats yet")
        fields = ("date", "hits", ...long boring list...)
        # order_by = ("-date",)
        # orderable = False

Model's Meta only has verbose_name{,_plural} stuff, and view is a simple DetailView that looks like this:

class SourceDetailView(LoginRequiredMixin, DetailView):
    model = Source
    ...
    def get_context_data(self, **kwargs):
        ctx = super(...)
        ctx["stats"] = DailySummaryTable(DailySummary.objects.filter(source=self.object))
        return ctx

And there are three rows in DB (PostgreSQL 9.6), all having the same source_id (I have only one), so they all match the query. Strangely, If I replace .filter(source=...) with .all() the issue seems to disappear.

That's all I was able to figure out. Hope it helps. I guess, I'll just stick with 1.2.1 for now :)

drdaeman on 8 Oct 2016

👍1

Thanks for showing your wanderings, I'll look into this issue later.

jieter on 9 Oct 2016

I just ran into the same issue, crystallising the queryset via list() is not a viable solution on our side neither.

op-alex-reid on 19 Oct 2016

@op-alex-reid can mlyou try to turn your use case into a minimal reproducible test case?

jieter on 19 Oct 2016

I'm having the same problem (Django==1.9, django-tables==1.2.3) with the following code:

class UserPlayerTable(tables.Table):
    actions = tables.LinkColumn('player:my_players_detail', args=[A('slug')],
                                text=_('View / Edit'),
                                verbose_name=_('View / Edit'), empty_values=())
    embed = tables.LinkColumn('player:my_players_embed', args=[A('slug')],
                                text=_('View / Embed now'),
                                verbose_name=_('View / Embed now'), empty_values=())

    class Meta:
        template = 'tables/table.html'
        model = Player
        fields = ('created', 'slug', 'actions', 'embed')
        attrs = {"class": "changeset"}
        order_by = ['-created']
        orderable = False

and

UserPlayerTable(Player.objects.filter(user=context['impersonate_user']))

Interestingly, when I set orderable = True, the problem does not occur.

In my case, crystallising the queryset via list() _is_ an option (and the fix for now), since there are only at most 5 rows in the table.

lukasklein on 27 Oct 2016

Thanks for the example!

jieter on 27 Oct 2016

I tried reproducing this using this code:

class Player(models.Model):
    person = models.ForeignKey(Person)
    created = models.DateTimeField(auto_now_add=True)
    score = models.PositiveIntegerField()

def test_issue_361(per_page=5):

    bob = Person.objects.create(first_name='Bob', last_name='Builder')
    eve = Person.objects.create(first_name='Eve', last_name='Dropper')

    for i in range(10):
        Player.objects.create(person=bob, score=randint(0, 200))
        Player.objects.create(person=eve, score=randint(200, 400))
        Player.objects.create(person=bob, score=5)
        Player.objects.create(person=eve, score=5)

    class UserPlayerTable(tables.Table):
        class Meta:
            model = Player
            fields = ('person.name', 'score',)
            order_by = ['-score']
            orderable = False

    queryset = Player.objects.filter(score=5)

    table = UserPlayerTable(queryset)
    RequestConfig(request, paginate={'per_page': per_page}).configure(table)
    html = table.as_html(request)

    # count the number of rows, subtract one for the header
    assert (html.count('<tr') - 1) == per_page

but this doesn't yield duplicated rows...

jieter on 10 Dec 2016

I encountered the same issue today. I'm using django-tables2==1.2.9 in a Django==1.10 project, where I also utilize django-filters==1.0.1.

Interestingly enough the doubling of rows in my table only occurs if

there are many entries to display
(so if I enable pagination with 25 records per page everything is fine, but if I display the full dataset I get multiple identical entries)
the table is sorted by a column that contains non-strings
(everything is fine if I sort by a column that contains e.g. names, but sorting by columns with dates, ints or bools results in trouble)

Downgrading to 1.2.1 solves the problem, but that is not really an option.
Furthermore the dupes seem to be shown instead of and not in addition to real entries, because the count {{filter.qs.count}} is as it should be.

I hope this helps a little bit with getting to the bottom of the problem, because other than that I really enjoy working with django-tables2.

Thanks for all your work!

n0ctua on 20 Jan 2017

@n0ctua thanks!

Can you have a look at the exact queries executed and maybe share them here?

jieter on 20 Jan 2017

I've just had this problem too and understand why it's happening for me so thought I'd post as I suspect others have it for the same reason. _(Although, I haven't tried with other versions of django-tables2, so I might be wrong as I don't understand why that would stop this happening... Although... actually thinking about it I would hypothesise that this is because in older versions non-paginated tables ran a single query against the db instead of a query for each row – more details on this below...)_

It's because _SQL order by on non-unique fields is not deterministic_ so when combined by top n you get the same top n rows again and again. (Also, for some reason when I create a Table with no pagination it generates a query for _every_ row. I.E. top 1).

I'm not sure what the best solution is here? I think the only way to guarantee this will work is to always have the final order by be on a unique field. Ideally the primary key? Also, might be worth adding some notes to the docs warning about this?

As a separate problem but one that will really reduce how often this happens (and speed up queries for non-paginated tables) is if instead of running a top 1 query for every row in the table would be just running a single query that fetches all the results.

intiocean on 26 Feb 2017

👍2

@intiocean this number of queries for non-paginated tables is quite strange, not sure why this happens, and it should not happen. This test case shows the problem:

def test_single_query_for_non_paginated_table(settings):
    '''
    A non-paginated table should not generate a query for each row, but only
    one query to count the rows and one to fetch the rows.
    '''
    from django.db import connection
    settings.DEBUG = True

    for i in range(10):
        Person.objects.create(first_name='Bob %d' % randint(0, 200), last_name='Builder')

    num_queries_before = len(connection.queries)

    class PersonTable(tables.Table):
        class Meta:
            model = Person
            fields = ('first_name', 'last_name')

    table = PersonTable(Person.objects.all())
    request = build_request('/')
    RequestConfig(request, paginate=False).configure(table)
    table.as_html(request)

    # print '\n'.join(q['sql'] for q in connection.queries)
    assert len(connection.queries) - num_queries_before == 2

jieter on 27 Feb 2017

I have tried, unsuccessfully, to create a test for the paginated case that fails. I don't understand why but everything appears to work as normal in the paginated case... I thought I'd get rows that appear as if duplicated on multiple pages if sorting by a non-unique field.

intiocean on 27 Feb 2017

@intiocean I think I fixed it with 10f5e968bc10552be0ad02ee566513b5d274d5c5

jieter on 27 Feb 2017

@jieter Great! I can see why that would resolve the issue with non-paginated tables but still think that ordering by a non-unique column when the data is spread across different pages could result in the same row appearing on multiple pages due to the top n ... limit x offset y

What do you think?

intiocean on 27 Feb 2017

@wtfrank, @intiocean @n0ctua @op-alex-reid @fliphess I just pushed some commits to master which might fix this. Can you verify?

jieter on 27 Feb 2017

@intiocean If that is the source of this issue, I think we can only fix this issue partially by documenting this behaviour.

jieter on 27 Feb 2017

👍1

I can confirm that this fixes non-paginated tables @jieter. This also improves performance for non-paginated tables as there is now only one query instead of one for each row.

Check:
With 962f502 and sorting by the 'Notes' column -> duplication

With f853078 and sorting by the 'Notes' column -> no duplication 🎉

intiocean on 27 Feb 2017

Any chance you could do a release @jieter?

intiocean on 27 Feb 2017

wohoo, finally fixed!

Yes, I'll release within an hour.

jieter on 27 Feb 2017

🎉4

released 1.4.0

jieter on 27 Feb 2017

Awesome, thanks @jieter!

intiocean on 27 Feb 2017

@jieter and @intiocean thank you both so much for fixing this! Works perfectly now.

n0ctua on 1 Mar 2017

Hi, I have just experienced this issue on 1.21.2 version and Django==2.0.6. Thanks to intiocean 's comment, fixed it with setting additional ordering by 'pk'.