Django-tables2: Duplicated table rows after sorting by column header

Created on 3 Aug 2016  Â·  32Comments  Â·  Source: jieter/django-tables2

I'm experiencing an issue where some rows are rendered multiple times in place of other rows. The table in question used to render perfectly well on an earlier versions of django and django-tables2. I upgraded both of them at the same time to the latest version of each (django 1.10, django-tables2 1.2.4) then this issue started happening.

See this image here for a demonstration: http://imgur.com/a/pPRT8

When I load a table without choosing a sort order by the column, I don't see this issue. Once I click on a column header to change the sort order then rows are duplicated.

So far I've identified that the data passed into the BoundRows object is correct, with no duplicated items. However iterating through the bound rows returns duplicated entries. So it looks like something may be changing the data object while the iteration is taking place.

This issue is happening on tables of length ~150, and I haven't (yet) been able to reproduce this issue with a smaller test dataset.

This snippet shows the test code I have used to narrow down the cause of the issue.

--- a/django_tables2/rows.py
+++ b/django_tables2/rows.py
@@ -8,6 +8,7 @@ from django.utils import six
 from .columns.linkcolumn import BaseLinkColumn
 from .utils import A, AttributeDict, call_with_appropriate, computed_values

+import sys

 class BoundRow(object):
     """
@@ -187,9 +188,17 @@ class BoundRows(object):
     def __init__(self, data, table):
         self.data = data
         self.table = table
+        for d in data:
+            print >>sys.stderr, d
+
+        print >>sys.stderr, "end data"
+

     def __iter__(self):
         for record in self.data:
+            print >>sys.stderr, "__iter__", record
             yield BoundRow(record, table=self.table)
bug

Most helpful comment

wohoo, finally fixed!

Yes, I'll release within an hour.

All 32 comments

Crystallising the queryset via list() stops the issue happening:

+++ b/django_tables2/rows.py
@@ -8,6 +8,7 @@ from django.utils import six
 from .columns.linkcolumn import BaseLinkColumn
 from .utils import A, AttributeDict, call_with_appropriate, computed_values

+import sys

 class BoundRow(object):
     """
@@ -187,9 +193,26 @@ class BoundRows(object):
     def __init__(self, data, table):
         self.data = data
         self.table = table


     def __iter__(self):
+        print >>sys.stderr, type(list(self.data.data))

Thanks for reporting, casting to list is not acceptable as a solution since that won't work with bigger datasets.

Can you try to downgrade your django-tables2 version to make sure it's not a problem with updating your Django version?

Yeah definitely not a solution - but it may be indicative of the cause. The other reason casting to list isn't a suitable solution is because it doesn't always seem to be a queryset holding the data anyway, in which case self.data.data won't exist.

@wtfrank, I see, but it would really help if you could provide a reproducible test case. Did you try downgrading django-tables2 to the previous version you used?

I think i'm running into the same issue: In my table, one row is missing, and another one is a duplicate... For me this started with version 1.2.2.
Downgrading to 1.2.1 "fixxes" the problem for me.... Is there any information i can provide to make the search easier?

A minimal test case would be really nice to have. We can then run git bisect to find the bad commit.

I'm looking into a test case at the moment, but strangely enough i cannot reproduce the issue on my test environment ...

Locally (using ./manage runserver using sqlite) all is well, while in production (running uwsgi and mysql) the problem of dupes exists... (both environments run on equal versions)

The queryset does not contain any dupes.... But at the time the table.html is rendered, the queryset is missing one object and another one is duplicate...

Getting back to you soon with a test case hopefully :)

The issue occurs also in 1.2.6 version. I think it is sth with #329 or #330 but so far I can not find way to fix it.

Those are only applicable to 1.2.6, so I don't expect them to be the source of this bug.

Sorry, I'm not able to provide a proper reproducible case, but I've just got this issue (or, at least, I believe it's this issue) and can tell that it certainly had happened between 1.2.1 and 1.2.2.

Whenever I downgrade to 1.2.1, everything looks just fine. When I upgrade to 1.2.2 or above (I've started with latest 1.2.6, of course), I consistently lose the first row (out of three I have in DB) and get a copy of another one instead of it. E.g., when on 1.2.1 I have:

| date | hits | ... |
| --- | --- | --- |
| 2016-10-08 | 123 | ... |
| 2016-10-07 | 321 | ... |
| 2016-10-06 | 0 | ... |

On 1.2.2-1.2.6 I consistently get this instead:

| date | hits | ... |
| --- | --- | --- |
| 2016-10-06 | 0 | ... |
| 2016-10-07 | 321 | ... |
| 2016-10-06 | 0 | ... |

I'm using Django 1.9.9 on Python 2.7.12, project was started using pydanny/cookiecutter-django (although heavily modified after that), and my code looks like this:

class DailySummaryTable(tables.Table):
    class Meta:
        model = DailySummary
        attrs = {"class": "paleblue"}
        empty_text = _("No stats yet")
        fields = ("date", "hits", ...long boring list...)
        # order_by = ("-date",)
        # orderable = False

Model's Meta only has verbose_name{,_plural} stuff, and view is a simple DetailView that looks like this:

class SourceDetailView(LoginRequiredMixin, DetailView):
    model = Source
    ...
    def get_context_data(self, **kwargs):
        ctx = super(...)
        ctx["stats"] = DailySummaryTable(DailySummary.objects.filter(source=self.object))
        return ctx

And there are three rows in DB (PostgreSQL 9.6), all having the same source_id (I have only one), so they all match the query. Strangely, If I replace .filter(source=...) with .all() the issue seems to disappear.

That's all I was able to figure out. Hope it helps. I guess, I'll just stick with 1.2.1 for now :)

Thanks for showing your wanderings, I'll look into this issue later.

I just ran into the same issue, crystallising the queryset via list() is not a viable solution on our side neither.

@op-alex-reid can mlyou try to turn your use case into a minimal reproducible test case?

I'm having the same problem (Django==1.9, django-tables==1.2.3) with the following code:

class UserPlayerTable(tables.Table):
    actions = tables.LinkColumn('player:my_players_detail', args=[A('slug')],
                                text=_('View / Edit'),
                                verbose_name=_('View / Edit'), empty_values=())
    embed = tables.LinkColumn('player:my_players_embed', args=[A('slug')],
                                text=_('View / Embed now'),
                                verbose_name=_('View / Embed now'), empty_values=())

    class Meta:
        template = 'tables/table.html'
        model = Player
        fields = ('created', 'slug', 'actions', 'embed')
        attrs = {"class": "changeset"}
        order_by = ['-created']
        orderable = False

and

UserPlayerTable(Player.objects.filter(user=context['impersonate_user']))

Interestingly, when I set orderable = True, the problem does not occur.

In my case, crystallising the queryset via list() _is_ an option (and the fix for now), since there are only at most 5 rows in the table.

Thanks for the example!

I tried reproducing this using this code:

class Player(models.Model):
    person = models.ForeignKey(Person)
    created = models.DateTimeField(auto_now_add=True)
    score = models.PositiveIntegerField()

def test_issue_361(per_page=5):

    bob = Person.objects.create(first_name='Bob', last_name='Builder')
    eve = Person.objects.create(first_name='Eve', last_name='Dropper')

    for i in range(10):
        Player.objects.create(person=bob, score=randint(0, 200))
        Player.objects.create(person=eve, score=randint(200, 400))
        Player.objects.create(person=bob, score=5)
        Player.objects.create(person=eve, score=5)

    class UserPlayerTable(tables.Table):
        class Meta:
            model = Player
            fields = ('person.name', 'score',)
            order_by = ['-score']
            orderable = False

    queryset = Player.objects.filter(score=5)

    table = UserPlayerTable(queryset)
    RequestConfig(request, paginate={'per_page': per_page}).configure(table)
    html = table.as_html(request)

    # count the number of rows, subtract one for the header
    assert (html.count('<tr') - 1) == per_page

but this doesn't yield duplicated rows...

I encountered the same issue today. I'm using django-tables2==1.2.9 in a Django==1.10 project, where I also utilize django-filters==1.0.1.

Interestingly enough the doubling of rows in my table only occurs if

  1. there are many entries to display
    (so if I enable pagination with 25 records per page everything is fine, but if I display the full dataset I get multiple identical entries)

  2. the table is sorted by a column that contains non-strings
    (everything is fine if I sort by a column that contains e.g. names, but sorting by columns with dates, ints or bools results in trouble)

Downgrading to 1.2.1 solves the problem, but that is not really an option.
Furthermore the dupes seem to be shown instead of and not in addition to real entries, because the count {{filter.qs.count}} is as it should be.

I hope this helps a little bit with getting to the bottom of the problem, because other than that I really enjoy working with django-tables2.

Thanks for all your work!

@n0ctua thanks!

Can you have a look at the exact queries executed and maybe share them here?

I've just had this problem too and understand why it's happening for me so thought I'd post as I suspect others have it for the same reason. _(Although, I haven't tried with other versions of django-tables2, so I might be wrong as I don't understand why that would stop this happening... Although... actually thinking about it I would hypothesise that this is because in older versions non-paginated tables ran a single query against the db instead of a query for each row – more details on this below...)_

It's because _SQL order by on non-unique fields is not deterministic_ so when combined by top n you get the same top n rows again and again. (Also, for some reason when I create a Table with no pagination it generates a query for _every_ row. I.E. top 1).

I'm not sure what the best solution is here? I think the only way to guarantee this will work is to always have the final order by be on a unique field. Ideally the primary key? Also, might be worth adding some notes to the docs warning about this?

As a separate problem but one that will really reduce how often this happens (and speed up queries for non-paginated tables) is if instead of running a top 1 query for every row in the table would be just running a single query that fetches all the results.

@intiocean this number of queries for non-paginated tables is quite strange, not sure why this happens, and it should not happen. This test case shows the problem:

def test_single_query_for_non_paginated_table(settings):
    '''
    A non-paginated table should not generate a query for each row, but only
    one query to count the rows and one to fetch the rows.
    '''
    from django.db import connection
    settings.DEBUG = True

    for i in range(10):
        Person.objects.create(first_name='Bob %d' % randint(0, 200), last_name='Builder')

    num_queries_before = len(connection.queries)

    class PersonTable(tables.Table):
        class Meta:
            model = Person
            fields = ('first_name', 'last_name')

    table = PersonTable(Person.objects.all())
    request = build_request('/')
    RequestConfig(request, paginate=False).configure(table)
    table.as_html(request)

    # print '\n'.join(q['sql'] for q in connection.queries)
    assert len(connection.queries) - num_queries_before == 2

I have tried, unsuccessfully, to create a test for the paginated case that fails. I don't understand why but everything appears to work as normal in the paginated case... I thought I'd get rows that appear as if duplicated on multiple pages if sorting by a non-unique field.

@intiocean I think I fixed it with 10f5e968bc10552be0ad02ee566513b5d274d5c5

@jieter Great! I can see why that would resolve the issue with non-paginated tables but still think that ordering by a non-unique column when the data is spread across different pages could result in the same row appearing on multiple pages due to the top n ... limit x offset y

What do you think?

Similar to: http://stackoverflow.com/questions/31736700/duplicate-elements-in-django-paginate-after-order-by-call

@wtfrank, @intiocean @n0ctua @op-alex-reid @fliphess I just pushed some commits to master which might fix this. Can you verify?

@intiocean If that is the source of this issue, I think we can only fix this issue partially by documenting this behaviour.

I can confirm that this fixes non-paginated tables @jieter. This also improves performance for non-paginated tables as there is now only one query instead of one for each row.

Check:
With 962f502 and sorting by the 'Notes' column -> duplication
image

With f853078 and sorting by the 'Notes' column -> no duplication 🎉
image

Any chance you could do a release @jieter?

wohoo, finally fixed!

Yes, I'll release within an hour.

released 1.4.0

Awesome, thanks @jieter!

@jieter and @intiocean thank you both so much for fixing this! Works perfectly now.

Hi, I have just experienced this issue on 1.21.2 version and Django==2.0.6. Thanks to intiocean 's comment, fixed it with setting additional ordering by 'pk'.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

foldedpaper picture foldedpaper  Â·  6Comments

mpasternak picture mpasternak  Â·  9Comments

brianmay picture brianmay  Â·  6Comments

blite picture blite  Â·  3Comments

applegrew picture applegrew  Â·  17Comments