Netbox: Device_count is not updated on device roles if a device is removed/add to the role

Created on 5 Nov 2019 · 15Comments · Source: netbox-community/netbox

Environment

Python version: v3.6.8
NetBox version: v2.6.5

Steps to Reproduce

Create a new device role
Add an existing device to it
Check the api response on device_count
Remove the device from it
Check the api response on device_count

Expected Behavior

I expect the device_count to be correctly updated

Observed Behavior

When adding an existing device to the role, the role's device_count stays at null.
When a new device is created and added to the role directly, the device_count is still null.

accepted bug

Source

soer7022

👍1

Most helpful comment

FYI I am looking into what it will take to solve this upstream in cacheops by supporting subqueries natively.

lampwins on 12 Feb 2020

👍3

All 15 comments

I'm not able to reproduce this using the development server alone. It smells like a caching problem.

jeremystretch on 6 Nov 2019

This likely stems from the use of subqueries to count the related devices and VMs. @lampwins you've done some work around this; can you take a look?

jeremystretch on 6 Nov 2019

Ran into this issue with /dcim/regions. Is disabling caching on that count query an option or would you rather invalidate the count on updates to the children?

hSaria on 11 Dec 2019

@jeremystretch this is related to caching and the use of subqueries. Basically cacheops just doesn't support them (see 7 here)

There are a couple of ways we can work around this but want to know your thoughts?

lampwins on 29 Jan 2020

FYI I am looking into what it will take to solve this upstream in cacheops by supporting subqueries natively.

lampwins on 12 Feb 2020

👍3

After looking into it, implementing support for subqueries upstream appears pretty involved and over my head, so I am now investigating doing this with annotations.

lampwins on 15 May 2020

Actually, the comment on subqueries "one can just break queries into simpler ones, which cache better" made me think. Usually subqueries are used to make things faster by having fewer separate queries, but with cacheops having several separately cacheable queries may make more sense.

Something to test and play with…

steffann on 23 Jun 2020

I tried removing the subqueries

class DeviceRoleViewSet(ModelViewSet):
    queryset = DeviceRole.objects.all()
    ...

And adding the counts as simple properties:

class DeviceRole(ChangeLoggedModel):
    ...

    @property
    def device_count(self):
        return self.devices.count()

    @property
    def virtualmachine_count(self):
        return self.virtual_machines.count()

And that actually works well. The count queries are cached, and refreshed when necessary. The upside is that they are even available outside the API code.

@lampwins Shall I implement this for all subqueries? I'd be happy to take this issue.

steffann on 23 Jun 2020

The alternative is to make one query that returns a {role_id: count} mapping, and use that in the property. That would be one slightly bigger query that caches for all the devices roles, instead of cacheing each count by itself.

steffann on 23 Jun 2020

@steffann this is actually the way we used to do it a long time ago. I am not apposed to it, in light of a unified internal API and clear operability with caching. A downside though is it really clutters the models.

@jeremystretch's call though.

lampwins on 23 Jun 2020

…and the best way to get a {role_id: count} mapping is to use annotations, which brings us back to @lampwins' solution :)

Tested it, works!

class DeviceRoleViewSet(ModelViewSet):
    queryset = DeviceRole.objects.annotate(
        device_count=Count('devices'),
        virtualmachine_count=Count('virtual_machines')
    )

Which makes me wonder why we didn't use a simple Count() in the first place. @jeremystretch do you remember?

For a more unified internal API we might even move those annotations to the Model's Manager so they are available everywhere. It may also help with the cacheing to use the same query everywhere.

steffann on 23 Jun 2020

Another possible implementation:

    @property
    def device_count(self):
        device_count_map = dict(Device.objects
                                .order_by('device_role')
                                .values_list('device_role')
                                .annotate(Count('pk')))
        return device_count_map.get(self.pk)

    @property
    def virtualmachine_count(self):
        from virtualization.models import VirtualMachine
        virtualmachine_count_map = dict(VirtualMachine.objects
                                        .order_by('role')
                                        .values_list('role')
                                        .annotate(Count('pk')))
        return virtualmachine_count_map.get(self.pk)

It looks very inefficient (and would be without cacheops) but because the query is constant it is cached efficiently.

I think it will be up to @jeremystretch to choose :)

steffann on 23 Jun 2020

    @property
    def device_count(self):
        return self.devices.count()

This approach doesn't scale. Each call to device_count() on an object in the list results in a new SQL query. So if you have a list of 1000 objects, that's 1000 queries. If you have two such fields per object, that's 2000 queries.

class DeviceRoleViewSet(ModelViewSet):
    queryset = DeviceRole.objects.annotate(
        device_count=Count('devices'),
        virtualmachine_count=Count('virtual_machines')
    )

Using Count() is more efficient, however due to its use of GROUP BY it does not work when we need to count multiple relations: The reported count for both relations is the product of both. For example, if you have a DeviceRole assigned to ten devices and ten virtual machines, the count for each will be reported as 100.

We can work around this by setting distinct=True, but that absolutely destroys performance to the point where the first proposal above may actually be preferable.

That leaves us with subqueries. Subqueries work great, but as this issue points out, django-cacheops does not support invalidation for them. We actually use subqueries in many places today, so the scope of this bug is much greater than just device roles.

jeremystretch on 21 Jul 2020

Blocked by Suor/django-cacheops#365

jeremystretch on 4 Aug 2020

django-cacheops 5.1 has been released

lampwins on 25 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings