We need to know what are the queries this request generates and see what impact they have on DB in terms of response time. Then, we can check how these can be optimized.
This stems from https://github.com/openfoodfoundation/ofn-install/issues/477.
This should cut the negative impact on performance this page has which seems to be causing serious downtime.
I am not sure we need to do this work. When a super admin loads the products update page, there are dozens?hundreds? of requests to the server for each page... I think we just need to do #4076
Copying a relevant request log from https://github.com/openfoodfoundation/ofn-install/issues/477#issuecomment-514121571
Started GET "/api/products/bulk_products?page=29;per_page=20;" for 82.132.212.49 at 2019-07-22 15:56:04 +0000
(...)
Completed 200 OK in 64479.5ms (Views: 34634.5ms | ActiveRecord: 29843.7ms)
Notice the ActiveRecord: 29843.7ms. Regardless of concurrency, we can't handle something like this and it's a symptom that something is really wrong. That is why I added spike so at least we take a look and I'm not suggesting we refactor things straight away.
IMO we are paying the price of being so reactive on these operations issues.
I'm doing the pagination now (it's nearly ready for review). I think the queries are still pretty bad even when they're smaller, so I think it's worth seeing if there's any quick wins (like simple N+1s we can strip out) to improve it.
ok, agreed!
def spike
if quick_wins_available?
fix_it_now!
else
redirect_to_dev_ready_column
end
end
:smile:
Edit: CodeClimate will complain I havn't used a guard clause here...
I created https://github.com/openfoodfoundation/openfoodnetwork/pull/4092 while investigating this.
Here's the outcome of the investigation. Basically, wherever I look I see N+1 which makes it harder to prioritize... but these are the ones that look relevant to me:
lib/open_food_network/permissions.rb does very nasty things and should be reconsidered but among these, when passing manage_products it triggers
SELECT enterprises.id FROM "enterprises"
Needless to say, this is a time bomb.
I haven't found who triggers the query but the Rails log lists
SELECT "spree_products"."id" FROM "spree_products" WHERE ("spree_products".deleted_at IS NULL)
which might be manageable now given our numbers but these could get easily out of hand, specilly when the DB server is busy. This is UK's explain analyze:
openfoodnetwork=> explain analyze SELECT "spree_products"."id" FROM "spree_products" WHERE ("spree_products".deleted_at IS NULL);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Seq Scan on spree_products (cost=0.00..1034.07 rows=10596 width=4) (actual time=0.035..6.249 rows=10601 loops=1)
Filter: (deleted_at IS NULL)
Rows Removed by Filter: 3513
Planning time: 0.673 ms
Execution time: 6.841 ms
I don't like that number of rows and the sequential scan :see_no_evil:
None of the many attributes and associations of the Api::Admin::ProductSerializer, including variants and the atttributes and associations of their Api::Admin::VariantSerializer, are eager loaded. Nested N+1? M*(N+1) :boom:?
Copied straight from the log
Rendered spree/admin/products/index/_header.html.haml (63.5ms)
Enterprise Load (2.0ms) SELECT "enterprises".* FROM "enterprises" WHERE "enterprises"."is_primary_producer" = 't' ORDER BY name
Might it be triggered by index/_header.html.haml ?
What stands out from the query plan below is that sorting by name takes 1071Kb (UK server)
openfoodnetwork=> explain analyze SELECT "enterprises".* FROM "enterprises" WHERE "enterprises"."is_primary_producer" = 't' ORDER BY name;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Sort (cost=270.13..273.36 rows=1291 width=1639) (actual time=5.708..5.961 rows=1291 loops=1)
Sort Key: name
Sort Method: quicksort Memory: 1071kB
-> Seq Scan on enterprises (cost=0.00..203.42 rows=1291 width=1639) (actual time=0.023..1.945 rows=1291 loops=1)
Filter: is_primary_producer
Rows Removed by Filter: 151
Planning time: 17.614 ms
Execution time: 6.206 ms
In any case, this broad query could be even more dangerous as the number of enterprises grows. We better fix it soon.
There are more like these, especially in the /admin/products request but these seem the worst and require attention.
Removing the N+1s from the serializers might be straightforward (I haven't checked) but the other issues are too scary to postpone. Shall I create issues for all of them but with lower severity? @luisramos0 @Matt-Yorkley @mkllnk @kristinalim ?
Yes, individual issues for each of the things you found.
Issues created. Feel free to question their severity.
Most helpful comment
:smile:
Edit: CodeClimate will complain I havn't used a guard clause here...