Sylius version affected: 1.3.4
Description
I have ~300k products, with 1 variant each. Most of the times I can't even load admin -> products page, because my php-fpm gw timeouts after 60s. I've increased it since then.
From what I see MySQL is not using right index, and i'm not sure we can even force it with ORM.
Steps to reproduce
Here is an example of original query from admin/products/ page:
SELECT s0_.code AS code_0, s0_.created_at AS created_at_1, s0_.updated_at AS updated_at_2, s0_.enabled AS enabled_3, s0_.id AS id_4, s0_.variant_selection_method AS variant_selection_method_5, s0_.average_rating AS average_rating_6, s0_.external_id AS external_id_7, s0_.mfr_part_number AS mfr_part_number_8, s0_.alternate_part_number AS alternate_part_number_9, s0_.barcode AS barcode_10, s0_.product_line AS product_line_11, s0_.rawData AS rawData_12, s0_.rawPiesData AS rawPiesData_13, s1_.name AS name_14, s1_.slug AS slug_15, s1_.description AS description_16, s1_.meta_keywords AS meta_keywords_17, s1_.meta_description AS meta_description_18, s1_.id AS id_19, s1_.short_description AS short_description_20, s1_.locale AS locale_21, s0_.main_taxon_id AS main_taxon_id_22, s1_.translatable_id AS translatable_id_23 FROM sylius_product s0_ INNER JOIN sylius_product_translation s1_ ON s0_.id = s1_.translatable_id AND (s1_.locale = 'en') ORDER BY s0_.code ASC LIMIT 10
Took: 60 seconds
Explain:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE s1_ NULL ref sylius_product_translation_uniq_trans,UNIQ_105A9084180C698989D9B62,IDX_105A9082C2AC5D3 UNIQ_105A9084180C698989D9B62 767 const 149228 100.00 Using temporary; Using filesort
1 SIMPLE s0_ NULL eq_ref PRIMARY PRIMARY 4 tps_test24.s1_.translatable_id 1 100.00 NULL
and if we force UNIQ_677B9B7477153098 index:
SELECT s0_.code AS code_0, s0_.created_at AS created_at_1, s0_.updated_at AS updated_at_2, s0_.enabled AS enabled_3, s0_.id AS id_4, s0_.variant_selection_method AS variant_selection_method_5, s0_.average_rating AS average_rating_6, s0_.external_id AS external_id_7, s0_.mfr_part_number AS mfr_part_number_8, s0_.alternate_part_number AS alternate_part_number_9, s0_.barcode AS barcode_10, s0_.product_line AS product_line_11, s0_.rawData AS rawData_12, s0_.rawPiesData AS rawPiesData_13, s1_.name AS name_14, s1_.slug AS slug_15, s1_.description AS description_16, s1_.meta_keywords AS meta_keywords_17, s1_.meta_description AS meta_description_18, s1_.id AS id_19, s1_.short_description AS short_description_20, s1_.locale AS locale_21, s0_.main_taxon_id AS main_taxon_id_22, s1_.translatable_id AS translatable_id_23 FROM sylius_product s0_ FORCE INDEX (UNIQ_677B9B7477153098) INNER JOIN sylius_product_translation s1_ ON s0_.id = s1_.translatable_id AND (s1_.locale = 'en') ORDER BY s0_.code ASC LIMIT 10
Took: 2 ms
There is also another query on same page that takes 18s without force index and 800ms with.
SELECT count(DISTINCT s0_.id) AS sclr_0 FROM sylius_product s0_ INNER JOIN sylius_product_translation s1_ ON s0_.id = s1_.translatable_id AND (s1_.locale = 'en');
Maybe we can at least document it in some Performance docs section? /cc @CoderMaggie
I would like to know if we can fix it somehow with Doctrine?
My 2nd option was to start thinking about moving to pgsql.
In case Sylius wants to compete with other e-commerce solutions, I think it's a must that big amounts of data should be workable. Might not be easy, but certainly an highly required thing.
This can be indeed solved with doctrine but you might have to override some repository methods. Have a look here: gist.github.com/arnaud-lb/2704404
@mamazu i would really appreciate if you can check if it's doable. couldn't figure it out myself yet.
I'll try to get to it this evening.
Hm, it doesn't look as easy as it seems. I will try to override the QueryBuilder in the `Product Repository and see if that helps.
So, I have added the IndexWalker try to check out this commit 68a97ca81f2d5918fbdb8efd569cd36c2185a655 and see if that helps.
@mamazu yup, works well! Page load time went down from 2 minutes to 2 seconds 🏆
One thing I want to ask/mention - that in the future if we will have multiple JOINs, and we use USE INDEX, it may try to apply it to the wrong place, right?
:( Just found a place where "USE INDEX" makes it worse..
on page _/admin/products/taxon/{id}_
Original query (without _use index_):
SELECT s0_.code AS code_0, s0_.created_at AS created_at_1, s0_.updated_at AS updated_at_2, s0_.enabled AS enabled_3, s0_.id AS id_4, s0_.variant_selection_method AS variant_selection_method_5, s0_.average_rating AS average_rating_6, s0_.external_id AS external_id_7, s0_.mfr_part_number AS mfr_part_number_8, s0_.alternate_part_number AS alternate_part_number_9, s0_.barcode AS barcode_10, s0_.product_line AS product_line_11, s0_.rawData AS rawData_12, s0_.rawPiesData AS rawPiesData_13, s1_.name AS name_14, s1_.slug AS slug_15, s1_.description AS description_16, s1_.meta_keywords AS meta_keywords_17, s1_.meta_description AS meta_description_18, s1_.id AS id_19, s1_.short_description AS short_description_20, s1_.locale AS locale_21, s0_.main_taxon_id AS main_taxon_id_22, s1_.translatable_id AS translatable_id_23 FROM sylius_product s0_ INNER JOIN sylius_product_translation s1_ ON s0_.id = s1_.translatable_id AND (s1_.locale = 'en') INNER JOIN sylius_product_taxon s2_ ON s0_.id = s2_.product_id WHERE s2_.taxon_id = '434' ORDER BY s2_.position ASC LIMIT 10;
85 ms
and with _USE INDEX_
SELECT s0_.code AS code_0, s0_.created_at AS created_at_1, s0_.updated_at AS updated_at_2, s0_.enabled AS enabled_3, s0_.id AS id_4, s0_.variant_selection_method AS variant_selection_method_5, s0_.average_rating AS average_rating_6, s0_.external_id AS external_id_7, s0_.mfr_part_number AS mfr_part_number_8, s0_.alternate_part_number AS alternate_part_number_9, s0_.barcode AS barcode_10, s0_.product_line AS product_line_11, s0_.rawData AS rawData_12, s0_.rawPiesData AS rawPiesData_13, s1_.name AS name_14, s1_.slug AS slug_15, s1_.description AS description_16, s1_.meta_keywords AS meta_keywords_17, s1_.meta_description AS meta_description_18, s1_.id AS id_19, s1_.short_description AS short_description_20, s1_.locale AS locale_21, s0_.main_taxon_id AS main_taxon_id_22, s1_.translatable_id AS translatable_id_23 FROM sylius_product s0_ USE INDEX (UNIQ_677B9B7477153098) INNER JOIN sylius_product_translation s1_ ON s0_.id = s1_.translatable_id AND (s1_.locale = 'en') INNER JOIN sylius_product_taxon s2_ ON s0_.id = s2_.product_id WHERE s2_.taxon_id = '434' ORDER BY s2_.position ASC LIMIT 10;
11.1 sec
So... while making one query thousands times better, it makes another query much worse.
The current implementation of it only allows to use the FORCE INDEX for the FROM clause of the statement but I am sure this could be varied with a much more thought through solution(this is more like a proof of concept).
I noticed the same thing. I added the force index to the createListQueryBuilder of the product. This can include multiple things and sometimes a full table scan is faster. We probably need to fine tune this a little.
@mamazu i'll do couple more tests, but i think if we remove it from running on taxon pages, it's ready for PR, no? Check my comment -
https://github.com/Sylius/Sylius/commit/68a97ca81f2d5918fbdb8efd569cd36c2185a655#r31792661
Are there any opinions on that from @Zales0123?
@mamazu i did more testing and adding _forceIndex_ only on products main page by moving _forceIndex_ to _if_ works really well.
public function createListQueryBuilder(string $locale, $taxonId = null): QueryBuilder
{
$queryBuilder = $this->createQueryBuilder('o')
->addSelect('translation')
->innerJoin('o.translations', 'translation', 'WITH', 'translation.locale = :locale')
->setParameter('locale', $locale)
;
if (null === $taxonId) {
$queryBuilder
->forceIndex('UNIQ_677B9B7477153098')
;
}
if (null !== $taxonId) {
$queryBuilder
->innerJoin('o.productTaxons', 'productTaxon')
->andWhere('productTaxon.taxon = :taxonId')
->setParameter('taxonId', $taxonId)
;
}
return $queryBuilder;
}
Do you wanna create a PR?
You can pull request the final version with your changes as well.
@Zales0123 any feedback on the issue or PR?
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.
Most helpful comment
In case Sylius wants to compete with other e-commerce solutions, I think it's a must that big amounts of data should be workable. Might not be easy, but certainly an highly required thing.