Magento2: Catalog Category Indexing takes very long on MariaDB 10.3 with many products

Created on 21 Oct 2019  路  25Comments  路  Source: magento/magento2

Description (*)

I know MariaDB 10.3 is not yet supported, as per dev docs, so I add this as feature requests, while it will manifest as a bug as soon as we support MariaDB 10.3

Magento should remove the quotes in the SELECT entity_id IN statement when creating indexes. The entity_ids are not strings, and this seems to confuse the optimizer.

Steps to reproduce:

  1. Change value for simple products to 50000 for example (setup/performance-toolkit/profiles/ce/small.xml);
  2. Run in console: bin/magento setup:perf:generate-fixtures /var/www/html/magento24/setup/performance-toolkit/profiles/ce/small.xml for example

Expected behavior (*)

Catalog Category Indexing should run quickly, on MariaDB 10.1 and on 10.3.18

Benefits

Shop ownes can use MariaDB 10.3.18

Additional information

See this answer on StackExchange: https://magento.stackexchange.com/a/293345/81

I took the full query which is run several times (for each store view) on the catalog category products indexer and extracted the select part

SELECT 2 AS `category_id`, `cp`.`entity_id` AS `product_id`, IF(ccp.product_id IS NOT NULL, ccp.position, 0) AS `position`, IF(ccp.product_id IS NOT NULL, 1, 0) AS `is_parent`, 2 AS `store_id`, IFNULL(cpvs.value, cpvd.value) AS `visibility` FROM `catalog_product_entity` AS `cp`
 INNER JOIN `catalog_product_website` AS `cpw` ON cpw.product_id = cp.entity_id
 INNER JOIN `catalog_product_entity_int` AS `cpsd` ON cpsd.entity_id = cp.entity_id AND cpsd.store_id = 0 AND cpsd.attribute_id = 97
 LEFT JOIN `catalog_product_entity_int` AS `cpss` ON cpss.entity_id = cp.entity_id AND cpss.attribute_id = cpsd.attribute_id AND cpss.store_id = 2
 INNER JOIN `catalog_product_entity_int` AS `cpvd` ON cpvd.entity_id = cp.entity_id AND cpvd.store_id = 0 AND cpvd.attribute_id = 99
 LEFT JOIN `catalog_product_entity_int` AS `cpvs` ON cpvs.entity_id = cp.entity_id AND cpvs.attribute_id = cpvd.attribute_id  AND cpvs.store_id = 2
 LEFT JOIN `catalog_category_product` AS `ccp` ON ccp.product_id = cp.entity_id WHERE (cpw.website_id = '1') AND (IFNULL(cpss.value, cpsd.value) = 1) AND           (IFNULL(cpvs.value, cpvd.value) IN (2, 3, 4)) AND (cp.entity_id IN ('107280', '107281', '107282', '107283', '107284'
    [... around 70 K entity IDs ...]
    187965', '187966')) GROUP BY `cp`.`entity_id`;

This query alone runs 8 minutes on Maria DB 10.3

When I simply remove all the quotes via s/'//g in vi in the line with the "IN" statement I get

SELECT 2 AS `category_id`, `cp`.`entity_id` AS `product_id`, IF(ccp.product_id IS NOT NULL, ccp.position, 0) AS `position`, IF(ccp.product_id IS NOT NULL, 1, 0) AS `is_parent`, 2 AS `store_id`, IFNULL(cpvs.value, cpvd.value) AS `visibility` FROM `catalog_product_entity` AS `cp`
 INNER JOIN `catalog_product_website` AS `cpw` ON cpw.product_id = cp.entity_id
 INNER JOIN `catalog_product_entity_int` AS `cpsd` ON cpsd.entity_id = cp.entity_id AND cpsd.store_id = 0 AND cpsd.attribute_id = 97
 LEFT JOIN `catalog_product_entity_int` AS `cpss` ON cpss.entity_id = cp.entity_id AND cpss.attribute_id = cpsd.attribute_id AND cpss.store_id = 2
 INNER JOIN `catalog_product_entity_int` AS `cpvd` ON cpvd.entity_id = cp.entity_id AND cpvd.store_id = 0 AND cpvd.attribute_id = 99
 LEFT JOIN `catalog_product_entity_int` AS `cpvs` ON cpvs.entity_id = cp.entity_id AND cpvs.attribute_id = cpvd.attribute_id  AND cpvs.store_id = 2
 LEFT JOIN `catalog_category_product` AS `ccp` ON ccp.product_id = cp.entity_id WHERE (cpw.website_id = '1') AND (IFNULL(cpss.value, cpsd.value) = 1) AND           (IFNULL(cpvs.value, cpvd.value) IN (2, 3, 4)) AND (cp.entity_id IN (107280, 107281, 107282, 107283, 107284
    [... around 70 K entity IDs ...]
    187965, 187966)) GROUP BY `cp`.`entity_id`;

Next steps

I will try to provide a patch.

Not Required Indexer Fixed in 2.4.x Confirmed Format is valid Ready for Work P2 ready for dev Reproduced on 2.4.x S1

Most helpful comment

set in_predicate_conversion_threshold = 4294967295;
set global in_predicate_conversion_threshold = 4294967295;

seems also to help, as a workaround.

See: https://jira.mariadb.org/browse/MDEV-20871?focusedCommentId=138523&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-138523

There might be a fix in MariaDB 10.3.21

All 25 comments

Hi @amenk. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • [ ] Summary of the issue
  • [ ] Information on your environment
  • [ ] Steps to reproduce
  • [ ] Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.3-develop instance - upcoming 2.3.x release

For more details, please, review the Magento Contributor Assistant documentation.

@amenk do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?

  • [ ] yes
  • [ ] no

Not able to reproduce, because it's a feature requests for a recent MariaDB 10.3 which is not yet supported by Magento 2.3

I also created an issue in the MariaDB tracker: https://jira.mariadb.org/browse/MDEV-20871

vendor/magento/module-catalog/Model/Indexer/Category/Product/Action/Full.php:287

    foreach ($batchQueries as $query) {
        $this->connection->delete($this->tableMaintainer->getMainTmpTable((int)$store->getId()));
        $entityIds = $this->connection->fetchCol($query);
        foreach($entityIds as &$val) { // patch
            $val = (integer) $val;     // patch
        }                              // patch
        $resultSelect = clone $basicSelect;
        $resultSelect->where($whereCondition, $entityIds);

Or should we put it into \Magento\Framework\DB\Select::where and convert is_numeric values to numbers ?

Digging more down into the function leads to Zend Framework 1 code :(

I tried to patch
\Magento\Framework\DB\Select::where

public function where($cond, $value = null, $type = null)
{
    // ensure that integers are passed as integer,
    // to avoid negative effects on performance
    if (is_array($value)) {
        foreach($value as &$val) {
            $val = $this->castNumeric($val);
        }
    } else {
        $value = $this->castNumeric($value);
    }

    if ($value === null && $type === null) {
        $value = '';
    } elseif ($type == self::TYPE_CONDITION) {
        $type = null;
    }
    if (is_array($value)) {
        $cond = $this->getConnection()->quoteInto($cond, $value);
        $value = null;
    }
    return parent::where($cond, $value, $type);
}

private function castNumeric($val) {
    if (is_numeric($val)) {
        return $val + 0;
    }
    return $val;
}

But this does not catch the situation when you edit a categories anchor property -> the update of the url_rewrite index still uses quotes in strings

I will inject the patch into \Magento\Framework\DB\Adapter\Pdo\Mysql::quoteInto

We are noticing this problem too with MariaDB 10.3.18 in Magento 2.3.2. We notice that most INSERT queries of the indexer take waaaay longer than expected and especially the following query takes up to 8 hours!

INSERT INTO `catalog_category_product_index_store1_tmp` (`category_id`, `product_id`, `position`, ...

We cannot even do a indexer:reindex because once it arrives at the categories index, the following error will happen after 8 hours:

Product Categories indexer process unknown error:
SQLSTATE[42S02]: Base table or view not found: 1146 Table 'catalog_category_product_index_store1_tmp' doesn't exist, query was: DELETE FROM `catalog_category_product_index_store1_tmp`

So apparently it takes so long, that the temporary table "vanishes" in between (?).

@fritzmg Feel free to try my patch https://github.com/magento/magento2/pull/25212

@fritzmg Which MariaDB version were you on before it became so slow?

Before it was actually MySQL 5.7.27. Since this is happening in a live system (hoster had to switch DB server due to other troubles) we can't unfortunately test your fix (yet) and thus our hoster is in the process of downgrading the MariaDB version again.

What we did was, to locally setup 10.3.18 in docker to reproduce the bug and test the fix.

Yes, I've seen that on Stackoverflow :). When we get around to it, we will try the same.

@amenk can you try it with MariaDB 10.3.17 also, i got a case where it was in 10.3.18 slow and in 10.3.17 fast. My currently suspect is a regession because of https://jira.mariadb.org/browse/MDEV-20109, see the other Reports linked there.

We are noticing similar problems with MariaDB 10.2.27 - just not with the queries of the indexer. Certain selects take up to 60 seconds while they should only last milliseconds.

I'm also running mariadb 10.3.18 and having these issues. Indexers just never stop being triggered now even after tripling our hardware speed.

Attempting to clone a server then upgrade to 10.4 on it and test indexing on it.

10.4 latest made zero difference.
@amenk patch reduced it from 45 mins to 40 seconds for the query he posted.
Removing IN for entity_id altogether reduces query time to 0.0013 seconds

M2 mview and indexer system are once again trash and once again costing my client money and me time.

How did you remove "IN" for entity_id altogether? Did you still select the same rows? Via a JOIN?

Please maintain a constructive tone ;-)

I just manually changed the query for testing, I'm researching now if the entity_id's are do to a batch or handed off from a prior query. If it's just the mview batching, then it's 30,000 faster to not batch than to do it.

It does select all products and then batch them for a root category full reindex.
Here's my result after removing batching entirely for that:
Category Products index has been rebuilt successfully in 00:00:14

Before it was 5 hours.

Will test some more then release a module to override default m2 behavior. Honestly, should probably just write a module to overwrite all indexers at some point they are terrible I have worked on this project nearly 3 years and this is a constant source of suffering. Many other ecommerce systems don't have something like this for good reason, just more examples of overengineering to make stuff worse and more costly to develop, run, and maintain. I can't keep a positive tone because I have nothing positive to say about it, and I doubt you would either if your quality of life and that of your clients was worsened by poorly designed magento2. If we could switch off it we would but sunk cost is a lot at this point and it's as I said a few years work.

https://github.com/SoloDevAtrix/snippets/blob/master/indexer_catalog_category_products.php
runs in :02 seconds on the same database, my frontend functions perfect with it.

My original glance over that it selected all products for root was wrong they actually do quite a bit of specific join conditions on all types. I have my database matching a full run pretty closely. Only difference is a few positions are adjusted in my view, for the better, as well as don't have weird one off where it'll put in entries into the index for things like
A product with cat path 1/2/4/142/160
normally it'd put in the 160 the 2 and 4
but 142 would be skipped by m2.
but in some cases it'd put it in for no reason.
i have the option in my code to put it in for all or none. was the only diff I could find in the results of the code output into the database outside slight position stuff as I said.

Enabling putting in all the non parent non leaf options doubles the size as well as seems to achieve nothing. So left it off.

If you use it power to you, I offer zero warranty, even implied, and you're on your own to get it to work. It's also alpha code at best so if you run into a bug then you ran into a bug cuz it's just finished with it's first test run.

I just think it's really sour how badly m2 code runs this. You're filling ONE TABLE.

Here's the reasons why m2's queries are SLOW SLOW SLOW:

  1. Use of IN to do batching. This changes mysql joins from better join types to a range type which is slower.
  2. Redoing tasks because of batching. Stop batching your shit unless it's necessary and if it is then don't use IN statements.
  3. Using temporary tables, almost always bad for performance. Avoid using them ever.
  4. Tons of self joins, you had way way way too many and some of them served almost no purpose.
  5. Forcing all the work on mysql. Sometimes this is efficient. Sometimes it's better to write a loop or do work in PHP. Some things are so complex the tradeoff for EVERYONE that has to work on it is just better to chunk it up into code sections in php. This seems to use mysql or be damned and it be damned.
  6. way too much data from all those joins. simple is better. Adding one bad join can spike usage from .0001 seconds to some query that can take 3+mins to run. Use EXPLAIN and understand what it says.

Last note, my working example code, doesn't follow m2 ideology. At all. If you guys cared as much about pragmatic efficient solutions as you did your cursed ideology then maybe m2 wouldn't be a garbage heap.

This is a serious problem for my client, it's actually maybe costing him his entire business.

The indexers lock up the database, adding hardware barely helps. Locked up database means slow loads on any page not already cached in varnish. Cache warming helps but can't do every page on the site. Leads to lower conversions. He can't fix or edit products because again, it's so slow. And when he does, he gets locked out because hours of indexers are running. So if my language is harsh, that's why.

I've worked for this client oh, 9 years now. The switch to magento2 is by far the worst decision that's ever been made for the company.

set in_predicate_conversion_threshold = 4294967295;
set global in_predicate_conversion_threshold = 4294967295;

seems also to help, as a workaround.

See: https://jira.mariadb.org/browse/MDEV-20871?focusedCommentId=138523&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-138523

There might be a fix in MariaDB 10.3.21

Migrated to MariaDB 10.4.12 on my M2.1.7 install with 6000 products (centos 8, php7.0)
Indexing seems to be OK, plan to migrate to M2.3 within the next few weeks

magento indexer:reindex
Design Config Grid index has been rebuilt successfully in 00:00:00
Customer Grid index has been rebuilt successfully in 00:00:00
Product Flat Data index has been rebuilt successfully in 00:00:01
Category Flat Data index has been rebuilt successfully in 00:00:00
Category Products index has been rebuilt successfully in 00:00:01
Product Categories index has been rebuilt successfully in 00:00:00
Product Price index has been rebuilt successfully in 00:00:02
Product EAV index has been rebuilt successfully in 00:00:03
Stock index has been rebuilt successfully in 00:00:00
Catalog Rule Product index has been rebuilt successfully in 00:00:00
Catalog Product Rule index has been rebuilt successfully in 00:00:00
Catalog Search index has been rebuilt successfully in 00:01:01

set in_predicate_conversion_threshold = 4294967295;
set global in_predicate_conversion_threshold = 4294967295;

seems also to help, as a workaround.

See: https://jira.mariadb.org/browse/MDEV-20871?focusedCommentId=138523&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-138523

There might be a fix in MariaDB 10.3.21

Thanks to this indexing of category products went from impossible (7+ hours) to 10 seconds! Thanks a lot!

:white_check_mark: Confirmed by @engcom-Alfa
Thank you for verifying the issue. Based on the provided information internal tickets MC-37309 were created

Issue Available: @engcom-Alfa, _You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself._

Was this page helpful?
0 / 5 - 0 ratings