I know MariaDB 10.3 is not yet supported, as per dev docs, so I add this as feature requests, while it will manifest as a bug as soon as we support MariaDB 10.3
Magento should remove the quotes in the SELECT entity_id IN statement when creating indexes. The entity_ids are not strings, and this seems to confuse the optimizer.
Catalog Category Indexing should run quickly, on MariaDB 10.1 and on 10.3.18
Shop ownes can use MariaDB 10.3.18
See this answer on StackExchange: https://magento.stackexchange.com/a/293345/81
I took the full query which is run several times (for each store view) on the catalog category products indexer and extracted the select part
SELECT 2 AS `category_id`, `cp`.`entity_id` AS `product_id`, IF(ccp.product_id IS NOT NULL, ccp.position, 0) AS `position`, IF(ccp.product_id IS NOT NULL, 1, 0) AS `is_parent`, 2 AS `store_id`, IFNULL(cpvs.value, cpvd.value) AS `visibility` FROM `catalog_product_entity` AS `cp`
 INNER JOIN `catalog_product_website` AS `cpw` ON cpw.product_id = cp.entity_id
 INNER JOIN `catalog_product_entity_int` AS `cpsd` ON cpsd.entity_id = cp.entity_id AND cpsd.store_id = 0 AND cpsd.attribute_id = 97
 LEFT JOIN `catalog_product_entity_int` AS `cpss` ON cpss.entity_id = cp.entity_id AND cpss.attribute_id = cpsd.attribute_id AND cpss.store_id = 2
 INNER JOIN `catalog_product_entity_int` AS `cpvd` ON cpvd.entity_id = cp.entity_id AND cpvd.store_id = 0 AND cpvd.attribute_id = 99
 LEFT JOIN `catalog_product_entity_int` AS `cpvs` ON cpvs.entity_id = cp.entity_id AND cpvs.attribute_id = cpvd.attribute_id  AND cpvs.store_id = 2
 LEFT JOIN `catalog_category_product` AS `ccp` ON ccp.product_id = cp.entity_id WHERE (cpw.website_id = '1') AND (IFNULL(cpss.value, cpsd.value) = 1) AND           (IFNULL(cpvs.value, cpvd.value) IN (2, 3, 4)) AND (cp.entity_id IN ('107280', '107281', '107282', '107283', '107284'
    [... around 70 K entity IDs ...]
    187965', '187966')) GROUP BY `cp`.`entity_id`;
This query alone runs 8 minutes on Maria DB 10.3
When I simply remove all the quotes via s/'//g in vi in the line with the "IN" statement I get
SELECT 2 AS `category_id`, `cp`.`entity_id` AS `product_id`, IF(ccp.product_id IS NOT NULL, ccp.position, 0) AS `position`, IF(ccp.product_id IS NOT NULL, 1, 0) AS `is_parent`, 2 AS `store_id`, IFNULL(cpvs.value, cpvd.value) AS `visibility` FROM `catalog_product_entity` AS `cp`
 INNER JOIN `catalog_product_website` AS `cpw` ON cpw.product_id = cp.entity_id
 INNER JOIN `catalog_product_entity_int` AS `cpsd` ON cpsd.entity_id = cp.entity_id AND cpsd.store_id = 0 AND cpsd.attribute_id = 97
 LEFT JOIN `catalog_product_entity_int` AS `cpss` ON cpss.entity_id = cp.entity_id AND cpss.attribute_id = cpsd.attribute_id AND cpss.store_id = 2
 INNER JOIN `catalog_product_entity_int` AS `cpvd` ON cpvd.entity_id = cp.entity_id AND cpvd.store_id = 0 AND cpvd.attribute_id = 99
 LEFT JOIN `catalog_product_entity_int` AS `cpvs` ON cpvs.entity_id = cp.entity_id AND cpvs.attribute_id = cpvd.attribute_id  AND cpvs.store_id = 2
 LEFT JOIN `catalog_category_product` AS `ccp` ON ccp.product_id = cp.entity_id WHERE (cpw.website_id = '1') AND (IFNULL(cpss.value, cpsd.value) = 1) AND           (IFNULL(cpvs.value, cpvd.value) IN (2, 3, 4)) AND (cp.entity_id IN (107280, 107281, 107282, 107283, 107284
    [... around 70 K entity IDs ...]
    187965, 187966)) GROUP BY `cp`.`entity_id`;
I will try to provide a patch.
Hi @amenk. Thank you for your report.
To help us process this issue please make sure that you provided the following information:
Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:
@magento give me 2.3-develop instance - upcoming 2.3.x release
For more details, please, review the Magento Contributor Assistant documentation.
@amenk do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?
Not able to reproduce, because it's a feature requests for a recent MariaDB 10.3 which is not yet supported by Magento 2.3
I also created an issue in the MariaDB tracker: https://jira.mariadb.org/browse/MDEV-20871
vendor/magento/module-catalog/Model/Indexer/Category/Product/Action/Full.php:287
    foreach ($batchQueries as $query) {
        $this->connection->delete($this->tableMaintainer->getMainTmpTable((int)$store->getId()));
        $entityIds = $this->connection->fetchCol($query);
        foreach($entityIds as &$val) { // patch
            $val = (integer) $val;     // patch
        }                              // patch
        $resultSelect = clone $basicSelect;
        $resultSelect->where($whereCondition, $entityIds);
Or should we put it into \Magento\Framework\DB\Select::where and convert is_numeric values to numbers ?
Digging more down into the function leads to Zend Framework 1 code :(
I tried to patch
\Magento\Framework\DB\Select::where
public function where($cond, $value = null, $type = null)
{
    // ensure that integers are passed as integer,
    // to avoid negative effects on performance
    if (is_array($value)) {
        foreach($value as &$val) {
            $val = $this->castNumeric($val);
        }
    } else {
        $value = $this->castNumeric($value);
    }
    if ($value === null && $type === null) {
        $value = '';
    } elseif ($type == self::TYPE_CONDITION) {
        $type = null;
    }
    if (is_array($value)) {
        $cond = $this->getConnection()->quoteInto($cond, $value);
        $value = null;
    }
    return parent::where($cond, $value, $type);
}
private function castNumeric($val) {
    if (is_numeric($val)) {
        return $val + 0;
    }
    return $val;
}
But this does not catch the situation when you edit a categories anchor property -> the update of the url_rewrite index still uses quotes in strings
I will inject the patch into \Magento\Framework\DB\Adapter\Pdo\Mysql::quoteInto
We are noticing this problem too with MariaDB 10.3.18 in Magento 2.3.2. We notice that most INSERT queries of the indexer take waaaay longer than expected and especially the following query takes up to 8 hours!
INSERT INTO `catalog_category_product_index_store1_tmp` (`category_id`, `product_id`, `position`, ...
We cannot even do a indexer:reindex because once it arrives at the categories index, the following error will happen after 8 hours:
Product Categories indexer process unknown error:
SQLSTATE[42S02]: Base table or view not found: 1146 Table 'catalog_category_product_index_store1_tmp' doesn't exist, query was: DELETE FROM `catalog_category_product_index_store1_tmp`
So apparently it takes so long, that the temporary table "vanishes" in between (?).
@fritzmg Feel free to try my patch https://github.com/magento/magento2/pull/25212
@fritzmg Which MariaDB version were you on before it became so slow?
Before it was actually MySQL 5.7.27. Since this is happening in a live system (hoster had to switch DB server due to other troubles) we can't unfortunately test your fix (yet) and thus our hoster is in the process of downgrading the MariaDB version again.
What we did was, to locally setup 10.3.18 in docker to reproduce the bug and test the fix.
Yes, I've seen that on Stackoverflow :). When we get around to it, we will try the same.
@amenk can you try it with MariaDB 10.3.17 also, i got a case where it was in 10.3.18 slow and in 10.3.17 fast. My currently suspect is a regession because of https://jira.mariadb.org/browse/MDEV-20109, see the other Reports linked there.
We are noticing similar problems with MariaDB 10.2.27 - just not with the queries of the indexer. Certain selects take up to 60 seconds while they should only last milliseconds.
I'm also running mariadb 10.3.18 and having these issues. Indexers just never stop being triggered now even after tripling our hardware speed.
Attempting to clone a server then upgrade to 10.4 on it and test indexing on it.
10.4 latest made zero difference.
@amenk patch reduced it from 45 mins to 40 seconds for the query he posted.
Removing IN for entity_id altogether reduces query time to 0.0013 seconds
M2 mview and indexer system are once again trash and once again costing my client money and me time.
How did you remove "IN" for entity_id altogether? Did you still select the same rows? Via a JOIN?
Please maintain a constructive tone ;-)
I just manually changed the query for testing, I'm researching now if the entity_id's are do to a batch or handed off from a prior query. If it's just the mview batching, then it's 30,000 faster to not batch than to do it.
It does select all products and then batch them for a root category full reindex.
Here's my result after removing batching entirely for that:
Category Products index has been rebuilt successfully in 00:00:14
Before it was 5 hours.
Will test some more then release a module to override default m2 behavior. Honestly, should probably just write a module to overwrite all indexers at some point they are terrible I have worked on this project nearly 3 years and this is a constant source of suffering. Many other ecommerce systems don't have something like this for good reason, just more examples of overengineering to make stuff worse and more costly to develop, run, and maintain. I can't keep a positive tone because I have nothing positive to say about it, and I doubt you would either if your quality of life and that of your clients was worsened by poorly designed magento2. If we could switch off it we would but sunk cost is a lot at this point and it's as I said a few years work.
https://github.com/SoloDevAtrix/snippets/blob/master/indexer_catalog_category_products.php
runs in :02 seconds on the same database, my frontend functions perfect with it.
My original glance over that it selected all products for root was wrong they actually do quite a bit of specific join conditions on all types.  I have my database matching a full run pretty closely.  Only difference is a few positions are adjusted in my view, for the better, as well as don't have weird one off where it'll put in entries into the index for things like
A product with cat path 1/2/4/142/160
normally it'd put in the 160 the 2 and 4
but 142 would be skipped by m2.
but in some cases it'd put it in for no reason.
i have the option in my code to put it in for all or none.  was the only diff I could find in the results of the code output into the database outside slight position stuff as I said.
Enabling putting in all the non parent non leaf options doubles the size as well as seems to achieve nothing. So left it off.
If you use it power to you, I offer zero warranty, even implied, and you're on your own to get it to work. It's also alpha code at best so if you run into a bug then you ran into a bug cuz it's just finished with it's first test run.
I just think it's really sour how badly m2 code runs this. You're filling ONE TABLE.
Here's the reasons why m2's queries are SLOW SLOW SLOW:
Last note, my working example code, doesn't follow m2 ideology. At all. If you guys cared as much about pragmatic efficient solutions as you did your cursed ideology then maybe m2 wouldn't be a garbage heap.
This is a serious problem for my client, it's actually maybe costing him his entire business.
The indexers lock up the database, adding hardware barely helps. Locked up database means slow loads on any page not already cached in varnish. Cache warming helps but can't do every page on the site. Leads to lower conversions. He can't fix or edit products because again, it's so slow. And when he does, he gets locked out because hours of indexers are running. So if my language is harsh, that's why.
I've worked for this client oh, 9 years now. The switch to magento2 is by far the worst decision that's ever been made for the company.
set in_predicate_conversion_threshold = 4294967295;
   set global in_predicate_conversion_threshold = 4294967295;
seems also to help, as a workaround.
There might be a fix in MariaDB 10.3.21
Migrated to MariaDB 10.4.12 on my M2.1.7 install with 6000 products (centos 8, php7.0)
Indexing seems to be OK, plan to migrate to M2.3 within the next few weeks
magento indexer:reindex
Design Config Grid index has been rebuilt successfully in 00:00:00
Customer Grid index has been rebuilt successfully in 00:00:00
Product Flat Data index has been rebuilt successfully in 00:00:01
Category Flat Data index has been rebuilt successfully in 00:00:00
Category Products index has been rebuilt successfully in 00:00:01
Product Categories index has been rebuilt successfully in 00:00:00
Product Price index has been rebuilt successfully in 00:00:02
Product EAV index has been rebuilt successfully in 00:00:03
Stock index has been rebuilt successfully in 00:00:00
Catalog Rule Product index has been rebuilt successfully in 00:00:00
Catalog Product Rule index has been rebuilt successfully in 00:00:00
Catalog Search index has been rebuilt successfully in 00:01:01
set in_predicate_conversion_threshold = 4294967295;
set global in_predicate_conversion_threshold = 4294967295;seems also to help, as a workaround.
There might be a fix in MariaDB 10.3.21
Thanks to this indexing of category products went from impossible (7+ hours) to 10 seconds! Thanks a lot!
:white_check_mark: Confirmed by @engcom-Alfa
 Thank you for verifying the issue. Based on the provided information internal tickets MC-37309 were created
Issue Available: @engcom-Alfa, _You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself._
Hi @amenk. Thank you for your report.
The issue has been fixed in magento/magento2#27129 by @kandy in 2.4-develop branch
Related commit(s):
The fix will be available with the upcoming 2.4.2 release.
Most helpful comment
set in_predicate_conversion_threshold = 4294967295;
set global in_predicate_conversion_threshold = 4294967295;
seems also to help, as a workaround.
See: https://jira.mariadb.org/browse/MDEV-20871?focusedCommentId=138523&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-138523
There might be a fix in MariaDB 10.3.21