Magento2: Plugin on Indexer method getSearchableProducts might cause products not to be indexed.

Created on 4 Nov 2020  ·  31Comments  ·  Source: magento/magento2

I've got really critical issue on one of our projects that was caused by custom extension. Investigation of this issue was really hard and not obvious at all and was not clear what caused this issue and how to fix that quickly on production.

In order to prevent similar situations in future by anyone else I'd like to suggest introducing mechanism of preventing breaking the indexation mechanism by custom extensions

Preconditions (*)

  1. Any Magento 2 version with ElasticSearch
  2. Configure ElasticSearch 6.x / 7.x
  3. Create a plugin on method getSearchableProducts defined in \Magento\CatalogSearch\Model\Indexer\Fulltext\Action\Full class

Steps to reproduce (*)

  1. Add 2000 saleable products
  2. Set indexes on schedule,
  3. In the plugin, add a condition to skip every second batch (return an empty array) instead of returning products fetched by SQL
  4. Reset search index
  5. Reindex products from shell, catalogsearch_fulltext

Expected result (*)

  1. All products are added to elastic search index

Actual result (*)

  1. Only some product batch will be added to elasticseach index
  2. Second product batch wont be indexed and all other products also won't be indexed (while (count($products) > 0) {)

Proposed solution:

  • Fetch all saleable product count,
  • use MySQL offset for pagination, remove all those variables like $lastProductId because they are not needed.
  • another solution: create id ranges as @IvanChepurnyi suggested

Please provide Severity assessment for the Issue as Reporter. This information will help during Confirmation and Issue triage processes.

  • [ ] Severity: S0 _- Affects critical data or functionality and leaves users without workaround._
  • [ ] Severity: S1 _- Affects critical data or functionality and forces users to employ a workaround._
  • [ ] Severity: S2 _- Affects non-critical data or functionality and forces users to employ a workaround._
  • [x] Severity: S3 _- Affects non-critical data or functionality and does not force users to employ a workaround._
  • [ ] Severity: S4 _- Affects aesthetics, professional look and feel, “quality” or “usability”._
P3 done Reproduced on 2.4.x S3 Performance

Most helpful comment

Hello @qsolutions-pl.
I have been able to reproduce the issue.
Steps to reproduce:

  1. Add 3rd party code with after plugin like in the attachment
    How_to_break_the_indexer.zip
  1. Follow steps from https://github.com/magento/magento2/issues/30798#issuecomment-734897736

Because of the plugin on getSearchableProducts defined in \Magento\CatalogSearch\Model\Indexer\Fulltext\Action\Full class
products from the 3rd batch won't be visible at the front.
It is not the issue of the Magento itself but affected functionality could be adjusted to avoid issues with 3rd party extensions.

All 31 comments

Hi @qsolutions-pl. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, please, review the Magento Contributor Assistant documentation.

Please, add a comment to assign the issue: @magento I am working on this


  • Join Magento Community Engineering Slack and ask your questions in #github channel.

    :warning: According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

:clock10: You can find the schedule on the Magento Community Calendar page.

:telephone_receiver: The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

:movie_camera: You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

:pencil2: Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@sidolov can you help create issue for this

In class \Magento\CatalogSearch\Model\Indexer\Fulltext\Action\Full there is hardcoded batch size, 500 products. 
If there is a huge gap between product ids, 
while loop get 0 products when it's trying to fetch saleable products, 
but it cannot fetch them due to gap in entity_id

Why do you say that batch size is hardcoded? From what I see
https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/CatalogSearch/Model/Indexer/Fulltext/Action/Full.php#L247
batch size is passed via constructor argument, which means that it could be easily overwritten via DI.xml configuration.

Moreover, we have similar configuration antiGapMultiplier in DataProvider as well
https://github.com/magento/magento2/blame/2.4-develop/app/code/Magento/CatalogSearch/Model/Indexer/Fulltext/Action/DataProvider.php#L161
which is used as a multiplication factor for batch size

        $select = $this->getSelectForSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $batch);
        if ($productIds === null) {
            $select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );
        }

This parameter is also configurable through DI.

Also, it's strange that some batch configuration is a rescue here, because based on the code, there is particular branch of business logic to cover big IDs gaps, which just executes query with no upper limit - https://github.com/magento/magento2/blame/2.4-develop/app/code/Magento/CatalogSearch/Model/Indexer/Fulltext/Action/DataProvider.php#L212-L218

        $select = $this->getSelectForSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $batch);
        if ($productIds === null) {
            $select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );
        }
        $products = $this->connection->fetchAll($select);
        if ($productIds === null && !$products) {
            // try to search without limit entity_id by batch size for cover case with a big gap between entity ids
            $products = $this->connection->fetchAll(
                $this->getSelectForSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $batch)
            );
        }

Anyways it's better either to configure via DI current code and make it compliant with the business case you have, or investigate why it does not work first (because from what I see the initial intention was to cover the case you described), rather than just making refactoring at once

Can we have config set batch size via admin ?

@mrtuvn this should not be set in admin anywhere, simply to say this is a background task and admin user should not have an option to change it. If you give such option to a customer with admin access there is 100% chance that they will set such number to high and they will hit MySQL / memory limits and break indexing even more.

@maghamed what do you call this?
image
Maybe you (or the core team) fixed in 2.4.1 - I will know it once I have free time to check this.
However if you did change it: why it was not mentioned in release notes?

This is a critical issue which basically breaks entire store.

Or there is more simple explanation: you (or the core team) just don't care however there is a better term for this, but I don't use curse words in public forums :-)

@maghamed
I've checked 2.4.1 code and guess what:

bug is still there

This should not be solved by di.xml because you have to do it per project. With dependency injection you can change values batch size and antiGapMultiplier

so lets look at few examples:

  • batch size: 100, antiGap: 5 --> if gap between IDs is > 500 indexers will fail
  • batch size: 500, antiGap: 5 --> if gap between IDs is > 2500 indexers will fail
  • batch size: 1000, antiGap: 5 --> if gap between IDs is > 5000 indexers will fail

This is one of the biggest bugs in Magento 2 I've encountered. It does not have any testing.

It only works if there are no gaps between product IDs.

So basically you can tell all your Magento Clients this:
"Please do not remove any products from the database, if you remove more than 500, most of the products will disappear from the frontend"

BTW: I love the "batch size and antiGap" being called performance improvements

I'll leave it here - batches extraction without gaps via 2 simple MySQL queries:
https://github.com/EcomDev/sync-magento-2-migration/blob/main/src/TableRangeConditionGeneratorFactory.php#L38-L67

@engcom-Alfa @engcom-Bravo @engcom-Charlie could you confirm that this issue exists on 2.4-develop branch?
This is really critical issue.

@IvanChepurnyi that's looks like a pretty good solution!

@ihor-sviziev the main problem (which is affected by this bug) is this file
vendor/magento/module-catalog-search/Model/Indexer/Fulltext/Action/Full.php

lines:

        $products = $this->dataProvider
            ->getSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $this->batchSize);
        while (count($products) > 0) {

the while loop is where the indexers "skips"

Just a small clarification: the bug affects SALEABLE products, so the gap needs to be between SALEABLE products IDs

To anyone still following this:

This is a CRITICAL ISSUE because it may cause most (or maybe even all) of the products disappear from frontend catalog during indexation process.

It happend in 2 projects I have in development.

let me be very clear here. I did not debug this case by myself, but looking at the source code https://github.com/magento/magento2/blame/2.4-develop/app/code/Magento/CatalogSearch/Model/Indexer/Fulltext/Action/DataProvider.php#L206-L218

        $select = $this->getSelectForSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $batch);
        if ($productIds === null) {
            $select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );
        }
        $products = $this->connection->fetchAll($select);
        if ($productIds === null && !$products) {
            // try to search without limit entity_id by batch size for cover case with a big gap between entity ids
            $products = $this->connection->fetchAll(
                $this->getSelectForSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $batch)
            );
        }

all the configuration of batch/antiGapMultiplier impacts only first part (where we apply limit for product entity ids), while we still have this logic

        if ($productIds === null && !$products) {
            // try to search without limit entity_id by batch size for cover case with a big gap between entity ids
            $products = $this->connection->fetchAll(
                $this->getSelectForSearchableProducts($storeId, $staticFields, $productIds, $lastProductId, $batch)
            );
        }

where we don't apply limitation for entity_id at all

            $select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );

why the entities are not returned in the scope of this query, but along with that returned when you add the filter with larger number for batch size?
I prefer to have an answer for this question first rather than jump into fixing batch sizing, which is configurable via DI and should not be a problem

Just curious if stores already go live, owner without delete any products only update/add and run index .Will this issue can happen ??

Just curious if stores already go live, owner without delete any products only update/add and run index .Will this issue can happen ??

I'm pretty sure that making 500 products as out of stock / disabled is very valid case on production (but probably not very frequent), for instance during some automatic sync from external system, especially when total products more than 10k or even 100k.

The thing the main concern here is that issue absolutely not obvious, and not clear what caused it. I don't really want event potentially having this issue on my production.

@maghamed for stores with 100k and more it's a regular procedure to disable / 500 or more products when they update their catalog for new season (especially in fashion industry) . Lets say that we increase the batch to 5000 by di.xml and then what? Customer adds another 100k products (200k in total) and the issue with happen again. We increase it to 50000 and then what? What if we have a customer has 1M products, increase it to 100k? What about 10M?

Your suggestion by doing it using di.xml is not a workaround at all. Its just hiding the problem. From my point of view it looks like you don't work with store owners directly or haven't worked for a long time and you don't even understand how serious this issue is.

Like @ihor-sviziev mentioned: such a situation should not occur in general

@maghamed as for the code you've just mention, you *DO APPLY A LIMIT > and < * let me show you (as you clearly don't see it):

first > condition

private function getSelectForSearchableProducts(
[...]
$select->where('e.entity_id > ?', $lastProductId); // here is where the magic happens
        $select->order('e.entity_id');
        $select->limit($batch);

        return $select;
    }

and then you pass that SELECT object to another function which applies < condition here

$select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );

so in the end this is your SELECT

$select->where('e.entity_id > ?', $lastProductId);
$select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );

Can you see it now? this code is bugged, not covered with test and can break ANY store...

so the end query is something like this

select ids ,(...) from table where
entity_id > lastProductId and
entity_id < (lastProductId  + batchSize * antiGapMultiplier)

Hi @ihor-sviziev. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

  • [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.
  • [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • [ ] 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • [ ] 4. Verify that the issue is reproducible on 2.4-develop branch

    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and _stop verification process here_!

  • [ ] 5. Add label Issue: Confirmed once verification is complete.

  • [ ] 6. Make sure that automatic system confirms that report has been added to the backlog.

@qsolutions-pl
Again:
Did I say that we should not fix the issue - no, I didn't. What I said is - we need to investigate the root cause why it does not work for you, and then decide how better to fix it, along with this we can use a workaround.

Regarding your argument, that we are building query which looks like

select ids ,(...) from table where
entity_id > lastProductId and
entity_id < (lastProductId  + batchSize * antiGapMultiplier)

that's correct. But in fact there are two queries in this function, you might see it by the number of fetchAll functions:
https://github.com/magento/magento2/blame/2.4-develop/app/code/Magento/CatalogSearch/Model/Indexer/Fulltext/Action/DataProvider.php#L212
and
https://github.com/magento/magento2/blame/2.4-develop/app/code/Magento/CatalogSearch/Model/Indexer/Fulltext/Action/DataProvider.php#L215

The first one is the one you described in your previous comment, while for the second we don't apply this filter

            $select->where(
                'e.entity_id < ?',
                $lastProductId ? $this->antiGapMultiplier * $batch + $lastProductId + 1 : $batch + 1
            );

so, the resulting SQL looks like:

select ids ,(...) from table where
entity_id > lastProductId 

This query I mentioned earlier and the one corresponds to the comment in code // try to search without limit entity_id by batch size for cover case with a big gap between entity ids Why does this query not work for your case - that's the question. I might understand that there are a lot of products being returned which lead to memory consumption problem, but looks like this is not what happen to you, as products are just not returned

@maghamed I believe this case isn't really obvious, I would like to dig into debug and see what exactly causing such issue.

Hi @qsolutions-pl,
I tried to reproduce the issue, but I couldn't.
My steps were following:

  1. imported 1000 enabled products (part 1)
  2. increased incremeent id with ALTER TABLE catalog_product_entity AUTO_INCREMENT = 11001;
  3. imported 1000 disabled products (part 2)
  4. increased incremeent id with ALTER TABLE catalog_product_entity AUTO_INCREMENT = 22001;
  5. imported 1000 enabled products (part 3)
  6. admin switched to update by schedule
  7. ran php bin/magento indexer:reset && php bin/magento indexer:reindex
  8. created empty category and assigned to it just products with latest product ids (just 12 products)
  9. ran php bin/magento indexer:reset && php bin/magento indexer:reindex

After that I went to the category - and I see these products.
image

My test files:
part1.zip
part2.zip
part3.zip

Could you provide what I did wrong, or just add missing info to steps to reproduce?

I can't share the database image due to NDA from the Client. However I'm preparing a script which will create products with following ID schema:

  • productd1; ID1,
  • product2: ID: 502
  • product3: ID 1003 and so on
    so the GAP between product IDs is always > 500

Several people, including myself are able to reproduce the issue.

If you can't replicate it using products just put a plugin on getSearchableProducts that returns 0 products 50% of the time.
You'll see that the indexer just stops halfway. If you're using the PR you'll see the indexer keeps going until the last batch.

Hi @engcom-Hotel. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

  • [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.
  • [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • [ ] 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • [ ] 4. Verify that the issue is reproducible on 2.4-develop branch

    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and _stop verification process here_!

  • [ ] 5. Add label Issue: Confirmed once verification is complete.

  • [ ] 6. Make sure that automatic system confirms that report has been added to the backlog.

Hello @qsolutions-pl.
Steps to reproduce from https://github.com/magento/magento2/issues/30798#issuecomment-734897736 are the same as yours.
Tried to reproduce the issue - works as expected.
PHP 7.4.13
MySQL 8
ElasticSearch 7
gap_between_products_id
cat_with_products_from_3rd_batch
update_by_schedule
reindex_view_products
You can check it on your side as well.
Could you please provide any additional information which can help to reproduce described issue.
Thank you.

Here is latest update, after small debugging session with @ihor-sviziev who dedicated his time to this. You have my sincere thanks for this Ihor. :tumbler_glass:

After connecting to x-debug to the project it turned out that the problem with indexes crashing / stopping the work is related to a plugin casued by Amasty module. However I still believe native Magento 2 indexer code should be changed to something @IvanChepurnyi suggested.

Why this problem keeps happening?
If you look at this function (plugin) in Amasty module (basically the same thing that @PascalBrouwers has suggested)

public function afterGetSearchableProducts(
        MagentoDataProvider $subject,
        array $result,
        string $storeId
    ): array {
        return $result;
        $manageStock = $this->config->getValue('cataloginventory/item_options/manage_stock');

        if ($manageStock) {
            $displayType = $this->config->getValue('cataloginventory/options/show_out_of_stock');
            $stockData = $this->getStockStatusData((int)$storeId, $this->getProductIds($result), !$displayType);
            foreach ($result as $key => $data) {
                if (!isset($stockData[$data['entity_id']])) {
                    unset($result[$key]);
                }
            }
        }

        return $result;
    }

in my case where there are thousands of unsaleable products, this metod will remove all items from main method getSearchableProducts
if for any reason items are removed from results of that method , indexer will stop due to this condition

while (count($products) > 0) {

in this class file vendor/magento/module-catalog-search/Model/Indexer/Fulltext/Action/Full.php

In my personal opinion this is still a minor flaw in functionality in native indexing.
while loop with antiGapModifiers and batchSize should be changed to avoid scenario like this.
I myself will be working on something like rangeConditions and create PR in 10-14 days once I have some time between projects.

Adding rangeConditions in a foreach loop won't ruin performance and will make the core code more robust.

Hi @sidolov,
According to the update in https://github.com/magento/magento2/issues/30798#issuecomment-737886556 and the new description - could we decrease priority and severity, maybe to P3 S3?

Hi @sidolov,
According to the update in #30798 (comment) - could we decrease priority and severity, maybe to P3 S3?

Done

@ihor-sviziev done! Thank you guys for the investigation!

Here we go again "Amasty" behavior cause issue. Good use-case for check magento core work with 3rd-party module

Hello @qsolutions-pl.
I have been able to reproduce the issue.
Steps to reproduce:

  1. Add 3rd party code with after plugin like in the attachment
    How_to_break_the_indexer.zip
  1. Follow steps from https://github.com/magento/magento2/issues/30798#issuecomment-734897736

Because of the plugin on getSearchableProducts defined in \Magento\CatalogSearch\Model\Indexer\Fulltext\Action\Full class
products from the 3rd batch won't be visible at the front.
It is not the issue of the Magento itself but affected functionality could be adjusted to avoid issues with 3rd party extensions.

Was this page helpful?
0 / 5 - 0 ratings