Elasticsuite: Changing sort order causes shop to become empty -> error fielddata is disabled on text fields

Created on 4 Feb 2020  路  13Comments  路  Source: Smile-SA/elasticsuite

Preconditions

Changed "use in sort" of several products, shop is empty after that

Magento Version : 2.3.3
ElasticSuite Version : 2.8.3

Steps to reproduce

  1. Change use-in-sort

Expected result

  1. Shop is working

Actual result

  1. All categories are empty

This message is logged

[2020-02-04 09:42:45] main.ERROR: {"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [option_text_p2m_brand] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"prod_german_catalog_product_prod_20200204_093706","node":"dPLXFARdRy-DPEYCuarbRg","reason":{"type":"illegal_argument_exception","reason":"Fielddata is disabled on text fields by default. Set fielddata=true on [option_text_p2m_brand] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."}}]},"status":400} [] []
bug

Most helpful comment

Hope to see this before long, the issue is very disruptive for a busy site with a huge catalog under constant curation (700+ attributes).

Updating attributes only via deployment is probably not realistic for most merchants.

All 13 comments

Most probably, this is due to the fact that mapping is not updated "live" when setting an attribute as being usable for sorting.

A full reindex is probably needed after changing the attribute.

Thanks for the quick response.

Then there should be an according warning

Same issue will probably occur for a non searchable attribute suddenly becoming filterable.

Can an automatic full-reindex be triggered in such cases? It's very annoying and very bad for the conversion rate of a shop, if no more products are visible in category after changing an attribute.

A full reindex is probably needed after changing the attribute.

In the current case one full reindex + cache flush did not help. I had to do a reindex again. Maybe a cache flush + single reindex also would have worked. Maybe in the first reindex, still the old mapping was used

@romainruaud How can we tackle this?

Imho, there are several differents cases to be handled :

  • when the mapping can be updated after editing an attribute : just compute the proper PUT query to update the mapping.

  • when the mapping cannot be updated : there is not much to do except invalidating the fulltext index, but this could cause a full reindex being triggered during the day on a live website.

But to be honest, i'm not a huge fan of editing attributes directly through the back-office on a production environment. That's not something we are used to do at Smile, and we manage all the attribute data through setup files (and do a full reindex during the delivery process if needed). That should explain why we did not faced this issue so much.

see https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html#updating-field-mappings

I understand it's a hard problem.

I understand that you dis-encourage the editing of attributes.

In our case the attributes are managed via the Magento API. The API and editing attributes via the Magento backend - are both Magento standard functionalities which should somehow be supported in a way to not end up with an empty shop.

For now we put up a monitoring system which checks for this error.

Also it would be an idea to not allow editing of those attribute settings when the module is in place.

  • In which case would the proper PUT query work?
  • How hard is it to trigger a full reindex? I guess the implication is, that it puts a high load? But in the case we have now, the categories are empty anyways.
  • How does Magento core's elastic module handle this?

@romainruaud What do you suggest? It's very business relevant, when the shop suddenly becomes empty.

Hope to see this before long, the issue is very disruptive for a busy site with a huge catalog under constant curation (700+ attributes).

Updating attributes only via deployment is probably not realistic for most merchants.

I believe I found a workaround. I added a dynamic template definition to force string options_text_* fields to keyword by default, rather than text. As far as I can tell this should be safe, and it resolves at least the superficial error at hand (runtime attributes being created with a type incompatible with layered navigation/sorting).

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/dynamic-templates.html#template-examples

"mappings": {
    "_doc": {
        "dynamic_templates": [
            {
                "option_text_strings_as_keywords": {
                    "mapping": {
                        "type": "keyword"
                    },
                    "match_mapping_type": "string",
                    "match": "option_text_*"
                }
            }
        ],
        ...
    }
}

Added as a plugin on \Smile\ElasticsuiteCore\Index\Mapping::asArray(), but there's probably a better way.

Between this and disabling index invalidation on certain attribute changes, I think that leaves us in a spot where attributes can be managed on production without triggering errors and reindexes in the process. Results TBD.

@romainruaud I'm very surprised and curious about your statement that at Smile you are not used to attribute management using the admin console (in production). How would clients manage their catalog? They will always have to go through development to get additional attributes? We have over 300 different types of products in our catalog it would have to hire a developer just to manage our catalog sets and attributes.

Not to bash, but I'm just curious at how one would manage such a situation (and if in your (Smile's) experience Magento is 'unstable' to a point where, seemingly, unharmful actions have to be managed to such an extend). Is there always a 'working' copy that is synchronized every number of days, weeks, months? And what about a situation that includes a PIM for instance.

edit: typo

Fixed by #2109 and #2134

Was this page helpful?
0 / 5 - 0 ratings