Elasticsearch-dsl-py: Fielddata is disabled on text fields by default.

Created on 28 Mar 2017 · 6Comments · Source: elastic/elasticsearch-dsl-py

I try to aggregate with text field, and show these error on ES 5x

Fielddata is disabled on text fields by default. Set fielddata=true on [latlong] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

How can I set these fielddata=true on Python ElasticSearch DSL?

Regards

Source

fellipeh

Most helpful comment

@kdbusiness90 it looks like you didn't actually create the index with the mappings in elasticsearch but instead those were created for you automatically - your fielddata=True isn't in the mappings. You will need to either use the Index object or at least call NotesIndex.init() before indexing any data.

Note also that you don't need fielddata since you already have the Keyword subfield (as that is the default behavior) so you can just use filing_name.keyword when you need a field for sorting/aggregations. it is a much better solution than adding fielddata which is an in-memory datastructure.

HonzaKral on 31 Dec 2017

👍11 ❤2

All 6 comments

Hi! I have the same issue in the past and I solved it by adding fielddata=True when create the fields, like:

# ...
extra_params = {'fielddata': True}
# ...
name = Text(extra_params)

BR,

sanchezg on 30 Mar 2017

You can just specify Text(fielddata=True) instead of just Text().

Note however that it is not advised to do that, if you want to aggregate on a field you should use the Keyword field. Aggregating on Text field using fielddata is very expensive (all values for all documents for that field are loaded into memory!!) and also doesn't typically give good results since the content will be analyzed. See [0] for details.

0 - https://www.elastic.co/guide/en/elasticsearch/reference/5.4/fielddata.html#before-enabling-fielddata

HonzaKral on 11 May 2017

👍1

Hi,HonzaKral ,you are right,I'm a newer to ELK,and this is my problem.Thank you!!!

Ningshiqi on 30 Jun 2017

@HonzaKral Thank you!

Ningshiqi on 30 Jun 2017

@HonzaKral, I am facing same issue.

I have tried above solution Text(fielddata=True) to achieve sorting but it is not working for me.

below is class definition;

class NotesIndex(DocType):
    """
    NoteIndex.init(using=es_client)
    """
    pk = Integer()
    description = Text(fields={'raw': Keyword()}, fielddata=True)
    modified_at = Date()
    subject = Text(fielddata=True)
    filing_name = Text(fielddata=True)

    class Meta:
        index = 'notes'

Mapping:

        "mappings": {
            "notes_index": {
                "properties": {
                    "created_at": {
                        "type": "date"
                    },
                    "description": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "filing_id": {
                        "type": "long"
                    },
                    "filing_name": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "id": {
                        "type": "long"
                    },
                    "modified_at": {
                        "type": "date"
                    },
                    "public": {
                        "type": "boolean"
                    },
                    "subject": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "title_filing": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "user": {
                        "properties": {
                            "full_name": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "id": {
                                "type": "long"
                            },
                            "image": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            }
                        }
                    }
                }
            }
        },

kdbusiness90 on 31 Dec 2017

HonzaKral on 31 Dec 2017

👍11 ❤2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Using function score with queries

zahir-koradia · 3Comments

Run time error during search execution: "NotFoundError: TransportError(404, u'search_phase_execution_exception', u'No search context found for id [8664053]')"

arizhakov · 4Comments

How to achieve more like this functionality?

barseghyanartur · 4Comments

Create custom analyzer filter for a index in py-elasticsearch-dsl

SalahAdDin · 4Comments

DocType.mget seems to be missing id fields

gabrielpjordao · 3Comments