Elasticsearch-dsl-py: Fielddata is disabled on text fields by default.

Created on 28 Mar 2017  路  6Comments  路  Source: elastic/elasticsearch-dsl-py

I try to aggregate with text field, and show these error on ES 5x

Fielddata is disabled on text fields by default. Set fielddata=true on [latlong] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

How can I set these fielddata=true on Python ElasticSearch DSL?

Regards

Most helpful comment

@kdbusiness90 it looks like you didn't actually create the index with the mappings in elasticsearch but instead those were created for you automatically - your fielddata=True isn't in the mappings. You will need to either use the Index object or at least call NotesIndex.init() before indexing any data.

Note also that you don't need fielddata since you already have the Keyword subfield (as that is the default behavior) so you can just use filing_name.keyword when you need a field for sorting/aggregations. it is a much better solution than adding fielddata which is an in-memory datastructure.

All 6 comments

Hi! I have the same issue in the past and I solved it by adding fielddata=True when create the fields, like:

# ...
extra_params = {'fielddata': True}
# ...
name = Text(extra_params)

BR,

You can just specify Text(fielddata=True) instead of just Text().

Note however that it is not advised to do that, if you want to aggregate on a field you should use the Keyword field. Aggregating on Text field using fielddata is very expensive (all values for all documents for that field are loaded into memory!!) and also doesn't typically give good results since the content will be analyzed. See [0] for details.

0 - https://www.elastic.co/guide/en/elasticsearch/reference/5.4/fielddata.html#before-enabling-fielddata

Hi,HonzaKral ,you are right,I'm a newer to ELK,and this is my problem.Thank you!!!

@HonzaKral Thank you!

@HonzaKral, I am facing same issue.

I have tried above solution Text(fielddata=True) to achieve sorting but it is not working for me.

below is class definition;

class NotesIndex(DocType):
    """
    NoteIndex.init(using=es_client)
    """
    pk = Integer()
    description = Text(fields={'raw': Keyword()}, fielddata=True)
    modified_at = Date()
    subject = Text(fielddata=True)
    filing_name = Text(fielddata=True)

    class Meta:
        index = 'notes'

Mapping:

        "mappings": {
            "notes_index": {
                "properties": {
                    "created_at": {
                        "type": "date"
                    },
                    "description": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "filing_id": {
                        "type": "long"
                    },
                    "filing_name": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "id": {
                        "type": "long"
                    },
                    "modified_at": {
                        "type": "date"
                    },
                    "public": {
                        "type": "boolean"
                    },
                    "subject": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "title_filing": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "user": {
                        "properties": {
                            "full_name": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            },
                            "id": {
                                "type": "long"
                            },
                            "image": {
                                "type": "text",
                                "fields": {
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            }
                        }
                    }
                }
            }
        },

@kdbusiness90 it looks like you didn't actually create the index with the mappings in elasticsearch but instead those were created for you automatically - your fielddata=True isn't in the mappings. You will need to either use the Index object or at least call NotesIndex.init() before indexing any data.

Note also that you don't need fielddata since you already have the Keyword subfield (as that is the default behavior) so you can just use filing_name.keyword when you need a field for sorting/aggregations. it is a much better solution than adding fielddata which is an in-memory datastructure.

Was this page helpful?
0 / 5 - 0 ratings