I try to aggregate with text field, and show these error on ES 5x
Fielddata is disabled on text fields by default. Set fielddata=true on [latlong] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.
How can I set these fielddata=true on Python ElasticSearch DSL?
Regards
Hi! I have the same issue in the past and I solved it by adding fielddata=True when create the fields, like:
# ...
extra_params = {'fielddata': True}
# ...
name = Text(extra_params)
BR,
You can just specify Text(fielddata=True) instead of just Text().
Note however that it is not advised to do that, if you want to aggregate on a field you should use the Keyword field. Aggregating on Text field using fielddata is very expensive (all values for all documents for that field are loaded into memory!!) and also doesn't typically give good results since the content will be analyzed. See [0] for details.
0 - https://www.elastic.co/guide/en/elasticsearch/reference/5.4/fielddata.html#before-enabling-fielddata
Hi,HonzaKral ,you are right,I'm a newer to ELK,and this is my problem.Thank you!!!
@HonzaKral Thank you!
@HonzaKral, I am facing same issue.
I have tried above solution Text(fielddata=True) to achieve sorting but it is not working for me.
below is class definition;
class NotesIndex(DocType):
"""
NoteIndex.init(using=es_client)
"""
pk = Integer()
description = Text(fields={'raw': Keyword()}, fielddata=True)
modified_at = Date()
subject = Text(fielddata=True)
filing_name = Text(fielddata=True)
class Meta:
index = 'notes'
Mapping:
"mappings": {
"notes_index": {
"properties": {
"created_at": {
"type": "date"
},
"description": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"filing_id": {
"type": "long"
},
"filing_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "long"
},
"modified_at": {
"type": "date"
},
"public": {
"type": "boolean"
},
"subject": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title_filing": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"user": {
"properties": {
"full_name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "long"
},
"image": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
},
@kdbusiness90 it looks like you didn't actually create the index with the mappings in elasticsearch but instead those were created for you automatically - your fielddata=True isn't in the mappings. You will need to either use the Index object or at least call NotesIndex.init() before indexing any data.
Note also that you don't need fielddata since you already have the Keyword subfield (as that is the default behavior) so you can just use filing_name.keyword when you need a field for sorting/aggregations. it is a much better solution than adding fielddata which is an in-memory datastructure.
Most helpful comment
@kdbusiness90 it looks like you didn't actually create the index with the mappings in elasticsearch but instead those were created for you automatically - your
fielddata=Trueisn't in the mappings. You will need to either use theIndexobject or at least callNotesIndex.init()before indexing any data.Note also that you don't need
fielddatasince you already have theKeywordsubfield (as that is the default behavior) so you can just usefiling_name.keywordwhen you need a field for sorting/aggregations. it is a much better solution than addingfielddatawhich is an in-memory datastructure.