In Elasticsearch multi-fields can be defined. These fields are present in the index and can be queried against (just like _all
) but are not present in the _source
.
However when in the Discover interface of Kibana, for example even though you can query against these fields in the search box at the top and they show up in the list of fields if you uncheck Hide missing fields
it appears Kibana does not handle these properly.
As a user since I can query on them, I would expect to see them. See https://github.com/elastic/kibana/issues/1829 for original reference.
When you click on multi-fields in the sidebar, would you expect the downdown contents ("Quick Count") to be the same as the base field, so that the user can click the magnifying glass to create filters against the raw field?
Any older members around that might be able to provide context for why multi-fields are hidden in the first place? @w33ble maybe?
When you click on multi-fields in the sidebar, would you expect the downdown contents ("Quick Count") to be the same as the base field, so that the user can click the magnifying glass to create filters against the raw field?
From an end user perspective yes; It's something I can query against if I type in manually.
Any older members around that might be able to provide context for why multi-fields are hidden in the first place?
Before my time. I'm not sure why we hide the multi-fields, possibly to save space and/or cut down on the output?
Kibana in general doesn't handle multi-fields well. As a tangential issue, the requirement to use non-analyzed fields (usually via the .raw
field) in 5.0 is also somewhat confusing. Showing the .raw
multi-field clutters up the dropdown and it's not an obvious solution the first time you run in to the error either.
It's odd that unchecking that box shows multi-fields too - those fields seem to be hidden on purpose, they shouldn't just show up like that.
Multifields is such a common use case though - Logstash generates them automatically for all string fields other than the message field _and_ that it disables fielddata for all analyzed strings by default - see https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template-es2x.json.
For LS users, if someone wants to sort on application_name in the Discovery view, it will just throw an error now saying that application_name (which is a string analyzed field) cannot be sorted (even though there is an application_name.raw available in the index which is not_analyzed and using doc_values). So with this limitation, it means that Logstash users by default cannot perform sorting in the Discover table on any string field.
I would really consider this a bug. If we show the multi-fields in the list of selectable fields (even if it's only accessible by un-checking "Hide Missing Fields"), then I would expect it to actually display the value of the field in the table. However, it currently doesn't, and only displays "-". Either we should not show them in the list at all, or we should actually show the value of the field in the table.
I don't think that the multi-fields should be displayed as different fields (AFAIK it's just the same information, but analyzed differently).
Kibana should display the best type, corresponding to the context. Users get really confused by the mapping produced by Logstash for example. They don't understand which type is adapted to what they want to do.
If the goal is to create a visualization, it makes more sense to select the "keyword" type of a string by default, not the "text" type. Currently, if I click a string field in the left panel of the Discovery tab to create a quick viz, the text type is chosen. As Logstash disables the fielddata, it produces an error. When creating a visualization from scratch, it should be better to display only the keyword types for a terms aggregation, leaving the possibility for the user to explicitly select the text type with a checkbox for example. Or display the keywords at the top of the list.
In the Discovery tab, the column headers should be set to the keywords, not the text fields as the default LS mapping prevents the sorting. But only the short name should be displayed, not "field.keyword", which would be confusing.
In general, it would be good to hide the complexity of multifields, text/keyword types, etc by choosing a default type depending on the context. Or course Kibana should let the advanced user select another type if needed, by checking a box or something else.
I chatted with @rashidkpc recently and I think we can improve both usability and performance here by querying the field mapping API and checking for the existence of a keyword version of any text field the user wants to sort/visualize on.
... querying the field mapping API and checking for the existence of a keyword version of any text field the user wants to sort/visualize on
@bargs A user on discuss recently asked about sorting on .keyword
fields in Discover. Your comment seems relevant to what this user is asking for. Are you suggesting something like checking for the existence of a .keyword
version of a text field and then using that for sorting?
For reference, this is the topic I created:
https://discuss.elastic.co/t/kibana-not-using-sub-keyword-field-for-sorting/66983
I have to agree with Bargs on using .keyword if available as it improves performance while keeping things simple.
Yup, ideally we would just use the correct version of the field depending on how you're trying to use it.
+1 I'm also running into this limitation. I assume the resolution would also make it possible to sort on the hidden field. I'd like to be able to sort a column on a multi-field which has both an analyzed text index and a not_analyzed string index. The UI only lets me select/display/sort the analyzed text index which isn't what I want. If there was some way to select the .raw index, I expect it'd work as I want.
+1 same here. I'd like to be able to create a search and choose fields users can sort on.
@gamercubed it's not ideal at the moment, but you can select and sort on a multi-field if you un-hide them in the sidebar. You won't see any values in the raw field's column since it has no value in _source, but the sorting will still work correctly.
@Bargs
As I mention in issue #10996, it would be great if : when you sort on a "visible" field, it uses the first "sub-field" which type is "keyword". This is clearly better than throw an error (as it behaves today).
This particularly relevant because out-of-the-box, Logstash maps each string field with "text" type, and with "keyword" sub-field as "keyword" type.
Hi,
I am a novice to Elasticsearch/Kibana version 4.5.1. I am not sure that my question is 100% relevant to this issue; in any event, here it is. In Discover, I would like to alphabetically sort records according to an indexed, analyzed text field. This field is actually author(s) from scientific publications. Each author appears as "Last Name First Name", and for those publications that have more than one author, a comma is used to separate the authors. So essentially I would like to sort a list of publications according to the family name of the first author in alphabetical order. It does not work. I think it has to do with the length of the field, and the way that Elasticsearch handles long strings of text. Sorting using simple fields like PubMed's PMID (8-digit numbers; "DataProviderKey" in attached screen capture), or patent numbers (e.g., US9000000) works exactly as expected. Thanks in advance for helping me.
Sort is disabled on text fields, except if you set fielddata=true on this field mapping.
Warning : it is very memory expensive, so dangerous.
Else you can create in mapping, a sub field which is keyword and use it to sort.
@fbaligand Thanks for your quick reply, and for the clear explanation. We will think about whether we really need to be able to sort according to these long text string fields.
When you define a keyword typed field, you can set "ignore_above=256" option to indicate that only the 256 first characters are stored in keyword field. Largely enough to sort.
Having the ability to sort on 'string' fields that you might also want to partial match filtering is an extremely important feature, in my opinion..
Question from a customer, "Why can't I sort strings? Wouldn't this be the most basic feature of an analytics tool?"
Seems like this hasn't gotten much attention but it's a huge pain point for us
@strawgate you can sort by string if you add the .keyword
version of the field as a column in the doc table. I agree though, it's not obvious and not a good experience.
@Bargs -- I started a post here with another workaround I found: https://discuss.elastic.co/t/index-mapping-type-text-and-keyword-vs-type-keyword-and-text/140805
Default Mapping generated by ES:
"Windows": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Custom Mapping:
"Windows": {
"type": "keyword",
"ignore_above": 256,
"fields": {
"text": {
"type": "text"
}
}
}
The custom mapping causes Kibana to allow sorting on the field without the ".keyword" workaround. I am however having a hard time figuring out what this might break or otherwise change. Do you have any idea? Might this be a viable workaround and we can just apply this custom mapping to text fields instead of the default mapping?
Yep, that's a valid workaround. Ultimately you're just changing the names of the fields. I know of at least one member of our team who arranges their mappings in this way because it makes more sense to him that way.
Personally, this is the first thing I do when I personalize Logstash elasticsearch template.
The latest update on this is that the Elasticsearch team is working on a new way to fetch field information, including multi-fields: https://github.com/elastic/elasticsearch/issues/49028
By using that new API once it's ready, we'll be able to display in Discover the most accurate representation of each document, combining _source
and docvalues
.
cc @kertal
Great! 馃憤
Most helpful comment
I would really consider this a bug. If we show the multi-fields in the list of selectable fields (even if it's only accessible by un-checking "Hide Missing Fields"), then I would expect it to actually display the value of the field in the table. However, it currently doesn't, and only displays "-". Either we should not show them in the list at all, or we should actually show the value of the field in the table.