Kibana: Improve handling of multi-fields in Discover

Created on 10 Jun 2016 · 26Comments · Source: elastic/kibana

In Elasticsearch multi-fields can be defined. These fields are present in the index and can be queried against (just like _all) but are not present in the _source.

However when in the Discover interface of Kibana, for example even though you can query against these fields in the search box at the top and they show up in the list of fields if you uncheck Hide missing fields it appears Kibana does not handle these properly.

As a user since I can query on them, I would expect to see them. See https://github.com/elastic/kibana/issues/1829 for original reference.

screen shot 2016-06-10 at 10 44 07 am

Discover KibanaApp enhancement

Source

djschny

👍4

Most helpful comment

I would really consider this a bug. If we show the multi-fields in the list of selectable fields (even if it's only accessible by un-checking "Hide Missing Fields"), then I would expect it to actually display the value of the field in the table. However, it currently doesn't, and only displays "-". Either we should not show them in the list at all, or we should actually show the value of the field in the table.

lukasolson on 30 Sep 2016

👍10

All 26 comments

When you click on multi-fields in the sidebar, would you expect the downdown contents ("Quick Count") to be the same as the base field, so that the user can click the magnifying glass to create filters against the raw field?

Any older members around that might be able to provide context for why multi-fields are hidden in the first place? @w33ble maybe?

Bargs on 10 Jun 2016

When you click on multi-fields in the sidebar, would you expect the downdown contents ("Quick Count") to be the same as the base field, so that the user can click the magnifying glass to create filters against the raw field?

From an end user perspective yes; It's something I can query against if I type in manually.

djschny on 10 Jun 2016

Any older members around that might be able to provide context for why multi-fields are hidden in the first place?

Before my time. I'm not sure why we hide the multi-fields, possibly to save space and/or cut down on the output?

Kibana in general doesn't handle multi-fields well. As a tangential issue, the requirement to use non-analyzed fields (usually via the .raw field) in 5.0 is also somewhat confusing. Showing the .raw multi-field clutters up the dropdown and it's not an obvious solution the first time you run in to the error either.

It's odd that unchecking that box shows multi-fields too - those fields seem to be hidden on purpose, they shouldn't just show up like that.

w33ble on 15 Jun 2016

Multifields is such a common use case though - Logstash generates them automatically for all string fields other than the message field _and_ that it disables fielddata for all analyzed strings by default - see https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template-es2x.json.

For LS users, if someone wants to sort on application_name in the Discovery view, it will just throw an error now saying that application_name (which is a string analyzed field) cannot be sorted (even though there is an application_name.raw available in the index which is not_analyzed and using doc_values). So with this limitation, it means that Logstash users by default cannot perform sorting in the Discover table on any string field.

ppf2 on 9 Aug 2016

👍5

lukasolson on 30 Sep 2016

👍10

I don't think that the multi-fields should be displayed as different fields (AFAIK it's just the same information, but analyzed differently).
Kibana should display the best type, corresponding to the context. Users get really confused by the mapping produced by Logstash for example. They don't understand which type is adapted to what they want to do.
If the goal is to create a visualization, it makes more sense to select the "keyword" type of a string by default, not the "text" type. Currently, if I click a string field in the left panel of the Discovery tab to create a quick viz, the text type is chosen. As Logstash disables the fielddata, it produces an error. When creating a visualization from scratch, it should be better to display only the keyword types for a terms aggregation, leaving the possibility for the user to explicitly select the text type with a checkbox for example. Or display the keywords at the top of the list.
In the Discovery tab, the column headers should be set to the keywords, not the text fields as the default LS mapping prevents the sorting. But only the short name should be displayed, not "field.keyword", which would be confusing.

In general, it would be good to hide the complexity of multifields, text/keyword types, etc by choosing a default type depending on the context. Or course Kibana should let the advanced user select another type if needed, by checking a box or something else.

dav3860 on 17 Nov 2016

I chatted with @rashidkpc recently and I think we can improve both usability and performance here by querying the field mapping API and checking for the existence of a keyword version of any text field the user wants to sort/visualize on.

Bargs on 17 Nov 2016

❤1 👍1

... querying the field mapping API and checking for the existence of a keyword version of any text field the user wants to sort/visualize on

@bargs A user on discuss recently asked about sorting on .keyword fields in Discover. Your comment seems relevant to what this user is asking for. Are you suggesting something like checking for the existence of a .keyword version of a text field and then using that for sorting?

ycombinator on 24 Nov 2016

For reference, this is the topic I created:
https://discuss.elastic.co/t/kibana-not-using-sub-keyword-field-for-sorting/66983

I have to agree with Bargs on using .keyword if available as it improves performance while keeping things simple.

luis-silva on 24 Nov 2016

Yup, ideally we would just use the correct version of the field depending on how you're trying to use it.

Bargs on 28 Nov 2016

👍1

+1 I'm also running into this limitation. I assume the resolution would also make it possible to sort on the hidden field. I'd like to be able to sort a column on a multi-field which has both an analyzed text index and a not_analyzed string index. The UI only lets me select/display/sort the analyzed text index which isn't what I want. If there was some way to select the .raw index, I expect it'd work as I want.

blfrantz on 23 Mar 2017

+1 same here. I'd like to be able to create a search and choose fields users can sort on.

bwalsh on 23 Mar 2017

@gamercubed it's not ideal at the moment, but you can select and sort on a multi-field if you un-hide them in the sidebar. You won't see any values in the raw field's column since it has no value in _source, but the sorting will still work correctly.

screen shot 2017-03-27 at 2 25 30 pm

Bargs on 27 Mar 2017

@Bargs
As I mention in issue #10996, it would be great if : when you sort on a "visible" field, it uses the first "sub-field" which type is "keyword". This is clearly better than throw an error (as it behaves today).

This particularly relevant because out-of-the-box, Logstash maps each string field with "text" type, and with "keyword" sub-field as "keyword" type.

fbaligand on 3 Apr 2017

👍3

Hi,
I am a novice to Elasticsearch/Kibana version 4.5.1. I am not sure that my question is 100% relevant to this issue; in any event, here it is. In Discover, I would like to alphabetically sort records according to an indexed, analyzed text field. This field is actually author(s) from scientific publications. Each author appears as "Last Name First Name", and for those publications that have more than one author, a comma is used to separate the authors. So essentially I would like to sort a list of publications according to the family name of the first author in alphabetical order. It does not work. I think it has to do with the length of the field, and the way that Elasticsearch handles long strings of text. Sorting using simple fields like PubMed's PMID (8-digit numbers; "DataProviderKey" in attached screen capture), or patent numbers (e.g., US9000000) works exactly as expected. Thanks in advance for helping me.

sorted by individual
sorted by dataproviderkey

sansbonsang on 11 Jul 2017

Sort is disabled on text fields, except if you set fielddata=true on this field mapping.
Warning : it is very memory expensive, so dangerous.
Else you can create in mapping, a sub field which is keyword and use it to sort.

fbaligand on 11 Jul 2017

👍1

@fbaligand Thanks for your quick reply, and for the clear explanation. We will think about whether we really need to be able to sort according to these long text string fields.

sansbonsang on 11 Jul 2017

When you define a keyword typed field, you can set "ignore_above=256" option to indicate that only the 256 first characters are stored in keyword field. Largely enough to sort.

fbaligand on 11 Jul 2017

👍1

Having the ability to sort on 'string' fields that you might also want to partial match filtering is an extremely important feature, in my opinion..

ArcticSnowman on 5 Jun 2018

👍2

Question from a customer, "Why can't I sort strings? Wouldn't this be the most basic feature of an analytics tool?"

Seems like this hasn't gotten much attention but it's a huge pain point for us

strawgate on 19 Jul 2018

@strawgate you can sort by string if you add the .keyword version of the field as a column in the doc table. I agree though, it's not obvious and not a good experience.

Bargs on 19 Jul 2018

@Bargs -- I started a post here with another workaround I found: https://discuss.elastic.co/t/index-mapping-type-text-and-keyword-vs-type-keyword-and-text/140805

Default Mapping generated by ES:

"Windows": {
  "type": "text",
  "fields": {
    "keyword": {
      "type": "keyword",
      "ignore_above": 256
    }
  }
}

Custom Mapping:

"Windows": {
  "type": "keyword",
  "ignore_above": 256,
  "fields": {
    "text": {
      "type": "text"
    }
  }
}

The custom mapping causes Kibana to allow sorting on the field without the ".keyword" workaround. I am however having a hard time figuring out what this might break or otherwise change. Do you have any idea? Might this be a viable workaround and we can just apply this custom mapping to text fields instead of the default mapping?

strawgate on 20 Jul 2018

Yep, that's a valid workaround. Ultimately you're just changing the names of the fields. I know of at least one member of our team who arranges their mappings in this way because it makes more sense to him that way.

Bargs on 20 Jul 2018

Personally, this is the first thing I do when I personalize Logstash elasticsearch template.

fbaligand on 20 Jul 2018

The latest update on this is that the Elasticsearch team is working on a new way to fetch field information, including multi-fields: https://github.com/elastic/elasticsearch/issues/49028

By using that new API once it's ready, we'll be able to display in Discover the most accurate representation of each document, combining _source and docvalues.

cc @kertal