Kibana: Can't build visualizations on text fields

Created on 4 Apr 2016 · 30Comments · Source: elastic/kibana

Selecting a text field as the target for an aggregation returns the following error:

Fielddata is disabled on text fields by default. Set fielddata=true on [agent] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

screen shot 2016-04-04 at 3 55 07 pm

Another thing that's odd is that the http response from ES containing the error has a 200 status code. I'm not sure if that's an intentional change in ES, but it doesn't seem right to me.

blocker bug v5.0.0

Source

Bargs

Most helpful comment

Now that https://github.com/elastic/elasticsearch/pull/17980 is merged, we should be able to fix this.

Bargs on 2 May 2016

👍2

All 30 comments

Another thing that's odd is that the http response from ES containing the error has a 200 status code. I'm not sure if that's an intentional change in ES, but it doesn't seem right to me.

I presume you're talking about the msearch response? It is correct that msearch will return 200 - the msearch request completed correctly, you have to look at the individual items to see if they executed correctly or not. This is just like the bulk API.

clintongormley on 5 Apr 2016

Not sure why this would be a P1 blocker. Until we have a way to ask elasticsearch what fields are "aggregatable" we simply have to give the user errors. Ideally the error would say something about using the not-analyzed variant of the chosen field (should it exist), but perhaps this is an enhancement we can bring to this UI issue.

spalger on 6 Apr 2016

@spalger Text fields replaced analyzed string fields. Aggregating on analyzed string fields didn't used to throw an error. It wasn't recommended, but it didn't throw an error.

Bargs on 6 Apr 2016

Sure, but text fields not being aggregatable by _default_ is a new behavior in elasticsearch that bubbles up to Kibana, how is that a bug in Kibana?

If the user wants to continue to aggregate on these value types they should do as the error message suggests and "set fielddata=true".

spalger on 6 Apr 2016

What type of solution do you imagine here?

spalger on 6 Apr 2016

TBH when I read the error message, I didn't know enough about fielddata and how it relates to aggregations to understand that it essentially meant "this field is not aggregatable". I would guess most users would have the same reaction. So if text fields can't be aggregated on by default, we should hide them from the field list in the vis editor by default.

Bargs on 6 Apr 2016

yeah, this is why I said:

Until we have a way to ask elasticsearch what fields are "aggregatable" we simply have to give the user errors.

spalger on 7 Apr 2016

We used to try and guess which fields were aggregateable, but the details about what qualifies/disqualifies a field are quite complex and have changed in the past without actually causing any breaks in Kibana. Users then started filing issues (#3335, #5914) about how elasticsearch had added the ability to aggregate on fields in some new scenario and we had no workaround for them, Kibana was simply going to prevent them from aggregating on that field until the next version was released.

This is why we did https://github.com/elastic/kibana/pull/5806, and why we fall back to the error message that elasticsearch chooses to explain the issue.

spalger on 7 Apr 2016

The historical context helps... I understand what you're saying. But this change in ES defaults, combined with our policy to simply throw an ES error if we get one, is going to lead to a really terrible user experience. By default, half of a user's string fields (all the non-raw fields) are going to throw a really cryptic error in their face. If this happened to me as a brand new Kibana user, I might just assume the app is broken. This is worse than previous versions where the defaults worked, and the user would only get an error if they intentionally messed with advanced mapping options.

I don't know what an acceptable solution would be since I don't know all the details of the previous discussions about removing the bucketable property, but I feel like at the very least we need to give the user some sort of warning or more friendly error message. Longer term, getting this into ES becomes much more important.

Bargs on 7 Apr 2016

Now that https://github.com/elastic/elasticsearch/pull/17980 is merged, we should be able to fix this.

Bargs on 2 May 2016

👍2

Just tried out Kibana 5 alpha and running into this issue as well.

I have to agree with @Bargs that the error is cryptic and I don't know what to do from here.
Since the error message suggests an option to fix the problem, it would be nice to have a way to do so in the UI, but I don't see any obvious one (no option to set fielddata=true in options for the mappings.)

streamnsight on 27 Jun 2016

Just realize there is a new .keyword extension after the text field to build visualization...

Seems to work, but it raises a question: is this a 'representation' for the UI or an actual new field ?
What if I have a nested field ending with .keyword ? Is it going to be interpreted as the field that can be aggregated or am I going to see two fields with the same name?

streamnsight on 27 Jun 2016

@streamnsight in 5.0 strings are mapped as multi fields with text and keyword versions by default: https://www.elastic.co/guide/en/elasticsearch/reference/master/breaking_50_mapping_changes.html#_default_string_mappings. So .keyword isn't a UI only construct, it's coming from elasticsearch.

Bargs on 27 Jun 2016

@Bargs thanks for the link...
Can you confirm: does that mean keyword is now a reserved field name, and I can't have a nested key called mytextfield.keyword ?

streamnsight on 27 Jun 2016

@streamnsight It's not reserved, it's just a default. You can override that default by creating your own mappings for the field in your index, or index template.

Or if you want to disable the automatic multi-field entirely, you can edit the _default_ mappings for all indices.

Bargs on 27 Jun 2016

👍1

Once Kibana starts using the feature added in https://github.com/elastic/elasticsearch/pull/17980, this problem should go away as the text field won't be shown as aggregatable

clintongormley on 27 Jun 2016

This has an even uglier result in Graph UI. If you use the text field there you get a server 500 error. Elasticsearch and Kibana are showing the same error.
cc @markharwood

Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [agent] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory.

but Graph is showing a 500 error;

graphui_error

LeeDr on 29 Jun 2016

👍1

Still getting this error on fresh install of Elasticsearch 5 alpha 5 and Kibana 5 alpha 5. elastic/elasticsearch#17980 has not fixed this.

irab on 16 Aug 2016

@irab Kibana needs to start using the feature added in https://github.com/elastic/elasticsearch/pull/17980 before you'll see any difference

clintongormley on 16 Aug 2016

Hi @clintongormley. I took a look at that issue - it's tagged "v5.0.0.-alpha3" and was committed to Master back in April. I'm assuming it's in the version i'm using - v5.0.0.-alpha5 release Aug 9th.

irab on 17 Aug 2016

@irab i repeat: Kibana needs to _start using_ the feature, which will mean not showing fields that shouldn't be used in aggregations.

clintongormley on 17 Aug 2016

Thanks for the clarification. Hard to tell what is enabled...

irab on 17 Aug 2016

I'm not sure I understand how to proceed. I see published tutorials like this which use the text of tweets to do what I want to do (the "Graphing Tweet Text Contents" section).

What do I need to change to allow this sort of analysis and where do I need to change it? I'm trying to recreate the example using twitter data.

zjost on 3 Feb 2017

@zjost I wasn't able to find the data set this blog post is using, but I suspect entities.hashtags.text is one giant string in the source JSON. It would be better to split that string into an array prior to indexing and then select the keyword version of the field. The other option is to turn on fielddata for the text version of the field to make it aggregatable, which would be fine if you're just playing around with things in a local environment, but it can suck up a lot of memory so you generally don't want to use it in production.

Bargs on 3 Feb 2017

Using keyword version of the fields doesn't work in kibana graph workspace.

Graphing UI make a REST call to http://localhost:5601/api/graph/graphExplore, which returns an empty response : {"ok":true,"resp":{"took":0,"timed_out":false,"failures":[],"vertices":[],"connections":[]}} .

ES&Kibana versions being used : 5.1.2

pavankumarb on 4 Feb 2017

@Bargs thanks! Is there a way to run the text through the standard analyzer before using the keyword method? I like the keyword functionality, but it only makes sense if you can first standardize the text strings or #DataScience != #datascience

zjost on 5 Feb 2017

Is there a way to run the text through the standard analyzer before using the keyword method?

See normalizers in 5.2

markharwood on 6 Feb 2017

@pavankumarb Checkout the troubleshooting docs for no results

markharwood on 6 Feb 2017

So there's no way to use an analyzer and then index the tokens? The whole point is to do stemming...etc and find patterns in documents. It seems that's exactly what many of the old tutorials do, but there are new defaults that make this either difficult or impossible. Is there anyway to recreate the result where, i.e. given the text field of a tweet one can use Graph on the field so that tokens that are significantly related are represented by the graph? Not full tweet text, but tokens within. Thanks again for the help.

zjost on 6 Feb 2017

@zjost the only alternative would be to enable fielddata on the text field - just be aware that it is going to use a lot of memory

clintongormley on 7 Feb 2017

👍1

Was this page helpful?

0 / 5 - 0 ratings