Kibana: Result streaming on the discover tab

Created on 18 Dec 2017  路  10Comments  路  Source: elastic/kibana

In version 4 of Kibana, the _Discover_ tab was filled by doing multiple _msearch HTTP requests, one to each index matching the index pattern.
In version 6 now a single _msearch request is now done to display the _Discover_ tab.

This meant that when viewing the _Discover_ tab in Kibana 4, there was one request per index which, while inefficient, performed much better on slower Elasticsearch hardware. Queries over billions of documents in a hundred indexes would load progressively - slowly - 100 small HTTP requests but they would not timeout.

With Kibana 6 and a single _msearch request there is no progressive loading, and the single query for everything needs to complete and return before showing any data - 1 big HTTP request. On slower Elasticsearch instances this means that it can often timeout on large collections of data.

It would seem very useful to have a toggle to bring back this progressive loading with multiple queries, one per index. This currently is a blocker for us upgrading to Kibana 6.

I discussed this issue previously here:
https://discuss.elastic.co/t/discover-tab-timing-out-with-single-msearch-request/110325

Discover KibanaApp enhancement

Most helpful comment

Turns out this might be solvable in ES.

All 10 comments

Splitting up the requests was always a bit hacky and significantly complicated the codebase. We could explore some options if it affects a large number of users, but this is the first I've heard of this problem.

Out of curiosity, what takes longer for you, the query itself or the aggregation for building the date histogram? You can grab the request Discover is sending from your browser's dev tool and play around with it in Kibana's dev tool app to get some timings.

I've done some profiling and if I'm interpreting this correctly the DateHistogramAggregator is the thing that is slow here, specifically the collect phase:
https://pastebin.com/Y4ve1yjS

Thanks, I'll have to think about this some more and do some additional testing. Might take a little while with the holidays coming up. Just a guess, but if the date histogram is the bottleneck it might speed things up if you select a larger date interval instead of relying on whatever auto picks.

Turns out this might be solvable in ES.

If the only issue here was the date histogram, we could close that issue in favor of #18853.

@Bargs please feel free to close this ticket, if there is no other request in this ticket except the time zone problematic, now described in #18853

I've done some basic tests and not found any issues to do with speed due to the timezone settings in Kibana. I can do some more thorough tests on this on Tuesday.

My issue is that a query to view the discover histogram on a billion documents is taking about 5 minutes (2 node cluster) and timing out. If this query is split up into chunks (indexes ala Kibana 4) then it doesn't time-out. Not sure if the timezone issue is causing the query to take a long time - or if a histogram query over that many documents is likely to take that long. As I say I will test more thoroughly on Tuesday thanks.

I've done some testing and the timezone is definitely one factor in this, but the query still times out and takes too long to be useful:

Query timing:
With time_zone:"Europe/London": ~170s
With time_zone:"UTC": ~90s

This ticket is a request for the return of the Histogram loading behaviour from Kibana 4 - loading the data one index at a time - which meant that we did not get timeouts. With this behaviour we would have started to see results after just a few seconds and then the data would fill up as it completes.

As it is this new behaviour makes the Discover tab mostly unusable to us. The ability to disable it in #17065 is helpful - but not really the solution we were looking for.

Thanks for the additional info @jgough. Out of curiosity, how slow is the query if you remove the date histogram agg completely?

I agree some form of progressive loading might be nice for slow queries. Interval based patterns and field stats are going away so we couldn't implement it the same way as in Kibana 4, but we could still use simple date ranges to break up the query.

Also keep in mind in the short term you can increase the timeout settings in kibana.yml if you don't mind waiting for the slow queries to load.

Just managed to get a few more quick benchmarks. Note this is a slightly different time range so the results are not quite comparable to above

With histogram agg (Europe/London): >~120s (timed out)
With histogram agg (UTC): ~57s
Without histogram agg: ~18s

We have a similar concern with the loading of the Discover tab. We have a 2B+ document and growing logging cluster. A search of 7 days of logs often matches over 1B results. While we don't get timeouts, the discover tab loads pretty slow, about 30s when caches are warm. I tested the query without aggs and it does cut the time in about half, but Kibana doesn't give an option to remove the histogram from the Discover tab. Changing the timezone doesn't make much difference since we are on a later version of ES that solves the timezone problem.

All this loading is blocking and you get nothing until the entire search is done. If I search an individual day that matches ~250M documents I get a 6s load time (with caches warm). I would think you could get a much better user experience using progressive and/or parallel loading. If I took that same 7 day search and broke it up into 7 1 day searches, the total query time would be higher, but kibana could start showing results on the screen sooner. There would also be the opportunity to parallelize the requests which would bring the total time to full results down much lower.

We are running Elasticsearch and Kibana 7.4 on AWS.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bhavyarm picture bhavyarm  路  3Comments

cafuego picture cafuego  路  3Comments

timmolter picture timmolter  路  3Comments

tbragin picture tbragin  路  3Comments

LukeMathWalker picture LukeMathWalker  路  3Comments