Kibana: [Index Patterns] Excessive index patterns fetching on a dashboard

Created on 30 Oct 2020 · 13Comments · Source: elastic/kibana

Kibana version: 7.9.3

Describe the bug:

On cold load of a dashboard with multiple visualisations each visualisation get its index patterns. It is often the case that index pattern is the same.

Multiple visualisation in parallel calling: https://github.com/elastic/kibana/blob/master/src/plugins/data/common/index_patterns/index_patterns/index_patterns.ts#L360 for the same index_pattern.

There is a cache mechanism, but it isn't used until an index pattern is loaded. So when getting the same index pattern from multiple places until it is loaded savedObjectsClient.get will be called multiple times with for the same object.

Then savedObjectsClient.get has a batching mechanism, but it still retrieves multiple identical saved objects.

Steps to reproduce:

Open a dashboard
find a bulk_get request
one of the requests would be for index patterns and in response you will see bunch of identical objects. Index patterns SO could be huge and it could cause significant impact on first load.

Expected behavior:

No redundant index patterns objects flying over the network.

Screenshots (if relevant):

Any additional context:

We could improve caching mechanisms on index pattern service level, by caching on-going promises.
OR we could improve batching mechanism in SO client with dedupe. in this case everyone would benefit

Dashboard Index Patterns Saved Objects AppServices bug

Source

Dosant

All 13 comments

Pinging @elastic/kibana-app-arch (Team:AppArch)

elasticmachine on 30 Oct 2020

OR we could improve batching mechanism in SO client with dedupe. in this case everyone would benefit

@elastic/kibana-platform, do you think it makes sense to improve on SO client level?

Dosant on 30 Oct 2020

Thanks for finding this @Dosant - very helpful. I think it would be worthwhile to alter the Index Pattern service cache to address this, independent of the decision to fix this at a saved object client level.

mattkime on 2 Nov 2020

@elastic/kibana-platform, do you think it makes sense to improve on SO client level?

Well, short answer would be no. bulkGet does what it is supposed to do, it bulkGets from ES, convert to SO and return the result. Also other consumers may be using the same API, and expects the number or resulting objects to have the same length as the requested objects, changing this behavior to remove dupes would be a breaking changes for them.

It's really at the consumer discretion to have the appropriate logic to avoid requested the same object multiple times when performing a bulkGet call imho.

@rudolf, wdyt?

pgayvallet on 2 Nov 2020

Also other consumers may be using the same API, and expects the number or resulting objects to have the same length as the requested objects, changing this behavior to remove dupes would be a breaking changes for them.

It is still possible to make inner optimization without changing the behavior, correct?
Alternatively this probably could instead be improved on processBatchQueue level

Dosant on 2 Nov 2020

Hi,

One note here is that the dashboard with 50 panels is loading very fast in Kibana 6.8.3 and in the bulk_get the request JSON has the index pattern ids only once and not as many times as many panels are on the dashboard.

The behavior in 6.8.3:

you have 3 panels with 2 indexA and 3 indexB patterns
bulk_get sends only one indexA and one indexB IDs in the request JSON

The behavior in 7.9.3:

you have 3 panels with 2 indexA and 3 indexB patterns
bulk_get sends indexA ID two times and indexB ID three times in the request JSON

Of course, with 5 panels the loading time is not so different but with 50 panels after the bulk_get the browser spends 20+ seconds to parse the response in 7.9.3.

Regards,
Robert

milkfinch on 3 Nov 2020

👍1

It is still possible to make inner optimization without changing the behavior, correct?

Sorry, I misunderstood the problem. I thought you were calling bulkGet with duplicates... You are actually calling get multiple times with the same type/id tuple, and the resulting bulkGet that fires from

https://github.com/elastic/kibana/blob/d4b2a5145a9aa813146e2c587f7be2929f09fb7e/src/core/public/saved_objects/saved_objects_client.ts#L162

is sending the duplicates to the backend in the bulk_get call.

In that case you are right. Even if it's an implementation detail, this should be optimized at the SO client's level. I would call that an enhancement/ perf optimization though, not a bug, as that doesn't lead to any error.

Created https://github.com/elastic/kibana/issues/82581

pgayvallet on 4 Nov 2020

👍1

Just opened https://github.com/elastic/kibana/pull/82603 that closes #82581. Should I also indicate that the PR fixes this issue?

pgayvallet on 4 Nov 2020

Should I also indicate that the PR fixes this issue?

I think so, yes, thank you!

Could be closed unless @mattkime also wants to polish this on index pattern level.
IMO we are good with the fix in core.

Dosant on 4 Nov 2020

@Dosant I don't see a reason to make index pattern related changes considering @pgayvallet 's PR.

mattkime on 4 Nov 2020

@mattkime, just thought of the small bit we still could improve on index pattern side:

With https://github.com/elastic/kibana/pull/82603 we optimized data over the network, but on index pattern layer we still waste time on scripting while initializing index patterns from saved object, because we check for cache hit only in the beginning of get(){..} method.

This redundant processing could accumulate into significant blocking time chunks for large index patterns when we load the dashboard with bunch of visualization:
Here is a 7.9 example:
Screenshot 2020-11-05 at 10 01 09

I think we should check a cache hit twice:

once before fetching index patterns
and then after fetch has resolved.

in 2 it is possible that some other task has already put a final index pattern object into cache and we no longer need to spend cpu ticks to initialize the same index pattern again. This micro optimization should safe significant amount of cpu time on initial loading of large dashboards with large index patterns.

pseudo code:

Now:

 get = async (id) => {
    if (cache.has(id)) return cache.get(id);
    const so = await fetch(id);
    const ip = await init(so);
   cache.set(id, ip);
   return ip;
 }

Improvement::

 get = async (id) => {
    if (cache.has(id)) return cache.get(id);
    const so = await fetch(id);
    if (cache.has(id)) return cache.get(id);  // <---- check cache again to avoid possible redundant init() call and safe cpu cycles
    const ip = await init(so);
   cache.set(id, ip);
   return ip;
 }

Dosant on 5 Nov 2020

unsubscribing now that https://github.com/elastic/kibana/pull/82603 is closed. Feel free to re-ping if you need anything else from our side.

pgayvallet on 5 Nov 2020

👍1

@Dosant Thats a good point. I'm also thinking of the case where we are loading the field list on the fly. We'd be hitting the field caps api pretty hard if we're not caching at the appropriate level. I agree that this is worth doing.

mattkime on 5 Nov 2020

Was this page helpful?

0 / 5 - 0 ratings