Kibana version: 7.9.3
Describe the bug:
On cold load of a dashboard with multiple visualisations each visualisation get its index patterns. It is often the case that index pattern is the same.
Multiple visualisation in parallel calling: https://github.com/elastic/kibana/blob/master/src/plugins/data/common/index_patterns/index_patterns/index_patterns.ts#L360 for the same index_pattern.
There is a cache mechanism, but it isn't used until an index pattern is loaded. So when getting the same index pattern from multiple places until it is loaded savedObjectsClient.get will be called multiple times with for the same object.
Then savedObjectsClient.get has a batching mechanism, but it still retrieves multiple identical saved objects.
Steps to reproduce:
bulk_get requestExpected behavior:
No redundant index patterns objects flying over the network.
Screenshots (if relevant):

Any additional context:
Pinging @elastic/kibana-app-arch (Team:AppArch)
OR we could improve batching mechanism in SO client with dedupe. in this case everyone would benefit
@elastic/kibana-platform, do you think it makes sense to improve on SO client level?
Thanks for finding this @Dosant - very helpful. I think it would be worthwhile to alter the Index Pattern service cache to address this, independent of the decision to fix this at a saved object client level.
@elastic/kibana-platform, do you think it makes sense to improve on SO client level?
Well, short answer would be no. bulkGet does what it is supposed to do, it bulkGets from ES, convert to SO and return the result. Also other consumers may be using the same API, and expects the number or resulting objects to have the same length as the requested objects, changing this behavior to remove dupes would be a breaking changes for them.
It's really at the consumer discretion to have the appropriate logic to avoid requested the same object multiple times when performing a bulkGet call imho.
@rudolf, wdyt?
Also other consumers may be using the same API, and expects the number or resulting objects to have the same length as the requested objects, changing this behavior to remove dupes would be a breaking changes for them.
processBatchQueue levelHi,
One note here is that the dashboard with 50 panels is loading very fast in Kibana 6.8.3 and in the bulk_get the request JSON has the index pattern ids only once and not as many times as many panels are on the dashboard.
The behavior in 6.8.3:
The behavior in 7.9.3:
Of course, with 5 panels the loading time is not so different but with 50 panels after the bulk_get the browser spends 20+ seconds to parse the response in 7.9.3.
Regards,
Robert
It is still possible to make inner optimization without changing the behavior, correct?
Sorry, I misunderstood the problem. I thought you were calling bulkGet with duplicates... You are actually calling get multiple times with the same type/id tuple, and the resulting bulkGet that fires from
is sending the duplicates to the backend in the bulk_get call.
In that case you are right. Even if it's an implementation detail, this should be optimized at the SO client's level. I would call that an enhancement/ perf optimization though, not a bug, as that doesn't lead to any error.
Just opened https://github.com/elastic/kibana/pull/82603 that closes #82581. Should I also indicate that the PR fixes this issue?
Should I also indicate that the PR fixes this issue?
I think so, yes, thank you!
Could be closed unless @mattkime also wants to polish this on index pattern level.
IMO we are good with the fix in core.
@Dosant I don't see a reason to make index pattern related changes considering @pgayvallet 's PR.
@mattkime, just thought of the small bit we still could improve on index pattern side:
With https://github.com/elastic/kibana/pull/82603 we optimized data over the network, but on index pattern layer we still waste time on scripting while initializing index patterns from saved object, because we check for cache hit only in the beginning of get(){..} method.
This redundant processing could accumulate into significant blocking time chunks for large index patterns when we load the dashboard with bunch of visualization:
Here is a 7.9 example:

I think we should check a cache hit twice:
in 2 it is possible that some other task has already put a final index pattern object into cache and we no longer need to spend cpu ticks to initialize the same index pattern again. This micro optimization should safe significant amount of cpu time on initial loading of large dashboards with large index patterns.
pseudo code:
Now:
get = async (id) => {
if (cache.has(id)) return cache.get(id);
const so = await fetch(id);
const ip = await init(so);
cache.set(id, ip);
return ip;
}
Improvement::
get = async (id) => {
if (cache.has(id)) return cache.get(id);
const so = await fetch(id);
if (cache.has(id)) return cache.get(id); // <---- check cache again to avoid possible redundant init() call and safe cpu cycles
const ip = await init(so);
cache.set(id, ip);
return ip;
}
unsubscribing now that https://github.com/elastic/kibana/pull/82603 is closed. Feel free to re-ping if you need anything else from our side.
@Dosant Thats a good point. I'm also thinking of the case where we are loading the field list on the fly. We'd be hitting the field caps api pretty hard if we're not caching at the appropriate level. I agree that this is worth doing.