I was looking at a slow query where removing the timezone option made the query 4x faster: 8s on average without the time_zone parameter and 32s on average with a time_zone.
This query filters one week of data in February with Europe/Berlin as a timezone (so all documents are on the same side of the daylight saving time boundary) and there are more than 1B matches.
Can we speed this up?
For the record this is less of an issue for timezones that do not implement daylight savings, so users might want to consider switching to Etc/GMT-1 instead of Europe/Berlin if that works for them.
cc @elastic/es-search-aggs
Joda does really quite a lot of work in trying to find the previous time zone transition, which I'd guess to be the expensive bit (and it appears on the stack trace in the investigation you were working on):
java.time looks to be a lot smarter - it sorts out the transitions by year and involves a cache:
Although I'm sure it'd be possible to put a cache around Joda, given that we're working on #27330 I think it would be a good idea to postpone any in-depth work on this until we can see what the effects of that would be.
This is good to know, thanks for sharing. I suspect that the Rounding class would be one of the easier ones to migrate so we might not have to wait long to know how much java.time helps.
I suspect that the Rounding class would be one of the easier ones to migrate...
Sounds like you're volunteering!
Did something break/changed between 5.X and 6.X ? I run three clusters in 5.1, 5.6.8 and 6.1 and DateHistogram aggregations are basically unusable for me with the 6.X cluster. When I put the same data in 5.X clusters, I don't have any performance issues.
Did something break/changed between 5.X and 6.X ? I run three clusters in 5.1, 5.6.8 and 6.1 and DateHistogram aggregations are basically unusable for me with the 6.X cluster. When I put the same data in 5.X clusters, I don't have any performance issues.
I don't think this is the right place to comment about this. If you can make a bash script that reproduces the issue against a clean cluster I'd file it as a separate issue. If you can't I'd take it to http://discuss.elastic.co/ .
Since this pops up rather often, I wanted to add another possible performance improvement suggestion.
Before starting aggregating the date histogram, we could actually check if the query had an overall date range filter applied. We could than check on the start and end date in that date range. If both of these dates are within the same daylight saving time period, we could actually use the offset this time zone had during that period as an absolute fixed time zone (e.g. if I am just doing a date histogram with the timezone Europe/Berlin for an overall date range of April 1st, 2018 to July 1st 2018, I could safely rewrite the aggregation to use UTC+2/Etc/GMT-2 timezone instead).
This would of course not solve the performance issue, when doing a date histogram over a period of time, that contained a DST switch, but already would improve for a lot of users, usually looking at smaller date ranges.
See also https://github.com/elastic/kibana/issues/18853 for a detailed meta issue on the Kibana side.
I agree this is a good idea. Unfortunately today queries and aggregations are kept completely unaware of each other so this would be hard to implement without adding unwanted dependencies.
Something less efficient than your proposal but that should cover a number of cases already would be to look at the min/max values that exist within the current shard and apply the optimization that you describe if all times within this interval have the same offset. With eg. daily indices, this optimization would still apply in most cases.
Working my way through agg issues. @jpountz, is this closeable now that #30534 merged, or was that just a partial solution to the slowdown?
It is partial, but I think it's good enough to close this issue. Thanks for the ping.
Most helpful comment
FYI we have a few issues related to this in Kibana. It's been on my todo list to look into possible solutions for awhile. I'm stoked to see the root problem may be solvable in ES.