This test class can occupy around 40% of :server:test time.
> Task :server:test
Slow Tests Summary:
45.63s | org.elasticsearch.search.aggregations.bucket.histogram.AutoDateHistogramAggregatorTests
Pinging @elastic/es-search-aggs
Can I work on this?
@ekalgolas I had been planning to do it myself, but if you'd like to, you're welcome to it if you're still interested?
Sure. I will investigate the slowness in the test code and mention my findings here. Is there anything specific you know that is responsible for it which you wanted to address here?
I tried to do some refactor work to avoid re computations for date and made some common objects global. This only yields 10-15% performance improvement in best case scenario. This actual bottleneck are the testAll* test which try to test intervals with a huge dataset (~600) and assert on the search and reduce cases with 5-6 variations. Each one of those tests takes about 3-4 secs on my machine, and there are about 5 of those, and total execution time being 20-23 secs.
An easy fix would be to either reduce the dataset size or the number of variations, for which I need to know the importance of having these variations and the huge dataset size. If this is absolutely important (which does not seem like for every point in the dataset), we will have to figure out how to run these tests faster keeping these parameters same. Otherwise, we can either rewrite these tests to have the edge data points (with a few regular cases) or reduce the variations tested in a way that the functionality is still tested with less data and assertions
Thoughts @pcsanwald ?
Hi @ekalgolas - I apologize for the delayed response on this. If you're still interested in working on this, I think the right approach is to:
1) move the unit tests to used randomized parameters (perhaps a randomized range would be good) as opposed to the exhaustive set of variations.
2) move the benchmarking part of this to use rally
If you're still interested (I again apologize for the delay in responding), and want to take a crack at the first thing, I can handle the benchmarking on the rally side.
CC @colings86 for an expert opinion on whether the above seems right :).
Sure. I`ll get started and submit a pull request for the same. Thanks for the response
@pcsanwald Created a pull request. I was able to bring down the run-time by 60%