Elasticsearch: Possibility to add new ordering scheme for Aggregation results.

Created on 11 Sep 2017  路  9Comments  路  Source: elastic/elasticsearch

When using "term" bucket aggregation, the data can be ordered by term in Ascending order or in descending order. There is no other option available. Like :- Ordering based on day of week.
This feature would also solve problems like #10543
Is it something which can be implemented via a Plugin? If so then how can I attach my Ordering logic?

:AnalyticAggregations >feature Analytics help wanted

Most helpful comment

Chatted about this a bit in Fixit Thursday. This is something we'd like to support...somehow. Not entirely sure where it belongs at the moment.

  • ~We could add scripting as a order option on the Terms aggregation, which would allow the user to order things however they want. That would be conceptually similar to how we allow scripted scoring for search hits.~

    ~The main disadvantage I see here is that the user could design some kind of sort algo that has really bad or unbounded errors. We're trying to remove other sort orders (https://github.com/elastic/elasticsearch/issues/17614, https://github.com/elastic/elasticsearch/issues/17588) because they can have unbounded error, so it'd be a shame to reintroduce a foot-gun mechanism. I suspect it would be difficult for users to reason about easily.~

  • We could extend the bucket_sort pipeline aggregation to allow scripting for similar effect. The advantage of this approach is that bucket_sort can be applied to many different types of aggregations, and scripting would add a lot of flexibility. Being a pipeline agg, we don't have any of the unbounded error problems associated with sorting a terms agg.

    The disadvantage is that it's just sorting buckets, so a result has to make it into the top n to be sortable to begin with.

  • Something else that I'm not thinking of :)

Regardless of where it ends up, it feels like a gap in our features that you can't sort buckets with custom logic.

I'm personally leaning towards extending the bucket_sort to support scripts as the best option.

All 9 comments

Like :- Ordering based on day of week.

From where this day of week comes from? Can you please provide a sample of documents, aggregation and output you'd like? That would help to understand your request. Thanks

Lets say we have a bunch of alerts :-

PUT /dailyalerts/server/_bulk
{ "index" : { "_id" : 1 } }
{ "alertLevel" : "1", "dayOfWeek" : "Monday" }
{ "index" : { "_id" : 2 } }
{ "alertLevel" : "7", "dayOfWeek" : "Friday" }
{ "index" : { "_id" : 3 } }
{ "alertLevel" : "2", "dayOfWeek" : "Wednesday" }
{ "index" : { "_id" : 4 } }
{ "alertLevel" : "3", "dayOfWeek" : "Tuesday" }
{ "index" : { "_id" : 5 } }
{ "alertLevel" : "6", "dayOfWeek" : "Monday" }

Now i want to count the alerts bucket them on the day of week they arrive and would be pretty useful if I could order them based on the day of week. Like :- Monday, Tuesday, Wednesday, etc. And similarly many people would like to have their own ordering logic so having option to add such features via plugins might help.

GET /dailyalerts/server/_search
{
"size": 0,
"aggs": {
"Alerts by Weekday": {
"terms": {
"field": "dayOfWeek.keyword",
"order": {
"_term": "customDefinedOrder_DOW"
}
}
}
}
}

This becomes more prominent when you want your Kibana visualizations to show up in a specific order and leads to issues like this :-
https://github.com/elastic/kibana/issues/10543
Which happened because of this :- https://github.com/elastic/elasticsearch/commit/b01e3f0d3b35f15032ec90f2838d51873b961566#diff-dfda7f92593222e55a117eb83d3ea89eR61

On top of it- Elasticsearch being a search engine, it would be very helpful for search results to be ordered via custom logic as well because you certainly can't cater everyone. So why not give possibility to define custom ordering logic via a plugin?

Now i want to count the alerts bucket them on the day of week they arrive and would be pretty useful if I could order them based on the day of week.

For now I think that the easiest solution would be to index the day of week as a number with Monday=0, Tuesday=1 etc.

On top of it- Elasticsearch being a search engine, it would be very helpful for search results to be ordered via custom logic as well because you certainly can't cater everyone. So why not give possibility to define custom ordering logic via a plugin?

For search results one can use Script Based Sorting.

For aggregation results I think we can open a discussion on this.

cc @elastic/es-search-aggs

Chatted about this a bit in Fixit Thursday. This is something we'd like to support...somehow. Not entirely sure where it belongs at the moment.

  • ~We could add scripting as a order option on the Terms aggregation, which would allow the user to order things however they want. That would be conceptually similar to how we allow scripted scoring for search hits.~

    ~The main disadvantage I see here is that the user could design some kind of sort algo that has really bad or unbounded errors. We're trying to remove other sort orders (https://github.com/elastic/elasticsearch/issues/17614, https://github.com/elastic/elasticsearch/issues/17588) because they can have unbounded error, so it'd be a shame to reintroduce a foot-gun mechanism. I suspect it would be difficult for users to reason about easily.~

  • We could extend the bucket_sort pipeline aggregation to allow scripting for similar effect. The advantage of this approach is that bucket_sort can be applied to many different types of aggregations, and scripting would add a lot of flexibility. Being a pipeline agg, we don't have any of the unbounded error problems associated with sorting a terms agg.

    The disadvantage is that it's just sorting buckets, so a result has to make it into the top n to be sortable to begin with.

  • Something else that I'm not thinking of :)

Regardless of where it ends up, it feels like a gap in our features that you can't sort buckets with custom logic.

I'm personally leaning towards extending the bucket_sort to support scripts as the best option.

+1 for doing this in the bucket_sort pipeline aggregation

Any update on this feature?

No news @jainraj. If/when there is movement someone will update this ticket, or reference it from a PR. We would like to implement it by adding scripting to the bucket_sort pipeline agg (instead of adding sorting to the order parameter of terms agg).

The issue is marked help wanted so it's something that is up for grabs if someone wants to work on it. :)

I'd like to work on this.

Was this page helpful?
0 / 5 - 0 ratings