Elasticsearch: Cannot order by Group By aggregation results in SQL

Created on 30 Apr 2018  路  5Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version): Version: 7.0.0-alpha1-SNAPSHOT, Build: default/zip/707ba28/2018-04-27T09:12:51.745354Z, JVM: 1.8.0_144

Plugins installed: none

JVM version (java -version): java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

OS version (uname -a if on a Unix-like system): Darwin MustafaAkin.local 17.4.0 Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64 x86_64

Description of the problem including expected versus actual behavior:

Steps to reproduce:

  1. Setup metricbeat (or any data that you can take average + group by)
  2. SELECT avg(x) as val, name FROM index GROUP BY val
  3. See results not ordered

For reference, the query is as follows:

GET _xpack/sql?format=txt
{
    "query": "SELECT avg(system.process.memory.rss.bytes) AS total, system.process.name FROM metricbeat* GROUP BY system.process.name ORDER BY total LIMIT 100"
}

       total        |system.process.name
--------------------+-------------------
3.3009664E7         |CalendarAgent      
3.3805892963265306E8|Google Chrome      
2.754828747755102E8 |Google Chrome H    
2.34139648E8        |Slack Helper       
1.3305953523809524E8|Spotify            
8.14535317979798E8  |Spotify Helper     
1.3287424E7         |XprotectService    
3.394526625185185E8 |goland             
1.9761845169230768E8|iTerm2             
1.0084054522828283E9|idea               
1.5776596712727273E9|java               
2.3113728E7         |metricbeat         
2.38579712E8        |node               
1.644544E7          |quicklookd         
6209536.0           |zsh               

But the results is not ordered and it can be seen from translate API as well.

{
  "size": 0,
  "_source": false,
  "stored_fields": "_none_",
  "aggregations": {
    "groupby": {
      "composite": {
        "size": 100,
        "sources": [
          {
            "123671": {
              "terms": {
                "field": "system.process.name",
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggregations": {
        "123838": {
          "avg": {
            "field": "system.process.memory.rss.bytes"
          }
        }
      }
    }
  }
}
:Query LanguageSQL >bug

Most helpful comment

This has been fixed through #30585 (ordering on a non-grouping column is not allowed anymore).

All 5 comments

Pinging @elastic/es-search-aggs

@costin could you take a look at this please? I think the order on the terms agg should be _term (which is implicitly ascending) rather than asc?

Relates to https://github.com/elastic/elasticsearch/issues/29900
First off, thanks for trying out SQL.

For GROUP BY, composite aggs is used (not terms) which allows ordering _only_ on the grouping keys (see the docs).

Ordering on anything else, means the data has to be reshuffled before being set to the client, which like the terms aggs doesn't scale...

The bug here is that we're not validating (yet) the query and silently discarding the order.

For GROUP BY, composite aggs is used (not terms) which allows ordering only on the grouping keys (see the docs).

My bad, I didn't see the composite agg was being used above and just saw terms. Thanks for clarifying the issue @costin

This has been fixed through #30585 (ordering on a non-grouping column is not allowed anymore).

Was this page helpful?
0 / 5 - 0 ratings