Presto: Count distinct on multiple columns

Created on 15 May 2016  路  3Comments  路  Source: prestodb/presto

Hi there,

In our system, queries like:
SELECT COUNT(DISTINCT _col0, _col1, _col2) .....
are commonly used.

Surely we can rewrite these to CONCAT etc., before submitting to presto.
Is there any plan to support these kind of queries, or any other suggestions?
Thanks.

Most helpful comment

The next version (0.148) will have this using the row constructor:

SELECT count(DISTINCT row(col1, col2, col3)) FROM ...

All 3 comments

The next version (0.148) will have this using the row constructor:

SELECT count(DISTINCT row(col1, col2, col3)) FROM ...

This is currently in the master branch. You can try it out now:

presto> select count(distinct orderstatus || ',' || clerk) from tpch.tiny.orders;
 _col0 
-------
  2310 
(1 row)

presto> select count(distinct row(orderstatus, clerk)) from tpch.tiny.orders;
 _col0 
-------
  2310 
(1 row)

@electrum
this can't work, when col1 / col2 / col3s value contains Null.

SELECT count(DISTINCT row(col1, col2, col3)) FROM ...

Was this page helpful?
0 / 5 - 0 ratings