Ksql: GROUP BY should work with Scalar functions

Created on 12 Dec 2017  路  6Comments  路  Source: confluentinc/ksql

The following query should work:

SELECT
  TIMESTAMPTOSTRING(ROWTIME, 'yyyy'),
  sum(1)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY TIMESTAMPTOSTRING(ROWTIME, 'yyyy');

Currently, it fails with:

Query terminated
enhancement

Most helpful comment

Fixed. You can now group by scalar functions. You can also, optionally, use the _exact_ same clause in the select statement. So these are now supported:

-- with the group by in the select statement:
SELECT
  TIMESTAMPTOSTRING(ROWTIME, 'yyyy'),
  sum(1)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY TIMESTAMPTOSTRING(ROWTIME, 'yyyy');

-- without the group by in the select statement:
SELECT
  COUNT(*)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY TIMESTAMPTOSTRING(ROWTIME, 'yyyy');

All 6 comments

@tomlinsonr currently, GROUP BY only accepts column names.

To clarify, we would like group by to work with scalar functions as well, essentially we are asking this question in the context of a BI tool permitting business users to perform ad-hoc analysis on kafka streams and tables.

Any update on this one guys? This is a very common use case in BI. User will create a derived field (e.g. foo = TIMESTAMPTOSTRING(ROWTIME, 'yyyy') on the fly. They will then use this is an analytic SQL statement that will GROUP BY foo. We want to enable this without materializing the function in a stream / table first.

@miguno - any way to get this moving?

@apurvam @hjafarpour

Working on a a fix... duplicate of #1849?

Fixed. You can now group by scalar functions. You can also, optionally, use the _exact_ same clause in the select statement. So these are now supported:

-- with the group by in the select statement:
SELECT
  TIMESTAMPTOSTRING(ROWTIME, 'yyyy'),
  sum(1)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY TIMESTAMPTOSTRING(ROWTIME, 'yyyy');

-- without the group by in the select statement:
SELECT
  COUNT(*)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY TIMESTAMPTOSTRING(ROWTIME, 'yyyy');
Was this page helpful?
0 / 5 - 0 ratings