Ksql: Select queries with windows in them should return window metadata.

Created on 12 Dec 2017  路  6Comments  路  Source: confluentinc/ksql

SELECT queries with windows in them should return windowing data.

This query ("1"):

SELECT userid, sum(1)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY userid;

should return exactly what is returned from these queries ("2"):

CREATE TABLE t1 AS
  SELECT userid, sum(1)
  FROM clickstream
  WINDOW TUMBLING (SIZE 5 SECONDS)
  GROUP BY userid;
SELECT * FROM t1;

Currently, (1) returns:

9 | 1
5 | 4
36 | 2
24 | 5
27 | 6

and (2) returns:

1511910830000 | 28 : Window{start=1511910830000 end=-} | 28 | 3
1511910825000 | 7 : Window{start=1511910825000 end=-} | 7 | 10
1511910825000 | 8 : Window{start=1511910825000 end=-} | 8 | 9
1511910825000 | 21 : Window{start=1511910825000 end=-} | 21 | 8
1511910830000 | 26 : Window{start=1511910830000 end=-} | 26 | 4
enhancement

Most helpful comment

I agree, this would be a useful feature. I guess the syntax would be something like

SELECT ROWTIME, ROWKEY, userid, sum(1)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY userid;

All 6 comments

Any update on this guys?

@hjafarpour @apurvam

@tomlinsonr just to be clear - are you talking about the columns that are returned (ROWTIME, ROWKEY), or the different data values that you're seeing (1 vs 3, 4 vs 10, etc).

@rmoff. Correct. We would like to see the ROWTIME and ROWKEY returned.

I agree, this would be a useful feature. I guess the syntax would be something like

SELECT ROWTIME, ROWKEY, userid, sum(1)
FROM clickstream
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY userid;

@tomlinsonr : We just merged the related PR https://github.com/confluentinc/ksql/pull/1993, where we provide direct access to window start time and window end time via built-in functions.

Given an existing stream such as:

    CREATE STREAM TEST (ID bigint)
        WITH (kafka_topic='test_topic', value_format='DELIMITED', key='ID');

You can now write something like:

    CREATE STREAM s1 AS
        SELECT id, WindowStart(), WindowEnd()
        FROM test
        WINDOW TUMBLING (SIZE 30 SECONDS)
        GROUP BY id;

Could you take a look and let us know whether this would also help you here?

Was this page helpful?
0 / 5 - 0 ratings