Edited by @miguno on Nov 23, 2018:
From @apurvam at https://github.com/confluentinc/ksql/issues/506#issuecomment-367848878
I take this feature to mean that we should implement the
DISTINCTkeyword such that the following query would count the number of distinct values forfieldin the window:SELECT COUNT(DISTINCT field) FROM stream WINDOW TUMBLING (size 1 hour);
Original ticket was empty.
hello, how does the ksql support distinct syntax
I take this feature to mean that we should implement the DISTINCT keyword such that the following query would count the number of distinct values for field in the window:
SELECT COUNT(DISTINCT field) FROM stream window tumbling (size 3600 seconds);
+1 for this
+1 for this
+1 for this would solve my issue
can someone kindly suggest a workaround for now
Can we just do a
SELECT COUNT(TOPKDISTINCT(field, maxint)) FROM stream window tumbling (size 3600 seconds);
@Jianyi-Ren
cannot use this SQL.
select col1,count(TOPKDISTINCT(field,10)) from stream window_tumbing(size 100 seconds);
error:
Caused by:Can't find any function with the name 'TOPKDISTINCT'
I try select count(sum(cols)) from stream;
This statement is cannot use KSQL.
error:can't find any function with the name 'COUNT'
@Jianyi-Ren
cannot use this SQL.
select col1,count(TOPKDISTINCT(field,10)) from stream window_tumbing(size 100 seconds);
error:
Caused by:Can't find any function with the name 'TOPKDISTINCT'
The issue here is piping the output of one function (TOPKDISTINCT) into another function (COUNT). If you simply had TOPKDISTINCT by itself, you would not see that error.
Assuming that the above piping worked, the query would give correct results only if we are certain that the number of distinct elements in the stream is less than K.
so,How to use count distinct in KSQL?
Hi , I wanted to check , if we can create UDFs, I mean a scalar function to give only the distinct values of the stream like the TOPKDISTINCT function but it works only with group by function. Can we just create a Distinct function in sql eg. select distinct EMPLOYEE_ID from EMPLOYEE in ksql. if yes? guide. I will build the UDFs.
Thanks for your help
Adding to the UDF epic: #3556
Staged for next release: https://github.com/confluentinc/ksql/pull/4150
Do we have any ETA when we can expect #4150 to appear in main builds and ksql docker images?
@vitalikaz This shipped in 0.7.0 -- my fault for not closing this issue. :) See the release notes.
@MichaelDrogalis Nice! One small thing, which might be out of topic though. I can't find any docs on how confluentinc/ksqldb-server docker images relate to confluentinc/cp-ksql-server images and how often do they sync with each other in terms of new features?
@vitalikaz Good question. This is something we're working on documenting precisely right now. To keep this issue on topic, swing by #ksqldb in the Confluent Community Slack room and I can give the quick rundown.
Most helpful comment
I take this feature to mean that we should implement the
DISTINCTkeyword such that the following query would count the number of distinct values forfieldin the window: