Clickhouse: Filling empty elements column items values not-empty strings by the condition

Created on 11 Aug 2019  Â·  4Comments  Â·  Source: ClickHouse/ClickHouse

I have table with column T (time) and Tag. How can you easily fill the empty values in the Tag column with the values of the previous not-empty and the next not-empty value of the Tag column if the difference between them is no more than 2 minutes? For example, in the screenshot, the selected area should be filled with 290350 values
Subquery are still difficult for me and I don't really want to use them. Thanks))
image

question

All 4 comments

There's ASOF JOIN that could help you. The idea is

  1. make SELECT * FROM your_table AS t1 ASOF JOIN your_table AS t2 ON t1.T >= t2.T WHERE t2.Tag != ''
  2. then make resulting column depending on comparison of t1.T and t2.T: if(t1T - t2.T > _some_, t1.Tag, t2.Tag) AS resultTag.

But there're restrictions:

  • ASOF JOIN need one more column with equality in ON sercion.
  • It works only in one direction: t2.T is less or equalt to t1.T

So it's possible to implement 'fill with the previous not-empty value' straight forward. There's also an issue to allow t1.T <= t2.T condition fo ASOF JOIN. If it's done it also would be possible to implement 'fill with the next not-empty value'. But it's difficult to mix the cases, cause it's not clear where it should be previous and where it should be the next.

Another idea you may try is to round your T column to 2 minutes intervals and JOIN table to itself with ANY LEFT JOIN on rounded T values. It would be faster then ASOF JOIN variant, but would contain some rounding error.

Thanks for the advice.
Tried to implement first part. It turns out the following:
image

  1. You need newer version for ASOF JOIN with ON syntax. Older ones support USING syntax only.
  2. ASOF needs at least one equality for now. You need t1.some = t2.some AND t1.Activity_Start >= t2.Activity_Start or using(some, Activiti_Start)
Was this page helpful?
0 / 5 - 0 ratings