Clickhouse: how do we check when Clickhouse finishes all merge operations?

Created on 18 Dec 2018  路  2Comments  路  Source: ClickHouse/ClickHouse

:) select * from system.merges

SELECT *
FROM system.merges
Ok.
0 rows in set. Elapsed: 0.002 sec.

Based on my understanding, the Clickhouse will start to merge the data after it has been loaded into the DB. When querying directly on unmerge data, the query performance is not optimized.

Question> How do we know the Clickhouse finishes all pending merge operations?
Is it true that we can check this through the system.merges? If the table is empty, then we can assume that the merging operations are done?

Thank you

Most helpful comment

Based on my understanding, ...

it's not quite right. There is a merge scheduler which plans merges for a table to reduce number of parts without causing I/O starvation. All merges will be finished eventually = never.

You should not bother about unmerged data. 99.99% poor performance cases related to other issues such as wrong partitioning / wrong where clauses / sort-key misdesign.

Is it true that we can check this through the system.merges?

No. system.merges reflects running merges. Merge could be scheduled to run in some future.

select count() parts_count, database,table,partition from system.parts
where table like '%your_table%' and active group by database,table,partition
order by parts_count desc

you can check parts_count , if it's >1 then it's still can be merged.
And it's also is not quite right because parts of the table can reach max size and will not be merged anymore.

All 2 comments

Based on my understanding, ...

it's not quite right. There is a merge scheduler which plans merges for a table to reduce number of parts without causing I/O starvation. All merges will be finished eventually = never.

You should not bother about unmerged data. 99.99% poor performance cases related to other issues such as wrong partitioning / wrong where clauses / sort-key misdesign.

Is it true that we can check this through the system.merges?

No. system.merges reflects running merges. Merge could be scheduled to run in some future.

select count() parts_count, database,table,partition from system.parts
where table like '%your_table%' and active group by database,table,partition
order by parts_count desc

you can check parts_count , if it's >1 then it's still can be merged.
And it's also is not quite right because parts of the table can reach max size and will not be merged anymore.

Thank you!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

opavader picture opavader  路  3Comments

jangorecki picture jangorecki  路  3Comments

bseng picture bseng  路  3Comments

jangorecki picture jangorecki  路  3Comments

vixa2012 picture vixa2012  路  3Comments