:) select * from system.merges
SELECT *
FROM system.merges
Ok.
0 rows in set. Elapsed: 0.002 sec.
Based on my understanding, the Clickhouse will start to merge the data after it has been loaded into the DB. When querying directly on unmerge data, the query performance is not optimized.
Question> How do we know the Clickhouse finishes all pending merge operations?
Is it true that we can check this through the system.merges? If the table is empty, then we can assume that the merging operations are done?
Thank you
Based on my understanding, ...
it's not quite right. There is a merge scheduler which plans merges for a table to reduce number of parts without causing I/O starvation. All merges will be finished eventually = never.
You should not bother about unmerged data. 99.99% poor performance cases related to other issues such as wrong partitioning / wrong where clauses / sort-key misdesign.
Is it true that we can check this through the system.merges?
No. system.merges reflects running merges. Merge could be scheduled to run in some future.
select count() parts_count, database,table,partition from system.parts
where table like '%your_table%' and active group by database,table,partition
order by parts_count desc
you can check parts_count , if it's >1 then it's still can be merged.
And it's also is not quite right because parts of the table can reach max size and will not be merged anymore.
Thank you!
Most helpful comment
it's not quite right. There is a merge scheduler which plans merges for a table to reduce number of parts without causing I/O starvation. All merges will be finished eventually = never.
You should not bother about unmerged data. 99.99% poor performance cases related to other issues such as wrong partitioning / wrong where clauses / sort-key misdesign.
No. system.merges reflects running merges. Merge could be scheduled to run in some future.
select count() parts_count, database,table,partition from system.parts
where table like '%your_table%' and active group by database,table,partition
order by parts_count desc
you can check parts_count , if it's >1 then it's still can be merged.
And it's also is not quite right because parts of the table can reach max size and will not be merged anymore.