Clickhouse: Smoosh column bin,mrk files in part to a single larger file to reduce inode number

Created on 1 Aug 2019 · 7Comments · Source: ClickHouse/ClickHouse

Use case
Too many small files in part directory and finally inode exhausts, and maybe 'too many open files'.
When there are 300 tables and the partition expression is yyyyMMdd type, each table ttl is 1 year, each table has 100 columns, then we get 300 * 365 * 100 * 2=21,900,000 files

Change partition expression to yyyyMM may help, but yyyyMM can't help if I want to replace one-day data atomically(reimport hive day partition into clickhouse when hive partition data is updated).

I found issues #4617 #5166 are related. PR #5171 add a max_parts_in_total limit, but I don't think it is the best way.

Describe the solution you'd like
Smoosh column bin,mrk files to a single larger file to reduce file number in part directory when this part is unlikely to change (the data is generated many days ago, and insert/update/delete is unlikely to happen). https://druid.apache.org/docs/latest/design/segments.html the druid way can be an example: small files are smooshed into meta.smoosh and 00000.smoosh(or another 00001.smoosh if 00000.smoosh is larger than 2GB).
In this way, the file number is reduced from 21,900,000 to 21,900

If it is difficult to judge what is the best timing to do the smoothing automatically, I suggest the https://clickhouse.yandex/docs/en/query_language/misc/#misc_operations-optimize optimize final query can do this

feature st-fixed

Source

kaijianding

👍1

Most helpful comment

@hustnn https://github.com/ClickHouse/ClickHouse/blob/master/docs/ru/extended_roadmap.md#16-полиморфные-куски-данных, doesn't seem to be available in english yet.

nvartolomei on 9 Dec 2019

😄1 👍1

All 7 comments

This issue will be addressed by development of "polymorphic parts" that is currently in progress by @CurtizJ

alexey-milovidov on 3 Nov 2019

@alexey-milovidov @CurtizJ

Is any more detail about polymorphic parts? Thanks.

hustnn on 9 Dec 2019

@hustnn https://github.com/ClickHouse/ClickHouse/blob/master/docs/ru/extended_roadmap.md#16-полиморфные-куски-данных, doesn't seem to be available in english yet.

nvartolomei on 9 Dec 2019

😄1 👍1

@nvartolomei Thanks. Let me use google translate first to take a look at the basic idea. I am also facing this issue now.

hustnn on 10 Dec 2019

It is proposed to allow pieces of tables such as MergeTree to arrange data in different formats. Namely: - in RAM; - on a disk with all columns in one file; - on a disk with columns in separate files.

Does one table have only one format or one table can have different formats (part in memory, part in disk) ?

hustnn on 10 Dec 2019

Does one table have only one format or one table can have different formats (part in memory, part in disk)?

One table can have parts in multiple different formats. E.g. small parts in compact (write-optimized) format and larger parts in wide (read optimized) format.

This feature is implemented and available in version 20.3. Synopsis:

CREATE TABLE test.hits_compact AS test.hits
    ENGINE = MergeTree ORDER BY ...
    SETTINGS min_bytes_for_wide_part = '10M'

alexey-milovidov on 26 Apr 2020

❤1

Next step: enable it with reasonable threshold by default.

alexey-milovidov on 26 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

What is the status of LowCardinality feature, where is it documented ?

healiseu · 3Comments

Error: INTO OUTFILE is not allowed

vixa2012 · 3Comments

[Question] Manually update clickhouse dictionary

igor-sh8 · 3Comments

max_execution_time does not work when filtering a distributed table

SaltTan · 3Comments

jdbc driver

bseng · 3Comments