Clickhouse: Multiple Exceptions while doing for loop load into the table

Created on 8 May 2019  路  4Comments  路  Source: ClickHouse/ClickHouse

Hi,

I was trying to load the data of size 500 million into a table with 200 chunks of files with 5Million of data in each file.

few files loaded without any error but few are not and receiving the below exceptions.

Error 1:
Received exception from server (version 18.14.15):
Code: 76. DB::Exception: Received from 1.2..1:9. DB::Exception: Cannot open file /ti*/clickhouse/data/ins*/staging_contact**/tmp_insert_6b55e561a54f424a3441250045dc7123_51051_51051_0/company_name.mrk, errno: 24, strerror: Too many open files.

Error 2.

Code: 210. DB::NetException: Connection reset by peer while writing to socket (1.2.3.1:9*)

Error 3

Code: 252. DB::Exception: Received from 1.2.3.1:9*. DB::Exception: Too many parts (300). Merges are processing significantly slower than inserts..

Kindly help us with answers at the earliest.

Thanks.

question

Most helpful comment

  1. Kafka? https://github.com/yandex/ClickHouse/issues/3198
    MACOs? https://clickhouse.yandex/docs/en/operations/server_settings/settings/#max-open-files
    Wrong limits cat /proc/{CH-Server-PID}/limits

  2. ,3.
    CPU/IO exhaustion

<merge_tree>
    <parts_to_delay_insert>150</parts_to_delay_insert>
    <parts_to_throw_insert>900</parts_to_throw_insert>
    <max_delay_to_insert>5</max_delay_to_insert>
</merge_tree>

1,2,3 Wrong partitioning show create table TABLE

All 4 comments

  1. Kafka? https://github.com/yandex/ClickHouse/issues/3198
    MACOs? https://clickhouse.yandex/docs/en/operations/server_settings/settings/#max-open-files
    Wrong limits cat /proc/{CH-Server-PID}/limits

  2. ,3.
    CPU/IO exhaustion

<merge_tree>
    <parts_to_delay_insert>150</parts_to_delay_insert>
    <parts_to_throw_insert>900</parts_to_throw_insert>
    <max_delay_to_insert>5</max_delay_to_insert>
</merge_tree>

1,2,3 Wrong partitioning show create table TABLE

1 & 2 can be related (sockets are also files). - Check @den-crane answer.

"Too many parts" with high rate of inserts see https://github.com/yandex/ClickHouse/issues/3174#issuecomment-423435071

@den-crane i havent changed anything since the last load and it was working fine

below is the table schema

CREATE TABLE in.staging_contact (column 1, column 2,.......) ENGINE = MergeTree PARTITION BY category ORDER BY (topic, domain) SETTINGS index_granularity = 8192

above is the show create table TABLE

let me know if anything need to be changed in create table statement

PARTITION BY category

CH was designed to use monthly partitioning (partition by toYYYYMM(somedate/time)) to manage data retention (drop old data). So the design suggests small number of partitions and a single digit number of partitions over batch.
If you insert 5mil rows per insert and this batch has hundreds of different categories CH would separate this batch to hundreds parts which follows to huge random I/O (to merge them) and as a result to the error "Too many parts".

Was this page helpful?
0 / 5 - 0 ratings

Related issues

innerr picture innerr  路  3Comments

atk91 picture atk91  路  3Comments

jimmykuo picture jimmykuo  路  3Comments

vvp83 picture vvp83  路  3Comments

vixa2012 picture vixa2012  路  3Comments