Clickhouse: Attempt to read after eof and click house services shutting down

Created on 14 Jun 2019 · 25Comments · Source: ClickHouse/ClickHouse

i am trying to group by a distributed tables (~100 million records ) so two interesting issues are happening

1)It is throwing "Attempt to read after eof: error"
2)click house service on the distributed server is getting shut down

I really put my head down but still could get solution for this error.

Second
What exactly max_bytes_before_external_group_by&max_bytes_before_external_order_by` configuration signifies and if i change them what impact do they make because right now its value in 0 and the query that is causing problem do have a group by case
How to change max_bytes_before_external_group_by&max_bytes_before_external_order_by values

question question-answered

Source

I-akshay

Most helpful comment

should not use the disk? Why still reporting memory limits?

Because final aggregation result should fit into memory.

So for group by SOME_HIGH_CARDINALITY_SET_OF_COLUMNS, ch will do intermediate partial aggregations on a disk, but in the end CH will gather the result into one part in memory.

den-crane on 27 Jun 2019

👍2 👀1

All 25 comments

1) OOMKiller killed the clickhouse process (check dmesg in the shell).

If those parameters are zero (default) - clickhouse tries do to all group by & order by in memory (it will allocate up to max_memory_usage bytes, you probably have less than 10Gb of RAM).

Alternative option - is to do huge group by / order by operations on disks. It is slower, but allow to execute those operations if the data doesn't fit in memory.

filimonov on 14 Jun 2019

Alternative option - is to do huge group by / order by operations on disks. It is slower, but allow to execute those operations if the data doesn't fit in memory.

How to do this ??

I-akshay on 14 Jun 2019

set in the session using [ set max_bytes_before_external_group_by = .... ]
or users' profiles.

cat /etc/clickhouse-server/conf.d/memory_usage.xml
<?xml version="1.0" ?>
<yandex>
    <profiles>
        <default>
            <max_memory_usage_for_all_queries>101310968832</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>50655484416</max_bytes_before_external_group_by>
            <max_memory_usage>101310968832</max_memory_usage>
        </default>
    </profiles>
</yandex>

den-crane on 14 Jun 2019

set in the session using [ set max_bytes_before_external_group_by = .... ]
or users' profiles.
cat /etc/clickhouse-server/conf.d/memory_usage.xml
<?xml version="1.0" ?>
<yandex>
    <profiles>
        <default>
            <max_memory_usage_for_all_queries>101310968832</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>50655484416</max_bytes_before_external_group_by>
            <max_memory_usage>101310968832</max_memory_usage>
        </default>
    </profiles>
</yandex>
Hi den-crane i just check memory_usage.xml is not present do we need to create this xml ,if yes can you please help where i can download memory.config file with default setting

Even if i try to execute this statement
SET max_bytes_before_external_group_by = 100000000 it is throwing SET max_bytes_before_external_group_by = 100000000 error

I-akshay on 17 Jun 2019

@I-akshay , yes that file should be created.

@den-crane provided a full content of that file. I don't understand why can you create by your own and need to download it somewhere, but please take it here: memory_usage.zip

You need to:
1) unpack it to /etc/clickhouse-server/conf.d/ directory on your clickhouse server
2) edit it accordingly to avaliable RAM on your server - max_memory_usage should be lower than available amount of RAM (in bytes) for example try 80%, max_memory_usage_for_all_queries should be lower than max_memory_usage, try 70%.

filimonov on 17 Jun 2019

set in the session using [ set max_bytes_before_external_group_by = .... ]
or users' profiles.

cat /etc/clickhouse-server/conf.d/memory_usage.xml
<?xml version="1.0" ?>
<yandex>
    <profiles>
        <default>
            <max_memory_usage_for_all_queries>101310968832</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>50655484416</max_bytes_before_external_group_by>
            <max_memory_usage>101310968832</max_memory_usage>
        </default>
    </profiles>
</yandex>

Hi ,
@den-crane So , basically i have created a distributed table with on a cluster with 3 servers now i have done what you have suggested but still the issue is not resolved, The huge group by / order by operations is not spilling on disks, can you please suggest why this is happening

I-akshay on 18 Jun 2019

@filimonov
can we add distributed_aggregation_memory_efficient property in /memory_usage.xml

I-akshay on 18 Jun 2019

@filimonov
can we add distributed_aggregation_memory_efficient property in /memory_usage.xml

yes you can

check that settings are applied to your session

select * from system.settings where name = 'max_bytes_before_external_group_by'

den-crane on 18 Jun 2019

HI den-crane i was asking about distributed_aggregation_memory_efficient , i want to permanently update distributed_aggregation_memory_efficient to 1 ,
I do'nt want to set the property every time before the session

I-akshay on 19 Jun 2019

cat /etc/clickhouse-server/conf.d/memory_usage.xml
<?xml version="1.0" ?>
<yandex>
    <profiles>
        <default>
            <max_memory_usage_for_all_queries>101310968832</max_memory_usage_for_all_queries>
            <max_bytes_before_external_group_by>50655484416</max_bytes_before_external_group_by>
            <max_memory_usage>101310968832</max_memory_usage>

            <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
        </default>
    </profiles>
</yandex>

den-crane on 20 Jun 2019

┌─name─────────────────────────────────────┬─value───────┬
│ distributed_aggregation_memory_efficient │ 1 │
│ max_bytes_before_external_group_by │ 8000000000 │
│ max_bytes_before_external_sort │ 8000000000 │
│ max_memory_usage │ 16000000000 │
│ max_memory_usage_for_all_queries │ 16000000000 │
└──────────────────────────────────────────┴─────────────┴
↓ Progress: 2.00 billion rows, 356.47 GB (62.25 million rows/s., 11.08 GB/s.) █████████████████████████████████████████████████████████████████████████████████████████████████████████████████▊ 98Received exception from server (version 19.4.2):
Code: 241. DB::Exception: Received from 127.0.0.1:9000. DB::Exception: Memory limit (for query) exceeded: would use 14.91 GiB (attempt to allocate chunk of 16777216 bytes), maximum: 14.90 GiB.

hi, Set the above parameters, should not use the disk? Why still reporting memory limits?

liangliangxiao on 27 Jun 2019

should not use the disk? Why still reporting memory limits?

Because final aggregation result should fit into memory.

So for group by SOME_HIGH_CARDINALITY_SET_OF_COLUMNS, ch will do intermediate partial aggregations on a disk, but in the end CH will gather the result into one part in memory.

den-crane on 27 Jun 2019

👍2 👀1

should not use the disk? Why still reporting memory limits?

Because final aggregation result should fit into memory.

So for group by SOME_HIGH_CARDINALITY_SET_OF_COLUMNS, ch will do intermediate partial aggregations on a disk, but in the end CH will gather the result into one part in memory.

Yes, You are right. I can try it again.Thanks.

liangliangxiao on 3 Jul 2019

should not use the disk? Why still reporting memory limits?

Because final aggregation result should fit into memory.

So for group by SOME_HIGH_CARDINALITY_SET_OF_COLUMNS, ch will do intermediate partial aggregations on a disk, but in the end CH will gather the result into one part in memory.

@den-crane Could you please put that into documentation? it will be easier to refer to it than to a single comment in, not very transparent, issue report.

Is it normal that data (45GB in CSV - 3x string, 5x int, 1x float - not sure how big in mergetree engine) "aggregated" by high cardinality set of columns, resulting same number of rows, are running out of memory on 120 GB machine?

Memory limit (for query) exceeded: would use 120.03 GiB (attempt to allocate chunk of 134217728 b
ytes), maximum: 120.00 GiB.

jangorecki on 16 Aug 2019

Because final aggregation result should fit into memory.
So for group by SOME_HIGH_CARDINALITY_SET_OF_COLUMNS, ch will do intermediate partial aggregations on a disk, but in the end CH will gather the result into one part in memory.

But that's just untrue.

alexey-milovidov on 20 Aug 2019

👍1

ClickHouse can send aggregation result to client without allocating memory for the whole result.
It will do it by buckets of 1 / 256 of total result size.

alexey-milovidov on 20 Aug 2019

@den-crane
should not use the disk? Why still reporting memory limits?

Because final aggregation result should fit into memory.

So for group by SOME_HIGH_CARDINALITY_SET_OF_COLUMNS, ch will do intermediate partial aggregations on a disk, but in the end CH will gather the result into one part in memory.

---- how to fix it ?
my sql : insert into apps_count select appid,count(distinct userid) from table group by appid.
the appid has 160 0000 values.
I failed every time when I ran around 98%.
max_memory_usage / max_bytes_before_external_group_by , I have set . but it is useless.

kainever7 on 20 Sep 2019

👍1

@kainever7 You should set max_bytes_before_external_group_by to slightly less value that the total amount of memory. For example, if you have 16 GB of memory, it's reasonable to set max_memory_usage to 15000000000 (because some memory is needed for OS and for other programs in the system) and max_bytes_before_external_group_by to 12000000000 (because ClickHouse will need some extra memory to spill data on disk).

If you do distributed group by, also set distributed_aggregation_memory_efficient to 1.

alexey-milovidov on 18 Jan 2020

@I-akshay, do you have any further questions?

blinkov on 17 Apr 2020

@blinkov are those informations from this issue available somewhere in the documentation?

jangorecki on 18 Apr 2020

@jangorecki only partially, I think, there's always a lot of room for improvement

blinkov on 19 Apr 2020

👍1

@alexey-milovidov as the issue was closed, is there any change to get a commit id which resolved it? or a link to documentation directly. I would like to implement those informations that I am waiting for in this thread to be documented.

jangorecki on 27 Apr 2020

We have to copy the relevant information from comment https://github.com/ClickHouse/ClickHouse/issues/5618#issuecomment-575831276 into docs.

alexey-milovidov on 27 Apr 2020

@alexey-milovidov is there any issue tracker for that where I can follow the status of that?

jangorecki on 27 Apr 2020

I still see the same issue. I dont see clickhouse writing to disk :(