Clickhouse: Clickhouse and mysterious activity

Created on 1 Mar 2017 · 8Comments · Source: ClickHouse/ClickHouse

Hello,

I've inserted some data into clickhouse but recently stopped the stream (and greped for clickhouse client, the only one running was the one with which I'm checking the database).

However when I run top I see:

1687 clickho+ 20 0 10.099g 748716 16548 S 240.2 2.3 365:20.17 clickhouse-serv

Clickhouse-server is EATING 1-4 of my cpus. Is this normal activity ? My queries are waaay slower during this process and I can't find its cause.

Edit:

I should mention that iostats is showing the db writes to disk and I can see it modifying the size of a table file (both increases and decreases by ~5%) the table is a MergeTree and this has been going on for about 40 minutes.

question

Source

George3d6

Most helpful comment

You could adjust size of thread pool for background operations.
It could be set in users.xml in background_pool_size in /profiles/default/.
Default value is 16. Lowering the size of this pool allows you to limit maximum CPU and disk usage.
But don't set it too low.

Also you could adjust some fine settings for MergeTree.
They are located in config.xml in <merge_tree> section.
List of available options along with documentation and default values is located in MergeTreeSettings.h file. These settings are intended only for fine tuning.

alexey-milovidov on 3 Mar 2017

👍2

All 8 comments

Almost certainly this is the merge process. MergeTree table consists of a number of parts that are sorted by the primary key. Each INSERT statement creates at least one new part. Merge process periodically selects several parts and merges them into one bigger part that is also sorted by primary key. Yes, this is normal activity.

ztlpn on 1 Mar 2017

You could do SELECT * FROM system.merges to show currently running background merges (also known as "compactions").

alexey-milovidov on 1 Mar 2017

Isn't there any way to stop these merging when executing queries ? They appear to be quite greedy with both processor time and I/O. Can't I set a "limit" for how many resources merges can take ? Or even better, set the db up in such a way that merges stop when I run SELECT queries.

George3d6 on 2 Mar 2017

Ok, since there was no update, I'm going to phrase the question otherwise:

If I were to modify the code in so I could be able to set a flag using the client in order to stop all background operations whilst running a query, do you think that would break anything important ? Or just slow down the merges ?

George3d6 on 3 Mar 2017

alexey-milovidov on 3 Mar 2017

👍2

Hy, I am a coworker of George3d6,
Thank you very much for the response, and this AMAZING piece of software!
I would like to add some more info:
We left click house "digest" those 1B row / 165 colum, about 170GB of data for one night, but activity is still on.

We are running some preliminary test, so hardware spec was "modest" a AMD FX6300 (6 cores) with 32GB ram, we have both SSD and HDD, but the problem of the slowdown is probably connected with some internal mutex.

(Psss ehi! on which hardware do you run clickhouse ? For networking what you suggest, I am planning to wire up in Infiniband 40gbit)

So I digged in the source code, and I fear that even setting very few thread is not the right options...

Simply because inside BackgroundProcessingPool::threadFunction() the Tasks = std::multimap; will fill up.

Certain task are almost instant but other really takes times (and once finished it restart after very few moment), so the benefit will be VERY small.

I will try to collect a list of the type of background job, the only question is

IS possible / OK that I insert a condition to "sleep" those background thread ?
OR there are certain task that CANNOT be delayed

If certain task can not be delayed, not a big problem, I will only "avoid" the execution of the background Merge job for "a bit", of course if a thread is still running, I will let it finish.

Thank you

RoyBellingan on 3 Mar 2017

👍1

There are two types of background actions:

merging of data parts;
fetching of data part from replica.

When you use non-replicated MergeTree, there are only tasks of first type.
And when you use ReplicatedMergeTree, in fact there is single type of tasks, that look at replication queue and then doing either merge or fetch.

Background tasks could be delayed a bit. It doesn't matter if the task will be executed right now or few seconds later.

If you delay merges for long time, you will face a risk of having too much data parts in a table. It leads to increase of SELECT latency, and also, when number of data parts (within single partition) raised up to few hundreds, INSERTs start to be throttled.

If you delay fetching from replica, you will get replication lag.

alexey-milovidov on 4 Mar 2017

the problem of the slowdown is probably connected with some internal mutex

You could check this easily.

First check SELECT * FROM system.merges. Do you see high number of running merges?

Run top. You will see high CPU or high disk usage.
If you see high CPU usage, run sudo perf top (requires to install linux-tools-common).
Then you will see top functions that are run by CPU.

To check disk usage, run iotop, iostat -dmx 1 and dstat.

alexey-milovidov on 4 Mar 2017

👍1

Was this page helpful?

0 / 5 - 0 ratings