ClickHouse, NUMA and memory allocation

Created on 16 Jan 2017  路  2Comments  路  Source: ClickHouse/ClickHouse

Hi. We are going to deploy an application which uses ClickHouse on our prestable environment. So from now we are interesting on CH behavior on NUMA design when working set for a query will not fit into memory segment of a CPU.

Dataset is about 200GiB, working set ~30GiB
Servers: SUN, 16x CPU, 256GiB RAM

The question is: what is a preferable server configuration for such cases? Will CH swap the memory if it doesn't fit the segment? Should we apply some specific configuration to operation system or so?

Most helpful comment

CH behavior on NUMA design when working set for a query will not fit into memory segment of a CPU

ClickHouse allocates most of memory during query execution. Query execution is done with multiple threads and each thread allocates almost equal amout of memory. It means that memory will spread on all available NUMA-nodes almost uniformly. Memory allocations are (mostly) query-local.

There are no big, long-living caches inside ClickHouse. ClickHouse mostly rely on OS page cache.
There are also some internal caches: mark cache, uncompressed cache, and index - they are usually small (less that tens of GB even on large servers).

Servers: SUN, 16x CPU, 256GiB RAM

ClickHouse runs on x86_64. It has not ported to SPARC. What exact server architecture do you use?

The question is: what is a preferable server configuration for such cases? Will CH swap the memory if it doesn't fit the segment? Should we apply some specific configuration to operation system or so?

Better to disable swap.

If you will experience an issue, when there are large amount of free memory, but you get OOMs, set NUMA policy for ClickHouse to interleave with numactl (you may edit init.d script for that purpose).
Due to memory allocation pattern of ClickHouse, this issue is very unlikely to happen.

All 2 comments

CH behavior on NUMA design when working set for a query will not fit into memory segment of a CPU

ClickHouse allocates most of memory during query execution. Query execution is done with multiple threads and each thread allocates almost equal amout of memory. It means that memory will spread on all available NUMA-nodes almost uniformly. Memory allocations are (mostly) query-local.

There are no big, long-living caches inside ClickHouse. ClickHouse mostly rely on OS page cache.
There are also some internal caches: mark cache, uncompressed cache, and index - they are usually small (less that tens of GB even on large servers).

Servers: SUN, 16x CPU, 256GiB RAM

ClickHouse runs on x86_64. It has not ported to SPARC. What exact server architecture do you use?

The question is: what is a preferable server configuration for such cases? Will CH swap the memory if it doesn't fit the segment? Should we apply some specific configuration to operation system or so?

Better to disable swap.

If you will experience an issue, when there are large amount of free memory, but you get OOMs, set NUMA policy for ClickHouse to interleave with numactl (you may edit init.d script for that purpose).
Due to memory allocation pattern of ClickHouse, this issue is very unlikely to happen.

Thank you for explanation - it makes sense for us :)

Was this page helpful?
0 / 5 - 0 ratings