I've enabled a bloom index on an Array(String) column, and I've noticed that because of this I'm seeing extremely high memory usage which is causing Clickhouse (version 19.17.4.11) to OOM.

As a result, it looks like there are a whole bunch of merges getting backed and not making progress. Server has 32G of RAM. Using perf, the one of the highest functions shown is DB::BloomFilter::addHashWithSeed which is why I think this is a result of the Bloom index.
metrics_qa :) select database, table, elapsed, progress, num_parts, formatReadableSize(memory_usage) from merges;
SELECT
database,
table,
elapsed,
progress,
num_parts,
formatReadableSize(memory_usage)
FROM merges
ββdatabaseββ¬βtableβββββββ¬βββββββelapsedββ¬βββββββββββββprogressββ¬βnum_partsββ¬βformatReadableSize(memory_usage)ββ
β default β histograms β 138.927456092 β 0.009250563257249815 β 12 β 2.05 GiB β
β default β histograms β 138.861809743 β 0.055173005336291 β 11 β 2.23 GiB β
β default β histograms β 138.829303978 β 0.05388030194829263 β 13 β 2.24 GiB β
β default β histograms β 138.759085752 β 0.011949443797662702 β 16 β 2.00 GiB β
β default β histograms β 138.74322428 β 0.01942913601668529 β 12 β 2.06 GiB β
β default β histograms β 138.717364362 β 0.022370385001223802 β 12 β 2.19 GiB β
β default β histograms β 138.685355371 β 0.03908257269084415 β 8 β 2.23 GiB β
β default β histograms β 138.649866072 β 0.0982202348644684 β 4 β 2.38 GiB β
β default β histograms β 138.368147204 β 0.07558041003556983 β 4 β 1.99 GiB β
β default β histograms β 138.286842186 β 0.04993548174615153 β 5 β 2.05 GiB β
β default β histograms β 138.251910277 β 0.06261820639570424 β 5 β 2.20 GiB β
β default β histograms β 138.243255183 β 0.058333314776566905 β 5 β 2.15 GiB β
β default β histograms β 138.219877609 β 0.27229364749861407 β 11 β 2.05 GiB β
ββββββββββββ΄βββββββββββββ΄ββββββββββββββββ΄βββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββββββββββ
13 rows in set. Elapsed: 0.046 sec.
metrics_qa :) select database, table, elapsed, progress, num_parts, formatReadableSize(memory_usage) from merges;
SELECT
database,
table,
elapsed,
progress,
num_parts,
formatReadableSize(memory_usage)
FROM merges
ββdatabaseββ¬βtableβββββββ¬ββββββelapsedββ¬ββββββββββββββprogressββ¬βnum_partsββ¬βformatReadableSize(memory_usage)ββ
β default β histograms β 14.460548537 β 0.0014619944948044384 β 12 β 599.14 MiB β
β default β histograms β 14.380980342 β 0.010725400336827147 β 11 β 612.94 MiB β
β default β histograms β 14.373428775 β 0.008424919941005757 β 13 β 553.63 MiB β
β default β histograms β 14.337396365 β 0.002216634563745533 β 16 β 614.03 MiB β
β default β histograms β 14.331145173 β 0.0037643951032327754 β 12 β 608.00 MiB β
β default β histograms β 14.218492431 β 0.003320279777547132 β 12 β 535.20 MiB β
β default β histograms β 14.163099015 β 0.00516794349630997 β 8 β 485.54 MiB β
β default β histograms β 14.133526059 β 0.01824524486646472 β 4 β 506.90 MiB β
β default β histograms β 14.103848452 β 0.013445953096215605 β 4 β 457.70 MiB β
β default β histograms β 14.071331435 β 0.01034111655563959 β 5 β 570.38 MiB β
β default β histograms β 13.857877555 β 0.012459197155920572 β 5 β 543.93 MiB β
β default β histograms β 13.84478482 β 0.008966428243041204 β 5 β 428.58 MiB β
β default β histograms β 13.820244621 β 0.0316512274015483 β 11 β 429.94 MiB β
ββββββββββββ΄βββββββββββββ΄βββββββββββββββ΄ββββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββββββββββ
13 rows in set. Elapsed: 0.043 sec.
metrics_qa :) select database, table, elapsed, progress, num_parts, formatReadableSize(memory_usage) from merges;
SELECT
database,
table,
elapsed,
progress,
num_parts,
formatReadableSize(memory_usage)
FROM merges
ββdatabaseββ¬βtableβββββββ¬ββββββelapsedββ¬βββββββββββββprogressββ¬βnum_partsββ¬βformatReadableSize(memory_usage)ββ
β default β histograms β 79.471550264 β 0.005503541730779245 β 12 β 1.33 GiB β
β default β histograms β 79.391981317 β 0.03652433628216813 β 11 β 1.50 GiB β
β default β histograms β 79.384428457 β 0.034385429061546756 β 13 β 1.49 GiB β
β default β histograms β 79.348397932 β 0.008764624022166247 β 16 β 1.56 GiB β
β default β histograms β 79.342144671 β 0.013883736778589699 β 12 β 1.49 GiB β
β default β histograms β 79.229500032 β 0.016518391893296982 β 12 β 1.69 GiB β
β default β histograms β 79.174108339 β 0.02235135562154062 β 8 β 1.35 GiB β
β default β histograms β 79.144535614 β 0.059905220644892496 β 4 β 1.46 GiB β
β default β histograms β 79.114858427 β 0.05633146665572432 β 4 β 1.52 GiB β
β default β histograms β 79.082340708 β 0.032979777123391125 β 5 β 1.38 GiB β
β default β histograms β 78.868886937 β 0.04081461137284325 β 5 β 1.42 GiB β
β default β histograms β 78.855796039 β 0.035865712972164815 β 5 β 1.30 GiB β
β default β histograms β 78.831255279 β 0.17175445457604885 β 11 β 1.45 GiB β
ββββββββββββ΄βββββββββββββ΄βββββββββββββββ΄βββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββββββββββ
Removing the bloom filter index (alter table histograms on cluster metrics_qa drop index tvl_bf;) drops the memory usage massively:
metrics_qa :) select database, table, elapsed, progress*100, num_parts, formatReadableSize(memory_usage) from system.merges;
SELECT
database,
table,
elapsed,
progress * 100,
num_parts,
formatReadableSize(memory_usage)
FROM system.merges
ββdatabaseββ¬βtableβββββββ¬βββββββelapsedββ¬βmultiply(progress, 100)ββ¬βnum_partsββ¬βformatReadableSize(memory_usage)ββ
β default β histograms β 166.999396023 β 12.882290374909966 β 13 β 266.11 MiB β
β default β histograms β 166.804492109 β 3.7402523443430375 β 16 β 352.81 MiB β
β default β histograms β 166.726134664 β 80.19551790831188 β 7 β 161.79 MiB β
β default β histograms β 166.725280153 β 1.2951098841319313 β 282 β 6.47 GiB β
β default β histograms β 166.716085174 β 66.28287656513143 β 9 β 199.50 MiB β
β default β histograms β 166.711055506 β 1.3800261116048564 β 243 β 5.96 GiB β
β default β histograms β 166.686738206 β 65.46335768719369 β 9 β 190.51 MiB β
β default β histograms β 166.662355568 β 68.70942321198352 β 8 β 184.57 MiB β
β default β histograms β 166.602394295 β 77.64397661812055 β 7 β 160.97 MiB β
β default β histograms β 166.58240519 β 72.00820582232703 β 6 β 149.59 MiB β
β default β histograms β 166.482173293 β 66.83028057574408 β 8 β 176.86 MiB β
β default β histograms β 166.430629511 β 68.56970075887493 β 7 β 171.74 MiB β
β default β histograms β 166.3774191 β 76.39837731533673 β 7 β 166.11 MiB β
ββββββββββββ΄βββββββββββββ΄ββββββββββββββββ΄ββββββββββββββββββββββββββ΄ββββββββββββ΄βββββββββββββββββββββββββββββββββββ
13 rows in set. Elapsed: 0.012 sec.
metrics_qa :)
Show your table definition (CREATE QUERY), please.
CREATE TABLE default.histograms (`Timestamp` DateTime CODEC(DoubleDelta, LZ4), `Client` LowCardinality(String), `Path` LowCardinality(String), `Tags.Key` Array(LowCardinality(String)), `Tags.Value` Array(String) CODEC(LZ4HC(0)), `Range` Array(Float32), `Steps` Float32, `IndexNotation` Int32, `Histogram` Array(Float32)) ENGINE = ReplicatedMergeTree('/clickhouse/tables/histograms/{shard_metrics_qa}', '{replica_metrics_qa}') PARTITION BY toStartOfDay(Timestamp) ORDER BY (Path, Timestamp) TTL Timestamp + toIntervalDay(21) SETTINGS index_granularity = 8192
Is the table schema. This is the index that was added:
alter table slomo.histograms on cluster metrics_qa add index tkv_bf Tags.Value TYPE bloom_filter(0.01) GRANULARITY 8192;
GRANULARITY is too big. Can it be adjusted to 3 or 5? detailed information see: https://github.com/ClickHouse/ClickHouse/issues/7517#issuecomment-549224927
For granularity = 5, bloom filter will build an index for the adjacent 5 granlules data block(one granlule size is specified when creating the MergeTree, the default 8192 row). GRANULARITY = 8192 is too big, which means that one index block has to compute the index information for 8192 * 8192 rows.
@zhang2014 Thanks for the information.
I've tried to recreate the OOM situation in my QA cluster and cannot replicate it. It might have just been a perfect storm of pending merges when the overly granular index was applied.
The bloom filter feature of ClickHouse is awesome and very fast to filter rows.
But, I think there are two issues in bloom filter.
GRANULARITY of index doesn't applied.making hashes for the elements of an array * the number of arrays * the number of merges.if you don't mind, I want to try to fix this issues.
Thank you.
@volfco
I am fixing the perfect storm of pending merges at https://github.com/ClickHouse/ClickHouse/pull/8242
Thank you.
@volfco
Some bugs are fixed at https://github.com/ClickHouse/ClickHouse/pull/8242.
If you are yet interested in the bloom_filter, please retest the perfect storm of pending merges.
And if the result of the test is fine then please close this issue,
Thank you.
@achimbab I don't have a solid idea of the conditions that existed for this to happen. So let's assume #8242 fixes it and I'll re-open this or link it to a new issue if something like this happens again