Simple SELECT query leads to Memory limit (total) exceeded due to wrong reporting by MemoryTracker (no real usage detected in top)
I execute a simple query (select a from b where c = 1) in one thread sequentially via HTTP:
for i in `seq 1 500`; do curl "http://ch:8123/?query=SELECT%20a%20FROM%20%60db%60.b%20WHERE%20c%3D1%20format%20JSON"; done;
It leads to this error in 15 seconds
DB data ~200mb compressed.
It consumes ~4MB per request and accumulates up to the 3.6GB limit. I see a continuous growth in MemoryTracker, which, if you do not stop the requests (or switch to other request), will not be reset.
But there is no real growth in the top.
Result contains 0 rows if it matters
Version 20.9.2.20 stable, in 20.6 and 20.8 it was the same.
Changing config max_server_memory_usage_to_ram_ratio is a dirty fix
20.11
for i in `seq 1 5000`; do curl -s -o /dev/null "http://localhost:8123/?query=select+*+from+system.query_log+limit+100+format+JSON" ; echo -n 'ps rss:' `ps -C clickhouse-serv -o rss=`' | '; echo "select metric,':', formatReadableSize(value) from system.metrics where metric like 'MemoryTracking'"|clickhouse-client; done;
ps rss: 454268 | MemoryTracking : 1.33 GiB
ps rss: 457828 | MemoryTracking : 1.33 GiB
ps rss: 470712 | MemoryTracking : 1.33 GiB
ps rss: 474896 | MemoryTracking : 1.34 GiB
ps rss: 476052 | MemoryTracking : 1.34 GiB
ps rss: 476776 | MemoryTracking : 1.34 GiB
....
ps rss: 576948 | MemoryTracking : 3.11 GiB
ps rss: 576948 | MemoryTracking : 3.11 GiB
ps rss: 576948 | MemoryTracking : 3.12 GiB
ps rss: 576948 | MemoryTracking : 564.27 MiB
ps rss: 576948 | MemoryTracking : 569.13 MiB
ps rss: 576948 | MemoryTracking : 573.99 MiB
ps rss: 576948 | MemoryTracking : 578.85 MiB
ps rss: 576948 | MemoryTracking : 583.71 MiB
ps rss: 576948 | MemoryTracking : 588.58 MiB
ps rss: 578004 | MemoryTracking : 593.44 MiB
....
ps rss: 509984 | MemoryTracking : 11.26 GiB
ps rss: 511040 | MemoryTracking : 11.26 GiB
ps rss: 511040 | MemoryTracking : 11.27 GiB
ps rss: 511040 | MemoryTracking : 11.27 GiB
ps rss: 511040 | MemoryTracking : 495.03 MiB
ps rss: 511304 | MemoryTracking : 499.58 MiB
ps rss: 511304 | MemoryTracking : 504.13 MiB
ps rss: 511568 | MemoryTracking : 508.68 MiB
Does not matter FORMAT JSON or TSV or CSV or...
clickhouse-client instead of curl
for i in `seq 1 5000`; do clickhouse-client --query 'select * from system.query_log limit 100 format TSV' >/dev/null ; echo -n 'ps rss:' `ps -C clickhouse-serv -o rss=`' | '; echo "select metric,':', formatReadableSize(value) from system.metrics where metric like 'MemoryTracking'"|clickhouse-client; done;
ps rss: 503380 | MemoryTracking : 597.90 MiB
ps rss: 503908 | MemoryTracking : 597.90 MiB
ps rss: 513920 | MemoryTracking : 597.90 MiB
ps rss: 526284 | MemoryTracking : 597.90 MiB
ps rss: 519276 | MemoryTracking : 597.90 MiB
ps rss: 531672 | MemoryTracking : 597.90 MiB
ps rss: 530152 | MemoryTracking : 597.90 MiB
ps rss: 533580 | MemoryTracking : 597.90 MiB
ps rss: 534636 | MemoryTracking : 597.90 MiB
ps rss: 535692 | MemoryTracking : 597.90 MiB
ps rss: 545948 | MemoryTracking : 597.90 MiB
ps rss: 556692 | MemoryTracking : 597.90 MiB
ps rss: 550620 | MemoryTracking : 597.90 MiB
ps rss: 554052 | MemoryTracking : 597.90 MiB
ps rss: 554844 | MemoryTracking : 597.90 MiB
ps rss: 557748 | MemoryTracking : 597.90 MiB
ps rss: 551328 | MemoryTracking : 597.90 MiB
ps rss: 552912 | MemoryTracking : 597.90 MiB
....
ps rss: 636528 | MemoryTracking : 627.13 MiB
ps rss: 637056 | MemoryTracking : 627.13 MiB
ps rss: 636988 | MemoryTracking : 627.13 MiB
ps rss: 636988 | MemoryTracking : 627.13 MiB
ps rss: 636644 | MemoryTracking : 627.14 MiB
ps rss: 637700 | MemoryTracking : 627.14 MiB
...
ps rss: 645712 | MemoryTracking : 594.29 MiB
ps rss: 643912 | MemoryTracking : 594.29 MiB
ps rss: 646816 | MemoryTracking : 594.29 MiB
ps rss: 657112 | MemoryTracking : 594.29 MiB
It consumes ~4MB per request and accumulates up to the 3.6GB limit. I see a continuous growth in MemoryTracker, which, if you do not stop the requests (or switch to other request), will not be reset.
Indeed, I can confirm this.
I won't clutter up the comment with details, but if you are curios you can take a look at PRs they contains excessive description (also you can a look into commit messages).
$ for i in {1..5000}; do curl -s -o /dev/null "http://localhost:8123/" -d@- <<<"select groupArray(repeat('a', 1000)) from numbers(10000) group by number%10 FORMAT JSON" ; echo -n 'ps rss:' `ps -C clickhouse -o rss=`' | '; echo "select metric,':', formatReadableSize(value) from system.metrics where metric like 'MemoryTracking'"| ./clickhouse client; done
ps rss: 188684 | MemoryTracking : 157.84 MiB
ps rss: 188684 | MemoryTracking : 157.85 MiB
ps rss: 188684 | MemoryTracking : 157.85 MiB
ps rss: 188684 | MemoryTracking : 157.86 MiB
ps rss: 188684 | MemoryTracking : 157.87 MiB
ps rss: 188684 | MemoryTracking : 157.87 MiB
ps rss: 188684 | MemoryTracking : 157.88 MiB
ps rss: 188684 | MemoryTracking : 157.89 MiB
ps rss: 188684 | MemoryTracking : 157.89 MiB
ps rss: 188684 | MemoryTracking : 157.90 MiB
ps rss: 188684 | MemoryTracking : 157.90 MiB
ps rss: 188684 | MemoryTracking : 157.91 MiB
ps rss: 188684 | MemoryTracking : 157.92 MiB
ps rss: 188684 | MemoryTracking : 157.92 MiB
ps rss: 190072 | MemoryTracking : 157.93 MiB
ps rss: 190072 | MemoryTracking : 157.94 MiB
ps rss: 190072 | MemoryTracking : 157.94 MiB
ps rss: 190072 | MemoryTracking : 157.95 MiB
ps rss: 190072 | MemoryTracking : 157.95 MiB
ps rss: 190072 | MemoryTracking : 157.96 MiB
ps rss: 190072 | MemoryTracking : 157.97 MiB
ps rss: 190072 | MemoryTracking : 157.97 MiB
ps rss: 190072 | MemoryTracking : 157.98 MiB
clichouse-client (actually for any query via newly created tcp connection), with 0.06MiB of drift (while before it was 4MiB), but it should be fixed in a different way:$ for i in {1..5000}; do ./clickhouse client --max_threads=1 -q "select groupArray(repeat('a', 1000)) from numbers(10000) group by number%10 FORMAT Null" ; echo "select metric,':', formatReadableSize(value) from system.metrics where metric like 'MemoryTracking'"| ./clickhouse client; done
MemoryTracking : 190.73 MiB
MemoryTracking : 190.80 MiB
MemoryTracking : 190.86 MiB
MemoryTracking : 190.92 MiB
MemoryTracking : 190.98 MiB
MemoryTracking : 191.05 MiB
MemoryTracking : 191.11 MiB
MemoryTracking : 191.17 MiB
MemoryTracking : 191.23 MiB
MemoryTracking : 191.30 MiB
MemoryTracking : 191.36 MiB
MemoryTracking : 191.42 MiB
MemoryTracking : 191.48 MiB
MemoryTracking : 191.55 MiB
MemoryTracking : 191.61 MiB
MemoryTracking : 191.67 MiB
MemoryTracking : 191.73 MiB
MemoryTracking : 191.80 MiB
MemoryTracking : 191.86 MiB
MemoryTracking : 191.92 MiB
bonus #16101 - fixes max_untracked_memory setting (and some others) for queries via HTTP
P.S. If you want to test it by yourself, you can get binaries from ClickHouse build check from the relevant PR.
Most helpful comment
Indeed, I can confirm this.
I won't clutter up the comment with details, but if you are curios you can take a look at PRs they contains excessive description (also you can a look into commit messages).
16122 - almost fixes the MemoryTracking accounting, but not 100%, there is still 0.01MiB leaked (P.S. I used another query, that does not requires any data, so you can go ahead and check your query).
16121 (the most doubtful PR, so don't consider it seriously for now) - this makes tracking of all (well except for the main thread, but it does not do anything once the server started accepting connections) allocations (before you need to create memory tracker explicitly), but this patch is still work-in-progress and doubtful (for me too, so don't expect that it will be in upstream soon). It fixes the memory drift for the same query as I used above but via
clichouse-client(actually for any query via newly created tcp connection), with 0.06MiB of drift (while before it was 4MiB), but it should be fixed in a different way:bonus #16101 - fixes
max_untracked_memorysetting (and some others) for queries via HTTPP.S. If you want to test it by yourself, you can get binaries from
ClickHouse build checkfrom the relevant PR.