Using the version a2ed8267add1cfc28fa2ca17f7227a816921e7d7 in case I run a QUERY like "SELECT 1" in loop using wait_end_of_query=1 after 7000 iterations all queries will fail with:
DynamicQueryHandler: Code: 241, e.displayText() = DB::Exception: Memory limit (total) exceeded: would use 14.40 GiB (attempt to allocate chunk of 4198133 bytes), maximum: 14.40 GiB, Stack trace (when copying this message, always include the lines below):
If wait_end_of_query is not specified the problem doesn't occur.
Cannot reproduce (even on that particular commit), 10k requests, ~1k in parallel everything is fine (RSS/VIRT does not grows):
$ time yes '127.1:8123/?wait_end_of_query=1' | head -n10000 | xargs -P100
00 curl -s -d 'select 1' | grep -x -c 1
10000
real 0m3.427s
Hello @azat,
I can replicate the problem with your query. Please look into the logs (those are spammed with DynamicQueryHandler: Code: 241, e.displayText() = DB::Exception: Memory limit (total) exceeded:) . If you look at the memory this will not increasing. There is no memory leak, the problem is in the MemoryTracker.
Silviu
Reproduced (forgot to set memory limit for server)
Using the version a2ed826
Hm, and this not happens before?
@azat I have no idea :) Now I wanted to use wait_end_of_query . But everything seems broken once the problem occurs. Even config auto reload is not longer working and throw same problem.
@azat I fund out that also If I do 10K small inserts one after the other without any wait_end_of_query receive same issue. There is any way I can disable this memory limit check ?
I fund out that also If I do 10K small inserts one after the other without any wait_end_of_query receive same issue.
Yes, this is kind of expecting behaviour
There is any way I can disable this memory limit check ?
You can add the following into config.xml:
<max_server_memory_usage_to_ram_ratio>100</max_server_memory_usage_to_ram_ratio>
To verify take a look into the logs:
Setting max_server_memory_usage ...
Hm, and this not happens before?
And yes it pops up only now, since max_server_memory_usage has been added only in 20.4 (#10421, #10362)
P.S. this memory tracking accounting will be reset each 90 seconds (AFAIR), but due to this queries are too fast (since they are no-op) this does not work
@azat I keeped the server opened for few hours without sending any query and seems that counter didn't reseted. All queries are still failing..
I keeped the server opened for few hours without sending any query and seems that counter didn't reseted. All queries are still failing..
Have you installed the server setting max_server_memory_usage_to_ram_ratio from the comment?
And what RSS for the clickhouse-server process?
I added max_server_memory_usage_to_ram_ratio and this fixed all problems. But without this once the problems appears the app becomes unusable even if the clickhouse-server process doesn't have more than 300 MB of memory on a 16 GB server... Even if you wait 1 hour will not become usable till you restart.
clickhouse-server process doesn't have more than 300 MB of memory
of RSS?
Can you post output of the following?
select metric, formatReadableSize(value) from system.metrics where metric = 'MemoryTracking'
this query should work since it uses less then max_untracked_memory, by default 4MB
Also, maybe you have some background queries that you forgot to mention?
same problem with 20.4.3.16 and 20.4.2.9
Hi,
I got the same problem and setting max_server_memory_usage_to_ram_ratio does not fix the problem. Anyone have a solution yet?
this is my error
ServerException: Code: 241.
DB::Exception: Memory limit (for query) exceeded: would use 18.63 GiB (attempt to allocate chunk of 4210950 bytes), maximum: 18.63 GiB: While executing MergeTreeThread. Stack trace:
0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x103fa390 in /usr/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x900b8ad in /usr/bin/clickhouse
2. ? @ 0x900af73 in /usr/bin/clickhouse
3. MemoryTracker::alloc(long) @ 0x90095e7 in /usr/bin/clickhouse
4. void DB::filterArraysImpl<char8_t>(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 15ul, 16ul>&, DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, long) @ 0xd60c57c in /usr/bin/clickhouse
5. DB::ColumnString::filter(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, long) const @ 0xd5a9385 in /usr/bin/clickhouse
6. ? @ 0xd9b2735 in /usr/bin/clickhouse
7. DB::MergeTreeRangeReader::read(unsigned long, std::__1::deque<DB::MarkRange, std::__1::allocator<DB::MarkRange> >&) @ 0xd9b64fa in /usr/bin/clickhouse
8. DB::MergeTreeBaseSelectProcessor::readFromPartImpl() @ 0xd9af83d in /usr/bin/clickhouse
9. DB::MergeTreeBaseSelectProcessor::generate() @ 0xd9b0417 in /usr/bin/clickhouse
10. DB::ISource::work() @ 0xdad2fab in /usr/bin/clickhouse
11. DB::SourceWithProgress::work() @ 0xdd18887 in /usr/bin/clickhouse
12. ? @ 0xdafe041 in /usr/bin/clickhouse
13. DB::PipelineExecutor::executeSingleThread(unsigned long, unsigned long) @ 0xdb01ffd in /usr/bin/clickhouse
14. ? @ 0xdb029c4 in /usr/bin/clickhouse
15. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x9031267 in /usr/bin/clickhouse
16. ? @ 0x902f753 in /usr/bin/clickhouse
17. start_thread @ 0x7e65 in /usr/lib64/libpthread-2.17.so
18. clone @ 0xfe88d in /usr/lib64/libc-2.17.so
I got the same problem and setting max_server_memory_usage_to_ram_ratio does not fix the problem. Anyone have a solution yet?
@HooliBox which version do you have? and please provide startup logs (at least Setting max_server_memory_usage ... messages)
I am on version version 20.4.5.36 (official build).
I did the grep max_server_memory_usage /var/log/clickhouse-server/clickhouse-server.log
2020.06.21 11:20:40.539962 [ 506 ] {} <Information> Application: Setting max_server_memory_usage was set to 2.86 TiB
2020.06.21 11:21:59.810640 [ 679 ] {} <Information> Application: Setting max_server_memory_usage was set to 2.57 TiB
2020.06.21 11:26:18.303317 [ 846 ] {} <Information> Application: Setting max_server_memory_usage was set to 2.86 TiB
2020.06.21 12:22:44.987683 [ 314 ] {} <Information> Application: Setting max_server_memory_usage was set to 2.86 TiB
@HooliBox but in your case the query limit exceeded, not total (like in issue subject):
DB::Exception: Memory limit (for query) exceeded: would use 18.63 GiB (attempt to allocate chunk of 4210950 bytes), maximum: 18.63 GiB: While executing MergeTreeThread. Stack trace:
So take a look at max_memory_usage* settings
Here is what I did, I ran the same query 10 times. The first couple time, the query works fine. Then, it stopped working and complaining about the memory.
from sqlalchemy import create_engine, Column, MetaData, literal
from clickhouse_sqlalchemy import Table, make_session, get_declarative_base, types, engines
import pandas as pd
import numpy as np
uri = 'clickhouse://[email protected]/'
engine = create_engine(uri)
metadata = MetaData()
sql = """
SELECT toYYYYMMDD(trade_datetime) AS trade_dt, s_info_windcode, SUM(volume) AS volume FROM minute_bar
WHERE toInt64(toYYYYMMDDhhmmss(trade_datetime)) - toInt64(toYYYYMMDD(trade_datetime)) * 1000000 > 94500
AND toInt64(toYYYYMMDDhhmmss(trade_datetime)) - toInt64(toYYYYMMDD(trade_datetime)) * 1000000 < 141500
GROUP BY toYYYYMMDD(trade_datetime), s_info_windcode"""
for i in range(10):
res = engine.execute(sql)
data = res.fetchall()
print(i, data[0])
and here is the output:
0 ('20191025', '603915.SH', '2157700')
1 ('20130208', '002059.SZ', '1159964')
2 ('20191025', '603915.SH', '2157700')
3 ('20130208', '002059.SZ', '1159964')
4 ('20130208', '002059.SZ', '1159964')
5 ('20191025', '603915.SH', '2157700')
---------------------------------------------------------------------------
DatabaseException Traceback (most recent call last)
<ipython-input-2-d1cd821b006b> in <module>
7
8 for i in range(10):
----> 9 res = engine.execute(sql)
10 data = res.fetchall()
11 print(i, data[0])
/opt/conda/lib/python3.7/site-packages/sqlalchemy/engine/base.py in execute(self, statement, *multiparams, **params)
2242
2243 connection = self._contextual_connect(close_with_result=True)
-> 2244 return connection.execute(statement, *multiparams, **params)
2245
2246 def scalar(self, statement, *multiparams, **params):
/opt/conda/lib/python3.7/site-packages/sqlalchemy/engine/base.py in execute(self, object_, *multiparams, **params)
1010 """
1011 if isinstance(object_, util.string_types[0]):
-> 1012 return self._execute_text(object_, multiparams, params)
1013 try:
1014 meth = object_._execute_on_connection
/opt/conda/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_text(self, statement, multiparams, params)
1185 parameters,
1186 statement,
-> 1187 parameters,
1188 )
1189 if self._has_events or self.engine._has_events:
/opt/conda/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
1322 except BaseException as e:
1323 self._handle_dbapi_exception(
-> 1324 e, statement, parameters, cursor, context
1325 )
1326
/opt/conda/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
1519 )
1520 else:
-> 1521 util.raise_(exc_info[1], with_traceback=exc_info[2])
1522
1523 finally:
/opt/conda/lib/python3.7/site-packages/sqlalchemy/util/compat.py in raise_(***failed resolving arguments***)
176
177 try:
--> 178 raise exception
179 finally:
180 # credit to
/opt/conda/lib/python3.7/site-packages/sqlalchemy/engine/base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
1282 if not evt_handled:
1283 self.dialect.do_execute(
-> 1284 cursor, statement, parameters, context
1285 )
1286
/opt/conda/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/base.py in do_execute(self, cursor, statement, parameters, context)
874
875 def do_execute(self, cursor, statement, parameters, context=None):
--> 876 cursor.execute(statement, parameters, context=context)
877
878 def _check_unicode_returns(self, connection, additional_tests=None):
/opt/conda/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/http/connector.py in execute(self, operation, parameters, context)
115 response_gen = transport.execute(raw_sql, params=params)
116
--> 117 self._process_response(response_gen)
118 self._end_query()
119
/opt/conda/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/http/connector.py in _process_response(self, response)
214 response = iter(response)
215
--> 216 self._columns = next(response, None)
217 self._types = next(response, None)
218 self._response = response
/opt/conda/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/http/transport.py in execute(self, query, params)
89 there is nothing to return.
90 """
---> 91 r = self._send(query, params=params, stream=True)
92 lines = r.iter_lines()
93 try:
/opt/conda/lib/python3.7/site-packages/clickhouse_sqlalchemy/drivers/http/transport.py in _send(self, data, params, stream)
135 orig = HTTPException(r.text)
136 orig.code = r.status_code
--> 137 raise DatabaseException(orig)
138 return r
DatabaseException: Orig exception: Code: 241, e.displayText() = DB::Exception: Memory limit (total) exceeded: would use 52.74 GiB (attempt to allocate chunk of 4718592 bytes), maximum: 52.73 GiB: While executing AggregatingTransform (version 20.4.5.36 (official build))
This seems like the same issue with the above where it didn't clear out the memory
Ok, here are some questions to you:
And also some notes regardless 20.4:
P.S. and looks like you have different issue
I pasted log file here.
I tried both tcp and http, they all failed with same error. The only difference is tcp takes couple more iterations to fail, I guess it is using less memory? Not sure.
From your logs:
2020.06.21 14:00:53.762580 [ 859 ] {}
MemoryTracker: Peak memory usage (for query): 6.15 GiB.
And the error when the query fails:
DB::Exception: Memory limit (for query) exceeded: would use 18.63 GiB (attempt to allocate chunk of 4210950 bytes), maximum: 18.63 GiB: While executing MergeTreeThread. Stack trace:
Possible issues:
Anyway looks like a separate issue to me (and actually this issues already contains 3 separate issues), could you report it separately? (including all the details you provided, also you can add full server log from the beginning until the query starts failing, since looks like it is not a problem for you)
I used the same query and no new data was inserted.
I could create a new issue.
Most helpful comment
Yes, this is kind of expecting behaviour
You can add the following into
config.xml:To verify take a look into the logs:
And yes it pops up only now, since
max_server_memory_usagehas been added only in 20.4 (#10421, #10362)P.S. this memory tracking accounting will be reset each 90 seconds (AFAIR), but due to this queries are too fast (since they are no-op) this does not work