Using ClickHouse 19.8.3.8 on Debian 9 on WSL, installed from Yandex repository. The problem also appeared on version 19.7.5.27.
Queries were executed in ClickHouse CLI.
Test table is created with this query:
CREATE TABLE kafka_test2 (
timestamp UInt64,
assetId String,
dimension1 Nullable(String),
dimension2 Nullable(String),
dimension3 Nullable(String),
dimension4 Nullable(String),
dimension5 Nullable(String),
metric1 Nullable(Float64)
) ENGINE = Kafka('localhost:9092', 'test_metrics', 'test-ch-group-100', 'JSONEachRow');
Table name kafka_test2 is unique and was never used, as well as consumer group test-ch-group-100.
Trace log:
2019.06.14 13:20:39.192913 [ 49 ] {3f3f4d3b-105c-4a3b-886b-b9c4a0cb4c53} <Debug> executeQuery: (from 127.0.0.1:34618) CREATE TABLE kafka_test2 ( timestamp UInt64, assetId String, dimension1 Nullable(String),
dimension2 Nullable(String), dimension3 Nullable(String), dimension4 Nullable(String), dimension5 Nullable(String), metric1 Nullable(Float64) ) ENGINE = Kafka('localhost:9092', 'test_metrics', 'test-ch-group-100', 'JSONEachRow');
2019.06.14 13:20:39.203243 [ 49 ] {3f3f4d3b-105c-4a3b-886b-b9c4a0cb4c53} <Trace> StorageKafka (kafka_test2): Setting brokers: localhost:9092
2019.06.14 13:20:39.203310 [ 49 ] {3f3f4d3b-105c-4a3b-886b-b9c4a0cb4c53} <Trace> StorageKafka (kafka_test2): Setting Group ID: test-ch-group-100 Client ID: clickhouse
2019.06.14 13:20:39.204507 [ 49 ] {3f3f4d3b-105c-4a3b-886b-b9c4a0cb4c53} <Debug> MemoryTracker: Peak memory usage (for query): 397.00 B.
2019.06.14 13:20:39.204674 [ 49 ] {3f3f4d3b-105c-4a3b-886b-b9c4a0cb4c53} <Information> TCPHandler: Processed in 0.012 sec.
Then I run DROP TABLE. Messages from CLI:
Darkrai.localdomain :) DROP TABLE kafka_test2;
DROP TABLE kafka_test2
Timeout exceeded while receiving data from server. Waited for 300 seconds, timeout is 300 seconds.
Cancelling query.
It hangs on cancelling.
In server log only one message:
2019.06.14 13:20:59.621685 [ 49 ] {ff0392c1-0ac8-4dff-94fe-f179e8853d6f} <Debug> executeQuery: (from 127.0.0.1:34618) DROP TABLE kafka_test2;
After stopping the server in linux console it hangs on:
v1rusw0rm@Darkrai:~$ sudo service clickhouse-server stop
Stop clickhouse-server service:
And server log:
2019.06.14 13:43:29.699305 [ 50 ] {} <Information> Application: Received termination signal (Terminated)
2019.06.14 13:43:29.699892 [ 1 ] {} <Debug> Application: Received termination signal.
2019.06.14 13:43:29.700008 [ 1 ] {} <Debug> Application: Waiting for current connections to close.
2019.06.14 13:43:30.936872 [ 1 ] {} <Debug> Application: Closed all listening sockets. Waiting for 2 outstanding connections.
2019.06.14 13:43:35.983936 [ 1 ] {} <Debug> Application: Closed connections. But 2 remains. Tip: To increase wait time add to config: <shutdown_wait_unfinished>60</shutdown_wait_unfinished>
2019.06.14 13:43:55.987254 [ 1 ] {} <Information> Application: Shutting down storages.
Server's error log is clean.
Also #2796
The bug only happens if nothing was selected\fetched from the table. Log after SELECT:
2019.06.14 17:04:04.375179 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Debug> executeQuery: (from 127.0.0.1:31154) SELECT * FROM kafka_raw6 LIMIT 1;
2019.06.14 17:04:04.375773 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Debug> StorageKafka (kafka_raw6): Starting reading 1 streams
2019.06.14 17:04:04.375810 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2019.06.14 17:04:04.375894 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Debug> executeQuery: Query pipeline:
Limit
Expression
Expression
Kafka
2019.06.14 17:04:04.883465 [ 45 ] {} <Trace> StorageKafka (kafka_raw6): Polled batch of 14655 messages
2019.06.14 17:04:04.884045 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Trace> StorageKafka (kafka_raw6): Committing message with offset 3
2019.06.14 17:04:04.884260 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Debug> MemoryTracker: Peak memory usage (for query): 1.10 MiB.
2019.06.14 17:04:04.884371 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Debug> MemoryTracker: Peak memory usage (total): 1.10 MiB.
2019.06.14 17:04:04.884435 [ 41 ] {9e9eb6a5-94e7-4917-a0fb-5108050e2b4c} <Information> TCPHandler: Processed in 0.509 sec.
And then after successful DROP TABLE:
2019.06.14 17:04:16.578412 [ 41 ] {a944cfd4-b8c8-4eca-87a9-6507c72eabb0} <Debug> executeQuery: (from 127.0.0.1:31154) DROP TABLE kafka_raw6;
2019.06.14 17:04:16.588628 [ 41 ] {a944cfd4-b8c8-4eca-87a9-6507c72eabb0} <Trace> StorageKafka (kafka_raw6): Waiting for cleanup
2019.06.14 17:04:19.593449 [ 41 ] {a944cfd4-b8c8-4eca-87a9-6507c72eabb0} <Debug> MemoryTracker: Peak memory usage (for query): 217.00 B.
2019.06.14 17:04:19.593640 [ 41 ] {a944cfd4-b8c8-4eca-87a9-6507c72eabb0} <Debug> MemoryTracker: Peak memory usage (total): 217.00 B.
2019.06.14 17:04:19.593688 [ 41 ] {a944cfd4-b8c8-4eca-87a9-6507c72eabb0} <Information> TCPHandler: Processed in 3.015 sec.
Probably caused by https://github.com/edenhill/librdkafka/issues/2077 - will try to implement any workaround.
Situation improved for v19.9+
Same issue here.
0. cppkafka::HandleException::HandleException(cppkafka::Error) @ 0x1016825a in /usr/bin/clickhouse
1. ? @ 0x1016ff12 in /usr/bin/clickhouse
2. cppkafka::Consumer::get_offsets_committed(std::__1::vector<cppkafka::TopicPartition, std::__1::allocator<cppkafka::TopicPartition> > const&) const @ 0x1017499d in /usr/bin/clickhouse
3. DB::ReadBufferFromKafkaConsumer::commit() @ 0xd6b5147 in /usr/bin/clickhouse
4. DB::KafkaBlockInputStream::readSuffixImpl() @ 0xd6abf2e in /usr/bin/clickhouse
5. DB::IBlockInputStream::readSuffix() @ 0xccd3b05 in /usr/bin/clickhouse
6. DB::SourceFromInputStream::generate() @ 0xdd22441 in /usr/bin/clickhouse
7. DB::ISource::work() @ 0xd9dc847 in /usr/bin/clickhouse
8. DB::SourceFromInputStream::work() @ 0xdd22145 in /usr/bin/clickhouse
9. ? @ 0xda143ad in /usr/bin/clickhouse
10. DB::PipelineExecutor::executeSingleThread(unsigned long, unsigned long) @ 0xda1851d in /usr/bin/clickhouse
11. ? @ 0xda18fa2 in /usr/bin/clickhouse
12. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8f50b4b in /usr/bin/clickhouse
13. ? @ 0x8f4f00f in /usr/bin/clickhouse
14. start_thread @ 0x76db in /lib/x86_64-linux-gnu/libpthread-2.27.so
15. __clone @ 0x12188f in /lib/x86_64-linux-gnu/libc-2.27.so
ClickHouse server version 20.3.4.10 (official build).
It hangs on drop table and drop database
Same issue at ClickHouse server version 20.4.2.9 (official build).
Most helpful comment
Probably caused by https://github.com/edenhill/librdkafka/issues/2077 - will try to implement any workaround.