Clickhouse: Caught exception while loading metadata,Unable to launch clickhouse

Created on 20 Sep 2019  ·  7Comments  ·  Source: ClickHouse/ClickHouse

OS
[root@ck3 ~]# uname -ra
Linux ck3 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@ck3 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)

clickhouse version
ClickHouse server version 19.13.1.11 (official build).

In case the OS is abnormally closed (power off), after 14 hours we tried to launch the clickouse. Unable to launch, The information in the log is “Incorrect user”.

clickhouse-server.err.log

`2019.09.19 22:24:29.471300 [ 1 ] {} Application: Caught exception while loading metadata: Code: 62, e.displayText() = DB::Exception: Incorrect user[:password]@host:port#default_database format detached, Stack trace:

  1. clickhouse-server(StackTrace::StackTrace()+0x30) [0x7f58f30]
  2. clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string, std::allocator > const&, int)+0x25) [0x3c427b5]
  3. clickhouse-server() [0x395b326]
  4. clickhouse-server() [0x700020c]
  5. clickhouse-server(DB::StorageDistributedDirectoryMonitor::createPool(std::__cxx11::basic_string, std::allocator > const&, DB::StorageDistributed const&)+0x37) [0x7000a07]
  6. clickhouse-server(DB::StorageDistributed::requireDirectoryMonitor(std::__cxx11::basic_string, std::allocator > const&)+0xaf) [0x6f60e7f]
  7. clickhouse-server(DB::StorageDistributed::createDirectoryMonitors()+0x183) [0x6f63903]
  8. clickhouse-server(DB::StorageDistributed::startup()+0x11) [0x6f63951]
  9. clickhouse-server() [0x73794e2]
  10. clickhouse-server(ThreadPoolImpl::worker(std::_List_iterator)+0x1a7) [0x3c74537]
  11. clickhouse-server(ThreadFromGlobalPool::ThreadFromGlobalPool::scheduleImpl(std::function)::{lambda()#3}>(ThreadPoolImpl::scheduleImpl(std::function)::{lambda()#3}&&)::{lambda()#1}::operator()() const+0x3e) [0x3c74b1e]
  12. clickhouse-server(ThreadPoolImpl::worker(std::_List_iterator)+0x1a6) [0x3c72046]
  13. clickhouse-server() [0xba2baa0]
  14. /lib64/libpthread.so.0(+0x7dd5) [0x7f1c77810dd5]
  15. /lib64/libc.so.6(clone+0x6d) [0x7f1c7703402d]
    (version 19.13.1.11 (official build))
    2019.09.19 22:24:30.364148 [ 1 ] {} Application: DB::Exception: Incorrect user[:password]@host:port#default_database format detached`

clickhouse-server.log

`2019.09.19 22:24:29.360436 [ 1 ] {} : Starting ClickHouse 19.13.1.11 with revision 54425
2019.09.19 22:24:29.360489 [ 1 ] {} Application: starting up
2019.09.19 22:24:29.362666 [ 1 ] {} Application: rlimit on number of file descriptors is 262144
2019.09.19 22:24:29.362678 [ 1 ] {} Application: Initializing DateLUT.
2019.09.19 22:24:29.362683 [ 1 ] {} Application: Initialized DateLUT with time zone 'America/New_York'.
2019.09.19 22:24:29.362908 [ 1 ] {} Application: Configuration parameter 'interserver_http_host' doesn't exist or exists and empty. Will use 'ck1' as replica host.
2019.09.19 22:24:29.363720 [ 1 ] {} ConfigReloader: Loading config '/etc/clickhouse-server/users.xml'
2019.09.19 22:24:29.364632 [ 1 ] {} Application: Uncompressed cache size was lowered to 495.59 MiB because the system has low amount of memory
2019.09.19 22:24:29.364742 [ 1 ] {} Application: Mark cache size was lowered to 495.59 MiB because the system has low amount of memory
2019.09.19 22:24:29.364759 [ 1 ] {} Application: Loading metadata from /var/lib/clickhouse/
2019.09.19 22:24:29.365639 [ 1 ] {} DatabaseOrdinary (db): Total 6 tables.
2019.09.19 22:24:29.366322 [ 5 ] {} BackgroundProcessingPool: Create BackgroundProcessingPool with 16 threads
2019.09.19 22:24:29.366587 [ 5 ] {} BackgroundSchedulePool: Create BackgroundSchedulePool with 16 threads
2019.09.19 22:24:29.369476 [ 5 ] {} ZooKeeper: initialized, hosts: ck1:2181
2019.09.19 22:24:29.369919 [ 5 ] {} db.chtable_c1: Loading data parts
2019.09.19 22:24:29.370292 [ 5 ] {} db.chtable_c1: Loaded data parts (1 items)
2019.09.19 22:24:29.383715 [ 5 ] {} db.job_end_log: Loading data parts
2019.09.19 22:24:29.383759 [ 5 ] {} db.job_end_log: Loaded data parts (0 items)
2019.09.19 22:24:29.395537 [ 5 ] {} db.job_start_log: Loading data parts
2019.09.19 22:24:29.396001 [ 5 ] {} db.job_start_log: Loaded data parts (2 items)
2019.09.19 22:24:29.409038 [ 1 ] {} DatabaseOrdinary (db): Starting up tables.
2019.09.19 22:24:29.409093 [ 13 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Activating replica.
2019.09.19 22:24:29.410616 [ 13 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Loading queue from /clickhouse/tables/01-01/chtable_c1/replicas/ck1/queue
2019.09.19 22:24:29.410826 [ 13 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Having 0 queue entries to load, 0 entries already loaded.
2019.09.19 22:24:29.410915 [ 13 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Loaded queue
2019.09.19 22:24:29.411191 [ 13 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Loading 14 mutation entries: 0000000000 - 0000000013
2019.09.19 22:24:29.412128 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000000 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412149 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000001 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412153 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000002 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412155 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000003 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412158 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000004 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412161 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000005 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412163 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000006 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412166 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000007 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412174 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000008 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412177 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000009 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412180 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000010 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412182 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000011 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412185 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000012 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.412188 [ 14 ] {} db.chtable_c1 (ReplicatedMergeTreeQueue): Marking mutation 0000000013 done because it is <= mutation_pointer (0000000013)
2019.09.19 22:24:29.413900 [ 24 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Activating replica.
2019.09.19 22:24:29.414229 [ 15 ] {} db.chtable_c1: Became leader
2019.09.19 22:24:29.420634 [ 24 ] {} db.job_end_log (ReplicatedMergeTreeQueue): Loading queue from /clickhouse/tables/01-01/job_end_log/replicas/ck1/queue
2019.09.19 22:24:29.421036 [ 24 ] {} db.job_end_log (ReplicatedMergeTreeQueue): Having 0 queue entries to load, 0 entries already loaded.
2019.09.19 22:24:29.421165 [ 24 ] {} db.job_end_log (ReplicatedMergeTreeQueue): Loaded queue
2019.09.19 22:24:29.434266 [ 26 ] {} db.job_end_log: Became leader
2019.09.19 22:24:29.471300 [ 1 ] {} Application: Caught exception while loading metadata: Code: 62, e.displayText() = DB::Exception: Incorrect user[:password]@host:port#default_database format detached, Stack trace:

  1. clickhouse-server(StackTrace::StackTrace()+0x30) [0x7f58f30]
  2. clickhouse-server(DB::Exception::Exception(std::__cxx11::basic_string, std::allocator > const&, int)+0x25) [0x3c427b5]
  3. clickhouse-server() [0x395b326]
  4. clickhouse-server() [0x700020c]
  5. clickhouse-server(DB::StorageDistributedDirectoryMonitor::createPool(std::__cxx11::basic_string, std::allocator > const&, DB::StorageDistributed const&)+0x37) [0x7000a07]
  6. clickhouse-server(DB::StorageDistributed::requireDirectoryMonitor(std::__cxx11::basic_string, std::allocator > const&)+0xaf) [0x6f60e7f]
  7. clickhouse-server(DB::StorageDistributed::createDirectoryMonitors()+0x183) [0x6f63903]
  8. clickhouse-server(DB::StorageDistributed::startup()+0x11) [0x6f63951]
  9. clickhouse-server() [0x73794e2]
  10. clickhouse-server(ThreadPoolImpl::worker(std::_List_iterator)+0x1a7) [0x3c74537]
  11. clickhouse-server(ThreadFromGlobalPool::ThreadFromGlobalPool::scheduleImpl(std::function)::{lambda()#3}>(ThreadPoolImpl::scheduleImpl(std::function)::{lambda()#3}&&)::{lambda()#1}::operator()() const+0x3e) [0x3c74b1e]
  12. clickhouse-server(ThreadPoolImpl::worker(std::_List_iterator)+0x1a6) [0x3c72046]
  13. clickhouse-server() [0xba2baa0]
  14. /lib64/libpthread.so.0(+0x7dd5) [0x7f1c77810dd5]
  15. /lib64/libc.so.6(clone+0x6d) [0x7f1c7703402d]
    (version 19.13.1.11 (official build))
    2019.09.19 22:24:29.471398 [ 1 ] {} Application: Shutting down storages.
    2019.09.19 22:24:29.471470 [ 1 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Restarting thread finished
    2019.09.19 22:24:29.473761 [ 1 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Waiting for threads to finish
    2019.09.19 22:24:29.473789 [ 1 ] {} db.chtable_c1: Stopped being leader
    2019.09.19 22:24:29.475174 [ 1 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Threads finished
    2019.09.19 22:24:29.475206 [ 1 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Restarting thread finished
    2019.09.19 22:24:29.476232 [ 1 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Waiting for threads to finish
    2019.09.19 22:24:29.476325 [ 1 ] {} db.job_end_log: Stopped being leader
    2019.09.19 22:24:29.477363 [ 1 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Threads finished
    2019.09.19 22:24:29.477391 [ 1 ] {} db.job_start_log (ReplicatedMergeTreeRestartingThread): Restarting thread finished
    2019.09.19 22:24:29.477396 [ 1 ] {} db.job_start_log (ReplicatedMergeTreeRestartingThread): Waiting for threads to finish
    2019.09.19 22:24:29.477399 [ 1 ] {} db.job_start_log (ReplicatedMergeTreeRestartingThread): Threads finished
    2019.09.19 22:24:29.477425 [ 1 ] {} db.job_start_log (ReplicatedMergeTreeRestartingThread): Restarting thread finished
    2019.09.19 22:24:29.477428 [ 1 ] {} db.job_start_log (ReplicatedMergeTreeRestartingThread): Waiting for threads to finish
    2019.09.19 22:24:29.477431 [ 1 ] {} db.job_start_log (ReplicatedMergeTreeRestartingThread): Threads finished
    2019.09.19 22:24:29.477480 [ 1 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Restarting thread finished
    2019.09.19 22:24:29.477484 [ 1 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Waiting for threads to finish
    2019.09.19 22:24:29.477487 [ 1 ] {} db.job_end_log (ReplicatedMergeTreeRestartingThread): Threads finished
    2019.09.19 22:24:29.477504 [ 1 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Restarting thread finished
    2019.09.19 22:24:29.477507 [ 1 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Waiting for threads to finish
    2019.09.19 22:24:29.477510 [ 1 ] {} db.chtable_c1 (ReplicatedMergeTreeRestartingThread): Threads finished
    2019.09.19 22:24:30.361742 [ 1 ] {} BackgroundSchedulePool: Waiting for threads to finish.
    2019.09.19 22:24:30.361781 [ 1 ] {} Application: Shutted down storages.
    2019.09.19 22:24:30.363860 [ 1 ] {} Application: Destroyed global context.
    2019.09.19 22:24:30.364148 [ 1 ] {} Application: DB::Exception: Incorrect user[:password]@host:port#default_database format detached
    2019.09.19 22:24:30.364165 [ 1 ] {} Application: shutting down
    2019.09.19 22:24:30.364169 [ 1 ] {} Application: Uninitializing subsystem: Logging Subsystem
    2019.09.19 22:24:30.364405 [ 41 ] {} BaseDaemon: Stop SignalListener thread
    `
bug st-accepted

Most helpful comment

Yes, in the end I figured out there was a detached and format_version.txt in the distributed data folder, even though for a distributed table these shouldn't be there. Once I removed them everything booted fine, but it took a long time to identify the table at fault, I have to move the metadata, start the server, put the metadata back, and then attach a table one by one to find the offending table. It would be good if the offending table name was printed in the log!

All 7 comments

I'm also seeing this in 19.15.2.2, and am having a hard time working out what the issue is.

The exception happens when CH tries to attach a distributed table, and there is a directory that does not comply with the naming convention for distributed table's directories.

CREATE TABLE system.dt AS system.settings
ENGINE = Distributed(test_shard_localhost, system, settings)

Ok.

DETACH TABLE system.dt

Ok.

mkdir /var/lib/clickhouse/data/system/dt/a_dir

ATTACH TABLE system.dt

Received exception from server (version 19.15.3):
Code: 62. DB::Exception: Received from localhost:9000. DB::Exception: Incorrect user[:password]@host:port#default_database format a_dir.

The problem is that there is no reference to the table, and during a restart it is not possible to understand what exactly is wrong.

Yes, in the end I figured out there was a detached and format_version.txt in the distributed data folder, even though for a distributed table these shouldn't be there. Once I removed them everything booted fine, but it took a long time to identify the table at fault, I have to move the metadata, start the server, put the metadata back, and then attach a table one by one to find the offending table. It would be good if the offending table name was printed in the log!

Thanks for the @SaltTan and @jpiper.
Strongly agree with @jpiper's point of view.

Another test case

:) create table test (a Int8) Engine=MergeTree() order by tuple()

CREATE TABLE test
(
    `a` Int8
)
ENGINE = MergeTree()
ORDER BY tuple()

Ok.

0 rows in set. Elapsed: 0.117 sec.

:) detach table test

DETACH TABLE test

Ok.

0 rows in set. Elapsed: 0.001 sec.

:) create table test AS system.settings ENGINE = Distributed(test_shard_localhost, system, settings)

CREATE TABLE test AS system.settings
ENGINE = Distributed(test_shard_localhost, system, settings)

Received exception from server (version 19.15.3):
Code: 62. DB::Exception: Received from localhost:9000. DB::Exception: Incorrect user[:password]@host:port#default_database format detached.

0 rows in set. Elapsed: 0.127 sec.

:) select count() from test

SELECT count()
FROM test

┌─count()─┐
│     240 │
└─────────┘

1 rows in set. Elapsed: 0.046 sec.

This issue will be solved in fundamental fashion in https://github.com/ClickHouse/ClickHouse/issues/6787
But simpler solution is possible. We can check that directory already exists on CREATE TABLE and throw an exception.

Fixed in #7512

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zhicwu picture zhicwu  ·  3Comments

derekperkins picture derekperkins  ·  3Comments

jimmykuo picture jimmykuo  ·  3Comments

igor-sh8 picture igor-sh8  ·  3Comments

fizerkhan picture fizerkhan  ·  3Comments