Clickhouse: How to recreate the Clickhouse metadata in Zookeeper if all Zookeeper data is lost?

Created on 29 Nov 2019  路  26Comments  路  Source: ClickHouse/ClickHouse

Problem

Clickhouse replica nodes data are still in the disk but all Zookeeper data in disk is gone (accidentally).

This has caused to prevent writing to the replicated tables. Reading from the replicated tables have no problem.

Question

Is there still a way to recover this? I tried creating the /var/lib/clickhouse/flags/force_restore_data and then restarted the Clickhouse (container), and the zookeeper metadata is still not created.

question

Most helpful comment

Yeah, you can rename them later.
And you can backup existing data using freeze https://clickhouse.yandex/docs/en/query_language/alter/#alter_freeze-partition

And use parts from shadow folder.

Steps:

  1. create a new replicated table TEMP
  2. ALTER TABLE table_name FREEZE
  3. mv /var/lib/clickhouse/shadow/...table_name.../* /var/lib/clickhouse/db/TEMP/detached
  4. attach all parts / partitions
cd /var/lib/clickhouse/db/TEMP/detached
let i=1;for f in `ls -1` ; do echo $i $f;((i++)); echo "alter table TEMP attach part '$f';"|clickhouse-client ; done
  1. rename table_name to table_name_old
  2. rename TEMP to table_name
  3. rename table_name to table_name_old at replica
  4. create a new replicated table table_name at replica (it will download the data)

All 26 comments

https://clickhouse.yandex/docs/en/operations/table_engines/replication/#recovery-when-metadata-in-the-zookeeper-cluster-is-lost-or-damaged

You have to create new replicated tables and attach partitions.
To avoid data download try ATTACH PARTITION ... FROM https://github.com/ClickHouse/ClickHouse/issues/5312#issuecomment-547146300

@den-crane, I have 2 nodes currently, by new do you mean new replicated with a different name?

Yeah, you can rename them later.
And you can backup existing data using freeze https://clickhouse.yandex/docs/en/query_language/alter/#alter_freeze-partition

And use parts from shadow folder.

Steps:

  1. create a new replicated table TEMP
  2. ALTER TABLE table_name FREEZE
  3. mv /var/lib/clickhouse/shadow/...table_name.../* /var/lib/clickhouse/db/TEMP/detached
  4. attach all parts / partitions
cd /var/lib/clickhouse/db/TEMP/detached
let i=1;for f in `ls -1` ; do echo $i $f;((i++)); echo "alter table TEMP attach part '$f';"|clickhouse-client ; done
  1. rename table_name to table_name_old
  2. rename TEMP to table_name
  3. rename table_name to table_name_old at replica
  4. create a new replicated table table_name at replica (it will download the data)

@den-crane thanks for the info. I will try it out.

@den-crane I only see this increment.txt under the shadow folder.

root@clickhouse-01-0:/# cat /var/lib/clickhouse/shadow/increment.txt
1

Can you elaborate step 3 further, please? Thanks.

What a response you got with ALTER TABLE table_name FREEZE ?

@den-crane

What a response you got with ALTER TABLE table_name FREEZE ?

clickhouse-01-0.clickhouses.clickshield.svc.cluster.local :) ALTER TABLE clicks_sharded FREEZE

ALTER TABLE clicks_sharded
    FREEZE


Ok.

@den-crane nevermind there is a directory called 1. I did not notice it. 馃槅

better to use another ZK path for TEMP.

because in case of drop old replicated RO table it will try to remove data from ZK
or you can just detach table table_name_old and remove /data/../table_name_old /metadata/.../table_name_old.sql

@den-crane I'm about to do the rename and I got this error from ZK.

Code: 999. DB::Exception: Received from localhost:9000. Coordination::Exception. Coordination::Exception: No node, path: /clickhouse/tables/01/default/clicks_sharded/replicas/clickhouse-01-0/host.

I don't have clicks_sharded there, I only have clicks_sharded_tmp which was created on step 1.

[zk: localhost:2181(CONNECTED) 7] ls /clickhouse/tables/01/default
[clicks_sharded_tmp]

detach table table_name
mv /metadata/../ table_name.sql /var/tmp/
mv /data/../table_name /var/tmp/

The detach does not work. I think it was because of the previous command did not work properly.

clickhouse-01-0.clickhouses.clickshield.svc.cluster.local :) detach table clicks_sharded;

DETACH TABLE clicks_sharded

Received exception from server (version 19.14.3):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.clicks_sharded_old doesn't exist..

0 rows in set. Elapsed: 0.002 sec.

In /var/lib/clickhouse/data/default/ directory, I only see clicks_sharded_old. clicks_sharded does not exists.

What if I recreate step 1, and use the original clicks_sharded in the ZK path instead of clicks_sharded_tmp?

just stop CH and save folder with old table in the same filesystem

mkdir /var/tmp/
mv /data/../ clicks_sharded /var/tmp/ or mv /data/../clicks_sharded_old /var/tmp/

save metadata mv /metadata/../clicks_sharded_old.sql /var/tmp/

@den-crane restart is required? I did that.

Restart for what? What is the current state?

Do you have data in the new table? Do you see the old table in CH ?

@den-crane you mentioned just stop CH. I have data in the temp table (new). I restarted clickhouse and it won't start, it's getting an error DB::Exception: Cannot create table from metadata file /var/lib/clickhouse/metadata/default//clicks_sharded_tmp.sql, error: DB::Exception: Existing table metadata in ZooKeeper differs in index granularity byte. I had to delete the ZK path for it to be back up again. 馃槃

Current state:

  • I have the backups of clicks_sharded and the metadata sql file in a temporary directory.
  • clicks_sharded and clicks_sharded_old table does not exist now.
  • I have no ZK path in zookeeper because of the above.
  • I have clicks_sharded_tmp table with the data from attach partition from FREEZING.

What CH version do you use?
Please show the end of .sql files from old and new tables with Engine and ... order by partition by

@den-crane

I recreated steps 1 to 6 but in step 1, as I said, I recreated the replicated TEMP table using the ZK path of the ORIGINAL table name. So instead of clicks_sharded_tmp, I used clicks_sharded.

CREATE TABLE clicks_sharded_tmp 
....
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/default/clicks_sharded', '{replica}')
....

Now I'm done with STEP 6.

Now, my STEP7 is I just recreated the distributed table that looks into clicks_sharded and I think everything is fine now. 馃

Yeah, you can rename them later.
And you can backup existing data using freeze https://clickhouse.yandex/docs/en/query_language/alter/#alter_freeze-partition

And use parts from shadow folder.

Steps:

  1. create a new replicated table TEMP
  2. ALTER TABLE table_name FREEZE
  3. mv /var/lib/clickhouse/shadow/...table_name.../* /var/lib/clickhouse/db/TEMP/detached
  4. attach all parts / partitions
cd /var/lib/clickhouse/db/TEMP/detached
let i=1;for f in `ls -1` ; do echo $i $f;((i++)); echo "alter table TEMP attach part '$f';"|clickhouse-client ; done
  1. rename table_name to table_name_old
  2. rename TEMP to table_name
  3. rename table_name to table_name_old at replica
  4. create a new replicated table table_name at replica (it will download the data)

I will give more updates later on. 馃槃

After restarting the node I'm getting this error..

DB::Exception: Cannot create table from metadata file /var/lib/clickhouse/metadata/default//clicks_sharded.sql, error: DB::Exception: Table contains parts with adaptive and non adaptive marks, but `setting enable_mixed_granularity_parts` is disabled, stack trace:

Need to set enable_mixed_granularity_parts=1 in config.xml section

@den-crane like this?

<yandex>
    <merge_tree>
        <enable_mixed_granularity_parts>1</enable_mixed_granularity_parts>
    </merge_tree>
</yandex>

yes. and restart

@den-crane I managed to get the first replica for each of my shards up and running!

When running the second replica of the shard I get this error.

<Error> Application: DB::Exception: Cannot create object 'clicks_sharded' from query ....
....
....
error: Coordination::Exception: No node, path: /clickhouse/tables/01/default/clicks_sharded/replicas/clickhouse-01-1/parts

create a new replicated table table_name at replica (it will download the data)

you need to execute create table at replicas with the same zk path as zk path at the first replicas

@den-crane hmm what I did is the following for the server 2:

Fixing the startup problem.

  • stop the server.
  • move the clicks_sharded table and clicks_sharded metadata to a temporary folder.
  • start-up the server again and now it's running fine again.
  • recreate the tables. zookeeper data paths are now created.
  • also, at this point server 1 has replicated the data to server 2.

Then, I finally merged the temporarily moved data of my server 2.

  • move the clicks_sharded data to the detached /var/lib/clickhouse/data/default/clicks_sharded/detached/ directory.
  • run the alter table clicks_sharded attach par query.

Thanks for all the help @den-crane! I really appreciate it. Closing this issue now! 馃嵒

Was this page helpful?
0 / 5 - 0 ratings

Related issues

goranc picture goranc  路  3Comments

igor-sh8 picture igor-sh8  路  3Comments

innerr picture innerr  路  3Comments

opavader picture opavader  路  3Comments

hatarist picture hatarist  路  3Comments