Problem
Clickhouse replica nodes data are still in the disk but all Zookeeper data in disk is gone (accidentally).
This has caused to prevent writing to the replicated tables. Reading from the replicated tables have no problem.
Question
Is there still a way to recover this? I tried creating the /var/lib/clickhouse/flags/force_restore_data and then restarted the Clickhouse (container), and the zookeeper metadata is still not created.
https://clickhouse.yandex/docs/en/operations/table_engines/replication/#recovery-when-metadata-in-the-zookeeper-cluster-is-lost-or-damaged
You have to create new replicated tables and attach partitions.
To avoid data download try ATTACH PARTITION ... FROM https://github.com/ClickHouse/ClickHouse/issues/5312#issuecomment-547146300
@den-crane, I have 2 nodes currently, by new do you mean new replicated with a different name?
Yeah, you can rename them later.
And you can backup existing data using freeze https://clickhouse.yandex/docs/en/query_language/alter/#alter_freeze-partition
And use parts from shadow folder.
Steps:
cd /var/lib/clickhouse/db/TEMP/detached
let i=1;for f in `ls -1` ; do echo $i $f;((i++)); echo "alter table TEMP attach part '$f';"|clickhouse-client ; done
@den-crane thanks for the info. I will try it out.
@den-crane I only see this increment.txt under the shadow folder.
root@clickhouse-01-0:/# cat /var/lib/clickhouse/shadow/increment.txt
1
Can you elaborate step 3 further, please? Thanks.
What a response you got with ALTER TABLE table_name FREEZE ?
@den-crane
What a response you got with ALTER TABLE table_name FREEZE ?
clickhouse-01-0.clickhouses.clickshield.svc.cluster.local :) ALTER TABLE clicks_sharded FREEZE
ALTER TABLE clicks_sharded
FREEZE
Ok.
@den-crane nevermind there is a directory called 1. I did not notice it. 馃槅
better to use another ZK path for TEMP.
because in case of drop old replicated RO table it will try to remove data from ZK
or you can just detach table table_name_old and remove /data/../table_name_old /metadata/.../table_name_old.sql
@den-crane I'm about to do the rename and I got this error from ZK.
Code: 999. DB::Exception: Received from localhost:9000. Coordination::Exception. Coordination::Exception: No node, path: /clickhouse/tables/01/default/clicks_sharded/replicas/clickhouse-01-0/host.
I don't have clicks_sharded there, I only have clicks_sharded_tmp which was created on step 1.
[zk: localhost:2181(CONNECTED) 7] ls /clickhouse/tables/01/default
[clicks_sharded_tmp]
detach table table_name
mv /metadata/../ table_name.sql /var/tmp/
mv /data/../table_name /var/tmp/
The detach does not work. I think it was because of the previous command did not work properly.
clickhouse-01-0.clickhouses.clickshield.svc.cluster.local :) detach table clicks_sharded;
DETACH TABLE clicks_sharded
Received exception from server (version 19.14.3):
Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table default.clicks_sharded_old doesn't exist..
0 rows in set. Elapsed: 0.002 sec.
In /var/lib/clickhouse/data/default/ directory, I only see clicks_sharded_old. clicks_sharded does not exists.
What if I recreate step 1, and use the original clicks_sharded in the ZK path instead of clicks_sharded_tmp?
just stop CH and save folder with old table in the same filesystem
mkdir /var/tmp/
mv /data/../ clicks_sharded /var/tmp/ or mv /data/../clicks_sharded_old /var/tmp/
save metadata mv /metadata/../clicks_sharded_old.sql /var/tmp/
@den-crane restart is required? I did that.
Restart for what? What is the current state?
Do you have data in the new table? Do you see the old table in CH ?
@den-crane you mentioned just stop CH. I have data in the temp table (new). I restarted clickhouse and it won't start, it's getting an error DB::Exception: Cannot create table from metadata file /var/lib/clickhouse/metadata/default//clicks_sharded_tmp.sql, error: DB::Exception: Existing table metadata in ZooKeeper differs in index granularity byte. I had to delete the ZK path for it to be back up again. 馃槃
Current state:
What CH version do you use?
Please show the end of .sql files from old and new tables with Engine and ... order by partition by
@den-crane
I recreated steps 1 to 6 but in step 1, as I said, I recreated the replicated TEMP table using the ZK path of the ORIGINAL table name. So instead of clicks_sharded_tmp, I used clicks_sharded.
CREATE TABLE clicks_sharded_tmp
....
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/default/clicks_sharded', '{replica}')
....
Now I'm done with STEP 6.
Now, my STEP7 is I just recreated the distributed table that looks into clicks_sharded and I think everything is fine now. 馃
Yeah, you can rename them later.
And you can backup existing data using freeze https://clickhouse.yandex/docs/en/query_language/alter/#alter_freeze-partitionAnd use parts from shadow folder.
Steps:
- create a new replicated table TEMP
- ALTER TABLE table_name FREEZE
- mv /var/lib/clickhouse/shadow/...table_name.../* /var/lib/clickhouse/db/TEMP/detached
- attach all parts / partitions
cd /var/lib/clickhouse/db/TEMP/detached let i=1;for f in `ls -1` ; do echo $i $f;((i++)); echo "alter table TEMP attach part '$f';"|clickhouse-client ; done
- rename table_name to table_name_old
- rename TEMP to table_name
- rename table_name to table_name_old at replica
- create a new replicated table table_name at replica (it will download the data)
I will give more updates later on. 馃槃
After restarting the node I'm getting this error..
DB::Exception: Cannot create table from metadata file /var/lib/clickhouse/metadata/default//clicks_sharded.sql, error: DB::Exception: Table contains parts with adaptive and non adaptive marks, but `setting enable_mixed_granularity_parts` is disabled, stack trace:
Need to set enable_mixed_granularity_parts=1 in config.xml
@den-crane like this?
<yandex>
<merge_tree>
<enable_mixed_granularity_parts>1</enable_mixed_granularity_parts>
</merge_tree>
</yandex>
yes. and restart
@den-crane I managed to get the first replica for each of my shards up and running!
When running the second replica of the shard I get this error.
<Error> Application: DB::Exception: Cannot create object 'clicks_sharded' from query ....
....
....
error: Coordination::Exception: No node, path: /clickhouse/tables/01/default/clicks_sharded/replicas/clickhouse-01-1/parts
create a new replicated table table_name at replica (it will download the data)
you need to execute create table at replicas with the same zk path as zk path at the first replicas
@den-crane hmm what I did is the following for the server 2:
Fixing the startup problem.
Then, I finally merged the temporarily moved data of my server 2.
/var/lib/clickhouse/data/default/clicks_sharded/detached/ directory.alter table clicks_sharded attach par query.Thanks for all the help @den-crane! I really appreciate it. Closing this issue now! 馃嵒
Most helpful comment
Yeah, you can rename them later.
And you can backup existing data using freeze https://clickhouse.yandex/docs/en/query_language/alter/#alter_freeze-partition
And use parts from shadow folder.
Steps: