The queries across the whole cluster don't work when a host has multiple shards:
DROP TABLE test ON CLUSTER 'cluster1'
click1 9000 371 An error occured before execution: Code: 371, e.displayText() = DB::Exception: There are two exactly the same ClickHouse instances click1:9000 in cluster cluster1, e.what() = DB::Exception 32 0
The host has this indeed three replicas of different shards:
SELECT *
FROM system.clusters
WHERE host_name = 'click1'
ββclusterβββ¬βshard_numββ¬βshard_weightββ¬βreplica_numββ¬βhost_nameββ¬βhost_addressββ¬βportββ¬βis_localββ¬βuserβββββ¬βdefault_databaseββ
β cluster1 β 28 β 1 β 1 β click1 β 10.0.10.100 β 9000 β 0 β default β r0 β
β cluster1 β 29 β 1 β 2 β click1 β 10.0.10.100 β 9000 β 0 β default β r1 β
β cluster1 β 30 β 1 β 3 β click1 β 10.0.10.100 β 9000 β 0 β default β r2 β
ββββββββββββ΄ββββββββββββ΄βββββββββββββββ΄ββββββββββββββ΄ββββββββββββ΄βββββββββββββββ΄βββββββ΄βββββββββββ΄ββββββββββ΄βββββββββββββββββββ
There's really no reason for this check to exist, if you want to make sure it's not a configuration error, you should perhaps use host_name:port/default_database for uniqueness check?
cc @ludv1x
Hi!
Indeed, for the simplicity of the first distributed DDL version, I intentionally skipped implementation of the case when a server has several shards in different databases. I thought it is not the widespread case.
It is not complicated to add support of this case.
Ah I see, supporting this use case would be wonderful π
So my understanding is that only ALTER queries for replicated tables are problematic, since some require you to execute the query only on leader replica, and some require you to execute the query on any single replica.
For CREATE | ATTACH | DROP | DETACH it should be executed on distinct server instances regardless of how many shards do they have. For example in my case, I have 3 shards in databases r{0,1,2}.
If write:
DROP TABLE r0.test ON CLUSTER 'cluster1'
I expect r0.test to be dropped on all hosts, and I don't expect other replicas or shards to be affected.
If the database isn't specified, I'd expect it to be current database, instead of expanding to default_database on each host.
If this is your planned behavior, then I'd be happy to submit a PR. Or did you plan to execute the queries without explictly stated database on each host_name:port/default_database instead (just like distributed view executes)?
We also have multiple shards and host a replica of each shard on every node, which basically makes ON CLUSTER requests unusable...
Some related talks from Telegram channel:
Alessandro Rizzo:
Ok. During ReplicatedMergeTree tables creation, I need to set ZK path and replica name.
But with 3 nodes sharded and 2 replicas for each one, I will have a structure like that:node1: shard1replica1, shard3replica2 node2: shard2replica1, shard1replica2 node3: shard3replica1, shard2replica2Does replica name on macros "break" with that?
Because I configured macros on node like
shard 01
replica 01
shard 02
replica 03
shard 03
replica 03
Does it makes sense?
Ivan Blinkov:
There was some workaround to do it like this, but unfortunately I can't reproduce it from top of my head (probably some1 else in the chat can).
It's easier to just use 6 nodes to avoid this colocation.
Alex Zatelepin:
That's called "cross replication". We have a test with sample configuration for this case: https://github.com/yandex/ClickHouse/blob/master/dbms/tests/integration/test_cross_replication/test.py
Alexander Zaitsev:
You may look at this example as well: https://www.altinity.com/blog/2018/5/10/circular-replication-cluster-topology-in-clickhouse
Mikhail Filimonov:
Currently it's a bit too tricky, may be it would be better to make it a bit more straight-ahead before making that recommendations 'official'? For example it would be much clearer if some macros values could be extracted from database name and hostname. I.e. database = test_myshard1 and macro like{shard_name} = {DATABASE:5:8}and{replica} = {HOSTNAME}, afterward macros would be used in table definition, and ON CLUSTER should work. Also it should be quite easy to implement, and afterwards it will be easy to configure clickhouse for usage like this.
Can we expect this feature in nearby future?
We are also facing this issue. Does anyone knows any plans on the fix
Same issue. We have to truncate some tables periodically and it would be much more convenient if our apps wouldn't know anything about cluster shards and their tables - only cluster name.
I have finished a patch to support distributed ddl(currently, only delete/updatet/drop partition operations have been tested) on cross replication clustersγ distributed-alter-queryγFor distributed ddl on crosss replication clusters, the syntax should be similar to the following "drop" statement:
"alter table db1.tab1_local on cluster cluster1 drop partition '2018-01-01';"
On other clusters, they remain the same as before.
Fixed in #11508 & #11703