I've read the documentation about layer.
doc: https://clickhouse.yandex/docs/en/operations/table_engines/distributed/
A large number of servers is used (hundreds or more) with a large number of small queries (queries of individual clients - websites, advertisers, or partners). In order for the small queries to not affect the entire cluster, it makes sense to locate data for a single client on a single shard. Alternatively, as we've done in Yandex.Metrica, you can set up bi-level sharding: divide the entire cluster into "layers", where a layer may consist of multiple shards. Data for a single client is located on a single layer, but shards can be added to a layer as necessary, and data is randomly distributed within them. Distributed tables are created for each layer, and a single shared distributed table is created for global queries.
If I want to configure multiple layers, then Do I fix
How to configure multiple layers for a single cluster?
What are the arguments for the Distributed table engines for the specific layer?
So you have 400 servers in your cluster, divide them into 20 sub-clusters.
Servers 1..20 will be members of sub_cluster_l1 and members of super-cluster cluster (with all 400 servers) the same time.
Put data of clients with names A.. into cluster_l1 and clients with names Z.. into sub_cluster_l20
Create 21 distributed tables, 20 for each subcluster and 1 for super-cluster.
If you need to select data for client A.. point your query to subcluster_1 (distributed table)
If you need to select data over all clients point your query to super-cluster (distributed table)
What are the arguments for the Distributed table engines for the specific layer?
Just create several Distributed tables with different cluster specified.
A table (ch-server) can be a member of several clusters.
Most helpful comment
So you have 400 servers in your cluster, divide them into 20 sub-clusters.
Servers 1..20 will be members of sub_cluster_l1 and members of super-cluster cluster (with all 400 servers) the same time.
Put data of clients with names A.. into cluster_l1 and clients with names Z.. into sub_cluster_l20
Create 21 distributed tables, 20 for each subcluster and 1 for super-cluster.
If you need to select data for client A.. point your query to subcluster_1 (distributed table)
If you need to select data over all clients point your query to super-cluster (distributed table)
Just create several Distributed tables with different cluster specified.
A table (ch-server) can be a member of several clusters.