Clickhouse: What is the best Architecture of ClickHouse Cluster in production?

Created on 21 Sep 2017  Â·  2Comments  Â·  Source: ClickHouse/ClickHouse

  • In order to improve the query performance, I use 3 node as a cluster, as below, I have A/B/C node, all nodes have a local table which use MergeTree engine, and also have a Distributed engine table, as planed, the query is very fast(3 nodes query parallelly).

  • But my problem is, how to make sure the whole cluster high availability?

  • After checking the Doc, I use such architecture as blow:

  • The config file as below:
<remote_servers>
    <logs>
        <shard>
            <weight>1</weight>
            <internal_replication>false</internal_replication>
            <replica>
                <host>A</host>
                <port>9000</port>
            </replica>
            <replica>
                <host>A'</host>
                <port>9000</port>
            </replica>
        </shard>
        <shard>
            <weight>2</weight>
            <internal_replication>false</internal_replication>
            <replica>
                <host></host>
                <port>9000</port>
            </replica>
            <replica>
                <host>B'</host>
                <port>9000</port>
            </replica>
        </shard>
    </logs>
</remote_servers>
  • I use A', B' and C' as replica node, make sure I have at least one backup.

  • But the new problem is, how to solve the failover problem?

    • If the A' or B' or C' get down, how to rebuild a new replica with history data?

Most helpful comment

As ClickHouse is multer-master replication, the best arch should be as below:

All 2 comments

Found that the replication engine can solve data loss problem.

But the failover solution is still not clear.

As ClickHouse is multer-master replication, the best arch should be as below:

Was this page helpful?
0 / 5 - 0 ratings