Clickhouse: Insert the same data to ReplicatedMergeTree table but get only one record when select.

Created on 28 Jun 2020  Β·  1Comment  Β·  Source: ClickHouse/ClickHouse

Describe the bug

Insert the same data to ReplicatedMergeTree table but get only one record when select.

How to reproduce

  • Which ClickHouse server version to use: 20.3.7.46
  • Create table

    CREATE TABLE olap.replicated_test
    (
        date Date,
        a String,
        b String
    )
    ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/replicated_test', '{replica}')
    PARTITION BY toYYYYMM(date)
    ORDER BY (date, a, b)
    SETTINGS index_granularity = 8192;
    
  • Insert data and select

    insert into olap.replicated_test values('2020-06-01', 'a', 'b');
    select * from olap.replicated_test;
    
    β”Œβ”€β”€β”€β”€β”€β”€β”€date─┬─a─┬─b─┐
    β”‚ 2020-06-01 β”‚ a β”‚ b β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜
    
  • Insert the same data again and select

    insert into olap.replicated_test values('2020-06-01', 'a', 'b');
    select * from olap.replicated_test;
    
    β”Œβ”€β”€β”€β”€β”€β”€β”€date─┬─a─┬─b─┐
    β”‚ 2020-06-01 β”‚ a β”‚ b β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”΄β”€β”€β”€β”˜
    
question question-answered

Most helpful comment

That's how insert deduplication works.

https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/replication/

_Data blocks are deduplicated. For multiple writes of the same data block (data blocks of the same size containing the same rows in the same order), the block is only written once. The reason for this is in case of network failures when the client application doesn’t know if the data was written to the DB, so the INSERT query can simply be repeated. It doesn’t matter which replica INSERTs were sent to with identical data. INSERTs are idempotent. Deduplication parameters are controlled by merge_tree server settings._

https://clickhouse.tech/docs/en/operations/settings/settings/#settings-insert-deduplicate

>All comments

That's how insert deduplication works.

https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/replication/

_Data blocks are deduplicated. For multiple writes of the same data block (data blocks of the same size containing the same rows in the same order), the block is only written once. The reason for this is in case of network failures when the client application doesn’t know if the data was written to the DB, so the INSERT query can simply be repeated. It doesn’t matter which replica INSERTs were sent to with identical data. INSERTs are idempotent. Deduplication parameters are controlled by merge_tree server settings._

https://clickhouse.tech/docs/en/operations/settings/settings/#settings-insert-deduplicate

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vvp83 picture vvp83  Β·  3Comments

innerr picture innerr  Β·  3Comments

opavader picture opavader  Β·  3Comments

fizerkhan picture fizerkhan  Β·  3Comments

hatarist picture hatarist  Β·  3Comments