Clickhouse: Replicating Replacing Merge - Duplicates/Final

Created on 17 Jan 2018 · 16Comments · Source: ClickHouse/ClickHouse

Hello,

I am finding some strange behaviour with a replicatedreplacingmergetree. I need to store time series data and can not have duplicates. I've tried a simple case of loading the exact same dataset of ~1.5million rows twice to test how well it rejects duplicates.

Result - select count() from table= ~2.3million rows, so lots of duplicates. Select count() from table FINAL - exactly the right number.

So duplicates are being correctly flagged but how do I optimize the database so that they no longer exist without using FINAL? I have tried optimize table and it immediately returns OK but with no change to the count, I have also tried optimize table deduplicate with the same results.

I understand that optimize should be an intensive process of remerging to sort this out? I'm happy with that as it could be run irregularly as duplicates won't occur in normal service.

How do I force optimize?

Thanks for such an impressive system!

question

Source

ben-motiv

Most helpful comment

After successful run of

set optimize_throw_if_noop = 1;
OPTIMIZE TABLE tbl FINAL;

that was done after replication is done I would expect no duplicates should be guaranteed.

qoega on 8 Apr 2020

👍2

All 16 comments

Further testing - using ReplacingMergeTree on a single non-replicated machine I found the optimize table works, but only if run a number of times in succession. Rapidly calling it ~10 times from tabix will see it successful.

Logs don't suggest any reason for this. Each time I run it I see it do some tidying, so not sure what is going on. Just running it once and leaving it alone doesn't seem to work, 2 or 3 times will see the count(*) reducing but still not fully deduplicated. 10+ times seems to deduplicate completely on this dataset.

ben-motiv on 17 Jan 2018

Further - Trying again on a ReplicatedReplacingMergeTree across two servers. Optimize returns OK on one machine, but doesn't do anything, and on the 2nd machine it complains that the user password is not correct (use the same users.xml, and all DDL 'on cluster' statements work fine, and the data has replicated, so not sure why?)

Either way, I can't optimize a replicatedreplacingmergetree. I can't get rid of the duplicates.

ben-motiv on 18 Jan 2018

Temporary solution:
Select
partition,
partCount
from(
SELECT
partition,
count(*) as partCount
FROM
system.parts
WHERE
active
AND (table = 'TABLE')
group by
partition
)
where
partCount > 1

And run optimize table

partition final. This runs rapidly and seems to force a consolidation of each in turn. Runs in <2 seconds on 3 million duplicates. This works on a ReplicatedReplacingMergeTree on two servers.

ben-motiv on 18 Jan 2018

ReplacingMergeTree is mostly intended to save disk storage in long term, you still need to write queries assuming there will be duplicates in raw data.

blinkov on 17 Sep 2018

@ben-motiv
Have you tried OPTIMIZE table FINAL?

https://github.com/yandex/ClickHouse/issues/2086

PetarLotrean on 11 Oct 2018

Further - Trying again on a ReplicatedReplacingMergeTree across two servers. Optimize returns OK on one machine, but doesn't do anything, and on the 2nd machine it complains that the user password is not correct (use the same users.xml, and all DDL 'on cluster' statements work fine, and the data has replicated, so not sure why?)

Either way, I can't optimize a replicatedreplacingmergetree. I can't get rid of the duplicates.

does this PR fixes the optimize() error on the second box? https://github.com/yandex/ClickHouse/pull/3742

savelov on 6 Dec 2018

@savelov

3742 will work for ReplicatedMergeTree, but I'm not sure also work for ReplicatedReplacingMergeTree

viongpanzi on 6 Dec 2018

I found the same issue here, though in my case I am finding duplicates in a ReplicatedReplacingMergeTree even when running it as OPTIMIZE TABLE tbl FINAL.
Optimization returns OK but no changes are made on duplicates, still the same number. Is this because of being a Replicated* table?
Connected to ClickHouse server version 19.13.3 revision 54425.

xmariachi on 3 Sep 2019

@xmariachi try

in clickhouse-client:

set optimize_throw_if_noop = 1;
OPTIMIZE TABLE tbl FINAL;

In this case you'll get a message with a reason why merge could not start.

den-crane on 3 Sep 2019

Thanks.
I've found this issue in the logs - may it be the cause? I replied in that permalink where you can see the problem:
https://github.com/yandex/ClickHouse/issues/6472#issuecomment-527967770

xmariachi on 4 Sep 2019

I upgraded servers to match the same version. I no longer see those errors.
However, the behaviour is still the same: data doesn't get compacted.
The query finishes fast:

OPTIMIZE TABLE tbl FINAL
Ok.

0 rows in set. Elapsed: 2.502 sec.

However, the table does not do data compaction, or at least not on a certain set of data where duplicates are. Having set the optimize_throw_if_noop flag , nothing gets printed in the logs.

Table is a ReplicatedReplacingMergeTree.

xmariachi on 5 Sep 2019

@xmariachi are you sure that primary / order by keys are set correctly and you checking corresponding uniqueness?

NanoBjorn on 12 Nov 2019

Hi @NanoBjorn, yes, I think so. It works fine with duplicates on smaller table sizes.

xmariachi on 14 Nov 2019

@ben-motiv, do you have any further questions?

blinkov on 1 Apr 2020

We also have this problem on about 200 million rows, OPTIMIZE in any shape and form does nothing and old rows are still duplicated. SELECT FINAL does not help as well.
ClickHouse client version 19.17.2.4 (official build).

@blinkov
Am I right to assume that in the end there is no strong guarantee about ReplReplaMergeTree's deduplication properties and we should always write queries in a way to expect duplicated samples? Because the current documentation does not make such an impression.

Boris-Barboris on 2 Apr 2020

After successful run of

set optimize_throw_if_noop = 1;
OPTIMIZE TABLE tbl FINAL;

that was done after replication is done I would expect no duplicates should be guaranteed.

qoega on 8 Apr 2020

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

uniqExact(distinct count) is slow

zhicwu · 3Comments

Any plan to support decimal?

innerr · 3Comments

INSERT with JSONEachRow format does not use DEFAULT value for the Date column

amonakhov · 3Comments

Address already in use error with docker image 1.1.54378

derekperkins · 3Comments

[Question] Why rows are reducing

fizerkhan · 3Comments