Clickhouse: Materialized View misses some data

Created on 10 Jan 2019  Β·  12Comments  Β·  Source: ClickHouse/ClickHouse

Hello. First of all thx for a great product. Very fast and flexible.

Recently we started using MaterializedView and noticed one issue (or not an issue but we did something wrong)

We created MVs based on local Metrics tables (they use ReplicatedMergeTree() engine and we constantly writes to those local tables (each second)) like:

CREATE MATERIALIZED VIEW default.MetricStats ON CLUSTER metrics 
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/metrics/{shard}/MetricStats', '{replica}')
PARTITION BY floor(time / 86400)
ORDER BY (time, appId) POPULATE AS
SELECT 
    uniqExactState(host) AS `os.host`, 
    uniqExactState(container) AS `container.id`, 
    uniqExactState(containerHostname) AS `container.host.hostname`, 
    appId, 
    floor(timestamp / 60) * 60 AS time
FROM default.Metrics 
WHERE timestamp > now()-300
GROUP BY 
    time, 
    appId

Then we created distributed table based on MVs

CREATE TABLE IF NOT EXISTS  default.MetricStatsDistributed ON CLUSTER metrics AS default.MetricStats
ENGINE = Distributed(metrics, default, MetricStats, rand());

After all we select data from Distributed table (which based on MVs)

SELECT 
    uniqExactMerge(`os.host`) AS `os.host`, 
    uniqExactMerge(`container.id`) AS `container.id`, 
    uniqExactMerge(`container.host.hostname`) AS `container.host.hostname`, 
    appId, 
    time
FROM MetricStatsDistributed 
WHERE appId = XXX
GROUP BY 
    appId, 
    time
ORDER BY time ASC

And got following

β”Œβ”€os.host─┬─container.id─┬─container.host.hostname─┬─appId─┬──────────time─┐
β”‚      35 β”‚          486 β”‚                       1 β”‚ XXX   β”‚ 1546932480000 β”‚
β”‚      35 β”‚          486 β”‚                       1 β”‚ XXX   β”‚ 1546932540000 β”‚
β”‚      35 β”‚          486 β”‚                       1 β”‚ XXX   β”‚ 1546932600000 β”‚
β”‚      35 β”‚          486 β”‚                       1 β”‚ XXX   β”‚ 1546932660000 β”‚
β”‚      35 β”‚          486 β”‚                       1 β”‚ XXX   β”‚ 1546932720000 β”‚
β”‚      35 β”‚          480 β”‚                       1 β”‚ XXX   β”‚ 1546932780000 β”‚ >>> at that time we created MVs
β”‚      29 β”‚          287 β”‚                       1 β”‚ XXX   β”‚ 1546932840000 β”‚
β”‚      27 β”‚          376 β”‚                       1 β”‚ XXX   β”‚ 1546932900000 β”‚
β”‚      27 β”‚          330 β”‚                       1 β”‚ XXX   β”‚ 1546932960000 β”‚
β”‚      21 β”‚          309 β”‚                       1 β”‚ XXX   β”‚ 1546933020000 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

35 are right values. 29, 27, 21 values are wrong and it means that we missed some unique values.
As you can see we see wrong values only after MVs were created, so during POPULATE we agg data right, But then during realtime insertion/aggregating we started to see issue.

We have a feeling that maybe not everything was added to MVs and aggregated with uniqExactState, or maybe aggregation into MVs triggers only once for (time, appId) pair (order key) and it something was added to "main" table later will be missed?

Thx for future answer&investigation.

question

All 12 comments

WHERE timestamp > now()-300

Your MV missing everything with timestamp older than 5 min. during insert.

Yes, we know, this is not a case here. We insert only last minute data usually
I even tried

WHERE timestamp > now()-1800

No effect, so cause is somewhere else

No effect, so cause is somewhere else

Just to be sure - can you try to remove that condition? Anyway you "insert only last minute data".

Why: maybe now() is beeing calculated once - at the moment of MV initialization?

Try to calculate MV countState or sum(1) to locate the issue (missing rows or uniqExact)

No effect, so cause is somewhere else

Just to be sure - can you try to remove that condition? Anyway you "insert only last minute data".

Why: maybe now() is beeing calculated once - at the moment of MV initialization?

Hello, thx for suggestion, I've tried create MV without any conditions, didn't help neither

Try to calculate MV countState or sum(1) to locate the issue (missing rows or uniqExact)

Thx for suggestion, I've tried countState and got this

β”‚ 13629 β”‚         78100 β”‚              78100 β”‚                         78100 β”‚ 1547548020000 β”‚
β”‚ 13629 β”‚         78014 β”‚              78014 β”‚                         78014 β”‚ 1547548080000 β”‚
β”‚ 13629 β”‚         77851 β”‚              77851 β”‚                         77851 β”‚ 1547548140000 β”‚
β”‚ 13629 β”‚         78054 β”‚              78054 β”‚                         78054 β”‚ 1547548200000 β”‚
β”‚ 13629 β”‚         50058 β”‚              50058 β”‚                         50058 β”‚ 1547548260000 β”‚  >>>> HERE MV WAS CREATED
β”‚ 13629 β”‚         39404 β”‚              39404 β”‚                         39404 β”‚ 1547548320000 β”‚
β”‚ 13629 β”‚         34681 β”‚              34681 β”‚                         34681 β”‚ 1547548380000 β”‚
β”‚ 13629 β”‚         26377 β”‚              26377 β”‚                         26377 β”‚ 1547548440000 β”‚
β”‚ 13629 β”‚         34462 β”‚              34462 β”‚                         34462 β”‚ 1547548500000 β”‚
β”‚ 13629 β”‚         33806 β”‚              33806 β”‚                         33806 β”‚ 1547548560000 β”‚
β”‚ 13629 β”‚         11035 β”‚              11035 β”‚                         11035 β”‚ 1547548620000 β”‚
β”‚ 13629 β”‚         28857 β”‚              28857 β”‚                         28857 β”‚ 1547548680000 β”‚

As you can see count also showed incorrect values. So problem is in missing rows it seems.

@wawanawna Did you find the reason?

No, we didn't find a reason @filimonov
We stopped using MV, found workaround.

Hi @wawanawna,
What is the workaround you found ?

@nupursinghbaghel it could be https://github.com/yandex/ClickHouse/issues/5274
Then workaround is to add some column to MV to break de-duplication
as select , now() as workaround -- or generateUUIDv4()
and collapse it using any

Hi @wawanawna,
What is the workaround you found ?

We just stopped using MVs @nupursinghbaghel

We using MV that selecs data from replicated table:

CREATE TABLE db.`replicated_table` (…) ENGINE = ReplicatedMergeTree(…, '{replica}') ORDER BY (…) PARTITION BY … SAMPLE BY CAST(…);

Table is replicated across 3 machines, only 1 machine handles INSERT queries.

We've created MV on "slave" replica, and a lot of documents are missing on it.

CREATE MATERIALIZED VIEW db2.`mv` ENGINE = MergeTree() ORDER BY (…) PARTITION BY … POPULATE AS SELECT … FROM db.`replicated_table`;

or same behaviour with ENGINE=ReplicatedMergeTree

CREATE MATERIALIZED VIEW db2.`mv` ENGINE = ReplicatedMergeTree(…, '{replica}') ORDER BY (…) PARTITION BY … POPULATE AS SELECT … FROM db.`replicated_table`;

Fixed by creating MV with ENGINE=ReplicatedMergeTree on each insertable replica. Now all works fine.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zhicwu picture zhicwu  Β·  3Comments

jangorecki picture jangorecki  Β·  3Comments

jimmykuo picture jimmykuo  Β·  3Comments

opavader picture opavader  Β·  3Comments

goranc picture goranc  Β·  3Comments