Hello. First of all thx for a great product. Very fast and flexible.
Recently we started using MaterializedView and noticed one issue (or not an issue but we did something wrong)
We created MVs based on local Metrics tables (they use ReplicatedMergeTree() engine and we constantly writes to those local tables (each second)) like:
CREATE MATERIALIZED VIEW default.MetricStats ON CLUSTER metrics
ENGINE = ReplicatedAggregatingMergeTree('/clickhouse/tables/metrics/{shard}/MetricStats', '{replica}')
PARTITION BY floor(time / 86400)
ORDER BY (time, appId) POPULATE AS
SELECT
uniqExactState(host) AS `os.host`,
uniqExactState(container) AS `container.id`,
uniqExactState(containerHostname) AS `container.host.hostname`,
appId,
floor(timestamp / 60) * 60 AS time
FROM default.Metrics
WHERE timestamp > now()-300
GROUP BY
time,
appId
Then we created distributed table based on MVs
CREATE TABLE IF NOT EXISTS default.MetricStatsDistributed ON CLUSTER metrics AS default.MetricStats
ENGINE = Distributed(metrics, default, MetricStats, rand());
After all we select data from Distributed table (which based on MVs)
SELECT
uniqExactMerge(`os.host`) AS `os.host`,
uniqExactMerge(`container.id`) AS `container.id`,
uniqExactMerge(`container.host.hostname`) AS `container.host.hostname`,
appId,
time
FROM MetricStatsDistributed
WHERE appId = XXX
GROUP BY
appId,
time
ORDER BY time ASC
And got following
ββos.hostββ¬βcontainer.idββ¬βcontainer.host.hostnameββ¬βappIdββ¬ββββββββββtimeββ
β 35 β 486 β 1 β XXX β 1546932480000 β
β 35 β 486 β 1 β XXX β 1546932540000 β
β 35 β 486 β 1 β XXX β 1546932600000 β
β 35 β 486 β 1 β XXX β 1546932660000 β
β 35 β 486 β 1 β XXX β 1546932720000 β
β 35 β 480 β 1 β XXX β 1546932780000 β >>> at that time we created MVs
β 29 β 287 β 1 β XXX β 1546932840000 β
β 27 β 376 β 1 β XXX β 1546932900000 β
β 27 β 330 β 1 β XXX β 1546932960000 β
β 21 β 309 β 1 β XXX β 1546933020000 β
βββββββββββ΄βββββββββββββββ΄ββββββββββββββββββββββββββ΄ββββββββ΄ββββββββββββββββ
35 are right values. 29, 27, 21 values are wrong and it means that we missed some unique values.
As you can see we see wrong values only after MVs were created, so during POPULATE we agg data right, But then during realtime insertion/aggregating we started to see issue.
We have a feeling that maybe not everything was added to MVs and aggregated with uniqExactState, or maybe aggregation into MVs triggers only once for (time, appId) pair (order key) and it something was added to "main" table later will be missed?
Thx for future answer&investigation.
WHERE timestamp > now()-300
Your MV missing everything with timestamp older than 5 min. during insert.
Yes, we know, this is not a case here. We insert only last minute data usually
I even tried
WHERE timestamp > now()-1800
No effect, so cause is somewhere else
No effect, so cause is somewhere else
Just to be sure - can you try to remove that condition? Anyway you "insert only last minute data".
Why: maybe now() is beeing calculated once - at the moment of MV initialization?
Try to calculate MV countState or sum(1) to locate the issue (missing rows or uniqExact)
No effect, so cause is somewhere else
Just to be sure - can you try to remove that condition? Anyway you "insert only last minute data".
Why: maybe now() is beeing calculated once - at the moment of MV initialization?
Hello, thx for suggestion, I've tried create MV without any conditions, didn't help neither
Try to calculate MV countState or sum(1) to locate the issue (missing rows or uniqExact)
Thx for suggestion, I've tried countState and got this
β 13629 β 78100 β 78100 β 78100 β 1547548020000 β
β 13629 β 78014 β 78014 β 78014 β 1547548080000 β
β 13629 β 77851 β 77851 β 77851 β 1547548140000 β
β 13629 β 78054 β 78054 β 78054 β 1547548200000 β
β 13629 β 50058 β 50058 β 50058 β 1547548260000 β >>>> HERE MV WAS CREATED
β 13629 β 39404 β 39404 β 39404 β 1547548320000 β
β 13629 β 34681 β 34681 β 34681 β 1547548380000 β
β 13629 β 26377 β 26377 β 26377 β 1547548440000 β
β 13629 β 34462 β 34462 β 34462 β 1547548500000 β
β 13629 β 33806 β 33806 β 33806 β 1547548560000 β
β 13629 β 11035 β 11035 β 11035 β 1547548620000 β
β 13629 β 28857 β 28857 β 28857 β 1547548680000 β
As you can see count also showed incorrect values. So problem is in missing rows it seems.
@wawanawna Did you find the reason?
No, we didn't find a reason @filimonov
We stopped using MV, found workaround.
Hi @wawanawna,
What is the workaround you found ?
@nupursinghbaghel it could be https://github.com/yandex/ClickHouse/issues/5274
Then workaround is to add some column to MV to break de-duplication
as select , now() as workaround -- or generateUUIDv4()
and collapse it using any
Hi @wawanawna,
What is the workaround you found ?
We just stopped using MVs @nupursinghbaghel
We using MV that selecs data from replicated table:
CREATE TABLE db.`replicated_table` (β¦) ENGINE = ReplicatedMergeTree(β¦, '{replica}') ORDER BY (β¦) PARTITION BY β¦ SAMPLE BY CAST(β¦);
Table is replicated across 3 machines, only 1 machine handles INSERT queries.
We've created MV on "slave" replica, and a lot of documents are missing on it.
CREATE MATERIALIZED VIEW db2.`mv` ENGINE = MergeTree() ORDER BY (β¦) PARTITION BY β¦ POPULATE AS SELECT β¦ FROM db.`replicated_table`;
or same behaviour with ENGINE=ReplicatedMergeTree
CREATE MATERIALIZED VIEW db2.`mv` ENGINE = ReplicatedMergeTree(β¦, '{replica}') ORDER BY (β¦) PARTITION BY β¦ POPULATE AS SELECT β¦ FROM db.`replicated_table`;
Fixed by creating MV with ENGINE=ReplicatedMergeTree on each insertable replica. Now all works fine.