How to reproduce
Expected behavior
"select count()" should gives 3000000 rows in table with MergeTree, but it gives less even after stream_flush_interval_ms passes (2999889, for example)
Additional context
@alexm93 try experimenting with kafka_max_block_size = 1. We ran into the same issue and temporarily resolved it by doing so. Also commented on https://github.com/yandex/ClickHouse/issues/4736
I observe similar issues.
Pushing 1m events burst is lossless.
Pushing 10m events burst results in partial loss to the materialized view with missing rows
kafka_max_block_size = 1 Significantly slows down the population rate but still loses tens of thousands of rows.
I am detecting missing rows by row count on kafka producer vs materialized view. E.g. 11000000 events go in, 10982699 rows come out in the MV. At smaller bursts the tally is exact.
I have tried MergeTree and Memory engines for the MV.
Confirmed the problem. Trying to fix.
@abyss7 thanks! While you're dealing with that, do you mind explaining or defining CH behaviour if it can't process kafka events as quickly as they are being published?
At high rate mismatch (continuous 10m/s publish, CH only processing about 1m/s), I find that 90% of the data seems to get blackholed never to be seen again in either CH or on the queue. CH should not be consuming the queue at a faster rate than it can actually process, and that should make up part of the guarantee that every single message is eventually processed.
@abyss7 thanks! While you're dealing with that, do you mind explaining or defining CH behaviour if it can't process kafka events as quickly as they are being published?
At high rate mismatch (continuous 10m/s publish, CH only processing about 1m/s), I find that 90% of the data seems to get blackholed never to be seen again in either CH or on the queue. CH should not be consuming the queue at a faster rate than it can actually process, and that should make up part of the guarantee that every single message is eventually processed.
In theory it's a plausible scenario, since there may be the situation where you have a single kafka message, which contains 1M rows, and when CH read it and started to insert rows via MV, the CH already marked it as commited. So, if CH crashes on the halfway, the rest of the rows will never be read again. If it's not the case, then I suggest to discuss it in another issue, or in Telegram Chat, if convenient.
Looks like the duplicate of #4736
Most helpful comment
Confirmed the problem. Trying to fix.