Describe the bug
Given the following tables:
CREATE TABLE IF NOT EXISTS source
(
`timestamp` DateTime,
`number` UInt32,
`date1` Date,
`date2` Date
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(timestamp)
ORDER BY (timestamp);
CREATE TABLE IF NOT EXISTS number_date (
date Date,
number UInt32,
dates_in_between Date,
count UInt32
) ENGINE = SummingMergeTree(count)
PARTITION BY toYYYYMMDD(date)
ORDER BY (date, number, dates_in_between);
CREATE MATERIALIZED VIEW IF NOT EXISTS number_date_view TO number_date AS
SELECT date, number, toDate(dates) as dates_in_between, CAST(count(*), 'UInt32') AS count
FROM
(
WITH 86400 AS dayseconds
SELECT
toDate(timestamp) AS date,
number,
timeSlots(toDateTime(date1, 'Zulu'), toUInt32((dateDiff('day', date1, date2) - 1) * dayseconds), toUInt32(dayseconds)) AS dates
FROM source
) ARRAY JOIN dates
GROUP BY date, number, dates_in_between;
When I insert data in source table with
INSERT INTO source (timestamp, number, date1, date2) VALUES('2019-12-01 12:00:00', 77, '2020-10-01', '2020-10-03');
ClickHouse complains about missing columns:
Received exception from server (version 19.11.2):
Code: 47. DB::Exception: Received from localhost:9001. DB::Exception: Missing columns: 'date' while processing query: 'SELECT date, number, toDate(dates) AS dates_in_between, CAST(count(), 'UInt32') AS count FROM (WITH 86400 AS dayseconds SELECT toDate(timestamp) AS date, number, timeSlots(toDateTime(date1, 'Zulu'), toUInt32((dateDiff('day', date1, date2) - 1) * dayseconds), toUInt32(dayseconds)) AS dates FROM default.source) ARRAY JOIN dates GROUP BY date, number, dates_in_between', required columns: 'number' 'date', source columns: 'date1' 'timestamp' 'date2' 'number', arrayJoin columns: 'dates': while pushing to view default.number_date_view.
Now. It is quite weird, since, the INSERT works perfectly if I don't create the views and it also works fine with an INSERT FROM SELECT
INSERT INTO number_date SELECT
date,
number,
toDate(dates) AS dates_in_between,
CAST(count(*), 'UInt32') AS count
FROM
(
WITH 86400 AS dayseconds
SELECT
toDate(timestamp) AS date,
number,
timeSlots(toDateTime(date1, 'Zulu'), toUInt32((dateDiff('day', date1, date2) - 1) * dayseconds), toUInt32(dayseconds)) AS dates
FROM source
)
ARRAY JOIN dates
GROUP BY
date,
number,
dates_in_between
I am running:
SELECT version()
ββversion()ββ
β 19.11.2.7 β
βββββββββββββ
In docker.
Am I don't something wrongly?
Let me know if I you need extra data.
Thanks in advance
CH does not have full support of subselect in MV yet. https://github.com/yandex/ClickHouse/issues/2878
Thanks for the quick reply.
It is quite unfortunate but I take comfort in the word "yet" :)
In the meanwhile, I have found a workaround moving the sub query logic to a materialized column in the source table so the materialized view does not require it.
Just in case it might be useful to someone else.
Needless to say, you guys have created an amazing piece of software.
Well done and thanks again!
@pachico What is the workaround that you implemented for this use case?
@h-abinaya28 I used a materialized column to preprocess part of the data so I didn't have to use a subquery.
In this case, in the table source I created a materialized column storing timeSlots(toDateTime(date1, 'Zulu'), toUInt32((dateDiff('day', date1, date2) - 1) * dayseconds), toUInt32(dayseconds)) and I arrayJoined them.
BTW, Original issue already fixed in master. Works in 19.15.1.1285.
Awesome news, @den-crane! Great work!
Tested in 20.3.4.10, it works.
Most helpful comment
BTW, Original issue already fixed in master. Works in 19.15.1.1285.