I have created a materialized view using this query
CREATE MATERIALIZED VIEW db.top_ids_mv (
`date_time` DateTime,
`id` String,
`total` UInt64
) ENGINE = SummingMergeTree
ORDER BY
(date_time, id) SETTINGS index_granularity = 8192 POPULATE AS
SELECT
toDateTime((intDiv(toUInt32(date_time), 60 * 60) * 60) * 60) AS date_time,
id AS id,
count(*) AS count
FROM
db.table
WHERE
type = 'user'
GROUP BY
date_time,id
My table contains almost 18 billion records. I have inserted my old data using POPULATE
. But newly inserted data is not getting inserted into this materialized view. I have created many other views and they are working fine but this is creating issue.
This is what I am receiving in logs
2021.09.23 19:54:54.424457 [ 137949 ] {5b0f3c32-2900-4ce4-996d-b9696bd38568} <Trace> PushingToViewsBlockOutputStream: Pushing (sequentially) from db.table (15229c91-c202-4809-9522-9c91c2028809) to db.top_ids_mv (0cedb783-bf17-42eb-8ced-b783bf1742eb) took 0 ms.
One thing I noticed is that it is taking 0ms. I think that is wrong because query must take some time.
Thanks. Any help would be appreciated
CodePudding user response:
- SummingMergeTree does not store rows with metrics == 0.
total UInt64 <----> count(*) AS count
-- names does not match. Your Mat.View inserts 0 intototal
,count
goes nowhere.
Both are expected and specifically implemented for the reasons.
https://den-crane.github.io/Everything_you_should_know_about_materialized_views_commented.pdf
...
SELECT
toDateTime((intDiv(toUInt32(date_time), 60 * 60) * 60) * 60) AS date_time,
id AS id,
count(*) AS total --<<<<<------
FROM
db.table
...
For query performance and better data compression I would do
ENGINE = SummingMergeTree
ORDER BY ( id, date_time ) --- order id , time
Also try codecs
`date_time` DateTime CODEC(Delta, LZ4),
`id` LowCardinality(String),
`total` UInt64 CODEC(T64, LZ4)