常规的视图(View)可以看作一个预先保存好的查询,而Clickhouse的物化视图(materialized view,MV)功能类似于数据的插入触发器,会在来源数据插入的时候按照定义更新MV对应的数据
场景:统计平台不同时间段(分钟、小时、6小时、天、周、月、年)的下载量。原始数据:(when,platform,count)
| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 
 | CREATE TABLE download
 (
 when   DateTime,
 platform UInt32,
 cnt  UInt32
 ) ENGINE = MergeTree
 PARTITION BY toYYYYMM(when)
 ORDER BY (platform, when);
 
 
 INSERT INTO download
 SELECT
 now() + number * 60 as when,
 rand() % 10,
 rand() % 100000000
 FROM system.numbers
 LIMIT 50000
 
 | 
查看数据
| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 
 | SELECT
 toStartOfDay(when) AS day,
 platform,
 sum(cnt) AS cnt
 FROM download
 GROUP BY platform, day
 order by day asc,platform
 
 SELECT
 toStartOfWeek(when) AS week,
 platform,
 sum(cnt) AS cnt
 FROM download
 GROUP BY platform, week
 order by week asc,platform
 
 | 
按天的查询可以建立为MV,其中的POPULATE表示从已有的数据中生成,MV创建完成后原始数据表中的新增数据将自动按照GROUP BY条件更新到MV中。
需要注意MV在创建到数据导入完成会存在一部分时间,实际使用中一般不指定POPULATE,而在MV创建完成后再分段的注入数据,保证数据一致性。
| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 
 | CREATE MATERIALIZED VIEW downloads_by_dayENGINE = SummingMergeTree
 PARTITION BY toYYYYMM(day)
 ORDER BY (platform, day) POPULATE AS
 SELECT
 toStartOfDay(when) AS day,
 platform,
 sum(cnt) AS cnt
 FROM download
 GROUP BY
 platform,
 day;
 
 | 
同样的,我们可以按照分析需求建立更多层级的MV,也可以在MV中选择不同的聚合函数。结合输入端给到的时间范围以及展示上最多需要的点数,调整需要查询MV。
| 12
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 
 | public enum TimeAlign{
 By1Min = 1,
 By5Min = 2,
 By10Min = 3,
 By15Min = 4,
 By30Min = 5,
 By1Hour = 6,
 By6Hour = 7,
 By12Hour = 8,
 By1Day = 9,
 By1Week = 10,
 By1Month = 11
 }
 
 |