du -h 0B ./detached 7.7M ./20230809_2_2_0 7.7M ./20230808_1_1_0 15M .
可以看到两个分区目录均是 7.7M 尝试执行删除操作后,可以在日志中看到下面的查询信息
<Debug> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Key condition: unknown, (column 0 in [1000, +Inf)), (column 0 in (-Inf, 10000]), and, and <Debug> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Key condition: unknown, (column 0 in [1000, +Inf)), (column 0 in (-Inf, 10000]), and, and <Debug> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): MinMax index condition: (toYYYYMMDD(column 0) in [20230808, 20230808]), unknown, unknown, and, and <Debug> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): MinMax index condition: (toYYYYMMDD(column 0) in [20230808, 20230808]), unknown, unknown, and, and <Trace> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Running binary search on index range for part 20230808_1_1_0 (124 marks) <Trace> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Found (LEFT) boundary mark: 0 <Debug> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Selected 0/1 parts by partition key, 0 parts by primary key, 0/0 marks by primary key, 0 marks to read from 0 ranges <Trace> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Found (RIGHT) boundary mark: 2 <Trace> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Found continuous range in 13 steps <Debug> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Selected 1/1 parts by partition key, 1 parts by primary key, 2/123 marks by primary key, 2 marks to read from 1 ranges <Trace> delete_operate.mutations_operate (7f120927-b71e-4f85-a06c-21a94b7f89e3) (SelectExecutor): Spreading mark ranges among streams (default reading) <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part 20230808_1_1_0, approx. 16384 rows starting from 0 <Trace> Aggregator: Aggregation method: without_key <Trace> AggregatingTransform: Aggregated. 0 to 1 rows (from 0.00 B) in 0.000570041 sec. (0.000 rows/sec., 0.00 B/sec.) <Trace> Aggregator: Merging aggregated data <Trace> MutateTask: Part 20230809_2_2_0 doesn't change up to mutation version 3
首先,clickhouse 会使用我们执行的删除语句中附带的 where 条件在每个分区中执行 count 查询,为了判断哪些分区有需要被删除的数据,从日志可以看出Reading 1 ranges in order from part 20230808_1_1_0, approx. 16384 rows starting from 0以及Part 20230809_2_2_0 doesn't change up to mutation version 3。注意日志中所说 20230808 的范围是 0~16384 并不是实际删除的范围,而是索引的范围。我们知道 mergeTree 引擎默认的跳数索引的间隔是 8192 而我们删除的数据范围是 1000-10000,显然作为一个整周期自然是 0-16384(2x8192) 当我们再次查看磁盘目录
<Debug> delete_operate.mutations_operate (4351d317-2cd6-4328-85fe-49d5beeff5c3): Clone part /opt/homebrew/var/lib/clickhouse/store/435/4351d317-2cd6-4328-85fe-49d5beeff5c3/20230809_2_2_0/ to /opt/homebrew/var/lib/clickhouse/store/435/4351d317-2cd6-4328-85fe-49d5beeff5c3/tmp_clone_20230809_2_2_0_3 <Trace> delete_operate.mutations_operate (4351d317-2cd6-4328-85fe-49d5beeff5c3): Renaming temporary part tmp_clone_20230809_2_2_0_3 to 20230809_2_2_0_3 with tid (1, 1, 00000000-0000-0000-0000-000000000000). <Trace> MergedBlockOutputStream: filled checksums 20230808_1_1_0_3 (state Temporary) <Trace> delete_operate.mutations_operate (4351d317-2cd6-4328-85fe-49d5beeff5c3): Renaming temporary part tmp_mut_20230808_1_1_0_3 to 20230808_1_1_0_3 with tid (1, 1, 00000000-0000-0000-0000-000000000000)
从磁盘目录也可以佐证这一点,首先上面的 20230809_2_2_0_3 占用空间为 0B,当然这是 mac 独有的现实方式,在其它 linux 系统不一定是这么显示,进入各个分区查看一下
wjun :: data/delete_operate/mutations_operate ‹stable› » ll 20230808_1_1_0_3 total 15632 -rw-r-----@ 1 wjun admin 17863 Aug 9 19:36 CreateTime.bin -rw-r-----@ 1 wjun admin 369 Aug 9 19:36 CreateTime.cmrk2 -rw-r-----@ 1 wjun admin 3968891 Aug 9 19:36 Score.bin -rw-r-----@ 1 wjun admin 409 Aug 9 19:36 Score.cmrk2 -rw-r-----@ 1 wjun admin 3969011 Aug 9 19:36 UserId.bin -rw-r-----@ 1 wjun admin 409 Aug 9 19:36 UserId.cmrk2 -rw-r-----@ 1 wjun admin 490 Aug 9 19:36 checksums.txt -rw-r-----@ 1 wjun admin 90 Aug 9 19:36 columns.txt -rw-r-----@ 1 wjun admin 6 Aug 9 19:36 count.txt -rw-r-----@ 1 wjun admin 10 Aug 9 19:36 default_compression_codec.txt -rw-r-----@ 1 wjun admin 1 Aug 9 19:36 metadata_version.txt -rw-r-----@ 1 wjun admin 8 Aug 9 19:36 minmax_CreateTime.idx -rw-r-----@ 1 wjun admin 4 Aug 9 19:36 partition.dat -rw-r-----@ 1 wjun admin 188 Aug 9 19:36 primary.cidx wjun :: data/delete_operate/mutations_operate ‹stable› » ll 20230809_2_2_0_3 total 15768 -rw-r-----@ 2 wjun admin 18042 Aug 9 19:35 CreateTime.bin -rw-r-----@ 2 wjun admin 375 Aug 9 19:35 CreateTime.cmrk2 -rw-r-----@ 2 wjun admin 4004938 Aug 9 19:35 Score.bin -rw-r-----@ 2 wjun admin 415 Aug 9 19:35 Score.cmrk2 -rw-r-----@ 2 wjun admin 4004915 Aug 9 19:35 UserId.bin -rw-r-----@ 2 wjun admin 415 Aug 9 19:35 UserId.cmrk2 -rw-r-----@ 2 wjun admin 490 Aug 9 19:35 checksums.txt -rw-r-----@ 2 wjun admin 90 Aug 9 19:35 columns.txt -rw-r-----@ 2 wjun admin 7 Aug 9 19:35 count.txt -rw-r-----@ 2 wjun admin 10 Aug 9 19:35 default_compression_codec.txt -rw-r-----@ 2 wjun admin 8 Aug 9 19:35 minmax_CreateTime.idx -rw-r-----@ 2 wjun admin 4 Aug 9 19:35 partition.dat -rw-r-----@ 2 wjun admin 173 Aug 9 19:35 primary.cidx