Clickhouse: How to generate secondary indexes?

Created on 20 Aug 2019  ยท  5Comments  ยท  Source: ClickHouse/ClickHouse

I've got a table with a value column of type string.Having populated the table, I add a secondary "data skipping" index with

ALTER TABLE xxx ADD INDEX value value TYPE ngrambf_v1(2, 512, 4, 7) GRANULARITY 64;

That command finishes in 0.011 sec. According to the docs,

These commands are lightweight in a sense that they only change metadata or remove files.

I looked at top - indeed, nothing is happening.

Then I execute a query which compares value:

SELECT count(*) from xxx WHERE value='whatever';

I get

[xxx-clickhouse] 2019.08.20 01:21:27.148979 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> executeQuery: (from [::ffff:127.0.0.1]:48270) SELECT count(*) FROM xxx WHERE value = 'whatever'
โ†’ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) [xxx-clickhouse] 2019.08.20 01:21:27.150348 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): Key condition: unknown
[xxx-clickhouse] 2019.08.20 01:21:27.150671 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150704 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150718 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150736 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150754 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150770 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150792 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150823 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150887 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150927 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.150970 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151011 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151045 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151100 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151140 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151178 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151227 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151261 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151300 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151342 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151390 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): File for index `value` does not exist. Skipping it.
[xxx-clickhouse] 2019.08.20 01:21:27.151427 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> default.xxx (SelectExecutor): Selected 21 parts by date, 21 parts by key, 1958896 marks to read from 21 ranges
[xxx-clickhouse] 2019.08.20 01:21:27.152593 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> executeQuery: Query pipeline:
Expression
 Expression
  ParallelAggregating
   Expression ร— 32
    Filter
     MergeTreeThread

โ”Œโ”€count()โ”€โ”
โ”‚  257298 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
[xxx-clickhouse] 2019.08.20 01:21:45.728037 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Information> executeQuery: Read 16046897896 rows, 286.56 GiB in 18.579 sec., 863724271 rows/sec., 15.42 GiB/sec.
[xxx-clickhouse] 2019.08.20 01:21:45.728106 {d6867bcc-dc33-4fcc-b55a-49be17ac11b4} [ 144 ] <Debug> MemoryTracker: Peak memory usage (for query): 401.37 MiB.

1 rows in set. Elapsed: 18.582 sec. Processed 16.05 billion rows, 307.69 GB (863.59 million rows/s., 16.56 GB/s.)

This was kind of expected - the index was not really created so it was not used. The performance is identical to the one before ADD INDEX.

My version:

ClickHouse client version 19.13.1.11 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 19.13.1 revision 54425.

Question

How to actually generate my secondary index?

comp-skipidx question

Most helpful comment

Thanks! This magic command is worth mentioning somewhere near ADD INDEX docs :smile:

All 5 comments

Run OPTIMIZE TABLE xxx FINAL; after adding index to build it for old data.

In upcoming releases (probably in 19.14 and higher) special syntax for that will appear:
ALTER TABLE ... MATERIALIZE INDEX ... IN PARTITION ...

See: https://github.com/yandex/ClickHouse/pull/5053/files?file-filters%5B%5D=.md

Thanks! This magic command is worth mentioning somewhere near ADD INDEX docs :smile:

@BayoNet do you have a ticket for this?

@BayoNet do you have a ticket for this?

Actually @nikvas0 already have added some documentation for that. :) (See link above).

Was this page helpful?
0 / 5 - 0 ratings