Clickhouse: feature request: EXPLAIN output

Created on 17 Aug 2017 · 19Comments · Source: ClickHouse/ClickHouse

I'm fairly new to Clickhouse and as I'm developing queries I'd like to be able to optimize them. With a traditional RDBMS I can examine the query plan to get a sense of what my query will do without running it. I'd like for Clickhouse to be able to give me some kind of diagnostic output that would guide my optimization efforts. For example, see SQLite and Postgres.

I found a previously discussion but I don't have direct access to the server to read that log file.

In short, I'm asking for:

some type of EXPLAIN syntax that can be prefixed to a query when it is sent to Clickhouse
a query response detailing how Clickhouse intends to execute the query
documentation describing what that EXPLAIN query response means and pointers on how to optimize queries

Thanks!

feature

Source

bubaflub

👍39

Most helpful comment

This is the part of a larger task and we have some related tasks planned:

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

Second, we are going to implement a query, that will show, how many rows will be read for each subquery on each server, according to index info.

Both are not exactly an EXPLAIN query.

alexey-milovidov on 17 Aug 2017

👍23

All 19 comments

This is the part of a larger task and we have some related tasks planned:

Second, we are going to implement a query, that will show, how many rows will be read for each subquery on each server, according to index info.

Both are not exactly an EXPLAIN query.

alexey-milovidov on 17 Aug 2017

👍23

Hi @alexey-milovidov , is there any plan to implement this feature ?

dulong on 28 Aug 2017

Any new information about this, please?

telemmaite on 12 Nov 2017

👍3

@alexey-milovidov It would be also useful to include some recommendation about index granularity for the queries. E.g. when the most common and frequent query touches only few rows, but had to scan much more rows for that, we can suggest: "If you want this query to be faster you can change your table index granularity to N".

bocharov on 22 Feb 2018

👍3

is there any plan to implement this feature ?

We started working on this plan: https://github.com/yandex/ClickHouse/issues/1118#issuecomment-323190806

alexey-milovidov on 22 Feb 2018

@alexey-milovidov It would be also useful to include some recommendation about index granularity for the queries. E.g. when the most common and frequent query touches only few rows, but had to scan much more rows for that, we can suggest: "If you want this query to be faster you can change your table index granularity to N".

At first we want to make index granularity adaptive on row size in bytes. For example, marks should be put more frequently if you have fat rows. This is separate feature.

alexey-milovidov on 22 Feb 2018

👍1

is the task dead?
any news about explain query plan?

akonyaev90 on 10 Sep 2018

@akonyaev90 not dead, just in backlog, not currently in progress.

blinkov on 10 Sep 2018

Could you tell, when you implement explain query plan?

BlackEric001 on 1 Nov 2018

Looking forward.

jackpgao on 20 Nov 2018

👍1

@BlackEric001 nope, not scheduled yet. Though you can already use something like this to get some idea on how query is executed:

% clickhouse-client --send_logs_level=debug <<< 'select max(number) from numbers(100) where number % 2 union all select 123 as number;' > /dev/null
[....yandex.net] 2018.11.20 16:18:29.703878 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: (from [::1]:53819) select max(number) from numbers(100) where number % 2 union all select 123 as number
[....yandex.net] 2018.11.20 16:18:29.704895 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: Query pipeline:
Union
 Converting
  Expression
   Expression
    Aggregating
     Concat
      ConvertColumnWithDictionaryToFull
       Expression
        Filter
         Limit
          Numbers
 Converting
  Expression
   Expression
    One

[....yandex.net] 2018.11.20 16:18:29.705772 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Information> executeQuery: Read 101 rows, 801.00 B in 0.002 sec., 54998 rows/sec., 425.95 KiB/sec.

blinkov on 20 Nov 2018

🎉3

Is this still on the backlog?

MeteHanC on 5 Jul 2019

@MeteHanC yes

blinkov on 5 Jul 2019

👍2

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

This task is done.

At first we want to make index granularity adaptive on row size in bytes. For example, marks should be put more frequently if you have fat rows. This is separate feature.

This task is also done.

@BlackEric001 nope, not scheduled yet. Though you can already use something like this to get some idea on how query is executed

Now you can use SET send_logs_level = 'trace' in clickhouse-client to get this info.

Now the task is depend on "Processors" feature, that is currently implemented by @KochetovNicolai
The current estimate is not earlier than Jan 2020.

alexey-milovidov on 1 Nov 2019

👍3

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

@alexey-milovidov Do you know the PRs about this task?

hustnn on 21 Nov 2019

Hello @blinkov @alexey-milovidov!
Could you guys please share any news about plans on that feature?

We are trying to predict each query cost (based on the possible impact on ClickHouse cluster: such as read_rows, read_bytes, parts touched etc) on-fly in order to determine whether it's possible to run the query or not. With the explain feature the entire process is much more simplified.

victor-perov on 3 Jan 2020

👍15 👀1

@BlackEric001 nope, not scheduled yet. Though you can already use something like this to get some idea on how query is executed:

% clickhouse-client --send_logs_level=debug <<< 'select max(number) from numbers(100) where number % 2 union all select 123 as number;' > /dev/null
[....yandex.net] 2018.11.20 16:18:29.703878 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: (from [::1]:53819) select max(number) from numbers(100) where number % 2 union all select 123 as number
[....yandex.net] 2018.11.20 16:18:29.704895 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: Query pipeline:
Union
 Converting
  Expression
   Expression
    Aggregating
     Concat
      ConvertColumnWithDictionaryToFull
       Expression
        Filter
         Limit
          Numbers
 Converting
  Expression
   Expression
    One

[....yandex.net] 2018.11.20 16:18:29.705772 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Information> executeQuery: Read 101 rows, 801.00 B in 0.002 sec., 54998 rows/sec., 425.95 KiB/sec.

HI. why my output information don't have "Query pipeline"

#clickhouse-client -u anyuser --ask-password --port 9000 --send_logs_level=trace <<< 'select count(*) from bp_bst.action_bst limit 10' 
Password for user (anyuser): 
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.938593 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Debug> executeQuery: (from 127.0.0.1:49731, user: anyuser) SELECT count(*) FROM bp_bst.action_bst LIMIT 10
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.938941 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): List of all grants: GRANT SHOW, EXISTS, SELECT, INSERT, ALTER, CREATE, CREATE TEMPORARY TABLE, DROP, TRUNCATE, OPTIMIZE, KILL, CREATE USER, ALTER USER, DROP USER, CREATE ROLE, DROP ROLE, CREATE POLICY, ALTER POLICY, DROP POLICY, CREATE QUOTA, ALTER QUOTA, DROP QUOTA, ROLE ADMIN, SYSTEM, dictGet(), TABLE FUNCTIONS ON *.*
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.938978 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.939122 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.939241 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst_local
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.939919 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> InterpreterSelectQuery: WithMergeableState -> WithMergeableState
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.940014 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> InterpreterSelectQuery: WithMergeableState -> Complete
1035815
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.940616 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Debug> executeQuery: (from 172.16.24.179:18502, initial_query_id: c9848987-a84f-47ba-a976-c81aa321e118) SELECT count() FROM bp_bst.action_bst_local LIMIT 10
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.940853 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Trace> AccessRightsContext (default): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst_local
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.941489 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Trace> InterpreterSelectQuery: WithMergeableState -> WithMergeableState
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.941829 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Information> executeQuery: Read 1 rows, 4.01 KiB in 0.001 sec., 867 rows/sec., 3.39 MiB/sec.
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.941856 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.942701 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Debug> executeQuery: (from 172.16.24.179:4291, initial_query_id: c9848987-a84f-47ba-a976-c81aa321e118) SELECT count() FROM bp_bst.action_bst_local LIMIT 10
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.942921 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Trace> AccessRightsContext (default): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst_local
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.943468 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Trace> InterpreterSelectQuery: WithMergeableState -> WithMergeableState
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.943760 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Information> executeQuery: Read 1 rows, 4.01 KiB in 0.001 sec., 996 rows/sec., 3.90 MiB/sec.
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.943787 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.943968 [ 28378 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> Aggregator: Merging partially aggregated blocks (bucket = -1).
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.944035 [ 28378 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> Aggregator: Merged partially aggregated blocks. 1 rows, 0.000 MiB. in 0.000 sec. (74233.539 rows/sec., 0.566 MiB/sec.)
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.944297 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Information> executeQuery: Read 3 rows, 12.02 KiB in 0.006 sec., 535 rows/sec., 2.10 MiB/sec.
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.944339 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.

Fanduzi on 23 May 2020

I'm fairly new to Clickhouse and as I'm developing queries I'd like to be able to optimize them. With a traditional RDBMS I can examine the query plan to get a sense of what my query will do without running it. I'd like for Clickhouse to be able to give me some kind of diagnostic output that would guide my optimization efforts. For example, see SQLite and Postgres.

I found a previously discussion but I don't have direct access to the server to read that log file.

In short, I'm asking for:

some type of EXPLAIN syntax that can be prefixed to a query when it is sent to Clickhouse

a query response detailing how Clickhouse intends to execute the query

documentation describing what that EXPLAIN query response means and pointers on how to optimize queries

Thanks!

hello,I want to know the explain that quering on cluster, what is ParallelAggregating,what is PartialSorting and Concat, and how to execute on cluster

czwanglei on 9 Jul 2020

I want to know the explain that quering on cluster

Explain for queries on cluster is not implemented yet. This feature will be added later.

what is ParallelAggregating,what is PartialSorting and Concat

This is a different parts of query pipeline. Those classes may be found in the source code and are useful mainly for developers. It would be difficult to support documentation for them because the code is constantly changing. I might possibly add another setting for detailed text description, but not 100% sure it's reasonable.

For those you asked:
ParallelAggregating is a first step of GROUP BY, when data is added into hash tables in parallel (one hash table per thread). It also calculates intermediate aggregation states, e.g. if you calculate count(), hash table will store countState for each key.
PartialSorting sorts blocks of data and cuts LIMIT rows if needed. This sorted blocks are usually processed by MergeSorting, which returns globally sorted data stream.
Concat just concatenates several data streams into one, reading consequently all the data form streams.