Clickhouse: feature request: EXPLAIN output

Created on 17 Aug 2017  路  19Comments  路  Source: ClickHouse/ClickHouse

I'm fairly new to Clickhouse and as I'm developing queries I'd like to be able to optimize them. With a traditional RDBMS I can examine the query plan to get a sense of what my query will do without running it. I'd like for Clickhouse to be able to give me some kind of diagnostic output that would guide my optimization efforts. For example, see SQLite and Postgres.

I found a previously discussion but I don't have direct access to the server to read that log file.

In short, I'm asking for:

  • some type of EXPLAIN syntax that can be prefixed to a query when it is sent to Clickhouse
  • a query response detailing how Clickhouse intends to execute the query
  • documentation describing what that EXPLAIN query response means and pointers on how to optimize queries

Thanks!

feature

Most helpful comment

This is the part of a larger task and we have some related tasks planned:

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

Second, we are going to implement a query, that will show, how many rows will be read for each subquery on each server, according to index info.

Both are not exactly an EXPLAIN query.

All 19 comments

This is the part of a larger task and we have some related tasks planned:

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

Second, we are going to implement a query, that will show, how many rows will be read for each subquery on each server, according to index info.

Both are not exactly an EXPLAIN query.

Hi @alexey-milovidov , is there any plan to implement this feature ?

Any new information about this, please?

@alexey-milovidov It would be also useful to include some recommendation about index granularity for the queries. E.g. when the most common and frequent query touches only few rows, but had to scan much more rows for that, we can suggest: "If you want this query to be faster you can change your table index granularity to N".

is there any plan to implement this feature ?

We started working on this plan: https://github.com/yandex/ClickHouse/issues/1118#issuecomment-323190806

@alexey-milovidov It would be also useful to include some recommendation about index granularity for the queries. E.g. when the most common and frequent query touches only few rows, but had to scan much more rows for that, we can suggest: "If you want this query to be faster you can change your table index granularity to N".

At first we want to make index granularity adaptive on row size in bytes. For example, marks should be put more frequently if you have fat rows. This is separate feature.

is the task dead?
any news about explain query plan?

@akonyaev90 not dead, just in backlog, not currently in progress.

Could you tell, when you implement explain query plan?

Looking forward.

@BlackEric001 nope, not scheduled yet. Though you can already use something like this to get some idea on how query is executed:

% clickhouse-client --send_logs_level=debug <<< 'select max(number) from numbers(100) where number % 2 union all select 123 as number;' > /dev/null
[....yandex.net] 2018.11.20 16:18:29.703878 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: (from [::1]:53819) select max(number) from numbers(100) where number % 2 union all select 123 as number
[....yandex.net] 2018.11.20 16:18:29.704895 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: Query pipeline:
Union
 Converting
  Expression
   Expression
    Aggregating
     Concat
      ConvertColumnWithDictionaryToFull
       Expression
        Filter
         Limit
          Numbers
 Converting
  Expression
   Expression
    One

[....yandex.net] 2018.11.20 16:18:29.705772 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Information> executeQuery: Read 101 rows, 801.00 B in 0.002 sec., 54998 rows/sec., 425.95 KiB/sec.

Is this still on the backlog?

@MeteHanC yes

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

This task is done.

At first we want to make index granularity adaptive on row size in bytes. For example, marks should be put more frequently if you have fat rows. This is separate feature.

This task is also done.

@BlackEric001 nope, not scheduled yet. Though you can already use something like this to get some idea on how query is executed

Now you can use SET send_logs_level = 'trace' in clickhouse-client to get this info.

Now the task is depend on "Processors" feature, that is currently implemented by @KochetovNicolai
The current estimate is not earlier than Jan 2020.

First, we are planning to implement rich profiling info for a query, to know,
how much time is spent in each server node, in each thread for each subquery, on
CPU, disk IO, network and synchronization.

@alexey-milovidov Do you know the PRs about this task?

Hello @blinkov @alexey-milovidov!
Could you guys please share any news about plans on that feature?

We are trying to predict each query cost (based on the possible impact on ClickHouse cluster: such as read_rows, read_bytes, parts touched etc) on-fly in order to determine whether it's possible to run the query or not. With the explain feature the entire process is much more simplified.

@BlackEric001 nope, not scheduled yet. Though you can already use something like this to get some idea on how query is executed:

% clickhouse-client --send_logs_level=debug <<< 'select max(number) from numbers(100) where number % 2 union all select 123 as number;' > /dev/null
[....yandex.net] 2018.11.20 16:18:29.703878 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: (from [::1]:53819) select max(number) from numbers(100) where number % 2 union all select 123 as number
[....yandex.net] 2018.11.20 16:18:29.704895 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Debug> executeQuery: Query pipeline:
Union
 Converting
  Expression
   Expression
    Aggregating
     Concat
      ConvertColumnWithDictionaryToFull
       Expression
        Filter
         Limit
          Numbers
 Converting
  Expression
   Expression
    One

[....yandex.net] 2018.11.20 16:18:29.705772 {9665eedb-4615-4701-85e4-050681dadc57} [ 25 ] <Information> executeQuery: Read 101 rows, 801.00 B in 0.002 sec., 54998 rows/sec., 425.95 KiB/sec.

HI. why my output information don't have "Query pipeline"

#clickhouse-client -u anyuser --ask-password --port 9000 --send_logs_level=trace <<< 'select count(*) from bp_bst.action_bst limit 10' 
Password for user (anyuser): 
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.938593 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Debug> executeQuery: (from 127.0.0.1:49731, user: anyuser) SELECT count(*) FROM bp_bst.action_bst LIMIT 10
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.938941 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): List of all grants: GRANT SHOW, EXISTS, SELECT, INSERT, ALTER, CREATE, CREATE TEMPORARY TABLE, DROP, TRUNCATE, OPTIMIZE, KILL, CREATE USER, ALTER USER, DROP USER, CREATE ROLE, DROP ROLE, CREATE POLICY, ALTER POLICY, DROP POLICY, CREATE QUOTA, ALTER QUOTA, DROP QUOTA, ROLE ADMIN, SYSTEM, dictGet(), TABLE FUNCTIONS ON *.*
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.938978 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.939122 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.939241 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> AccessRightsContext (anyuser): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst_local
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.939919 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> InterpreterSelectQuery: WithMergeableState -> WithMergeableState
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.940014 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> InterpreterSelectQuery: WithMergeableState -> Complete
1035815
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.940616 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Debug> executeQuery: (from 172.16.24.179:18502, initial_query_id: c9848987-a84f-47ba-a976-c81aa321e118) SELECT count() FROM bp_bst.action_bst_local LIMIT 10
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.940853 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Trace> AccessRightsContext (default): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst_local
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.941489 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Trace> InterpreterSelectQuery: WithMergeableState -> WithMergeableState
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.941829 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Information> executeQuery: Read 1 rows, 4.01 KiB in 0.001 sec., 867 rows/sec., 3.39 MiB/sec.
[bj2-clickhouse-all-prod-02] 2020.05.23 09:11:01.941856 [ 5470 ] {e01596ff-25c3-4e66-9cc8-499918b427f9} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.942701 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Debug> executeQuery: (from 172.16.24.179:4291, initial_query_id: c9848987-a84f-47ba-a976-c81aa321e118) SELECT count() FROM bp_bst.action_bst_local LIMIT 10
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.942921 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Trace> AccessRightsContext (default): Access granted: SELECT(D_CREATED_AT) ON bp_bst.action_bst_local
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.943468 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Trace> InterpreterSelectQuery: WithMergeableState -> WithMergeableState
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.943760 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Information> executeQuery: Read 1 rows, 4.01 KiB in 0.001 sec., 996 rows/sec., 3.90 MiB/sec.
[bj2-clickhouse-all-prod-06] 2020.05.23 09:11:01.943787 [ 21214 ] {67b0250c-c8ac-45f5-9291-3c6862ff0ed5} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.943968 [ 28378 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> Aggregator: Merging partially aggregated blocks (bucket = -1).
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.944035 [ 28378 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Trace> Aggregator: Merged partially aggregated blocks. 1 rows, 0.000 MiB. in 0.000 sec. (74233.539 rows/sec., 0.566 MiB/sec.)
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.944297 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Information> executeQuery: Read 3 rows, 12.02 KiB in 0.006 sec., 535 rows/sec., 2.10 MiB/sec.
[bj2-clickhouse-all-prod-01] 2020.05.23 09:11:01.944339 [ 23342 ] {c9848987-a84f-47ba-a976-c81aa321e118} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B.

I'm fairly new to Clickhouse and as I'm developing queries I'd like to be able to optimize them. With a traditional RDBMS I can examine the query plan to get a sense of what my query will do without running it. I'd like for Clickhouse to be able to give me some kind of diagnostic output that would guide my optimization efforts. For example, see SQLite and Postgres.

I found a previously discussion but I don't have direct access to the server to read that log file.

In short, I'm asking for:

  • some type of EXPLAIN syntax that can be prefixed to a query when it is sent to Clickhouse
  • a query response detailing how Clickhouse intends to execute the query
  • documentation describing what that EXPLAIN query response means and pointers on how to optimize queries

Thanks!

hello,I want to know the explain that quering on cluster, what is ParallelAggregating,what is PartialSorting and Concat, and how to execute on cluster

I want to know the explain that quering on cluster

Explain for queries on cluster is not implemented yet. This feature will be added later.

what is ParallelAggregating,what is PartialSorting and Concat

This is a different parts of query pipeline. Those classes may be found in the source code and are useful mainly for developers. It would be difficult to support documentation for them because the code is constantly changing. I might possibly add another setting for detailed text description, but not 100% sure it's reasonable.

For those you asked:
ParallelAggregating is a first step of GROUP BY, when data is added into hash tables in parallel (one hash table per thread). It also calculates intermediate aggregation states, e.g. if you calculate count(), hash table will store countState for each key.
PartialSorting sorts blocks of data and cuts LIMIT rows if needed. This sorted blocks are usually processed by MergeSorting, which returns globally sorted data stream.
Concat just concatenates several data streams into one, reading consequently all the data form streams.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vvp83 picture vvp83  路  3Comments

innerr picture innerr  路  3Comments

healiseu picture healiseu  路  3Comments

derekperkins picture derekperkins  路  3Comments

bseng picture bseng  路  3Comments