Clickhouse: Significantly different results when running clickhouse benchmark against a local, local replicated, and distributed table.

Created on 28 Aug 2019  Â·  6Comments  Â·  Source: ClickHouse/ClickHouse

This is a clarification question to highlight if I am doing something wrong or not using the clickhouse benchmark tool as intended.

After setting up a table with roughly 3 million rows and running the benchmark with a trivial select query, I saw ~900 QPS in my locally defined table. Performing the same query on a local table that is replicated (2 replicas, one remote) this number dropped to around ~600 QPS. Lastly, using a distributed table, this dropped to ~300 QPS.

Is it expected to lose throughput when querying a replicated and/or distributed table? I'll leave the table definitions here, and understand there are quite a few variables at play.

I will say the queries themselves return a small result set, around a few hundred rows,

Table Schema

CREATE TABLE IF NOT EXISTS time_table_schema (
  symbol String,
  date Date,
  data String
) ENGINE=MergeTree() PARTITION BY substring(symbol,1,1) ORDER BY (symbol,date)

Local Table

CREATE TABLE IF NOT EXISTS cluster_shard_1.d_time_table AS default.time_table_schema
ENGINE = MergeTree()
PARTITION BY substring(symbol,1,1)
ORDER BY (symbol,date_time)
SETTINGS index_granularity = 128

Replicated Table (Queries performed against local copy)

CREATE TABLE IF NOT EXISTS cluster_shard_1.d_time_table AS default.time_table_schema
ENGINE = ReplicatedMergeTree(
  '/clickhouse/{cluster}/tables/d_time_table/1', '{replica}')
PARTITION BY substring(symbol,1,1)
ORDER BY (symbol,date)

Distributed Table (Queries against 3 machines)

CREATE TABLE IF NOT EXISTS default.time_table AS default.time_table_schema
ENGINE = Distributed('shard_3_replica_2', '', 'd_time_table', rand());
question

All 6 comments

Local Table
SETTINGS index_granularity = 128 !!!!

clickhouse-benchmark run queries continuously using specified number of connections (one connection by default, controlled by -c argument). If you run it with single connection (by default), it will perform single query at any moment of time, and measure latency, not throughput. Latency will include network delays. But if you run it on localhost, there should be no difference between replicated and non-replicated tables.

The difference may be associated with different number of data parts - for example, if you have loaded data in one table yesterday and into another table just now, the latter table will have data in more number of parts.

Run clickhouse-benchmark with perf top (in another terminal) for comparison.

ok thanks.

Also related to this:

Local Table
SETTINGS index_granularity = 128 !!!!

I was testing limiting the index granularity for smaller return sets/single record lookups based on these posts:

https://www.altinity.com/blog/clickhouse-in-the-storm-part-1
https://www.altinity.com/blog/clickhouse-in-the-storm-part-2

I saw the same relative slowdowns when all tables were configured with the default index_granularity of 8192, and a significantly lower QPS.

Still trying to figure out if my config bashing did more harm than good.

BTW,

PARTITION BY substring(symbol,1,1)
ORDER BY (symbol,date_time)

I hope your select looks like select .... from ... where symbol =
you don't need substring(symbol,1,1) in where. It will reduce performance.

@den-crane yes the query is of the form:

select * from time_table where symbol='BLAH'

Going to close this issue as I believe I was reading the benchmarking results incorrectly. While a local table query showed X QPS and the distributed query showed X/3, the reported total QPS across all nodes was X, which makes sense.

Thanks for your help.

Was this page helpful?
0 / 5 - 0 ratings