Timescaledb: Significantly slower than vanilla postgres

Created on 30 Nov 2017  路  12Comments  路  Source: timescale/timescaledb

Hey,

I gave timescale a try due to the possible performance gains you get for time series data. Unfortunately it's way less performant (30-40x) than vanilla postgres for me. I already tried playing with some parameters / indexes but with no luck. I tested it on a 4gb 2 vcpus machine.

I want to store OHCL data in it.

The table looks like this:

CREATE TABLE IF NOT EXISTS "ohcl" (
    "time" TIMESTAMP WITHOUT TIME ZONE NOT NULL,
    "source" VARCHAR(32) NOT NULL,
    "currency" VARCHAR(16) NOT NULL,
    "unit" VARCHAR(16) NOT NULL,
    "high" REAL,
    "low" REAL,
    "open" REAL,
    "volume_from" REAL,
    "volume_to" REAL,
    "close" REAL
)

That's how I created the hypertable

SELECT create_hypertable('ohcl', 'time', chunk_time_interval => 3600000000);

And I got a unique index on

CREATE UNIQUE INDEX ON ohcl (time desc, source, currency, unit);

I also tried adding indexes for the rows I filter on.

CREATE INDEX ON ohcl (source);
CREATE INDEX ON ohcl (currency);
CREATE INDEX ON ohcl (unit);

This query takes ~75ms in postgres and ~3500ms in timescale.

SELECT time FROM ohcl
WHERE ohcl.source = 'q7z'
  AND ohcl.currency = 'EUR'
  AND ohcl.unit = 'USD'
ORDER BY time ASC LIMIT 1;

What am I doing wrong? Or how can I find out? Would like to get that query down to 20ms :D

Most helpful comment

Any update on that? Timescaledb queries are extremely slow than vanilla...

All 12 comments

Could you provide an EXPLAIN ANALYZE output for the query:

EXPLAIN ANALYZE SELECT time FROM ohcl
WHERE ohcl.source = 'q7z'
  AND ohcl.currency = 'EUR'
  AND ohcl.unit = 'USD'
ORDER BY time ASC LIMIT 1;

Just guessing, but ordering by time and the filter predicates are not compatible with your indices. You need to create compound indices including time, as described in the manual: http://docs.timescale.com/latest/using-timescaledb/schema-management.

@RobAtticus

Vanilla

Importing Rows ~170k: ~5s

Indexes:

CREATE UNIQUE INDEX ohcl_pg_time_source_currency_unit_idx ON ohcl_pg USING btree ("time" DESC, source, currency, unit)

Query:

EXPLAIN ANALYSE SELECT time FROM ohcl_pg
WHERE ohcl_pg.source = 'q7z'
  AND ohcl_pg.currency = 'EUR'
  AND ohcl_pg.unit = 'USD'
ORDER BY time ASC LIMIT 1;

Execution:

Seq Scan on ohcl_pg  (cost=0.00..4744.82 rows=4081 width=8) (actual time=0.021..24.158 rows=15865 loops=1)
  Filter: (((source)::text = 'q7z'::text) AND ((currency)::text = 'EUR'::text) AND ((unit)::text = 'USD'::text))
  Rows Removed by Filter: 158582
Planning time: 0.130 ms
Execution time: 25.298 ms

Timescale

Importing Rows ~170k: ~5min

Indexes:

CREATE INDEX ohcl_ts_time_idx ON ohcl_ts USING btree ("time" DESC)
CREATE UNIQUE INDEX ohcl_ts_time_source_currency_unit_idx ON ohcl_ts USING btree ("time" DESC, source, currency, unit)

Query:

EXPLAIN ANALYSE SELECT time FROM ohcl_pg
WHERE ohcl_pg.source = 'q7z'
  AND ohcl_pg.currency = 'EUR'
  AND ohcl_pg.unit = 'USD'
ORDER BY time ASC LIMIT 1;

Execution:

[53200] ERROR: out of shared memory Hint: You might need to increase max_locks_per_transaction.

This test was run with ~10x more data than my first post. By now I can't even query the table anymore.

@juestr Didn't I do that already by using?

CREATE UNIQUE INDEX ON ohcl (time desc, source, currency, unit);

I missed that line sorry. You are querying somewhat unusually for time ASC, the start of time. Not sure if PG can "reverse" DESC from the index here.

I wouldn't include all filter fields in the index either, this makes the index much larger for likely little selectivity gain, which may induce PG to skip it.

I am no expert on timescale, but as a general rule, have you run VACUUM ANALYSE after your bulk import?

@21stio is it possible for you to back to the smaller dataset size and run the query? At worst TimescaleDB should be a few msec slower for that original query, not 500x slower so we should start there before going to a larger dataset.

@21stio Any follow up on this? Would love to help figure this out

Hey, sorry had a busy time. We decided to stick with vanilla postgres as it satisfies our performance requirements for now. Unfortunately I don't have the time right now to run further experiments. Thx a lot for your support! :)

I also incurred in this issue just today. It is not really resolved.
I have stock data with (time, symbol, close) for 3M rows.
Running

SELECT close FROM stocks WHERE symbol = 'AAPL' 
AND time > date1 AND time < date2 ORDER BY time

was slow, and I was surpised I had to order by time. I added an index (time, symbol) but it gets even slower (?!). After a while the same ERROR: out of shared memory Hint: You might need to increase max_locks_per_transaction. as reported by @ppwfx .

@ptrcarta You really want a (symbol, time) index not a (time, symbol) index. If you want to include close it should be (symbol, time, close) to get an index-only scan. https://blog.timescale.com/use-composite-indexes-to-speed-up-time-series-queries-sql-8ca2df6b3aaa/

this doesn't explain the errors. Indeed I tried both index combinations but to no avail.

Any update on that? Timescaledb queries are extremely slow than vanilla...

Was this page helpful?
0 / 5 - 0 ratings