Timescaledb: Significantly slower than vanilla postgres

Created on 30 Nov 2017 · 12Comments · Source: timescale/timescaledb

Hey,

I gave timescale a try due to the possible performance gains you get for time series data. Unfortunately it's way less performant (30-40x) than vanilla postgres for me. I already tried playing with some parameters / indexes but with no luck. I tested it on a 4gb 2 vcpus machine.

I want to store OHCL data in it.

The table looks like this:

CREATE TABLE IF NOT EXISTS "ohcl" (
    "time" TIMESTAMP WITHOUT TIME ZONE NOT NULL,
    "source" VARCHAR(32) NOT NULL,
    "currency" VARCHAR(16) NOT NULL,
    "unit" VARCHAR(16) NOT NULL,
    "high" REAL,
    "low" REAL,
    "open" REAL,
    "volume_from" REAL,
    "volume_to" REAL,
    "close" REAL
)

That's how I created the hypertable

SELECT create_hypertable('ohcl', 'time', chunk_time_interval => 3600000000);

And I got a unique index on

CREATE UNIQUE INDEX ON ohcl (time desc, source, currency, unit);

I also tried adding indexes for the rows I filter on.

CREATE INDEX ON ohcl (source);
CREATE INDEX ON ohcl (currency);
CREATE INDEX ON ohcl (unit);

This query takes ~75ms in postgres and ~3500ms in timescale.

SELECT time FROM ohcl
WHERE ohcl.source = 'q7z'
  AND ohcl.currency = 'EUR'
  AND ohcl.unit = 'USD'
ORDER BY time ASC LIMIT 1;

What am I doing wrong? Or how can I find out? Would like to get that query down to 20ms :D

Source

ppwfx

Most helpful comment

Any update on that? Timescaledb queries are extremely slow than vanilla...

cparaskeva on 29 Feb 2020

👍4

All 12 comments

Could you provide an EXPLAIN ANALYZE output for the query:

EXPLAIN ANALYZE SELECT time FROM ohcl
WHERE ohcl.source = 'q7z'
  AND ohcl.currency = 'EUR'
  AND ohcl.unit = 'USD'
ORDER BY time ASC LIMIT 1;

RobAtticus on 30 Nov 2017

Just guessing, but ordering by time and the filter predicates are not compatible with your indices. You need to create compound indices including time, as described in the manual: http://docs.timescale.com/latest/using-timescaledb/schema-management.

juestr on 30 Nov 2017

@RobAtticus

Vanilla

Importing Rows ~170k: ~5s

Indexes:

CREATE UNIQUE INDEX ohcl_pg_time_source_currency_unit_idx ON ohcl_pg USING btree ("time" DESC, source, currency, unit)

Query:

EXPLAIN ANALYSE SELECT time FROM ohcl_pg
WHERE ohcl_pg.source = 'q7z'
  AND ohcl_pg.currency = 'EUR'
  AND ohcl_pg.unit = 'USD'
ORDER BY time ASC LIMIT 1;

Execution:

Seq Scan on ohcl_pg  (cost=0.00..4744.82 rows=4081 width=8) (actual time=0.021..24.158 rows=15865 loops=1)
  Filter: (((source)::text = 'q7z'::text) AND ((currency)::text = 'EUR'::text) AND ((unit)::text = 'USD'::text))
  Rows Removed by Filter: 158582
Planning time: 0.130 ms
Execution time: 25.298 ms

Timescale

Importing Rows ~170k: ~5min

Indexes:

CREATE INDEX ohcl_ts_time_idx ON ohcl_ts USING btree ("time" DESC)
CREATE UNIQUE INDEX ohcl_ts_time_source_currency_unit_idx ON ohcl_ts USING btree ("time" DESC, source, currency, unit)

Query:

EXPLAIN ANALYSE SELECT time FROM ohcl_pg
WHERE ohcl_pg.source = 'q7z'
  AND ohcl_pg.currency = 'EUR'
  AND ohcl_pg.unit = 'USD'
ORDER BY time ASC LIMIT 1;

Execution:

[53200] ERROR: out of shared memory Hint: You might need to increase max_locks_per_transaction.

This test was run with ~10x more data than my first post. By now I can't even query the table anymore.

ppwfx on 1 Dec 2017

@juestr Didn't I do that already by using?

CREATE UNIQUE INDEX ON ohcl (time desc, source, currency, unit);

ppwfx on 1 Dec 2017

I missed that line sorry. You are querying somewhat unusually for time ASC, the start of time. Not sure if PG can "reverse" DESC from the index here.

I wouldn't include all filter fields in the index either, this makes the index much larger for likely little selectivity gain, which may induce PG to skip it.

I am no expert on timescale, but as a general rule, have you run VACUUM ANALYSE after your bulk import?

juestr on 1 Dec 2017

@21stio is it possible for you to back to the smaller dataset size and run the query? At worst TimescaleDB should be a few msec slower for that original query, not 500x slower so we should start there before going to a larger dataset.

RobAtticus on 1 Dec 2017

@21stio Any follow up on this? Would love to help figure this out

RobAtticus on 7 Dec 2017

Hey, sorry had a busy time. We decided to stick with vanilla postgres as it satisfies our performance requirements for now. Unfortunately I don't have the time right now to run further experiments. Thx a lot for your support! :)

ppwfx on 16 Dec 2017

I also incurred in this issue just today. It is not really resolved.
I have stock data with (time, symbol, close) for 3M rows.
Running

SELECT close FROM stocks WHERE symbol = 'AAPL' 
AND time > date1 AND time < date2 ORDER BY time

was slow, and I was surpised I had to order by time. I added an index (time, symbol) but it gets even slower (?!). After a while the same ERROR: out of shared memory Hint: You might need to increase max_locks_per_transaction. as reported by @ppwfx .

ptrcarta on 14 Jun 2019

@ptrcarta You really want a (symbol, time) index not a (time, symbol) index. If you want to include close it should be (symbol, time, close) to get an index-only scan. https://blog.timescale.com/use-composite-indexes-to-speed-up-time-series-queries-sql-8ca2df6b3aaa/