Hey,
I gave timescale a try due to the possible performance gains you get for time series data. Unfortunately it's way less performant (30-40x) than vanilla postgres for me. I already tried playing with some parameters / indexes but with no luck. I tested it on a 4gb 2 vcpus machine.
I want to store OHCL data in it.
The table looks like this:
CREATE TABLE IF NOT EXISTS "ohcl" (
"time" TIMESTAMP WITHOUT TIME ZONE NOT NULL,
"source" VARCHAR(32) NOT NULL,
"currency" VARCHAR(16) NOT NULL,
"unit" VARCHAR(16) NOT NULL,
"high" REAL,
"low" REAL,
"open" REAL,
"volume_from" REAL,
"volume_to" REAL,
"close" REAL
)
That's how I created the hypertable
SELECT create_hypertable('ohcl', 'time', chunk_time_interval => 3600000000);
And I got a unique index on
CREATE UNIQUE INDEX ON ohcl (time desc, source, currency, unit);
I also tried adding indexes for the rows I filter on.
CREATE INDEX ON ohcl (source);
CREATE INDEX ON ohcl (currency);
CREATE INDEX ON ohcl (unit);
This query takes ~75ms in postgres and ~3500ms in timescale.
SELECT time FROM ohcl
WHERE ohcl.source = 'q7z'
AND ohcl.currency = 'EUR'
AND ohcl.unit = 'USD'
ORDER BY time ASC LIMIT 1;
What am I doing wrong? Or how can I find out? Would like to get that query down to 20ms :D
Could you provide an EXPLAIN ANALYZE output for the query:
EXPLAIN ANALYZE SELECT time FROM ohcl
WHERE ohcl.source = 'q7z'
AND ohcl.currency = 'EUR'
AND ohcl.unit = 'USD'
ORDER BY time ASC LIMIT 1;
Just guessing, but ordering by time and the filter predicates are not compatible with your indices. You need to create compound indices including time, as described in the manual: http://docs.timescale.com/latest/using-timescaledb/schema-management.
@RobAtticus
Importing Rows ~170k: ~5s
Indexes:
CREATE UNIQUE INDEX ohcl_pg_time_source_currency_unit_idx ON ohcl_pg USING btree ("time" DESC, source, currency, unit)
Query:
EXPLAIN ANALYSE SELECT time FROM ohcl_pg
WHERE ohcl_pg.source = 'q7z'
AND ohcl_pg.currency = 'EUR'
AND ohcl_pg.unit = 'USD'
ORDER BY time ASC LIMIT 1;
Execution:
Seq Scan on ohcl_pg (cost=0.00..4744.82 rows=4081 width=8) (actual time=0.021..24.158 rows=15865 loops=1)
Filter: (((source)::text = 'q7z'::text) AND ((currency)::text = 'EUR'::text) AND ((unit)::text = 'USD'::text))
Rows Removed by Filter: 158582
Planning time: 0.130 ms
Execution time: 25.298 ms
Importing Rows ~170k: ~5min
Indexes:
CREATE INDEX ohcl_ts_time_idx ON ohcl_ts USING btree ("time" DESC)
CREATE UNIQUE INDEX ohcl_ts_time_source_currency_unit_idx ON ohcl_ts USING btree ("time" DESC, source, currency, unit)
Query:
EXPLAIN ANALYSE SELECT time FROM ohcl_pg
WHERE ohcl_pg.source = 'q7z'
AND ohcl_pg.currency = 'EUR'
AND ohcl_pg.unit = 'USD'
ORDER BY time ASC LIMIT 1;
Execution:
[53200] ERROR: out of shared memory Hint: You might need to increase max_locks_per_transaction.
This test was run with ~10x more data than my first post. By now I can't even query the table anymore.
@juestr Didn't I do that already by using?
CREATE UNIQUE INDEX ON ohcl (time desc, source, currency, unit);
I missed that line sorry. You are querying somewhat unusually for time ASC, the start of time. Not sure if PG can "reverse" DESC from the index here.
I wouldn't include all filter fields in the index either, this makes the index much larger for likely little selectivity gain, which may induce PG to skip it.
I am no expert on timescale, but as a general rule, have you run VACUUM ANALYSE after your bulk import?
@21stio is it possible for you to back to the smaller dataset size and run the query? At worst TimescaleDB should be a few msec slower for that original query, not 500x slower so we should start there before going to a larger dataset.
@21stio Any follow up on this? Would love to help figure this out
Hey, sorry had a busy time. We decided to stick with vanilla postgres as it satisfies our performance requirements for now. Unfortunately I don't have the time right now to run further experiments. Thx a lot for your support! :)
I also incurred in this issue just today. It is not really resolved.
I have stock data with (time, symbol, close) for 3M rows.
Running
SELECT close FROM stocks WHERE symbol = 'AAPL'
AND time > date1 AND time < date2 ORDER BY time
was slow, and I was surpised I had to order by time. I added an index (time, symbol) but it gets even slower (?!). After a while the same ERROR: out of shared memory Hint: You might need to increase max_locks_per_transaction. as reported by @ppwfx .
@ptrcarta You really want a (symbol, time) index not a (time, symbol) index. If you want to include close it should be (symbol, time, close) to get an index-only scan. https://blog.timescale.com/use-composite-indexes-to-speed-up-time-series-queries-sql-8ca2df6b3aaa/
this doesn't explain the errors. Indeed I tried both index combinations but to no avail.
Any update on that? Timescaledb queries are extremely slow than vanilla...
Most helpful comment
Any update on that? Timescaledb queries are extremely slow than vanilla...