How can we benchmark Timescale? Is there a standard set of tests?
Can we run the standard PG Bench Testing against both timescale and PostgreSQL to get a comparison?
Or is there some specific benchmarking, as mentioned in issue #12 ?
Hi @cdesch thanks for reaching out. We have a TimescaleDB vs PostgreSQL benchmark repo for benchmarking TimescaleDB and PG.
Hi @stalltron, In the sample small device sample data set there are two tables device_info and device_readings. In the SQL schema with it, there is no indexed relationship associating the device_id as a foriegn_key between the tables.
In the Benchmark Repo you linked to, the sample dataset doesn't have a similar relationship between two tables, although could have one since the CPU hosts have their own identifier.
Will having that kind of foreign key relationship affect the performance of the DB in Timescale? For example, if I'm streaming device_readings in and the device_info doesn't exist, that would cause an issue in the database integrity. How does timescale tackle this use case? Am I missing something that you can point me to?
Is there a way, if we use UUID as the ID for the device, can we continue to stream in the device_readings and in the event a device_info doesn't exist, it is created? This way the data integrity is maintained and future inserts of device_readings will be associated correctly to the proper device.
I think part of the problem is determining where to handle it. Should it be handled in the DB or in the application code. If I have to look up the device_info before every insert of a device_readings in the application, that will really slow us down as that is two queries for each insert. It also slows us down to just attempt an insert, and if it fails because device_info didn't exist to then roll it back, insert device_info and then insert device_reading.
Hey @cdesch , we have a section in our docs on this under Reading Data:
http://docs.timescale.com/v0.8/using-timescaledb/reading-data#last-point
There you'll find a trigger function you can set on your hypertable so that for each row it makes sure that a metadata table is updated. As for performance we have no run those particular benchmarks yet.
Hi @cdesch: And regarding your question about pgbench, the reason we have developed our own benchmarks is that pgbench is designed for stress testing an OLTP server (~TPC-B), while TimescaleDB focuses on time-series workloads (which are very different than OLTP).
@stalltron I implemented (changed to list):
device_info was the parent and device_readings was the child. device_readings referenced device_info via a normal SQL indexed relationship. device_readings) in the relationshipSELECT lookup on the ID of the parent record (device_info) before inserting new child records (device_readings)device_info used a type uuid index for id and was referenced in device_readings as a type uuid as device_info_id.The performance difference between processing CSV files was dramatic! The new configuration vs the old configuration resulted in a 121.26% increase in performance. Each CSV I had was roughly 1598892 (give or take about 5k)
| Config | Hours | Minutes | Rows Processed | Row/Time Process Ratio (rows per millisecond)
| ------------- |:--------:| -----:| ------------:| ------------:|
| Old | 6.41 | 385| 17563938 | 1.3151948 |
| New | 2.9| 174 | 17563938 | 0.59439973 |
New PG Config:
# DB Version: 9.6
# OS Type: linux
# DB Type: web
# Total Memory (RAM): 64 GB
# Number of Connections: 200
max_connections = 200
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 83886kB
maintenance_work_mem = 2GB
min_wal_size = 1GB
max_wal_size = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
I'd like to get the performance testing to be a bit more refined. I may try one run of having just timescaledb without the pg tuner config enhancements, as well as one trial of just the pg tuner config enhancements without timescaledb.
@stalltron Do you have any suggestions on more accurately benchmarking the performance? There may have been more extraneous factors in my result that may normalize with better data/configuration/standards. For example, I streamed my data source from csv.gz files that were on a USB 3.0 external HDD on a computer that only has USB 2.0 support. I also understand that my implementation of Elixir and Ecto probably had an impact in comparison to someone who took the same data set and streamed it in on flat .csv that was on the same SSD or possibly more optimized (striped SSD or PCI-E SSD). I also noted that in processing the files that most of the threads (24) on the system never really peaked above 40% usage and instead hovered mainly between 25% to 30% (observation from watching the System Monitor). The XEON's in this workstation are 2Ghz, although they were not at capacity (maybe a config or a different bottleneck I'm overlooking). Looking at the BEAM observer, the biggest bottleneck ~appeared~ (I'm knot knowledgeable enough in this area (Linux performance) to determine if this is actually what it is) to be IO reads/writes to disk as this had the biggest bursts of activity to 100% in regular 0.5s intervals.
I'm very excited about the test results as there was a dramatic improvement in performance, although I feel there is probably still a bit of more optimization that can be done. Given the right conditions/configuration, I think we could see a 200% increase in performance. Proper testing and comparison will help figure out where the optimizations need to take place. E.g. Is it a hardware bottleneck, Ubuntu bottleneck, Elixir/Erlang bottleneck, PostgreSQL bottleneck, or TimescaleDB bottleneck?
Other stats include:
Ubuntu 16.04
2 x XEON E5-2620 @ 2.00 Ghz x24
64 gigs of ECC Ram
Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500B/AM) - (6GB/s)
Elixir 1.6.1 && Erlang 20.0
PG 9.6
Ecto Pool Size: 15
CSV Decode: Num Worker: 10
PoolBoy Size: 7
PoolBoy Max Overflow: 2
PoolBoy Setup
Edit:
Correction
| Config | Hours | Minutes | Milliseconds | Rows Processed | Row/Time Process Ratio (rows per millisecond)
| ------------- |:--------:| -----:| ------------:| ------------:| ------------:|
| Old | 6.41 | 385| 23100000.00 | 17563938 | 0.76 |
| New | 2.9| 174 | 10440000.00 | 17563938 | 1.68 |
Hey @cdesch,
Nice improvement! I looked over your setup and I'm not sure specifically where we could tune things to improve further. The pgtune stats for memory look roughly like what I'd expect and so does the rest of your setup. Some random thoughts that you may try:
index_size column) and adjust from there.Otherwise you may just have to play around with different settings and see what works out best.
@cdesch
Row/Time Process Ratio (rows per millisecond)
It should be milliseconds/row, right? Otherwise old is faster than new.
@bnil I think those two numbers were swapped by accident. But yes, It should be in milliseconds. I'll correct it.
I don't think there are any actionable items left in this issue, so closing it out.
We also open-sourced our Time Series Benchmarking Suite several months ago:
https://github.com/timescale/tsbs
https://blog.timescale.com/time-series-database-benchmarks-timescaledb-influxdb-cassandra-mongodb-bc702b72927e