Clickhouse: What is the recommended process for streaming data from PostgreSQL to ClickHouse?

Created on 9 Dec 2018  Â·  4Comments  Â·  Source: ClickHouse/ClickHouse

I have been researching multiple ETL PaaS solutions, however I have not found one that would support PostgreSQL and ClickHouse.

Is there a recommended process for automating data feed from PostgreSQL to ClickHouse?

comp-foreign-db question

Most helpful comment

There is no 'officially recommended' way here, different companies use different products, some of those integrations are open-sourced, some not, not all of them are mature enough. ClickHouse developers do not test/certify those solutions.

Please check
https://clickhouse.yandex/docs/en/interfaces/third-party/integrations/
https://clickhouse.yandex/docs/en/interfaces/third-party/client_libraries/

So no silver bullets here (for now) and TMTOWTDI as perl-people say. :)

You can try to ask something like: "how to use clickhouse together with _your favorite ETL solution_".
Or you can create your own exporter using some of the ClickHouse libraries.

In simplest case it can be done even with trivial bash scripts in cron, as easy as

psql -U user -d db_name -c "copy (select col1, col2 from src_table LIMIT 10) to STDOUT with CSV DELIMITER ',';" | clickhouse-client -u ch_user --password pwd -d ch_database --query 'INSERT INTO table(col1,col2) FORMAT CSV'

Also you can select from Postgres directly from Clickhouse (via ODBC driver, see https://www.altinity.com/blog/2018/9/20/using-odbc-with-clickhouse). And do something like INSERT INTO dst_table SELECT FROM src_postgres_table WHERE timestamp > now()-60

All 4 comments

There is no 'officially recommended' way here, different companies use different products, some of those integrations are open-sourced, some not, not all of them are mature enough. ClickHouse developers do not test/certify those solutions.

Please check
https://clickhouse.yandex/docs/en/interfaces/third-party/integrations/
https://clickhouse.yandex/docs/en/interfaces/third-party/client_libraries/

So no silver bullets here (for now) and TMTOWTDI as perl-people say. :)

You can try to ask something like: "how to use clickhouse together with _your favorite ETL solution_".
Or you can create your own exporter using some of the ClickHouse libraries.

In simplest case it can be done even with trivial bash scripts in cron, as easy as

psql -U user -d db_name -c "copy (select col1, col2 from src_table LIMIT 10) to STDOUT with CSV DELIMITER ',';" | clickhouse-client -u ch_user --password pwd -d ch_database --query 'INSERT INTO table(col1,col2) FORMAT CSV'

Also you can select from Postgres directly from Clickhouse (via ODBC driver, see https://www.altinity.com/blog/2018/9/20/using-odbc-with-clickhouse). And do something like INSERT INTO dst_table SELECT FROM src_postgres_table WHERE timestamp > now()-60

Thank you. I will report back what solution we ended up using.

if you're using postgresql >= 10, you might look into pg2ch project

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings