Hey there, Gunnar here from the Debezium team.
We're working on a range of CDC connectors for different databases, Postgres being one of them. Do you have any details to share already on the Yugabyte CDC support? Is it compatible with Postgres Logical Decoding? Curious about any details :) Or perhaps you'd even be interested in re-using the event structure created by Debezium? This could help folks dealing with connectors for different databases. Happy to exchange and discuss, if you're interested.
Hi @gunnarmorling, we are just starting the work on CDC and we'll make a design doc for this available soon. We'd love to learn more about Debezium! Can you please join YugaByte slack channel and send me a DM so we can setup some time to chat?
To answer your question on whether it'll be compatible with Postgres Logical Decoding - no, we are planning on building a custom solution that will leverage YugaByte DB's DocDB WAL.
Hi @gunnarmorling, wanted to touch base on this again. We'd like to have a chat and learn more about Debezium. Let me know how you'd like to move forward.
Hey @ndeodhar, perhaps you could drop by on the Debezium dev chat and we could chat about it there? There's a couple of things that could be done, e.g. using the same message format as Debezium does, so to make life easier for consumers that use multiple CDC connectors.
Was reminded of this by the recent news of the YugaByte open-sourcing. Great move! Is there anything I could read up regarding the CDC API(s) in YugaByte?
Hi @gunnarmorling, we are starting to put together documentation on CDC here:
https://github.com/YugaByte/yugabyte-db/blob/master/architecture/design/docdb-change-data-capture.md (Note that this is still work in progress, and CDC project is in early stage currently).
There's also some documentation on Data Center replication (which will use CDC under the hood) here: https://github.com/YugaByte/yugabyte-db/blob/master/architecture/design/multi-region-2DC-deployment.md
If you are interested in learning more about the underlying RPC APIs, you can find relevant code here:
https://github.com/YugaByte/yugabyte-db/blob/master/src/yb/cdc/cdc_service.proto
https://github.com/YugaByte/yugabyte-db/tree/master/ent/src/yb/cdc
As I understand, the connector must be started standalone. How downtimes of connector are handled? Is there any losses possible? Is there any option to get more high-availability? For example by not starting kafka connector manually, but provide some sort of setting, like it done in CockroachDB:
CREATE CHANGEFEED FOR TABLE table_name, table_name2 INTO 'scheme://host:port';
Most helpful comment
Was reminded of this by the recent news of the YugaByte open-sourcing. Great move! Is there anything I could read up regarding the CDC API(s) in YugaByte?