protobuf is a binary serialization format similar to Cap'n'Proto. It's very popular in the Go ecosystem in particular and is a popular Kafka serialization format.
Example use case: I'm working on an open-source flow record analyzer and I'm planning to use protobuf. Right now I'm using a separate inserter process that reads messages from Kafka and insert them into ClickHouse. However, in the long term, it would be desirable to skip that and use the Kafka engine instead.
While protobuf usually requires code generation, there's support for reading a schema at runtime using DynamicMessage. It's not ideal from a performance point of view, but it should be faster than JSON in any case.
The Cap'n'Proto implementation works similarly:
Please check ClickHouse roadmap https://clickhouse.yandex/docs/en/roadmap/#q2-2018 ;)
And you can always help with that by contributing. :)
I'm currently working on Protobuf support. The first testable implementation should be available at the end of this week. Stay tuned (:
Hello
Any progress with protobuf messages support ? DWH that natively supports storing/querying protobuf messages is the most demanding feature ;-)
This task is reassigned and scheduled for January.
DWH that natively supports storing/querying protobuf messages is the most demanding feature ;-)
It will allow to import/export/store subset of Protobuf messages. Nested messages will be limited to Arrays.
Thanks for the quick response
Nested messages will be limited to Arrays.
Does it mean that only repeated fields are going to be supported ? Or nested field will be exposed as array of size 1 ?
Protobuf output format (export from ClickHouse) has been implemented (in master).
Nested messages are not yet supported.
We are going to implement protobuf input format, and only then consider implementing nested messages.
Protobuf IO format has been implemented in master. All types of fields (repeated/optional/required) and nested messages are supported now. See documentation for details.
Most helpful comment
Protobuf IO format has been implemented in master. All types of fields (repeated/optional/required) and nested messages are supported now. See documentation for details.