Clickhouse: Support of Apache Arrow format ?

Created on 8 Jul 2020  ·  2Comments  ·  Source: ClickHouse/ClickHouse

Hi, __which version of ClickHouse fully supports Arrow format__ I couldn't find Apache Arrow in the change log ?

From documentation

Apache Arrow comes with two built-in columnar storage formats. ClickHouse supports read and write operations for these formats. Arrow is Apache Arrow’s “file mode” format. It is designed for in-memory random access.

When you say "file mode" format do you mean Apache Arrow Native File (stream) format ?
I tried

SELECT rowno, c_quantity FROM DAT_242_1 FORMAT Arrow

I got back
Exception on client:
Code: 73. DB::Exception: Unknown format Arrow
ClickHouse server version 19.15.3 revision 54426

Can you also write and add an example in documentation of writing and reading back from clickhouse client using that format ?
That will be great, thanks

comp-formats question question-answered

Most helpful comment

@healiseu first of all, you are using almost one-year-old version, while looking at documentation intended for master. Unfortunately, the documentation is still lagging behind code, so the best place to look for examples is the tests folder.

So the format you are looking for is probably ArrowStream https://github.com/ClickHouse/ClickHouse/blob/4df6d41457cf61ceecad853c18ac945322117bdc/tests/queries/0_stateless/01273_arrow_stream.sh
While here's the normal Arrow: https://github.com/ClickHouse/ClickHouse/blob/4df6d41457cf61ceecad853c18ac945322117bdc/tests/queries/0_stateless/01273_arrow.sh

All 2 comments

@healiseu first of all, you are using almost one-year-old version, while looking at documentation intended for master. Unfortunately, the documentation is still lagging behind code, so the best place to look for examples is the tests folder.

So the format you are looking for is probably ArrowStream https://github.com/ClickHouse/ClickHouse/blob/4df6d41457cf61ceecad853c18ac945322117bdc/tests/queries/0_stateless/01273_arrow_stream.sh
While here's the normal Arrow: https://github.com/ClickHouse/ClickHouse/blob/4df6d41457cf61ceecad853c18ac945322117bdc/tests/queries/0_stateless/01273_arrow.sh

Great news, thank you @blinkov, by the way I have just made a cross-reference with a relevant issue that I opened some time ago at mymarilyn/clickhouse-driver#128. In case someone is willing to help @xzkostyan to support ClickHouse Arrow arrays format I volunteer to test the new feature.

My plan is to support ClickHouse operations at a higher abstraction level using a Python functional-OOP command language in my new open-source GitHub project which hopefully will be released soon after the __$*@&!/+#$%__ COVID-19 crisis. You may read more information in this post. Stay tuned.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zhicwu picture zhicwu  ·  3Comments

bseng picture bseng  ·  3Comments

derekperkins picture derekperkins  ·  3Comments

opavader picture opavader  ·  3Comments

goranc picture goranc  ·  3Comments