Clickhouse: LowCardinality wire and read protocol

Created on 4 Jul 2020  路  2Comments  路  Source: ClickHouse/ClickHouse

I'm writing Golang driver for Clickhouse. I really confused about reading and writing LowCardinality columns. I try to read source code but...
Can you explain how to reading and writing LowCardinality columns?

question

Most helpful comment

If you just need to easily support LowCardinality(T), you may set setting low_cardinality_allow_in_native_format=0 for client session and work with data the same way as if it has type T. Server will automatically convert LowCardinality(T) -> T for selects and T -> LowCardinality(T) for inserts if needed.

If you want fully support LowCardinality(T), this is a good example (python, but code is pretty readable): https://github.com/mymarilyn/clickhouse-driver/blob/a6af47922826294a9173659c599a434c1e77803b/clickhouse_driver/columns/lowcardinalitycolumn.py

Technically, format is following:

Version UInt64
dictionary_size UInt64
Dictionary
indices_size UInt64
Indices

Version also contains the type of indices. version & 0xf is a number 0, 1, 2 or 3 for UInt8, UInt16, UInt32, UInt64 respectively.
Dictionary is a serialized column of type T.
If nested type is Nullable(T), we use type T for dictionary, and assume that 0th element contains Null.
Next element of dictionary (1st for Nullable, 0th for usual type) contains type default (zero or empty string).
Indices is a numeric column of type taken from version, which contains indices_size elements.

All 2 comments

If you just need to easily support LowCardinality(T), you may set setting low_cardinality_allow_in_native_format=0 for client session and work with data the same way as if it has type T. Server will automatically convert LowCardinality(T) -> T for selects and T -> LowCardinality(T) for inserts if needed.

If you want fully support LowCardinality(T), this is a good example (python, but code is pretty readable): https://github.com/mymarilyn/clickhouse-driver/blob/a6af47922826294a9173659c599a434c1e77803b/clickhouse_driver/columns/lowcardinalitycolumn.py

Technically, format is following:

Version UInt64
dictionary_size UInt64
Dictionary
indices_size UInt64
Indices

Version also contains the type of indices. version & 0xf is a number 0, 1, 2 or 3 for UInt8, UInt16, UInt32, UInt64 respectively.
Dictionary is a serialized column of type T.
If nested type is Nullable(T), we use type T for dictionary, and assume that 0th element contains Null.
Next element of dictionary (1st for Nullable, 0th for usual type) contains type default (zero or empty string).
Indices is a numeric column of type taken from version, which contains indices_size elements.

Thanks @KochetovNicolai. you saved my day

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bseng picture bseng  路  3Comments

innerr picture innerr  路  3Comments

fizerkhan picture fizerkhan  路  3Comments

zhicwu picture zhicwu  路  3Comments

vvp83 picture vvp83  路  3Comments