I'm writing Golang driver for Clickhouse. I really confused about reading and writing LowCardinality columns. I try to read source code but...
Can you explain how to reading and writing LowCardinality columns?
If you just need to easily support LowCardinality(T), you may set setting low_cardinality_allow_in_native_format=0 for client session and work with data the same way as if it has type T. Server will automatically convert LowCardinality(T) -> T for selects and T -> LowCardinality(T) for inserts if needed.
If you want fully support LowCardinality(T), this is a good example (python, but code is pretty readable): https://github.com/mymarilyn/clickhouse-driver/blob/a6af47922826294a9173659c599a434c1e77803b/clickhouse_driver/columns/lowcardinalitycolumn.py
Technically, format is following:
Version UInt64
dictionary_size UInt64
Dictionary
indices_size UInt64
Indices
Version also contains the type of indices. version & 0xf is a number 0, 1, 2 or 3 for UInt8, UInt16, UInt32, UInt64 respectively.
Dictionary is a serialized column of type T.
If nested type is Nullable(T), we use type T for dictionary, and assume that 0th element contains Null.
Next element of dictionary (1st for Nullable, 0th for usual type) contains type default (zero or empty string).
Indices is a numeric column of type taken from version, which contains indices_size elements.
Thanks @KochetovNicolai. you saved my day
Most helpful comment
If you just need to easily support
LowCardinality(T), you may set settinglow_cardinality_allow_in_native_format=0for client session and work with data the same way as if it has typeT. Server will automatically convertLowCardinality(T) -> Tfor selects andT -> LowCardinality(T)for inserts if needed.If you want fully support
LowCardinality(T), this is a good example (python, but code is pretty readable): https://github.com/mymarilyn/clickhouse-driver/blob/a6af47922826294a9173659c599a434c1e77803b/clickhouse_driver/columns/lowcardinalitycolumn.pyTechnically, format is following:
Version also contains the type of indices.
version & 0xfis a number 0, 1, 2 or 3 forUInt8,UInt16,UInt32,UInt64respectively.Dictionary is a serialized column of type
T.If nested type is
Nullable(T), we use typeTfor dictionary, and assume that 0th element containsNull.Next element of dictionary (1st for
Nullable, 0th for usual type) contains type default (zero or empty string).Indices is a numeric column of type taken from version, which contains
indices_sizeelements.