Clickhouse: Import from parquet IOError: Not yet implemented: Unsupported encoding.

Created on 31 Aug 2020  Â·  7Comments  Â·  Source: ClickHouse/ClickHouse

Describe the bug
I try to import parquet files and some files are patially imported because I have en encoding error.
The file are provided by an external providers. I don't know the library used to generate files.

  • Which ClickHouse server version to use

ClickHouse client version 20.6.3.28 (official build).

  • Queries to run that lead to unexpected result
    cat export-parquet-00001.snappy.parquet |clickhouse-client --query="INSERT INTO cur FORMAT Parquet"

Expected behavior
A clear and concise description of what you expected to happen.

Error message and/or stacktrace
Code: 33. DB::Exception: Error while reading Parquet data: IOError: Not yet implemented: Unsupported encoding.

Additional context
Add any other context about the problem here.

bug comp-3rdparty-libs comp-formats

All 7 comments

Please confirm you use one of:

        case Encoding::DELTA_BINARY_PACKED:
        case Encoding::DELTA_LENGTH_BYTE_ARRAY:
        case Encoding::DELTA_BYTE_ARRAY:

The file are provided by an external providers. I don't know the library used to generate files.

You can use parquet-tools ( https://github.com/apache/parquet-mr/tree/master/parquet-tools ) to inspect your parquet file.

Hello,

This is it DELTA_BYTE_ARRAY the responsible of the error.
have you an idea on how to avoid this error ?

Thanks.

No workarounds. It's just a library used by clickhouse to read parquet doesn't support that encoding - you can open an issue in upstream project https://issues.apache.org/jira/projects/ARROW/issues/

Many thanks for the help. I will open an issue in apache arrow project.

Was this page helpful?
0 / 5 - 0 ratings