Ksql: Add support for the "bytes" type in avro

Created on 7 May 2018  路  4Comments  路  Source: confluentinc/ksql

Edited by @miguno on Oct 02, 2018: This ticket was closed in favor of the slightly more general ticket: Support a VARBINARY, BINARY, or BYTES data type #1742.


I'm testing Kafka Connect, Couchbase and KSQL in order to extract all the modifications to Kafka and make some queries to get all the modifications per document and so on. Due the performance, I configured Kafka Connect to store the messages in Kafka in Avro format.

The problem is that Kafka Connect store the Couchbase document under the attribute content using base64 (in avro the equivalent format is bytes) and the rest of the metadata of the document using more simple formats (int, long, string...).

When I try create a stream for the avro topic where are all the documents modifications from Couchbase I have the below error:

ksql> CREATE STREAM avro WITH (KAFKA_TOPIC='beer-sample_avro', VALUE_FORMAT='AVRO');
 Could not fetch the AVRO schema from schema registry. Cannot find correct type for avro type: bytes

Is the avro type bytes supported?

The avro Schema of the Key is {"subject":"beer-sample_avro-key","version":1,"id":1,"schema":"\"string\""}.
The avro Schema of the Value is:

{
    "subject": "beer-sample_avro-value",
    "version": 1,
    "id": 2,
    "schema": {
        "type": "record",
        "name": "DcpMessage",
        "namespace": "com.couchbase",
        "fields": [{
            "name": "event",
            "type": "string"
        }, {
            "name": "partition",
            "type": {
                "type": "int",
                "connect.type": "int16"
            }
        }, {
            "name": "key",
            "type": "string"
        }, {
            "name": "cas",
            "type": "long"
        }, {
            "name": "bySeqno",
            "type": "long"
        }, {
            "name": "revSeqno",
            "type": "long"
        }, {
            "name": "expiration",
            "type": ["null", "int"],
            "default": null
        }, {
            "name": "flags",
            "type": ["null", "int"],
            "default": null
        }, {
            "name": "lockTime",
            "type": ["null", "int"],
            "default": null
        }, **{
            "name": "content",
            "type": ["null", "bytes"],
            "default": null
        }**, {
            "name": "bucket",
            "type": ["null", "string"],
            "default": null
        }, {
            "name": "vBucketUuid",
            "type": ["null", "long"],
            "default": null
        }],
        "connect.name": "com.couchbase.DcpMessage"
    }
}
data-accessibility duplicate enhancement

Most helpful comment

Currently we don't support the "byte" type. We'll use this issue to track adding support for it.

All 4 comments

Currently we don't support the "byte" type. We'll use this issue to track adding support for it.

Is there any workaround available for byte avro type?

@javierTQ et al: I am closing this ticket for the slightly more general ticket: Support a VARBINARY, BINARY, or BYTES data type #1742.

what is the work around for this issue ?

Was this page helpful?
0 / 5 - 0 ratings