Some customers create topics with complex schemas using Avro that include UNION (choice) fields.
It would be useful to be able to query these UNION fields, for example to only show messages whose field is of a particular type.
Not supporting UNION AVRO is very limiting for KSQL. We have a scenario where we have a message with some fixed fields and a map of additional fields whose values are constrained by the AVRO UNION type. Specifically it looks like this.
"fields":[
{"name":"Key","type":"string"},
{"name":"Timestamp","type":"long"},
{"name":"Attributes","type":{"type":"map","values":
["string","float","double","int",
{"type":"long",
"connect.version":1,
"connect.name":"org.apache.kafka.connect.data.Timestamp",
"logicalType":"timestamp-millis"
}
]
}]
We can create a stream in KSQL so long as it doesn't contain the Attributes map. This is because the map type can only be specified as MAP
The only alternative we would have is to publish our data in a MAP
Please can you support AVRO UNION types in general but specifically as values in a Map?
cc @MichaelDrogalis @derekjn @apurvam in case we want to prioritize this on our roadmap.
Protobuf and JSON Schema both have an equivalent "oneof" construct.
Unions/oneofs will be more important now that Schema Registry supports references. Using unions with references is to be preferred over using RecordNamingStrategy when storing multiple schema types in the same topic (see https://github.com/confluentinc/ksql/issues/1267).
Should totally support this by just adding the superset of columns from all types in the union.
https://martinfowler.com/eaaCatalog/singleTableInheritance.html
With Schema Registry's new support for schema references more and more users will be using Unions to allow topics to receive different event types, so ksqlDB not supporting Unions/OneOfs is going to become a bigger issue.
Here's a blog post describing how to store multiple event types in the same topic using unions/oneofs. Having union support in ksqlDB would allow such topics to be queried.
https://www.confluent.io/blog/multiple-event-types-in-the-same-kafka-topic/
Most helpful comment
Should totally support this by just adding the superset of columns from all types in the union.
https://martinfowler.com/eaaCatalog/singleTableInheritance.html
With Schema Registry's new support for schema references more and more users will be using Unions to allow topics to receive different event types, so ksqlDB not supporting Unions/OneOfs is going to become a bigger issue.