I am trying to parse a JSON field from a nested object. The expression is:
EXTRACTJSONFIELD(Records, ‘$.0.s3.object.userMetadata.X-Amz-Meta-Clientplatform’)
However it returns NULL probably because of hyphens in the X-Amz-Meta-Clientplatform key.
I also tried using the array syntax (don't know whether it is allowed):
EXTRACTJSONFIELD(Records, '$.0.s3.object.userMetadata[''X-Amz-Meta-Clientplatform'']')
however it doesn't work too.
The object in the Records field is:
[
{
“s3”: {
“object”: {
“key”: “TZ”,
“size”: 4,
“eTag”: “ae1d80ba85332ae2d9fbbd491b17c043”,
“userMetadata”: {
“X-Amz-Meta-Appversion”: “1.0.1",
“X-Amz-Meta-Clientplatform”: “ios”
}
}
}
}
]
How can I escape hyphens in KSQL in JSON path?
P.S. Other keys without hyphens are parsed ok. Also if I parse the user metadata itself using EXTRACTJSONFIELD(Records, ‘$.0.s3.object.userMetadata’) a correct array is returned:
{“X-Amz-Meta-Appversion”: “1.0.1","X-Amz-Meta-Clientplatform":"ios"}
Guys, I've investigated io.confluent.ksql.util.json.JsonPathTokenizer especially token constants in the beginning. I've noticed that the constant QUOTE = '\\"'; I don't know why but the following expression works ok for me!
EXTRACTJSONFIELD(Records, '$.0.s3.object.userMetadata[\\\\"X-Amz-Meta-Clientplatform\\\\"]')
Note the double slash and double quotes.
Nice find, @novikovantonio. We should definitely document this, and probably support a more standard way of escaping field references.
Most helpful comment
Guys, I've investigated io.confluent.ksql.util.json.JsonPathTokenizer especially token constants in the beginning. I've noticed that the constant
QUOTE = '\\"';I don't know why but the following expression works ok for me!Note the double slash and double quotes.