Describe the bug
We serialize log messages as json, and some messages contain fields that are themselves the serialization of a json object. In this case, we are seeing that the escaped quotes of the serialized json are unescaped before being output to elastic search, resulting in a string that is no longer valid json. We believe the problem is in the elasticsearch output, because we do not see an issue when testing fluent-bit and outputting locally.
To Reproduce
{"log":"{\"payload\":\"{\\\"foo\\\":1333,\\\"bar\\\":1,\\\"baz\\\":\\\"cdd32offh-scmddddx\\\",\\\"colorrr\\\":\\\"ffffff\\\",\\\"other_color\\\":\\\"c22829\\\"}\"}","stream":"stdout","time":"2018-06-11T14:37:30.681701731Z"}
Expected behavior
a json log object containing a string value that is the serialization of a json object should be preserved in that form, as a string that includes escaped quote characters.
Your Environment
Additional context
Also as per @Flydiverny on the slack channel
Log from container:
{"msg":"String with \"quotes\""}
Docker log file:
{"log":"{\"msg\":\"String with \\\"quotes\\\"\"}\n","stream":"stdout","time":"2019-02-22T14:55:38.191687326Z"}
Input tail with Docker parser
Kubernetes filter using json parser
What shows up in kibana:
{\"msg\":\"String with \"qoutes\"\"}\n
Which is invalid json:
{ "msg": "String with "qoutes"" }
As per @sjentzsch in slack channel:
{"log":"{\"@timestamp\": 1550769485892, \"level\": \"DEBUG\", \"process\": 1, \"thread\": \"139787420716360\", \"logger\": \"cachecontrol.controller\", \"message\": \"Response header has \\\"no-store\\\"\", \"traceId\": null, \"spanId\": null, \"tenant\": \"skill\"}\n","stream":"stderr","attrs":{"io.kubernetes.container.name":"skill-alexa","io.kubernetes.pod.name":"alexa-6f6997b89d-jfn5c","io.kubernetes.pod.namespace":"skill-edge","io.kubernetes.pod.uid":"b9be16c5-3466-11e9-a14d-000d3a289f71"},"time":"2019-02-21T17:18:05.892467619Z"} <--- fails to be parsed, due to \"message\": \"Response header has \\\"no-store\\\"\"
Indeed I can relate. If our applications log JSON that contains escaped strings or JSON itself (properly escaped!), the parsing of the log field fails.
However, I can not relate to this being an issue with the elasticsearch output. We are using azure output, however, I suspect it's related to the kubernetes filter, namely the Merge_Log functionality.
Would very much love to see this fixed! Unfortunately, that's a show-stopper for our team to use fluent-bit.
Thanks @sjentzsch , was not sure if it was es ouput or kubernetes filter actually. Title adjusted!
Seems the error is in flb_unescape_string(), which is called from merge_log_handler when Merge_Log is on.
So somewhere here essentially?
https://github.com/fluent/fluent-bit/blob/fe09b4c87f3b44b3bb9ed1b0c3d9938d4f675623/plugins/filter_kubernetes/kubernetes.c#L140-L174
it flows down through flb_pack_json() and then to flb_json_tokenise().
making a 'fix' to flb_unescape, it still ends up w/ a tokenise issue since jsmn doesn't understand nested.
Please check the following comment on #1278 :
https://github.com/fluent/fluent-bit/issues/1278#issuecomment-499583503
Thanks @edsiper will switch back to fluent-bit from fluentd soon to test this out
Issue already fixed, ref: #1278 (comment)
Most helpful comment
Also as per @Flydiverny on the slack channel