Fluent-bit: Add native support for nested "JSON strings"

Created on 23 May 2017  Â·  45Comments  Â·  Source: fluent/fluent-bit

How does fluent bit handle json within json where the sub json is a value for a message and not seen as a object? Often times the sub json is escaped so some work is needed by the plugin to work around this. For fluentd we needed a plugin. Does fluentbit solve this out of box?

bug fixed

Most helpful comment

All,

I've released 0.12.8 which address this problem from two angles:

  1. filter_kubernetes: when using Merge_JSON_Log option, now the filter will avoid to keep the escaped characters.
  2. The filter_parser now have a new option called __unescape_key__, so it can be used for scenario for Docker logs with nested string-JSON

http://fluentbit.io/announcements/v0.12.8/

All 45 comments

There is not a generic way at the moment, as said a plugin will be required.

The only case where this is handled is in the filter_kubernetes plugin where Docker JSON logs might have stringify json messages.

hmm likely for Docker use case without Kubernetes this needs to be fixed. I am thinking to add some kind of _merge_json_key_ option to let the parsers auto-handle that.

@edsiper if this does not work, then what is the v0.11.5 kubernetes-parameter Merge_JSON_Log On doing? I get a docker-input with the docker-parser, which contains a log-message and if that log-message is a JSON, the kubernetes-filter should incorporate it.

But when I try it, I get this error:

[2017/05/26 14:32:22] [ warn] [filter_kube] could not pack merged json
kube.var.log.containers.talk-json-to-me_default_talk-json-to-me-92bc5ba48086596ea4ca8f698f09701a59af7593f9989161f13b315ed56c1160.log: [1495809142, {"log":"{\"MaiTime\":\"Fri May 26 14:32:22 UTC 2017\",\"artist\":\"Jason Derulo\",\"lyrics\":\"Talk JSON to me\",\"dirty\":205}\r\n", "stream":"stdout", "time":"2017-05-26T14:32:22.545392924Z", "kubernetes":{"pod_name":"talk-json-to-me", "namespace_name":"default", "container_name":"talk-json-to-me", "docker_id":"92bc5ba48086596ea4ca8f698f09701a59af7593f9989161f13b315ed56c1160", "pod_id":"a2abebaf-421f-11e7-8086-0800270e934a"}}]
[2017/05/26 14:32:23] [ warn] [filter_kube] could not pack merged json
kube.var.log.containers.talk-json-to-me_default_talk-json-to-me-92bc5ba48086596ea4ca8f698f09701a59af7593f9989161f13b315ed56c1160.log: [1495809143, {"log":"{\"MaiTime\":\"Fri May 26 14:32:23 UTC 2017\",\"artist\":\"Jason Derulo\",\"lyrics\":\"Talk JSON to me\",\"dirty\":206}\r\n", "stream":"stdout", "time":"2017-05-26T14:32:23.5535398Z", "kubernetes":{"pod_name":"talk-json-to-me", "namespace_name":"default", "container_name":"talk-json-to-me", "docker_id":"92bc5ba48086596ea4ca8f698f09701a59af7593f9989161f13b315ed56c1160", "pod_id":"a2abebaf-421f-11e7-8086-0800270e934a"}}]}]

My Config-file:

[INPUT]
    Name           tail
    Tag            kube.*
    Path           /var/log/containers/*.log
    Parser         docker
    Mem_Buf_Limit  256MB

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On

[FILTER]
    Name kubernetes
    Match kube.*
    Merge_JSON_Log On

[OUTPUT]
    Name  file
    Match *
    Path /tmp/fluent-bit.log

I've uploaded my complete setup here for convenience: https://github.com/LarsKumbier/fluent-bit-json-merge-test/

@LarsKumbier thanks for the detailed explanation and test case.

I have been working on this and I found the two root causes for the problem:

__1.__ Fluent Bit: the function to make unescaped strings, was returning a wrong string length and also do not respecting special characters like \r, \n, etc.

__2.__ Test case: the test case provided generates an invalid JSON string:

{"log": "{\"MaiTime\":\"Tue May 30 02:23:34 UTC 2017\",\"artist\":\"Jason Derulo\",\"lyrics\":\"Talk JSON to me\",\"dirty\":0}\r\n","stream":"stdout","time":"2017-05-30T02:23:34.863074196Z"}

If you look carefully, after the json map there's an extra __\r\n__, so the map becomes invalid. This can be fixed using the __echo -n__ command in the script.

From the Fluent Bit side the following fix have been pushed:

https://github.com/fluent/fluent-bit/commit/316d46d18abdbc0a7f013ceeec161e6a046dd492

I will release Fluent Bit 0.11.7 shortly with this issue fixed.

Thanks again for your help!

Are you sure about the \r\n at the end?
The json rfc says whitespace is insignificant outside of quoted strings!

@gganssauge the thing is that the \r\n (which are not empty characters) are inside the string:

...,\"dirty\":0}\r\n"

so after the map there are two unexpected bytes which are part of the main string

My use case is extracting from JournalD. We have docker configured to write to journald and then extract the journal via fluent. This way all logs end up going through the journal and we don't have to worry about disparate collection of logs across a host.

@LarsKumbier @gganssauge

My bad, a nested JSON will always have an ending \n because Docker engine is including it. Fixed by https://github.com/fluent/fluent-bit/commit/673d39cd39e26f540dd8b564016733c380e3d474 (it will be included in v0.11.8)

@pfremm Journald support will come soon, please upvote here: https://github.com/fluent/fluent-bit/issues/217

I'm using Lars' test case with 0.11.13 and still get [filter_kube] could not pack merged json.

my config is

[SERVICE]
    Flush 1
    Daemon Off
    Log_Level    debug
    Log_File     /tmp/fluent-bit.log
    Parsers_File parsers.conf

[INPUT]
    Name           tail
    Tag            kube.*
    Path           /var/log/containers/*.log
    Parser         docker
    Mem_Buf_Limit  256MB

[FILTER]
    Name kubernetes
    Match kube.*
    Merge_JSON_Log On

[OUTPUT]
    Name  file
    Match *
    Path /tmp/fluent-bit.log

[OUTPUT]
    Name forward
    Match *
    Host 127.0.0.1
    Port 24224

@gganssauge would you please provide a json log that is failing so I can reproduce ?

error.zip
I attached the last 100 lines of the talk-json-to-me container log as well as the last 1000 lines of the fluent-bit.log generated by the above configuration.

For testing I used minikube-0.20 on ubuntu linux-16.04 with a 1.5.3 kubernetes deployment.

@gganssauge thanks for providing the test case.

I've found the problem happens because the log lines (nested) JSON ends in \r\n, instead of \n. I will improve the filter for such situation.

I was able to reproduce locally without minikube with this config:

[SERVICE]
    Flush 1
    Daemon Off
    Log_Level    info
    Parsers_File ../conf/parsers.conf

[INPUT]
    Name           tail
    Tag            kube.*
    Path           ./talk*.log
    Parser         docker
    Mem_Buf_Limit  256MB

[FILTER]
    Name kubernetes
    Match kube.*
    Merge_JSON_Log On
    Dummy_Meta On

[OUTPUT]
    Name stdout
    Match *

I have a similar problem in k8s 1.7.2 & fluentbit 0.12.4 using json-file and this config (installed with helm):

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf

[INPUT]
    Name             tail
    Path             /var/log/containers/*.log
    Parser           docker
    Tag              kube.*
    Refresh_Interval 5
    Mem_Buf_Limit    5MB
    Skip_Long_Lines  On

[FILTER]
    Name   kubernetes
    Match  kube.*
    Merge_JSON_Log On

[OUTPUT]
    Name  es
    Match *
    Host  elasticsearch.ops
    Port  9200
    Logstash_Format On
    Retry_Limit False

I cannot see kube-metadata and final log key in es is like this:

{\"type\":\"response\",\"@timestamp\":\"2017-10-02T15:40:11Z\",\"tags\":[],\"pid\":1,\"method\":\"get\",\"statusCode\":200,\"req\":{\"url\":\"/ui/favicons/favicon.ico\",\"method\":\"get\",\"headers\":{\"host\":\"monit-elasticsearch-kibana.ops\",\"connection\":\"keep-alive\",\"user-agent\":\"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36\",\"save-data\":\"on\",\"accept\":\"image/webp,image/apng,image/*,*/*;q=0.8\",\"dnt\":\"1\",\"referer\":\"http://monit-elasticsearch-kibana.ops/app/kibana\",\"accept-encoding\":\"gzip, deflate\",\"accept-language\":\"en-GB,en;q=0.8,en-US;q=0.6,es;q=0.4\"},\"remoteAddress\":\"10.244.5.14\",\"userAgent\":\"10.244.5.14\",\"referer\":\"http://monit-elasticsearch-kibana/app/kibana\"},\"res\":{\"statusCode\":200,\"responseTime\":3,\"contentLength\":9},\"message\":\"GET /ui/favicons/favicon.ico 200 3ms - 9.0B\"}\n

@jalberto

the original 'log' field is never touched or altered, would you please paste the full output of that record ?

aside from internal field (like timestamp) there is not other fields, just
"log".
using fluentd instead of flunt-bit works as expected

On Mon, 2 Oct 2017 at 17:59 Eduardo Silva notifications@github.com wrote:

@jalberto https://github.com/jalberto

the original 'log' field is never touched or altered, would you please
paste the full output of that record ?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/278#issuecomment-333579406,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGGV_R8-Fiu7WoEq1neU5hHhXJNtIf2ks5soQhDgaJpZM4NjKNO
.

@jalberto are you querying on Kibana or directly on Elasticsearch with curl ?

on kibana, but displaying the whole document

On Mon, 2 Oct 2017 at 20:14 Eduardo Silva notifications@github.com wrote:

@jalberto https://github.com/jalberto are you querying on Kibana or
directly on Elasticsearch with curl ?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/278#issuecomment-333618819,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGGV9qmZE91I7BJl6mKy0fmwmxLVc0lks5soSgSgaJpZM4NjKNO
.

Can confirm that the issue still exists on fluent-bit 0.12.6 inside k8s, details in gist (including config and example of a record inside elasticsearch fetched with curl).

@azhi

I looked at your gist and Fluent Bit says:

[2017/10/20 01:15:36] [ warn] [filter_kube] could not pack merged json

that error happens when the nested JSON message is not a valid JSON. In order to continue troubleshooting would you please supply the original Docker log file that is generating the problem ?

Just to be clear, is there any default support for merge_json in parser or filters without the kubernetes filter? I am currently running docker without kube, with log driver and my nginx output is a json. I thought it would merge it, but it just escaped the json and made it a string

Example:
https://gist.githubusercontent.com/lynxaegon/ad7f503ca7316b5ac0db5a22114e00bd/raw/17567b5fc5ea48da870c8f0b5b4c16c79df28ebc/Proxy-Test%2520Log

@lynxaegon

I will add an option to filter_parser to "unescape_key", on that way the parser will work properly.

@lynxaegon and all

I have merged a new feature called __unescape_key__ that will be available on version 0.12.8 to deal with nested JSON-strings maps, more details in the following commit:

https://github.com/fluent/fluent-bit/commit/ec8c031f404c15461e0c74dca73b3f4dbe47f2f0

All,

I've released 0.12.8 which address this problem from two angles:

  1. filter_kubernetes: when using Merge_JSON_Log option, now the filter will avoid to keep the escaped characters.
  2. The filter_parser now have a new option called __unescape_key__, so it can be used for scenario for Docker logs with nested string-JSON

http://fluentbit.io/announcements/v0.12.8/

@edsiper @pfremm
Guys, hi all, please help, I can't fine any docs how to use unescape_key in my case.
I have nginx access_log that looks like:
{"log":{\"id\":\"5a1d3f49d44ee271231631\",\"cur\":[\"USD\"],\"at\":2,\"imp\":[{\"id\":1,\"banner\":{\"w\":0,\"h\":0},\"bidfloor\":0.9}]}}
I want to get it with fluentd config:

<source>
        @type tail
        format json
        tag reqs.access
        path /var/log/nginx/test.access.log
        pos_file /var/log/nginx/test.access.log.pos
</source>
<filter reqs.access>
        @type parser
        key_name log
        format json
        unescape_key true
</filter>
<match reqs.access>
        @type kinesis_streams
        region us-west-1
        stream_name Kinesis_stream
</match>

How can I do that?
I always get something like this:
pattern not match: "{\"log\":{{\\\"id\\\":\\\"5a1d3f49d44ee271231631\\\",\\\"cur\\\":[\\\"USD\\\"],\\\"at\\\":2,\\\"imp\\\":[{\\\"id\\\":1,\\\"banner\\\":{\\\"w\\\":0,\\\"h\\\":0},\\\"bidfloor\\\":0.9}]}}"
Also I got "parameter 'unescape_key' is not used"
I need to pass that json to kinesis stream.
Please help!

all, please check the 0.13-dev image that have several improvements on this area:

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev

Hi edsiper,

I try it myself using your repo but without success

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev

{\"took\":9654,\"errors\":true,\"items\":[{\"index\":{\"_index\":\"logstash-2018.02.24\",\"_type\":\"flb_type\",\"_id\":\"TuTG2WEBba2pkx_gNGfa\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0},\"_seq_no\":351985,\"_primary_term\":1,\"status\":201}},{\"index\":{\"_index\":\"logstash-2018.02.24\",\"_type\":\"flb_type\",\"_id\":\"T-TG2WEBba2pkx_gNGfa\",\"_version\":1,\"result\":\"created\",\"_shards\":{\"total\":2,\"successful\":2,\"failed\":0},\"_seq_no\":218105,\"_primary_term\":1,\"status\":201}},{\"index\":{\"_index\":\"logstash-2018.02.24\",\"_type\":\"flb_type\",\"_id\":\"UOTG2WEBba2pkx_gNGfa\",\"status\":429,\"error\":{\"type\":\"es_rejected_execution_exception\",\"reason\":\"rejected execution of org.elasticsearch.transport.TransportService$7@34adf8ec on EsThreadPoolExecutor[bulk, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@51ffe9f8[Running, pool size = 2, active threads = 2, queued tasks = 200, completed tasks = 1557354]]\"}}},{\"i\n

Do I need to enable unescape_key true and the Merge_JSON_Log On ?

In theory, I don't.. as you have created this properties in the filter

Merge_Log On
K8S-Logging.Parser On

Currently using:

    [SERVICE]
        Flush        1
        Daemon       Off
        Log_Level    info
        Parsers_File parsers.conf

    [INPUT]
        Name             systemd
        Tag              host.*
        Path             /var/log/journal
        Systemd_Filter   _SYSTEMD_UNIT=docker.service
        Read_From_Tail   true

    [FILTER]
        Name   kubernetes
        Match  *
        Kube_URL        https://kubernetes.default
        Kube_CA_File    /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
        tls.verify  off
        use_journal     On
        Merge_log        On
        K8S-Logging.Parser On

    [FILTER]
        Name   parser
        Match  *
        Parser  syslog-rfc5424
        Decode_JSON_Field MESSAGE
        Merge_Log  on
        key_name MESSAGE
        unescape_key true

    [OUTPUT]
        Name          kafka
        Match         *
        Brokers       {{ .Values.backend.kafka.brokers }}
        Topics        {{ .Values.backend.kafka.topic }}
        Timestamp_Key @timestamp

and logs are output to kafka:

...
...
  "CONTAINER_ID": "50b213be931f",
  "CONTAINER_ID_FULL": "50b213be931ffccb3048332ae7612d03fd7f0936ab55db0aeff0bd52af7d9e3d",
  "CONTAINER_NAME": "k8s_my_foo_bar-7478fbb76d-f25bt_scoring_cf670539-1d5b-11e8-ad47-080027133a0c_0",
  "MESSAGE": "{\"timeMillis\":1519914091793,\"thread\":\"main\",\"level\":\"DEBUG\",\"loggerName\":\"org.springframework.beans.factory.annotation.InjectionMetadata\",\"message\":\"Processing injected element of bean 'org.springframework.boot.context.properties.ConfigurationPropertiesBindingPostProcessor': AutowiredMethodElement for public void org.springframework.boot.context.properties.ConfigurationPropertiesBindingPostProcessor.setGenericConverters(java.util.List)\",\"endOfBatch\":true,\"loggerFqcn\":\"org.apache.logging.log4j.jcl.Log4jLog\",\"threadId\":13,\"threadPriority\":5}\r",
  "_SOURCE_REALTIME_TIMESTAMP": "1519914091796374",
  "kubernetes": {
    "pod_name": "foo-bar-7478fbb76d-f25bt",
    "namespace_name": "foo",
    "pod_id": "cf670539-1d5b-11e8-ad47-080027133a0c",
    "labels": {
      "app": "foo-bar",
      "pod-template-hash": "3034966328",
      "raptor.sie.sony.com/logdriver": "journald",
      "release": "foo"
    },
    "annotations": {
      "kubernetes.io/created-by": "{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"ReplicaSet\\\",\\\"namespace\\\":\\\"foo\\\",\\\"name\\\":\\\"foo-bar-7478fbb76d\\\",\\\"uid\\\":\\\"201065c6-1d5b-11e8-8166-080027133a0c\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"122081\\\"}}\\n"
    },
    "host": "journald-worker01",
    "container_name": "foo-bar",
    "container_hash": ""
  }

Which looks ok to me thus far...

Basically all container logs are going through the docker daemon which uses logdriver=journald, but then types of messages in all the different containers need to be routed correctly.

One question I have: is it possible to parse the MESSAGE field as json (similar to the kubernetes output)? At that point, how does one route based upon specific keys to Topics? (ie log level: debug goes to a kafka-topic called debug). If that is not possible, what methods do folks use to differentiate types of logs?

@andrewgdavis, even I had the same issue , I added "Merge_JSON_Key"

[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Merge_JSON_Log On
Merge_JSON_Key log

It works fine but Pods are failing in kubernetes. Can anyone tell me what might be issue or do I need to add anything else?

@Sushma569 k8s pods failing shouldn't be related to fluent-bit processing of logs...
kubectl describe pod $your-pod should give some insight as to why they are failing. I have ran into a few failures, and usually it is because a liveness probe, a configuration error, permissions issue, or an OOM failure.

@andrewgdavis , I tried all the things in Kubernetes. I think I found the issue is "Merge_JSON_Key", other nodes are not able to find the key , that's why PODS are failing after sometime and backing up to running state later. I am trying to resolve this at fluentd.

@andrewgdavis did you find the solution to your question? I am having a very similar problem. More specifically the one around parsing the message field.

@TheRealDwright, I didn't find a solution-- however i was not able to try the suggested "Merge_JSON_Key". I was performing a couple of proof of concepts with different logging technologies, and unfortunately the json merge and kafka routing questions were left open ended.

there are many issues/setups reported, @andrewgdavis which specific problem do you have ?, make sure to provide your current setup.

@edsiper I am already doing the Merge_JSON true on the kubernetes filter, however not able to get the MESSAGE field which is a json get tokenized seperately.

@sumeethtewar do you want all the records under a new key ?, please paste your original log message and how you would like to see it after the filter process

@edsiper to give more details:

I am using a Kubernetes cluster.
FluentBit is working as a Daemon set and the version is : 0.13.0
All working fine.

My Config file for fluentBit looks like the below:

[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020

[INPUT]
Name systemd
Tag docker.container.k8
Path /run/log/journal
Parser docker
DB /var/log/flb_sys_kube.db
Systemd_Filter _SYSTEMD_UNIT=docker.service
Systemd_Filter _TRANSPORT=journal
Read_From_Tail true

[FILTER]
Name kubernetes
Match docker.container.k8
Kube_URL https://kubernetes.default.svc.cluster.local:443
Merge_Log On
Merge_JSON_Key log
K8S-Logging.Parser On
Use_Journal On

[FILTER]
Name record_modifier
Match docker.container.k8
Whitelist kubernetes
Whitelist MESSAGE
Whitelist CONTAINER_ID
Whitelist CONTAINER_TAG
Remove_key CONTAINER_ID_FULL
Remove_key _CAP_EFFECTIVE
Remove_key _CMDLINE
Remove_key _COMM
Remove_key _EXE
Remove_key _GID
Remove_key _UID
Remove_key _PID
Remove_key _MACHINE_ID
Remove_key _SELINUX_CONTEXT
Remove_key _SYSTEMD_CGROUP
Remove_key _TRANSPORT
Remove_key _BOOT_ID

[OUTPUT]
Name es
Match docker.container.k8
Host ${FLUENT_ELASTICSEARCH_HOST}
Port ${FLUENT_ELASTICSEARCH_PORT}
Buffer_Size False
Logstash_Format On
Logstash_Prefix fluent
Retry_Limit False
Time_Key @timestamp
Include_Tag_Key On
Tag_key _tag_all

[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped log

I am echoing logs from one of the container as :
echo "{ \"logLevel\": \"INFO\",\"LogDate\": \"$(date)\", \"ACTENANT\": \"1\", \"_ACUSER\": \"Sumeeth Tewar\", \"_CallRequestID\": \"2018New1Demo123\", \"_Msg\": \"The Actual Msg For Demo\", \"_TRAIL\": 0 }"

But in the elastic search I can see the MESSAGE being created as:
{
"_index": "fluent-2018.08.23",
"_type": "flb_type",
"_id": "nCpxZWUB8RYl9KxAWrcL",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-08-23T06:20:51.0Z",
"_tag_all": "docker.container.k8",
"_HOSTNAME": "flexNode1",
"PRIORITY": "6",
"CONTAINER_NAME": "json-log-spewer",
"CONTAINER_TAG": "docker.sumeettewar/json-log-spewer:latest",
"CONTAINER_ID": "2d5a1989fc3c",
"MESSAGE": "{ \"logLevel\": \"INFO\",\"LogDate\": \"Thu Aug 23 06:20:51 UTC 2018\", \"ACTENANT\": \"1\", \"_ACUSER\": \"Sumeeth Tewar\", \"_CallRequestID\": \"2018New1Demo123\", \"_Msg\": \"The Actual Msg For Demo\", \"_TRAIL\": 38 }",
"_SOURCE_REALTIME_TIMESTAMP": "1535005251519215"
},

"fields": {
"@timestamp": [
"2018-08-23T06:20:51.000Z"
]
},
"sort": [
1535005251000
]
}

Expectation is: I edited the below so could be erroneous, but you can get an idea what i am thinking

{
"_index": "fluent-2018.08.22",
"_type": "flb_type",
"_id": "4WWYYWUBVY2Xketpn8_P",
"_version": 1,
"_score": null,
"_source": {
"@timestamp": "2018-08-22T12:25:16.0Z",
"_tag_all": "docker.container.k8",
"PRIORITY": "6",
"_HOSTNAME": "flexNode1",
"CONTAINER_ID": "c21ead338974",
"CONTAINER_NAME": "json-log-spewer",
"CONTAINER_TAG": "docker.sumeettewar/json-log-spewer:latest",
"MESSAGE": "{ \"logLevel\": \"INFO\",\"LogDate\": \"Wed Aug 22 12:25:16 UTC 2018\", \"ACTENANT\": \"1\", \"_ACUSER\": \"Sumeeth Tewar\", \"_CallRequestID\": \"2018New1Demo123\", \"_Msg\": \"The Actual Msg For Demo\", \"_TRAIL\": 2489 }",
"_SOURCE_REALTIME_TIMESTAMP": "1534940716649236"
},
"logLevel":"INFO",
"LogDate":"Wed Aug 22 20:17:50 UTC 2018",
"ACTENANT":"1",
"_ACUSER":"Sumeeth Tewar",
"_CallRequestID":"2018New1Demo123",
"_Msg":"The Actual Msg For Demo",
"_TRAIL":30705,

"fields": {
"@timestamp": [
"2018-08-22T12:25:16.000Z"
]
},
"sort": [
1534940716000
]
}

Do let me know what changes I need to do to achieve the above expectation? I need to evaluate whether I could achieve this in FluentBit or else fall back onto FluentD?

@edsiper Can you please provide if there is update on this?

@edsiper I currently ran into this issue. I have JSON logs going to journald which is inside the MESSAGE field. The issue is that JSON is just sent along to the output as a full string where i would like to have that field parsed as json and split out into multiple fields either under the root or under a specified key (like message.field).

Experiencing a similar issue but with a log message containing a JSON object nested in a JSON string. For example:

{"samp.timestamp":"2018-10-17T17:08:31.145Z","samp.correlationId":"123-456-789","samp.service":"edge","samp.level":"debug","samp.message":"{\"method\":\"GET\",\"hostname\":\"100.111.30.113\",\"path\":\"/health-check\",\"query\":{}}"}

Fluentd handles this just fine and expands 'samp.message' into a seperate key like so:

"_type": "fluentd",
"_id": "bzACg2YBM-TQgHsV1nbW",
"_version": 1,
"_score": null,
"_source": {
"samp.timestamp": "2018-10-17T17:08:31.145Z",
"samp.correlationId": "123-456-789",
"samp.service": "edge",
"samp.level": "debug",
"samp.message": "{\"method\":\"GET\",\"hostname\":\"100.111.30.113\",\"path\":\"/health-check\",\"query\":{}}",
"log": "{\"samp.timestamp\":\"2018-10-17T17:08:31.145Z\",\"samp.correlationId\":\"123-456-789\",\"samp.service\":\"edge\",\"samp.level\":\"debug\",\"samp.message\":\"{\\"method\\":\\"GET\\",\\"hostname\\":\\"100.111.30.113\\",\\"path\\":\\"/health-check\\",\\"query\\":{}}\"}\n",
"stream": "stdout",

Fluent Bit sadly just ignores the whole log message and fails to merge any JSON.

We'd love to switch to Fluent Bit but this is a show stopper for us unless a workaround/solution can be found.

can u use filter_parser on the field in question? with Reserve_Data / Preserve_Key option?
https://docs.fluentbit.io/manual/filter/parser

We can indeed. Someone else suggested the same and we're now using the following which seems to work well:

# https://docs.fluentbit.io/manual/filter/parser
[FILTER]
    Key_Name            log
    Match               kubernetes.*
    Name                parser
    Parser              json
    Reserve_Data        True

# https://docs.fluentbit.io/manual/filter/kubernetes
[FILTER]
    K8S-Logging.Exclude On
    K8S-Logging.Parser  On
    Match               kubernetes.*
    Merge_Log           On
    Name                kubernetes

Please check the following comment on #1278 :

https://github.com/fluent/fluent-bit/issues/1278#issuecomment-499583503

Was this page helpful?
0 / 5 - 0 ratings

Related issues

botzill picture botzill  Â·  4Comments

UladzimirSemiankou picture UladzimirSemiankou  Â·  3Comments

thrift24 picture thrift24  Â·  4Comments

jcdauchy-moodys picture jcdauchy-moodys  Â·  3Comments

mbelchin picture mbelchin  Â·  3Comments