Fluent-bit: Incorrect escaping/unescaping of valid json logs

Created on 7 Oct 2018  路  7Comments  路  Source: fluent/fluent-bit

Bug Report

I believe this may be an issue with improper escaping or unescaping of logs coming from a Docker container in Kubernetes. Kibana/ElasticSearch is unable to display certain log lines as a result. I'm not really sure where the problem lies.

Log Example

This line coming from the same application is correctly interpreted and available in Kibana.

{"module":"Cluster.Logger","application":"libcluster","ts":"2018-10-07 19:16:06.561","node":"[email protected]","msg":"[libcluster:topologies] request to kubernetes failed!: {:failed_connect, [{:to_address, {'kubernetes.default.svc.cluster.local', 443}}, {:inet, [:inet], :nxdomain}]}","level":"error"}

This log line is pulled from the Kubernetes container log and is correct but is being mishandled.

{"module":"application_controller","ts":"2018-10-07 19:15:51.577","node":"[email protected]","msg":"Application mix started at :\"[email protected]\"","level":"info"}

The issue is with any text getting logged that looks like this:

Some text :"[email protected]"`

I believe the JSON being logged out is valid, where the :"something" is properly escaped in the log msg field.

In Kibana, I can see the log entries that are processed correctly and those that are not. The first one below is an example of one done incorrectly. The log key from the Docker logs is not being interpreted correctly and was left as a string.

{
  "_index": "logstash-2018.10.07",
  "_type": "flb_type",
  "_id": "AWZP9alHhugKwe9EnaaN",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-10-07T19:15:52.166Z",
    "log": "{\"module\":\"application_controller\",\"ts\":\"2018-10-07 19:15:52.166\",\"node\":\"[email protected]\",\"msg\":\"Application mix started at :\"[email protected]\"\",\"level\":\"info\"}\n",
    "stream": "stdout",
    "time": "2018-10-07T19:15:52.166370991Z",
    "kubernetes": {
      "pod_name": "simple-aware-deployment-746647f55d-tmtpt",
      "namespace_name": "default",
      "pod_id": "e256ae60-c9d5-11e8-9475-08002777cb9b",
      "labels": {
        "app": "simple-aware",
        "pod-template-hash": "3022039118"
      },
      "annotations": {
        "fluentbit.io/parser": "elixir-json"
      },
      "host": "minikube",
      "container_name": "simple-aware",
      "docker_id": "d3fc6cacc6603097aebfe3d9badca7a662ae5c5168d699bb6dc851a134d46c51"
    }
  },
  "fields": {
    "@timestamp": [
      1538939752166
    ],
    "time": [
      1538939752166
    ]
  },
  "highlight": {
    "kubernetes.labels.app": [
      "@kibana-highlighted-field@simple@/kibana-highlighted-field@-@kibana-highlighted-field@aware@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1538939752166
  ]
}

When I look at how Kibana is interpreting that JSON, the relevant log value looks like this:

{"module":"application_controller","ts":"2018-10-07 19:15:52.166","node":"[email protected]","msg":"Application mix started at :"[email protected]"","level":"info"}

I believe the problem is that the msg text should have been escaped again.

You can see that the :"simple_aware@xxx" was incorrectly unescaped. It should have remain escaped.

As evidence that my parser is setup correctly, a separate log line from the same application is handled correctly.

{
  "_index": "logstash-2018.10.07",
  "_type": "flb_type",
  "_id": "AWZP9alHhugKwe9EnaaS",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-10-07T19:16:30.433Z",
    "log": "{\"module\":\"Cluster.Logger\",\"application\":\"libcluster\",\"ts\":\"2018-10-07 19:16:05.801\",\"node\":\"[email protected]\",\"msg\":\"[libcluster:topologies] request to kubernetes failed!: {:failed_connect, [{:to_address, {'kubernetes.default.svc.cluster.local', 443}}, {:inet, [:inet], :nxdomain}]}\",\"level\":\"error\"}\n",
    "stream": "stdout",
    "time": "2018-10-07T19:16:05.802268084Z",
    "module": "Cluster.Logger",
    "application": "libcluster",
    "ts": "2018-10-07 19:16:05.801",
    "node": "[email protected]",
    "msg": "[libcluster:topologies] request to kubernetes failed!: {:failed_connect, [{:to_address, {'kubernetes.default.svc.cluster.local', 443}}, {:inet, [:inet], :nxdomain}]}",
    "level": "error",
    "kubernetes": {
      "pod_name": "simple-aware-deployment-746647f55d-zw488",
      "namespace_name": "default",
      "pod_id": "e255f056-c9d5-11e8-9475-08002777cb9b",
      "labels": {
        "app": "simple-aware",
        "pod-template-hash": "3022039118"
      },
      "annotations": {
        "fluentbit.io/parser": "elixir-json"
      },
      "host": "minikube",
      "container_name": "simple-aware",
      "docker_id": "09bac935c2a582af1b745a2ff6e51ec3248df12e1086c4ab536cce6d1e9d240f"
    }
  },
  "fields": {
    "@timestamp": [
      1538939790433
    ],
    "time": [
      1538939765802
    ]
  },
  "highlight": {
    "kubernetes.labels.app": [
      "@kibana-highlighted-field@simple@/kibana-highlighted-field@-@kibana-highlighted-field@aware@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1538939765802
  ]
}

The interpreted log line comes out in Kibana as this:

{"module":"Cluster.Logger","application":"libcluster","ts":"2018-10-07 19:16:05.801","node":"[email protected]","msg":"[libcluster:topologies] request to kubernetes failed!: {:failed_connect, [{:to_address, {'kubernetes.default.svc.cluster.local', 443}}, {:inet, [:inet], :nxdomain}]}","level":"error"}

Parser Config

    [PARSER]
        Name        elixir-json
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder         | Field | Optional Action
        # =============|==================|=======|=========
        Decode_Field_As   escaped           log     do_next
        Decode_Field_As   json              log

Environment

  • Version: fluent/fluent-bit:0.14.3
  • Kubernetes:
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:46:06Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.0", GitCommit:"fc32d2f3698e36b93322a3465f63a14e9f0eaead", GitTreeState:"clean", BuildDate:"2018-03-26T16:44:10Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Additional context

This issue prevents Kibana and/or ElasticSearch from correctly interpreting log entries so they are not indexed correctly.

bug fixed

Most helpful comment

No. I changed to use fluentd mostly because it has an easy docker setup for forwarding to AWS CloudWatch. (Running in Kubernetes). I prefer fluent-bit tho. :slightly_smiling_face:

All 7 comments

I also tried using this Parser Config (wondering if that was creating the problem). No change.

    [PARSER]
        Name        elixir-json
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder         | Field | Optional Action
        # =============|==================|=======|=========
        Decode_Field_As   json              log

@brainlid

I tried locally the text message and I could parse it without problems, would you please provide the full line coming from the Docker log ?

my test:

__1__ input file content:

{"module":"application_controller","ts":"2018-10-07 19:15:51.577","node":"[email protected]","msg":"Application mix started at :\"[email protected]\"","level":"info"}

__2__ Process with Fluent Bit:

$ ../../build/bin/fluent-bit -R ../../conf/parsers.conf -i tail -p path=test.log -p parser=json -o stdout -p format=json_lines -f 1
Fluent-Bit v0.14.5
Copyright (C) Treasure Data

[2018/10/26 19:11:37] [ info] [engine] started (pid=18005)
 {"date":1540577497.893470, "module":"application_controller", "ts":"2018-10-07 19:15:51.577", "node":"[email protected]", "msg":"Application mix started at :\"[email protected]\"", "level":"info"} 

my parser looks like this:

[PARSER]
    Name   json
    Format json
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z
    Decode_Field_As escaped msg

note that this config is only for the input file you provided, I need the real docker log to test it.

@edsiper Thank you for looking into this and for your response.

Here's a copy of the raw container logs via a kubectl logs <pod> command. There are two log lines. The 2nd is the one with the problem with the data in the msg field of the log.

{"module":"supervisor","ts":"2018-10-27 20:46:22.607","node":"[email protected]","msg":"Child Mix.ProjectStack of Supervisor Mix.Supervisor started\nPid: #PID<0.1537.0>\nStart Call: Mix.ProjectStack.start_link([])","level":"info"}
{"module":"application_controller","ts":"2018-10-27 20:46:22.607","node":"[email protected]","msg":"Application mix started at :\"[email protected]\"","level":"info"}

Following your parser example, I updated my parser to this:

    [PARSER]
        Name        elixir-json
        Format      json
        Time_Key    ts
        Time_Format %Y-%m-%d %H:%M:%S.%L
        Time_Keep   On
        # Command      |  Decoder         | Field | Optional Action
        # =============|==================|=======|=========
        Decode_Field_As   escaped             msg

In Kibana, all the logs with text like "msg":"Application mix started at :\"[email protected]\"" fail to parse the whole log line.

In Kibana, for a log that is not processed correctly, here's a sample of the JSON view:

{
  "_index": "logstash-2018.10.27",
  "_type": "flb_type",
  "_id": "AWa3WRrITTHZ2LfLhNpW",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-10-27T21:05:56.515Z",
    "log": "{\"module\":\"application_controller\",\"ts\":\"2018-10-27 21:05:56.515\",\"node\":\"[email protected]\",\"msg\":\"Application mix started at :\"[email protected]\"\",\"level\":\"info\"}\n",
    "stream": "stdout",
    "time": "2018-10-27T21:05:56.51569035Z",
    "kubernetes": {
      "pod_name": "simple-aware-deployment-5cc49dd8f8-dw6td",
      "namespace_name": "default",
      "pod_id": "152641c1-da2c-11e8-a109-080027f47166",
      "labels": {
        "app": "simple-aware",
        "pod-template-hash": "1770588494"
      },
      "annotations": {
        "fluentbit.io/parser": "elixir-json"
      },
      "host": "minikube",
      "container_name": "simple-aware",
      "docker_id": "74dba26b270fca6ec46bfbf6adb7c1580b138ae70caff843d0f9672aa8527636"
    }
  },
  "fields": {
    "@timestamp": [
      1540674356515
    ],
    "time": [
      1540674356515
    ]
  },
  "highlight": {
    "kubernetes.labels.app": [
      "@kibana-highlighted-field@simple@/kibana-highlighted-field@-@kibana-highlighted-field@aware@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1540674356515
  ]
}

It appears to me that the _source > log needs an extra escaping of the msg content. So it becomes invalid JSON when deserializing it.

Is there anything else I can provide to help track the issue down?

Hi @brainlid, did you manage to solve this problem?

No. I changed to use fluentd mostly because it has an easy docker setup for forwarding to AWS CloudWatch. (Running in Kubernetes). I prefer fluent-bit tho. :slightly_smiling_face:

Please check the following comment on #1278 :

https://github.com/fluent/fluent-bit/issues/1278#issuecomment-499583503

Issue already fixed, ref: #1278 (comment)

Was this page helpful?
0 / 5 - 0 ratings