Fluent-bit: Unable to get k8s metadata from systemd input

Created on 22 Feb 2018  路  21Comments  路  Source: fluent/fluent-bit

I am attempting to get the Kubernetes filter metadata from journald logs and output to Elasticsearch.
(From the thread https://github.com/fluent/fluent-bit/issues/332 I assume this is possible.)

We have an existing k8s setup with a number of pods logging to journald.

I created a new pod for fluent-bit:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: fluent-bit
spec:
  replicas: 1
  template:
    metadata:
      labels:
        name: fluent-bit
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:0.12.14
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
        - name: fluent-bit-log
          mountPath: /fluent-bit/log/
      terminationGracePeriodSeconds: 10
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        hostPath:
          path: /var/Davra/fluent-bit
      - name: fluent-bit-log
        hostPath:
          path: /data

This uses the following fluent-bit config:

 # cat /var/Davra/fluent-bit/fluent-bit.conf
[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Log_File     /fluent-bit/log/fluent-bit.log
    Parsers_File parsers.conf

[INPUT]
    Name            systemd
    Path            /var/log/journal
    Tag             k8s.*

[FILTER]
    Name            kubernetes
    Match           k8s.*

[OUTPUT]
    Name            es
    Match           *
    Host            192.168.10.203
    Port            9200
    Logstash_Format On
    Retry_Limit     False

The resultant docs in Elasticsearch show systemd metadata but no kubernetes metadata. If I flip my log-driver's to use json-file and tail /var/log/containers/*.log on the input instead, I do see the kubernetes metadata.

So, should I be able to get kubernetes metadata from journald input and, if so, any ideas of what may be wrong here?

Thanks.

fixed question

All 21 comments

Could you give logs from fluent-bit pod?
I think we could have similar problem. (#517 )

@bat9r I'd say this is unrelated to your issue - logs are good:

[2018/02/22 09:30:54] [ info] [engine] started
[2018/02/22 09:30:54] [ info] [filter_kube] https=1 host=kubernetes.default.svc.cluster.local port=443
[2018/02/22 09:30:54] [ info] [filter_kube] local POD info OK
[2018/02/22 09:30:54] [ info] [filter_kube] testing connectivity with API server...
[2018/02/22 09:30:54] [ info] [filter_kube] API server connectivity OK

Even at debug level logs look health

@garysmith-github I understood, okey..

would you please check if the following recipes (and 0.13-dev image) fix the issue ?

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev

I have used 0.13-dev but I'm not entirely sure what you're asking me for there as it has no recipe for taking input from systemd.

If I take the DaemonSet config as it stands, using the INPUT which tails from log files, then yes, that works fine and the k8s metadata is inserted into the logs sent to ES.

However, my problem is with taking INPUT from systemd. As such I updated your config as follows:
fluent-bit-configmap.yaml.txt
fluent-bit-ds.yaml.txt

With this setup I get no embedded k8s metadata, only logs of the form:

{
  "_index": "logstash-2018.02.23",
  "_type": "flb_type",
  "_id": "UMriwWEB7Ew-2Fy64adn",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-02-23T08:58:45.0Z",
    "PRIORITY": "6",
    "CONTAINER_NAME": "k8s_kairosdb_kairosdb-914055854-b63vq_default_d6c53deb-05a4-11e8-a8c4-080027435fb7_23",
    "CONTAINER_TAG": "",
    "CONTAINER_ID": "56e257661383",
    "CONTAINER_ID_FULL": "56e257661383836fac4cd90a23ee8a7a02ee1538c8f35657d1a90f3de1065a22",
    "_TRANSPORT": "journal",
    "_PID": "1719",
    "_UID": "0",
    "_GID": "0",
    "_COMM": "dockerd",
    "_EXE": "/usr/bin/dockerd",
    "_CMDLINE": "dockerd --host=fd:// --containerd=/var/run/docker/libcontainerd/docker-containerd.sock --log-driver=journald --mtu=1450 --selinux-enabled",
    "_CAP_EFFECTIVE": "3fffffffff",
    "_SYSTEMD_CGROUP": "/system.slice/docker.service",
    "_SYSTEMD_UNIT": "docker.service",
    "_SYSTEMD_SLICE": "system.slice",
    "_SELINUX_CONTEXT": "system_u:system_r:kernel_t:s0",
    "_BOOT_ID": "61cee272367249b7a64f35dc8839aaed",
    "_MACHINE_ID": "d5cbb0690d184677ad6b31d64880fcc8",
    "_HOSTNAME": "davra",
    "MESSAGE": "08:58:45.839 [qtp151442075-47] DEBUG [HttpParser.java:281] - filled 157/157",
    "_SOURCE_REALTIME_TIMESTAMP": "1519376325841637"
  },
  "fields": {
    "@timestamp": [
      "2018-02-23T08:58:45.000Z"
    ]
  },
  "sort": [
    1519376325000
  ]
}

Thoughts?

in the output above check the __MESSAGE__ field, that's your real log message

"08:58:45.839 [qtp151442075-47] DEBUG [HttpParser.java:281] - filled 157/157",

Yes, but I'm expecting the Kubernetes filter to give me the same metadata that it does when I tail logs.

Something that includes fields like this:

    "kubernetes": {
      "pod_name": "announce-api-2527845369-9ttmj",
      "namespace_name": "default",
      "pod_id": "312f182e-05a8-11e8-95cf-080027435fb7",
      "labels": {
        "name": "announce-api",
        "pod-template-hash": "2527845369"
      },
      "annotations": {
        "kubernetes_io/created-by": "{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"ReplicaSet\\\",\\\"namespace\\\":\\\"default\\\",\\\"name\\\":\\\"announce-api-2527845369\\\",\\\"uid\\\":\\\"312d1acf-05a8-11e8-95cf-080027435fb7\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"4317\\\"}}\\n"
      },
      "host": "192.168.10.169",
      "container_name": "announce-api",
      "docker_id": "652ce1959321ee676759ec8e591b5124150429e27bdacf2d6d2747986f7e4462"
    }

What I've been asking is - Is it possible to get that same kubernetes metadata from the filter when the input is systemd rather than tail?????

If so, how?

looks like a bug. Since I don't have a Kubernetes with Systemd-logging running here, would you please provide the JSON output of the API server for such specific Pod:

$ kubectl proxy
$ curl http://127.0.0.1:8001/api/v1/namespaces/default/pods/kairosdb-914055854-b63vq > pod.json

Thanks Eduardo, file attached.

If it helps in your debugging, I'm flipping all my k8s pods between json-file & journald logging by changing the --log-driver switch in docker.service ExecStart and restarting docker.
kairosdb-pod.json.txt

@garysmith-github

Your configmap kubernetes filter must enable the property "Use_Journal On".

note, since your sources are files and also systemd, you should use a different tagging and set two kubernetes filters, one for tail and other for systemd based records, e.g:

    [INPUT]
        Name           tail
        Tag               kube_tail.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube_tail.db
        Mem_Buf_Limit     5MB
        Skip_Long_Lines   On
        Refresh_Interval  10

    [INPUT]
        Name            systemd
        Tag               kube_systemd.*
        Path              /var/log/journal
        DB                /var/log/flb_kube_systemd.db

    [FILTER]
        Name                kubernetes
        Match               kube_tail.*
        Kube_URL            https://kubernetes.default.svc:443
        Merge_Log           On
        K8S-Logging.Parser  On

    [FILTER]
        Name                kubernetes
        Match               kube_systemd.*
        Kube_URL            https://kubernetes.default.svc:443
        Merge_Log           On
        K8S-Logging.Parser  On
        Use_Journal         On

Many thanks, Eduardo - that's working now.

When do anticipate 0.13-dev will get released?

Seems to still be problems here - it is not filling the correct pod details, for example:

{
  "_index": "logstash-2018.02.28",
  "_type": "flb_type",
  "_id": "8KB422EBJIqbAWYseZ1y",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2018-02-28T08:12:40.0Z",
    "PRIORITY": "6",
    "CONTAINER_TAG": "",
    "_TRANSPORT": "journal",
    "_PID": "1614",
    "_UID": "0",
    "_GID": "0",
    "_COMM": "dockerd",
    "_EXE": "/usr/bin/dockerd",
    "_CMDLINE": "dockerd --host=fd:// --containerd=/var/run/docker/libcontainerd/docker-containerd.sock --log-driver=journald --mtu=1450 --selinux-enabled",
    "_CAP_EFFECTIVE": "3fffffffff",
    "_SYSTEMD_CGROUP": "/system.slice/docker.service",
    "_SYSTEMD_UNIT": "docker.service",
    "_SYSTEMD_SLICE": "system.slice",
    "_SELINUX_CONTEXT": "system_u:system_r:kernel_t:s0",
    "_BOOT_ID": "481fa0d386a7418e84223d49064af28a",
    "_MACHINE_ID": "d5cbb0690d184677ad6b31d64880fcc8",
    "_HOSTNAME": "davra",
    "CONTAINER_ID": "8ae7eefda764",
    "CONTAINER_ID_FULL": "8ae7eefda7649136a0f52d668264b3520f5080a22d6c959b162ec52fdd1c5fb0",
    "CONTAINER_NAME": "k8s_announce-api_announce-api-2527845369-9ttmj_default_312f182e-05a8-11e8-95cf-080027435fb7_2",
    "MESSAGE": "     '    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1150:14)' ] }",
    "_SOURCE_REALTIME_TIMESTAMP": "1519805560815021",
    "kubernetes": {
      "pod_name": "kairosdb-914055854-b63vq",
      "namespace_name": "default",
      "pod_id": "d6c53deb-05a4-11e8-a8c4-080027435fb7",
      "labels": {
        "name": "kairosdb",
        "pod-template-hash": "914055854"
      },
      "annotations": {
        "kubernetes_io/created-by": "{\\\"kind\\\":\\\"SerializedReference\\\",\\\"apiVersion\\\":\\\"v1\\\",\\\"reference\\\":{\\\"kind\\\":\\\"ReplicaSet\\\",\\\"namespace\\\":\\\"default\\\",\\\"name\\\":\\\"kairosdb-914055854\\\",\\\"uid\\\":\\\"d6c2f841-05a4-11e8-a8c4-080027435fb7\\\",\\\"apiVersion\\\":\\\"extensions\\\",\\\"resourceVersion\\\":\\\"1592\\\"}}\\n"
      },
      "host": "192.168.10.169",
      "container_name": "kairosdb",
      "container_hash": ""
    }
  },
  "fields": {
    "@timestamp": [
      "2018-02-28T08:12:40.000Z"
    ]
  },
  "sort": [
    1519805560000
  ]
}

You can see this is:
"CONTAINER_NAME": "k8s_announce-api_announce-api-2527845369-9ttmj_default_312f182e-05a8-11e8-95cf-080027435fb7_2"
but
"kubernetes.pod_name": "kairosdb-914055854-b63vq"

Any ideas?

@garysmith-github hmm yeah there is a problem, parsing CONTAINER_NAME we should get:

name_prefix  = k8s
container_name = announce-api
container_hash  = 
pod_name = announce-api-2527845369-9ttmj
namespace_name = default

likely the hash key is not set correctly, I will troubleshoot.

@garysmith-github please provide me the Pod meta for announce-api-252....pod,

$ kubectl proxy
$ curl http://127.0.0.1:8001/api/v1/namespaces/default/pods/announce-api-2527845369-9ttmj > pod.json

@edsiper See attached

announce-api-pod.json.txt

I've found the root cause of the problem. I will provide an update shorlty

@garysmith-github please check the new release (0.13-dev:0.10)

https://github.com/fluent/fluent-bit-kubernetes-logging/tree/0.13-dev

Thanks Eduardo, that seems to be fixed now.

Any idea when that will move to non-dev branch?

My expectation is to release it by the third week of this month:

https://github.com/fluent/fluent-bit/milestone/5

Closing as fixed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Barbazoo picture Barbazoo  路  3Comments

edsiper picture edsiper  路  4Comments

mbelchin picture mbelchin  路  3Comments

arienchen picture arienchen  路  3Comments

JavaCS3 picture JavaCS3  路  3Comments