Fluent-bit: kubernetes = null

Created on 29 Oct 2019 · 22Comments · Source: fluent/fluent-bit

Bug Report

Describe the bug

Intermittently across multiple instances of our aws eks clusters logs are arriving with the kubernetes enrichment and annotations missing, the entire key is null.

We have tried restarting the fluent-bit pods, deleting the daemonset, creating new clusters and sometimes it works (arrives with k8s enrichment and annotations) others it does not.

To Reproduce

Example log message if applicable:

   @i: 33b65867
   @m: Waiting 1499 seconds to renew Vault token.
   @t: 2019-10-29T14:25:31.9918803Z
   delaySeconds: 1499
   environment: Union
   kubernetes: null
   stream: stdout
   time: 2019-10-29T14:25:31.992070193Z
   version: 0.2.2215.24190

Steps to reproduce the problem:

we haven't been able to pinpoint the issue or consistently reproduce it.

Expected behavior

When we define the input, filter and parser to extract kubernetes meta data and annotations from our logs and api it is added to our log file.

Screenshots

Your Environment

Version used: https://github.com/aws/amazon-cloudwatch-logs-for-fluent-bit which uses fluent-bit 1.3.2
Configuration:

spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: fluent-bit
      containers:
      - name: aws-for-fluent-bit
        image: amazon/aws-for-fluent-bit:1.3.2
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
        - name: mnt
          mountPath: /mnt
          readOnly: true
        resources:
          limits:
            memory: 500Mi
          requests:
            cpu: 500m
            memory: 100Mi
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
      - name: mnt
        hostPath:
          path: /mnt

kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File  parsers.conf
        HTTP_Server  On
        HTTP_Listen  0.0.0.0
        HTTP_PORT    2020
    [INPUT]
        Name              tail
        # namespace_name, container_name, pod_name, docker_id must be present in the tag
        # as they are required by the kubernetes filter
        # This regex cribbed from https://github.com/fluent/fluent-bit/blob/v1.0.2/plugins/filter_kubernetes/kube_regex.h#L25
        Tag_Regex         (?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$
        Tag               k8s_containers.<namespace_name>.<container_name>.<pod_name>.<docker_id>-
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        DB.Sync           Normal
        Buffer_Chunk_Size 1M
        Buffer_Max_Size   2M
        Mem_Buf_Limit     256MB
        Skip_Long_Lines   On
        Refresh_Interval  2
        Docker_Mode       On
    [FILTER]
        Name           kubernetes
        Match          *
        Kube_URL       https://kubernetes.default.svc:443
        Merge_Log      On
        Keep_Log       Off
        Annotations    On
        tls.verify     Off
        Regex_Parser   k8s-custom-tag
        Kube_Tag_Prefix k8s_containers.

    # Add logging to cloudwatch by looping through Namespaces
    [OUTPUT]
        Alias kube-system
        Name cloudwatch
        Match k8s_containers.kube-system*
        region us-east-1
        log_group_name /eks/test/kube-system
        log_stream_prefix kube-system-
        auto_create_group true

  parsers.conf: |
    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%Z
        Time_Keep    On
        # Command      |  Decoder | Field | Optional Action
        # =============|==================|=================
        # Decode_Field_As   escaped_utf8    log    do_next
        Decode_Field_As   json       log
    [PARSER]
        # Kubernetes filter must be able to parse these fields out of the tag to work
        Name    k8s-custom-tag
        Format  regex
        Regex   (?<namespace_name>[^_]+)\.(?<container_name>.+)\.(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)\.(?<docker_id>[a-z0-9]{64})-$
---
# Service added for ServiceMonitor (Prometheus)
apiVersion: v1
kind: Service
metadata:
  name: fluentbit
  namespace: kube-system
  labels:
    app: fluentbit
spec:
  selector:
    name: fluentbit
  ports:
  - port: 2020
    protocol: TCP
    targetPort: 2020
    name: metrics

Environment name and version (e.g. Kubernetes? What version?): AWS EKS Kubernetes version 1.13
Server type and version: AWS EKS ec2 t3.medium
Operating System and version: amazon-eks-node-1.13-v20190927 (ami-0990970b056c619eb)
Filters and plugins: kubernetes

Additional context

software development is severely impacted by not being able to trace where in the system things are failing. It also affects analytics as values we defined in our queries are missing.

bug fixed

Source

mflococo

Most helpful comment

Here is quick-and-dirty patch which should probably work so that metadata msgpack map is not corrupted :)

diff --git a/plugins/filter_kubernetes/kube_meta.c b/plugins/filter_kubernetes/kube_meta.c
index 0d3df7e7..790c53ca 100644
--- a/plugins/filter_kubernetes/kube_meta.c
+++ b/plugins/filter_kubernetes/kube_meta.c
@@ -395,8 +395,10 @@ static void extract_container_hash(struct flb_kube_meta *meta,
                                        "imageID",
                                        k2.via.str.size)) {
                         /* Strip "docker-pullable://" prefix */
-                        container_hash = v2.ptr + FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
-                        container_hash_len = v2.size - FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
+                        if (v2.size >= FLB_KUBE_META_IMAGE_ID_PREFIX_LEN) {
+                            container_hash = v2.ptr + FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
+                            container_hash_len = v2.size - FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
+                        }
                     }
                 }
                 if (name_found) {

kantica on 8 Nov 2019

🎉4

All 22 comments

Here is another example when outputting fluentbit logs to file:

k8s_containers.kube-system.aws-for-fluent-bit.fluentbit-7g649.c311243e64905c4f193c7bca600399fa8c7f8b48abfe15f4a41dfe11e196ea45-: [1572373624.454250, {"log":"2019-10-29T18:27:04.441400237Z stderr F [2019/10/29 18:27:04] [debug] [filter_kube] API Server (ns=kube-system, pod=kube-proxy-lvt7v) http_do=0, HTTP Status: 200","kubernetes":null}]

computeracer on 29 Oct 2019

Interesting to note is when I manually give the daemonset aws creds, and run it in a kind cluster, I still get intermittent "kubernetes":null. I also tried changing output to file, and again intermittent failures.
Any thoughts @kantica and @edsiper ?

computeracer on 30 Oct 2019

Today we greatly simplified the config, and deployed (kubectl apply -f) it in a local kind cluster.

We seem to consistently not get k8s annotations ("kubernetes": null) on the fluent-bit pod, but the others have all the data under fluent bit. Hope this may help get to the root of the issue.

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: pod-log-reader
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: pod-log-crb
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: pod-log-reader
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentbit
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  selector:
    matchLabels:
      name: fluentbit
  template:
    metadata:
      labels:
        name: fluentbit
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: fluent-bit
      containers:
      - name: aws-for-fluent-bit
        image: amazon/aws-for-fluent-bit:1.3.2
        env:
        - name: FLB_LOG_LEVEL
          value: debug
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
        - name: mnt
          mountPath: /mnt
          readOnly: true
        - name: cache
          mountPath: /fluent-bit/cache
        resources:
          limits:
            memory: 500Mi
          requests:
            cpu: 500m
            memory: 100Mi
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
      - name: mnt
        hostPath:
          path: /mnt
      - name: cache
        emptyDir: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File  parsers.conf
        HTTP_Server  On
        HTTP_Listen  0.0.0.0
        HTTP_PORT    2020
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     256MB
    [FILTER]
        Name kubernetes
        Match kube.*
        Kube_URL https://kubernetes.default.svc:443
        Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
        Kube_Tag_Prefix kube.var.log.containers.
        Merge_Log On
        Merge_Log_Key log_processed
        Kube_meta_preload_cache_dir /fluent-bit/cache

    [OUTPUT]
        Alias kube-system
        Name file
        Match kube.*
        Path /var/log/kube-system.log

  parsers.conf: |
    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

computeracer on 31 Oct 2019

This issue has been extremely difficult to consistently reproduce. Only _sometimes_ does the request to the API seem to produce null annotations. And because fluent-bit caches the metadata for a pod/namespace combination, it can take a while to encounter it.

Here's a method to _eventually_ trigger the issue:

Create a kind cluster

kind create cluster
export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"

Check out the fluent-bit source and make one small adjustment so that metadata is always fetched from the api instead of just using the cached result:

diff --git a/plugins/filter_kubernetes/kube_meta.c b/plugins/filter_kubernetes/kube_meta.c
index 0d3df7e7..e2a59529 100644
--- a/plugins/filter_kubernetes/kube_meta.c
+++ b/plugins/filter_kubernetes/kube_meta.c
@@ -974,7 +974,7 @@ int flb_kube_meta_get(struct flb_kube *ctx,
     ret = flb_hash_get(ctx->hash_table,
                        meta->cache_key, meta->cache_key_len,
                        &hash_meta_buf, &hash_meta_size);
-    if (ret == -1) {
+    if (-1 == -1) {
         /* Retrieve API server meta and merge with local meta */
         ret = get_and_merge_meta(ctx, meta,
                                  &tmp_hash_meta_buf, &hash_meta_size);

Build a docker image with the above change, and load it into kind

docker build -t fluent-bit:test .
kind load docker-image fluent-bit:test

Launch fluent-bit via kubectl. The following spec should represent a pretty minimal setup:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: pod-log-reader
rules:
- apiGroups: [""]
  resources:
  - namespaces
  - pods
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: pod-log-crb
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: pod-log-reader
subjects:
- kind: ServiceAccount
  name: fluent-bit
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluent-bit
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentbit
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
spec:
  selector:
    matchLabels:
      name: fluentbit
  template:
    metadata:
      labels:
        name: fluentbit
    spec:
      serviceAccountName: fluent-bit
      containers:
      - name: fluent-bit
        image: fluent-bit:test
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: kube-system
  labels:
    app.kubernetes.io/name: fluentbit
data:
  fluent-bit.conf: |
    [SERVICE]
        Parsers_File  parsers.conf
    [INPUT]
        Name              tail
        Tag               kube.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Refresh_Interval  2
    [FILTER]
        Name kubernetes
        Kube_Tag_Prefix kube.var.log.containers.
        Match kube.*
        Merge_Log On
    [OUTPUT]
        Alias kube-system
        Name file
        Match kube.*
        Path /var/log/fluent-bit-output.log
  parsers.conf: |
    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

Get shell access into the kind control-plane container and tail the output log.

# First grab the container id of the control plane
$ docker ps
CONTAINER ID        IMAGE                  COMMAND                  CREATED             STATUS              PORTS                                  NAMES
cc590a19909f        kindest/node:v1.15.3   "/usr/local/bin/entr…"   15 minutes ago      Up 15 minutes       64283/tcp, 127.0.0.1:64283->6443/tcp   kind-control-plane
# Then docker exec
$ docker exec -it cc590a19909f /bin/bash
# Inside the control-plane container, tail the log
tail -f /var/log/fluent-bit-output.log | grep -iE 'null|filter'

In another terminal session on the same host system, continually loop through creating a pod with some basic output and deleting it.

export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"
while true; do \
    kubectl run -it --rm --generator=run-pod/v1 --image=alpine alpine -- /bin/sh -c 'for i in 1 2 3 4 5; do date; sleep 1; done'; \
    sleep 1; \
done

Eventually, you will see logs from the output being tailed that look like this:

kube.var.log.containers.alpine_default_alpine-c40b7545867c821b3dfe0b398983e636132c44a98ab9ea8463d83b76206d77cd.log: [1573048310.008641, {"log":"2019-11-06T13:51:49.0348335Z stdout F Wed Nov  6 13:51:49 UTC 2019","kubernetes":null}]
kube.var.log.containers.alpine_default_alpine-c40b7545867c821b3dfe0b398983e636132c44a98ab9ea8463d83b76206d77cd.log: [1573048310.036044, {"log":"2019-11-06T13:51:50.0358045Z stdout F Wed Nov  6 13:51:50 UTC 2019","kubernetes":null}]

jhuntwork on 6 Nov 2019

👍1

I seem to be hitting this as well. Not sure what causes it.

gageorsburn on 6 Nov 2019

I have found what is causing this, and its inside kube_meta.c: extract_container_hash()

Baiscally part of extract_container_hash() which looks at "containerID" and "imageID" does not work correctly on for example kind cluster. Therefore there is corruption of kubernetes messagepack metadata map.

What is going on?
Inside kube_meta.c following is defined:

#define FLB_KUBE_META_CONTAINER_STATUSES_KEY "containerStatuses"
#define FLB_KUBE_META_CONTAINER_STATUSES_KEY_LEN \
    (sizeof(FLB_KUBE_META_CONTAINER_STATUSES_KEY) - 1)
#define FLB_KUBE_META_INIT_CONTAINER_STATUSES_KEY "initContainerStatuses"
#define FLB_KUBE_META_INIT_CONTAINER_STATUSES_KEY_LEN \
    (sizeof(FLB_KUBE_META_INIT_CONTAINER_STATUSES_KEY) - 1)
#define FLB_KUBE_META_CONTAINER_ID_PREFIX "docker://"
#define FLB_KUBE_META_CONTAINER_ID_PREFIX_LEN \
    (sizeof(FLB_KUBE_META_CONTAINER_ID_PREFIX) - 1)
#define FLB_KUBE_META_IMAGE_ID_PREFIX "docker-pullable://"
#define FLB_KUBE_META_IMAGE_ID_PREFIX_LEN \
    (sizeof(FLB_KUBE_META_IMAGE_ID_PREFIX) - 1)

CONTAINER_ID_PREFIX is defined as "docker://"
IMAGE_ID_PREFIX is defined as "docker-pullable://"

Looking at (kind) cluster metadata from kube-api server, following can be observerd:

# curl -s -k -H "Authorization: Bearer $token" https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods | jq '.items[].status.containerStatuses[] | "\(.containerID) \(.imageID)"'
"containerd://1d13ee4a35ca70f86cef420871d4a552ad7b14e251265dcf23595add22508247 docker.io/library/alpine@sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a"
"containerd://9d53fd9649c4ff7e67f73e26f7c1e13fc14caa2417bfc3694f567645f5338620 docker.io/library/alpine@sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a"
# curl -s -k -H "Authorization: Bearer $token" https://kubernetes.default.svc.cluster.local/api/v1/namespaces/default/pods | egrep '(containerID|imageID)'
            "imageID": ""
            "imageID": "docker.io/library/alpine@sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a",
            "containerID": "containerd://1ed452c3d02d8114e5739375e1d67405fa40c2f78797ec2d8e5bd13740424273"
                "containerID": "containerd://1efbaa7537602817fbf902570011db53097fed4b45301017f4c2b5691bb63d47"
            "imageID": "docker.io/library/alpine@sha256:c19173c5ada610a5989151111163d28a67368362762534d8a8121ce95cf2bd5a",
            "containerID": "containerd://1efbaa7537602817fbf902570011db53097fed4b45301017f4c2b5691bb63d47"

Clearly expected IMAGE_ID_PREFIX is not "docker-pullable://" and expected CONTAINER_ID_PREFIX is "containerd://" insted of defined "docker://".

This is relevant code part inside kube_meta.c is https://github.com/fluent/fluent-bit/blob/master/plugins/filter_kubernetes/kube_meta.c#L385-L400

If you comment that two else if blocks the issue should be gone.
IMHO IMAGE_ID and CONTAINER_ID prefixes should be configurable with sane defaults, and relevant code parts better/more robust (re)written.

kantica on 8 Nov 2019

Here is quick-and-dirty patch which should probably work so that metadata msgpack map is not corrupted :)

diff --git a/plugins/filter_kubernetes/kube_meta.c b/plugins/filter_kubernetes/kube_meta.c
index 0d3df7e7..790c53ca 100644
--- a/plugins/filter_kubernetes/kube_meta.c
+++ b/plugins/filter_kubernetes/kube_meta.c
@@ -395,8 +395,10 @@ static void extract_container_hash(struct flb_kube_meta *meta,
                                        "imageID",
                                        k2.via.str.size)) {
                         /* Strip "docker-pullable://" prefix */
-                        container_hash = v2.ptr + FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
-                        container_hash_len = v2.size - FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
+                        if (v2.size >= FLB_KUBE_META_IMAGE_ID_PREFIX_LEN) {
+                            container_hash = v2.ptr + FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
+                            container_hash_len = v2.size - FLB_KUBE_META_IMAGE_ID_PREFIX_LEN;
+                        }
                     }
                 }
                 if (name_found) {

kantica on 8 Nov 2019

🎉4

thanks everyone! what a team work :) , based on the feedback and the solution provided by @kantica I have pushed the followng __test__ image:

edsiper/fluent-bit-1.3.3-next:3

please give it a try, if it looks good we can do a release TODAY. please test it!

edsiper on 8 Nov 2019

Tested with edsiper/fluent-bit-1.3.3-next:3. Could not get a null. Well done.

I deleted and applied my deployment referenced earlier several times with the new image and could not reproduce a null in the logs of my kind cluster. Thank you!

computeracer on 8 Nov 2019

👍2

We are testing the patch too, it seems we could not get any null (so far)

ivanfoo on 11 Nov 2019

When we applied just https://github.com/fluent/fluent-bit/commit/d62bd88b9d03260cd62fdad73c2779663aa18bd8 to our custom image, we still eventually saw some nulls. But with current master (now at https://github.com/fluent/fluent-bit/commit/cc6b7bbbbfe3af42eab47286449bbcdce35a247c) we haven't seen any in the last 2 hours.

jhuntwork on 14 Nov 2019

The new version is coming out today, packages are already out and docker images will take a few more hours to be ready, thanks for your patience

edsiper on 21 Nov 2019

👍1

@edsiper

with this fix, container_hash contains hash only, so imageID: docker-pullable://bitnami/mongodb@sha256:a93f75fec367ea5393f46f6050dea9747053f694a2659da44a9e9e06cf2cd039 leads to container_hash:"a93f75fec367ea5393f46f6050dea9747053f694a2659da44a9e9e06cf2cd039". while it is expected to strip only docker-pullable://

I think memrchr should be replaced with memchr here https://github.com/fluent/fluent-bit/blob/master/plugins/filter_kubernetes/kube_meta.c#L325 (so to search for the leftmost colon)

EDIT: pls ignore, @kantica explained that this is desired change to only have hash.