Fluent-bit: fluentbit.io/exclude works only on newly created pods

Created on 26 Feb 2019 · 9Comments · Source: fluent/fluent-bit

Bug Report

Describe the bug

To Reproduce

Rubular link if applicable:
Example log message if applicable:

{"log":"YOUR LOG MESSAGE HERE","stream":"stdout","time":"2018-06-11T14:37:30.681701731Z"}

Steps to reproduce the problem:
Start fluent-bit as daemon set.
Apply the following pod:

apiVersion: v1
kind: Pod
metadata:
  name: logger
  namespace: default
spec:
  containers:
      - name: logger
        image: k8s.gcr.io/logs-generator:v0.1.1
        args:
          - /bin/sh
          - -c
          - |-
            /logs-generator --logtostderr --log-lines-total=${LOGS_GENERATOR_LINES_TOTAL} --run-duration=${LOGS_GENERATOR_DURATION}

            # Sleep forever to prevent restarts
            while true; do
              sleep 3600;
            done
        env:
        - name: LOGS_GENERATOR_LINES_TOTAL
          value: 100000
        - name: LOGS_GENERATOR_DURATION
          value: 600s

Annotate the pod to exclude logs:

$ kubectl annotate po logger fluentbit.io/exclude=true

Ensure that fluent-bit local cache is not invalidated and the newly added fluentbit.io/exclude=true is ignored => fluent-bit continues to stream logs from this pod. This is because kubectl annotate does not lead to pod restart. This behaviour also is not described in the docs, a warning could be added.

Expected behavior

Fluent-bit to stop process logs from the pod once after it is annotated.

Screenshots

Your Environment

Version used: 1.0.4
Configuration:

[SERVICE]
        Flush           30
        Daemon          Off
        Log_Level       warn
        Parsers_File    parsers.conf

[INPUT]
        Name              tail
        Tag               kubernetes.*
        Path              /var/log/containers/*.log
        Parser            docker
        DB                /var/log/flb_kube.db
        Mem_Buf_Limit     12MB
        Refresh_Interval  10
        ignore_older      1800s

[OUTPUT]
        Name            stdout
        Match           *

[FILTER]
        Name                kubernetes
        Match               kubernetes.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Buffer_Size         1M
        Merge_Log           On
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On
        Annotations         Off
        tls.verify          Off

Environment name and version (e.g. Kubernetes? What version?): K8S version 1.12.5
Server type and version:
Operating System and version:
Filters and plugins:

Additional context

Source

ialidzhikov

Most helpful comment

@donbowman While waiting for the "watch" model, would it be possible to add a parameter to refresh the cache periodically or using an HTTP endpoint to specify the name of the resource to refresh (a POD or a namespace) ?
Or add a TTL for the cache entries with can be set by configuration. It could be a good start.

Add a timestamp when the cache entry is added.
In the function flb_kube_meta_get after the call to flb_hash_get to get the cache content, compare the timestamp with the TTL configuration. If expired, release the cache data and make the call to the API Server.

julienlefur on 18 May 2020

👍2

All 9 comments

Also fluentbit.io/exclude is unusable with StatefulSets:

Apply the following sts:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: logger
  namespace: default
spec:
  serviceName: logger
  selector:
    matchLabels:
      app: logger
  template:
    metadata:
      labels:
        app: logger
    spec:
      nodeSelector:
        kubernetes.io/hostname: #put a node name here
      containers:
      - name: logger
        image: k8s.gcr.io/logs-generator:v0.1.1
        args:
          - /bin/sh
          - -c
          - |-
            /logs-generator --logtostderr --log-lines-total=${LOGS_GENERATOR_LINES_TOTAL} --run-duration=${LOGS_GENERATOR_DURATION}

            # Sleep forever to prevent restarts
            while true; do
              sleep 3600;
            done
        env:
        - name: LOGS_GENERATOR_LINES_TOTAL
          value: "100000"
        - name: LOGS_GENERATOR_DURATION
          value: "600s"

Ensure that fluent-bit does not exclude the sts pod logger-0
Edit the sts and add fluentbit.io/exclude: "true":

$ kubectl edit sts logger
statefulset.apps/logger edited

Ensure that fluentbit.io/exclude: "true" is ignored and fluent-bit continues to receive events from the newly created pod logger-0. Note that the sts has a nodeSelector that makes it easy to spawn the pod in the same node (=> fluent-bit) to reproduce more clearly this issue.

ialidzhikov on 26 Feb 2019

we need to implement watch.
we should probably move to a model where watch is filtered by the node we are on, and, when a change is given, invalidate the cache.

@ialidzhikov on the StatefulSet, won't it do a RollingUpdate and thus get the annotation on all the Pods owned by the StatefulSet controller?

donbowman on 1 Mar 2019

I would really like to see this feature. My team is experiencing the same problem when running FluentBit as a DaemonSet.

KarstenSchnitter on 4 Mar 2019

👍1

@ialidzhikov Did you add the fluentbit.io/exclude annotation to .metadata, or to .spec.template.metadata (which should be the right one)?

lbogdan on 4 Mar 2019

@donbowman , it does RollingUpdate and sets the annotation on all of the pods. As far I see the cache_key is build from <namespace>:<pod_name>:<container_name>. After the sts update, it creates a new pod with new newly added annotation, but the cache_key stays the same - the sts always follows the pattern <pod-name>-0, <pod-name>-1, <pod-name>-2 for the pod names. To recap - the cache_key does not change with the sts update and fluent-bit does not go to kube-apiserver to read the newly added annotation => fluentbit.io/exclude is completely ignored.

@lbogdan , I reproduced it just now and I add the annotation to .spec.template.metadata. You could also give a try with the steps above. : )

ialidzhikov on 14 Mar 2019

👍1

@donbowman We have the same issue. It would be really helpful to implement the watch model to catch the annotation "exclude".
We use fluent-bit in a production environment with more than 4 000 pods running and it would be great to be able to stop collecting logs from specific pods or even all pods from a given namespace by simply adding the annotation.

julienlefur on 27 Apr 2020

/kind bug

ialidzhikov on 29 Apr 2020

Add a timestamp when the cache entry is added.
In the function flb_kube_meta_get after the call to flb_hash_get to get the cache content, compare the timestamp with the TTL configuration. If expired, release the cache data and make the call to the API Server.

julienlefur on 18 May 2020

👍2

One use case I'm currently having is that I would like to exclude the fluentbit daemonset itself.
I've seen this 'fixed' after a few hours of running, but I didn't see anything deterministic, yet.

If anyone knows how I can trigger the exclude, even manually, it'll be great.