Fluent-bit: Kubernetes output/index based on namespace

Created on 8 Sep 2018  Â·  24Comments  Â·  Source: fluent/fluent-bit

I have fluentbit deployed to my kubernetes cluster and sending to a single elasticsearch index but per my requirements, we only need to send namespaces with '-prod' to the prod index and namespaces with the '-stage' to the non-prod index. This is because each index has different retention specifications we have to follow.
Is it possible to have fluent bit send to different outputs based on the namespace field set using the kubernetes parser?

Most helpful comment

Maybe something like:

    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker_no_time
        Tag              kube.<namespace_name>.<pod_name>.<container_name>
        Tag_Regex        (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-
        Refresh_Interval 5
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On

    [OUTPUT]
        Name  es1
        Match kube.*-prod
        Host  elasticsearch1
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Time_Key @timestamp

    [OUTPUT]
        Name  es2
        Match kube.*-stag
        Host  elasticsearch2
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Time_Key @timestamp

All 24 comments

Maybe something like:

    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker_no_time
        Tag              kube.<namespace_name>.<pod_name>.<container_name>
        Tag_Regex        (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-
        Refresh_Interval 5
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On

    [OUTPUT]
        Name  es1
        Match kube.*-prod
        Host  elasticsearch1
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Time_Key @timestamp

    [OUTPUT]
        Name  es2
        Match kube.*-stag
        Host  elasticsearch2
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Time_Key @timestamp

@donbowman That is awesome, but I don't see anything about Tag_Regex in the docs yet. I took a look at the code and it doesn't seem to be released yet. I'll keep watch.

the tag_regex is indeed in the code already. I have an outstanding PR for
the docs
https://github.com/fluent/fluent-bit-docs/pull/102

that adds it.

On Sun, 7 Oct 2018 at 20:50, Mitchell Maler notifications@github.com
wrote:

@donbowman https://github.com/donbowman That is awesome, but I don't
see anything about Tag_Regex in the docs yet. I took a look at the code and
it doesn't seem to be released yet. I'll keep watch.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/758#issuecomment-427701123,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE5Ok9vShafZCadVzw_n_c9rRO_Glv5Sks5uiqFHgaJpZM4WgCd6
.

@donbowman I tried using your above configuration with the latest docker image and all I got was this for the tag 'kube...' it was just ignoring the regex.

It only seems to be in the Master branch where release v0.14.4 doesn't have it.
https://github.com/fluent/fluent-bit/blob/cbc635c6373d15f02a37d31e05016de9073219e5/plugins/in_tail/tail_config.c

ah yes. its in the code :)
but not in the 0.14.4 dockerhub image.

On Mon, 8 Oct 2018 at 11:29, Mitchell Maler notifications@github.com
wrote:

@donbowman https://github.com/donbowman I tried using your above
configuration with the latest docker image and all I got was this for the
tag 'kube...' it was just
ignoring the regex.

It only seems to be in the Master branch where release v0.14.4 doesn't
have it.

https://github.com/fluent/fluent-bit/blob/cbc635c6373d15f02a37d31e05016de9073219e5/plugins/in_tail/tail_config.c

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/758#issuecomment-427878721,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE5OkxhcOuwG34X1IGNP6lpQRvCOd4F3ks5ui29UgaJpZM4WgCd6
.

Cool! Thought so, else I wasn't sure what was going on.
I'll keep watch for a newer image.

I've set my dockerhub to build from my fork:

donbowman/fluent-bit:latest --> latest master, should be in-sync with
upstream
donbowman/fluent-bit:integration --> all my changes merged

feel free to try that if you wish. Its building now.

On Mon, 8 Oct 2018 at 11:32, Mitchell Maler notifications@github.com
wrote:

Cool! Thought so, else I wasn't sure what was going on.
I'll keep watch for a newer image.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/758#issuecomment-427879828,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AE5Ok-P3kt1xqXSWaJUrKg-OZYLRVJ9mks5ui3ASgaJpZM4WgCd6
.

Merged

On Mon, Oct 8, 2018, 11:25 Don Bowman notifications@github.com wrote:

the tag_regex is indeed in the code already. I have an outstanding PR for
the docs
https://github.com/fluent/fluent-bit-docs/pull/102

that adds it.

On Sun, 7 Oct 2018 at 20:50, Mitchell Maler notifications@github.com
wrote:

@donbowman https://github.com/donbowman That is awesome, but I don't
see anything about Tag_Regex in the docs yet. I took a look at the code
and
it doesn't seem to be released yet. I'll keep watch.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://github.com/fluent/fluent-bit/issues/758#issuecomment-427701123
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AE5Ok9vShafZCadVzw_n_c9rRO_Glv5Sks5uiqFHgaJpZM4WgCd6

.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/fluent/fluent-bit/issues/758#issuecomment-427877468,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAWkNqVDQIW7XKDYkv5g7nMFSrMl5j_nks5ui251gaJpZM4WgCd6
.

is this closed now?

Sorry did we know what release this will be available in?
I see it in master here: https://github.com/fluent/fluent-bit/blob/master/plugins/in_tail/tail_config.c#L243

Hi,
Any update on this, regarding the 2 previous question by @donbowman and @kskewes ?

the feature is in the code of 1.x releases and the document PR is accepted
https://docs.fluentbit.io/manual/input/tail

@kskewes mentioned on the fluent-bit slack that he is using to steer to outputs based on namespace, so i think its in use now.

Great i'm gonna try with the current release, as of today , which is v1.0.2

thanks

Works great. Though I'm seeing +140m in cpu usage per pod.

Two inputs that glob match based on presence/exclusion of myappnamespace in log filename.

  1. Path /var/log/containers/*_<myappnamespace>_*.log, tag with kube.<myapp>....
  2. Exclude_Path /var/log/containers/*_<myappnamespace>_*.log, tag with kube.infra....

(2) Match all except specific namespace below:

    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        Exclude_Path      /var/log/containers/*_<myappnamespace>_*.log
        Tag               kube.infra.<namespace_name>.<pod_name>.<container_name>
        Tag_Regex         (?<pod_name>[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-
        Parser            cri
        DB                /var/log/flb_kube_infra.db
        Mem_Buf_Limit     500KB
        Skip_Long_Lines   On
        Refresh_Interval  10

Two outputs matching tags set by INPUT.

    [OUTPUT]
        Name            es
        Match           kube.infra.*
        Host            ${FLUENT_ELASTICSEARCH_INFRA_HOST}
        Port            ${FLUENT_ELASTICSEARCH_INFRA_PORT}
        Logstash_Format On
        Logstash_Prefix infra-container
        Time_Key        timestamp
        #Generate_ID     On
        Replace_Dots    On
        Retry_Limit     False
        tls             On
        tls.verify      Off

Hi,

I have setup the following config map which is taken directly from the repo : https://github.com/fluent/fluent-bit-kubernetes-logging, but i keep on getting the error below

````
k logs -f fluent-bit-dcshs
Fluent Bit v1.0.2
Copyright (C) Treasure Data

Output plugin 'ns-dev' cannot be loaded
Error: You must specify an output target. Aborting
````
The error is a little bit confusing for cause i'm not sure if it's not getting the elasticsearch host or is it because the Match clause in the [OUTPUT] never match anything , any idea ?

Here is the configmap

`````
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: logging
labels:
k8s-app: fluent-bit
data:
fluent-bit.conf: |
[SERVICE]
Flush 1
Log_Level info
Daemon off
Parsers_File parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port 2020

@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-elasticsearch.conf

input-kubernetes.conf: |
[INPUT]
Name tail
Path /var/log/containers/.log
Tag kube...
Tag_Regex (?a-z0-9?(.a-z0-9?)
)_(?[^_]+)_(?.+)-
Parser docker
DB /var/log/containers/fluentbit_kube.db
Mem_Buf_Limit 5MB
Skip_Long_Lines On
Refresh_Interval 5

filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc.cluster.local:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
tls.verify Off
Merge_Log On
K8S-Logging.Parser On

output-elasticsearch.conf: |
[OUTPUT]
Name es1
Match kube.*-dev
Host ${FLUENT_ELASTICSEARCH_HOST}
Port ${FLUENT_ELASTICSEARCH_PORT}
Logstash_Format On
Logstash_Prefix ns-dev
Type flb_type
Time_Key @timestamp
Retry_Limit False
tls On
tls.verify Off

 [OUTPUT]
     Name                        es2
     Match                       kube.*-uat
     Host                         ${FLUENT_ELASTICSEARCH_HOST}
     Port                          ${FLUENT_ELASTICSEARCH_PORT}
     Logstash_Format    On
     Logstash_Prefix       ns-uat
     Type                         flb_type
     Time_Key                 @timestamp
     Retry_Limit     False
     tls                   On
    tls.verify         Off

parsers.conf: |
[PARSER]
Name apache
Format regex
Regex ^(?[^ ]) [^ ] (?[^ ]) [(?)] "(?\S+)(?: +(?[^\"]?)(?: +\S)?)?" (?[^ ]) (?[^ ])(?: "(?[^\"])" "(?[^\"])")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache2
    Format regex
    Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   apache_error
    Format regex
    Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

[PARSER]
    Name   nginx
    Format regex
    Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name   json
    Format json
    Time_Key time
    Time_Format %d/%b/%Y:%H:%M:%S %z

[PARSER]
    Name        docker
    Format      json
    Time_Key    time
    Time_Format %Y-%m-%dT%H:%M:%S.%L
    Time_Keep   On
    Decode_Field_As   escaped    log

[PARSER]
    Name        syslog
    Format      regex
    Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
    Time_Key    time
    Time_Format %b %d %H:%M:%S

````

Try changing your outputs from Name es1 and Name es2 to Name es.
Name is the output plugin name, eg: es, stdout.
See here: https://docs.fluentbit.io/manual/configuration/file

For some reason I ended up having to do 2x TAIL sections to split a namespace off to a different ES.
Please let me know how you get on.

wow , thanks a lot @kskewes ! of course how did a miss that ....!!??
I'm getting the log in elasticsearch now and kibana visualisation. still need to tune the parser a little bit but that's already a big step forward.

Thanks

You're most welcome. :)
Thanks to Don for the patch!

You can also consider using the Match Regex on the output and a single tail input rather than the 2 tails. that should use less cpu i think(?)

Thanks Don, waiting for pods to catch up on back logs.
For watchers the following combo is working - using simple glob match + regex negative match.

<output 1>
Match         kube.*myapp*.*
...
<output 2>
Match_Regex    ^kube\.(?~myapp)\.*$
...

I think this can be closed now @mitchellmaler ?

Yes it can.

Does anyone can upload a full configuration for splitting namespaces to separate logs in Elasticsearch?

I landed here initially as well. Commented a solution on open Issue https://github.com/fluent/fluent-bit/issues/1775#issuecomment-678631531
The mentioned issue covers dynamically routing to different indices using one output

Was this page helpful?
0 / 5 - 0 ratings