Fluent-bit: Implement tag-rewrite feature

Created on 13 Jun 2017  路  31Comments  路  Source: fluent/fluent-bit

Hi

I'm evaluating fluentd-bit and it looks very promissing. I know that record_modifier was just merged and i would like to convert my old config to fluentd-bit way. Is it possible to convert this fluentd config part to fluentd-bit?

<filter kubernetes.**>
  @type record_modifier
  <record>
    enable_ruby true
    tag ${ record["logger_name"] ? "json" : "flat" }.kubernetes.${record["kubernetes"]["namespace_name"]}.${record["kubernetes"]["pod_name"]}-${record["kubernetes"]["container_name"]}
  </record>
</filter>
<match kubernetes.**>
  @type rewrite_tag_filter
  rewriterule1 tag (.+)       goto.$1
</match>
enhancement fixed

Most helpful comment

@edsiper any news about this feature?

All 31 comments

@dawidmalina the tag rewrite plugin was just sent in a PR but have not been reviewed yet.

The type of string composing in a configuration is not supported yet, likely to be implemented for 0.13 version (we are releasing 0.11.9 today and 0.12 ~next week)

Renaming this topic to set it as an enhancement request

Fluentd rewrite_tag_filter functionality would be important for us too, to be able to completely replace fluentd as a forwarder. One usecase is multiple docker containers sending to single forward input with fluentd logging driver and containers can have different needs for parsing formats and may also have different stdout and stderr formats.

For example nginx container needs to separate stderr and stdout for matching parsers for different formats:

<match docker.nginx>
  @type rewrite_tag_filter
  remove_tag_prefix docker
  rewriterule1 source stdout ${tag}.access
  rewriterule2 source stderr ${tag}.error
</match>

@edsiper any news (plans) for this improvement?

I am looking for a possibility to rewrite tag while using another field. for example:

rewriterule1 tag  ${tag}.${component}

No ETA yet for this feature.

@edsiper any news about this feature?

@edsiper do you think this can be implemented for 0.15 ? I think there are a lot of users looking for this change.

Maybe you could put here some guidelines how this can be implemented and someone could contribute.

looks like it will complicate https://github.com/fluent/fluent-bit/pull/291
but I think the duplicate record with the different tag will be enough.

yes, after checking all sources, much easier way it makes some meta plugin who have one input and many outputs with different tags (after you can be filtering each tag).
Right now we have "static" routing... in the source, it looks like "in_" plugin with filters and output handler we have no something like normal routing in the source at all.

This would be a great feature.
In our case we want to split the output into two different indices, depending on a specific value in a field. Right now it just seems not possible with Fluent-Bit?

Right now it just seems not possible with Fluent-Bit?

It depends on what exactly do you do. If you have many inputs with different tags it's possible if you have only one input it's not possible.
Also, sometimes you can just duplicate input plugins with a different tag, not so efficient but should work.

But how to use it with kubernetes, @stalkerg
For example, here we have a log lot services in k8s, like nginx/kafka/rabbitmq and I'd like to use a parse for each kind of service.
Any tip how to detect the type of service and use the right parser?

@stalkerg
But is the documentation of fluent-bit not "wrong" right now?
Without tag-rewrite this picture is not obtainable, is it?
image

A short update:

there is some progress on this, note that this feature is not a straightforward thing since some previous tasks are required. One the primary task recently done is a new buffering mechanism that implements a new way for input plugins to register records+tags.

so this is currently "work in process".

What is the status of this?

What about adding this ability to Lua filters? We have the opportunity to modify timestamp and records, and we get the tag as an input, why not allow this to be changed there also?

Sitting in the deep dive talk at KubeCon right now, realising this should be possible with stream processing and creating a new stream with a new tag.
-> https://docs.fluentbit.io/stream-processing/getting_started/fluent_bit_sql

This worked for me to create a new event with specific tag for re-processing - only issue is that I wasn't able to find a clean way to discard the original event. YMMV.

  stream-processor.conf: |
    [STREAM_TASK]
        Name   rails
        Exec   CREATE STREAM results WITH (tag='rails') AS SELECT * FROM TAG:'kube.*' WHERE kubernetes['labels']['fluentbit']='rails';

only issue is that I wasn't able to find a clean way to discard the original event. YMMV.

in the [INPUT] set _routable off_

Is there any way to do this with Lua filters? Stream processing seems to be to new.

Feature added to roadmap for next Major Release v1.4.

POC will be updated shortly

@edsiper where can I find the PoC? Can we do something else to push this forward? (testing, contributing?)

Would be massively useful if we would have a way to rewrite the tag based on the value of another field. I think this is not possible even in stream processing?

As far as I know it's only possible to use a fixed string in stream processing and not select a value of a field to be a part of the tag of that newly created stream. Correct me if I'm wrong.

Hello!

just a heads up! tag rewrite is a feature planned for v1.4 release (end of January 2020), you can track the progress of v1.4 release on this milestone:

https://github.com/fluent/fluent-bit/milestone/7

--

@juho9000 yes, tag composition is only string-based for now, that will be improved.

Will it be on track for the coming release at the end of the month ?

By the way, awesome feature !

Any updates on when this will be released? Anyway I can help test this feature (or others gating the release)?

My team is interested in moving from fluentd to fluent-bit, but we would prefer to use this feature over the stream processing feature for routing.

it's currently in "work in process", v1.4 release will include this new filter.

why is taking more time than expected?

Originally Fluent Bit design did not consider a way to re-tag records, even at that time no filtering feature was available. But as of today is a totally different story.

The future rewrite_tag filter plugin (this is a filter, not an output plugin as in Fluentd), will support subkeys matching and regex capture, so you will be able to create a rule like this:

Rule   $key1['sub1']['sub2']    ^([a-zA-Z]+)-([0-9]+)$     tag-$key2['something'].$1
                ^                         ^                           ^
            record key        regex / matching pattern         new record tag

As you can see this implementation is a bit complex, we are implementing custom features into the core to make it easier from a plugin perspective like the way to access record subkeys, matching groups and tag/strings composing.

I really appreciate your patience on this. No v1.4 will be released until this plugin is ready.

So you can start testing now (you need to build from GIT master):

This patch adds a new filter called 'rewrite_tag' that allows to
re-tag records based on matching rules using regular expressions.

This plugin is built on top of new Fluent Bit features exposed by
the record accessor, allowing to create custom tags using records
key content, tag content or any placeholder supported.

To get started, the filter needs to define rules, can be one or many
and they are processed in order until one of them matches the
required criteria, a rule have the following format:

 Rule  KEY  REGEX  NEW_TAG  KEEP

 - KEY    : name of a record key prefixed with '$', e.g: $name
 - REGEX  : regular expression defining the matching criteria
 - NEW_TAG: new tag to be defined, here you can use any placeholder
            supported.
 - KEEP   : boolean value that defines if the record matches, the
            original one must be preserved or discarded.

== Usage ==

Consider the following JSON payload (formatted for readability):

  {
    "name": "abc-123",
    "ss": {
      "s1": {
        "s2": "flb"
      }
    }
  }

Assumming the Tag associated to the record is 'aa.bb' we can use the
following configuration to rewrite it Tag and discard the original
record:

  [FILTER]
      Name  rewrite_tag
      Match aa.bb
      Rule  $name ^([a-z]+)-([0-9]+)$  test.$TAG[0].$1.${HOSTNAME} false

  [OUTPUT]
      Name  stdout
      Match test.*

== About Placeholders ==

The tag composition is very flexible and supports the following
placeholders:

 - ${VAR}    => translated from an environment variable, e.g: ${HOSTNAME}

 - $TAG      => full Tag associated to the record

 - $TAG[n]   => Tag part, parts are values between dots e.g: aa.bb, where
                $TAG[0] will retrieve 'aa' and $TAG[1] will get 'bb'.

 - $key      => value from the record key called 'key', e.g: specifying
                $name will return 'abc-123'.

 - $key['x'] => if the record 'key' value is a map, using one or multiple
                square brackets you can specify the sub-key value desired.
                Considering the above JSON example we could specify the
                key $ss['s1']['s2] to retrieve the value 'flb'.

Have fun ;)

Documentation extended with examples and Monitoring section.

Closing this enhancement request as fixed

Was this page helpful?
0 / 5 - 0 ratings