Fluent-bit: Feature Request: Change multiline behavior to support cri-o type logs

Created on 8 May 2019  路  8Comments  路  Source: fluent/fluent-bit

Problem

It appears as if fluentbit currently does not support cri-o type of multiline logs. This is due to the way the multiline code works.

Given a log format of type A which indicates a normal log line, and a log format of type B which indicates a continuation log line, the current multiline behavior is to only support ABBB type multiline logs, but it doesn't support BBBA type multiline logs (to see what I mean, check out example 1 and 2 below).

Cri-o has two logtags, P indicating a partial log line (one that will be continued by another partial log line, or by a 'full log line'), and F indicating a full log line. A P tagged log will be followed by a terminal F tagged log to close out the multiline log. For example, cri-o logs look like:

<timestamp> <stream> P foo
<timestamp> <stream> P bar
<timestamp> <stream> P baz
<timestamp> <stream> F bot

Which should become foo bar baz bot when combined as a multiline log. Current behavior is such that if Parser_Firstline is set to match (?<date>[^ ]+) (?<stream>[^ ]+) P (?<log>.*), then instead of a single log of foo bar baz bot current behavior would be to get three logs, foo, bar and baz bot.

Possible Solution

Instead of a Parser_Firstline and Parser_N, can we have a Parser_Multiline_Start, Parser_Multiline_Continue and Parser_Multiline_End?

Example 1

Given the following logs in a file /some/file which can be captured with the current multiline behavior:

2019/05/07 some
  multiline
  log
2019/05/07 another
  multiline
  log
2019/05/07 non multiline
2019/05/07 non multiline 2

given the following parsers:

[PARSER]
    Name full
    Format regex
    Regex (?<date>\d{4}/\d{2}/\d{2}) (?<log>.*)

[PARSER]
    Name partial
    Format regex
    Regex \s{2}(?<log>.*)

Could be configured as:

[INPUT]
    Name tail
    Path /some/file
    Multiline On
    Parser_Multiline_Start full
    Parser_Multiline_Continue partial
    Parser_Multiline_End partial

Example 2

Given the following crio logs in a file /some/file which can not be captured with current multiline behavior:

2019-05-07T18:57:50.904275087+00:00 stdout P some
2019-05-07T18:57:51.904275087+00:00 stdout P multiline
2019-05-07T18:57:52.904275087+00:00 stdout F log
2019-05-07T18:57:53.904275087+00:00 stdout P another
2019-05-07T18:57:54.904275087+00:00 stdout P multiline
2019-05-07T18:57:55.904275087+00:00 stdout F log
2019-05-07T18:57:56.904275087+00:00 stdout F non multiline
2019-05-07T18:57:57.904275087+00:00 stdout F non multiline 2

given the following parsers:

[PARSER]
    Name full
    Format regex
    Regex (?<date>[^ ]+) (?<stream>[^ ]+) F (?<log>.*)

[PARSER]
    Name partial
    Format regex
    Regex (?<date>[^ ]+) (?<stream>[^ ]+) P (?<log>.*)

Could be configured as:

[INPUT]
    Name tail
    Path /some/file
    Multiline On
    Parser_Multiline_Start partial
    Parser_Multiline_Continue partial
    Parser_Multiline_End full

I believe the logic to use these parameters, Parser_Multiline_Start, Parser_Multiline_Continue, and Parser_Multiline_End should be fairly straightforward. Would this be something that would be acceptable if it appeared in a PR?

Best

Most helpful comment

any updates?

All 8 comments

Hi, we also need a solution for partial logs cri-o (containerd). For example it would be great if this solution https://github.com/fluent/fluent-bit/pull/852 could be extended.

Definitely, we will work in a solution...

...but let's be honest, how is possible that another tooling is still adding more complexity when generating multiline logs ?

any updates?

The following is a minimal version of the idea from #852 implemented as a Lua filter. Untested, may be buggy.

local reassemble_state = {}

function reassemble_cri_logs(tag, timestamp, record)
   -- IMPORTANT: reassemble_key must be unique for each parser stream
   -- otherwise entries from different sources will get mixed up.
   -- Either make sure that your parser tags satisfy this or construct
   -- reassemble_key some other way
   local reassemble_key = tag
   -- if partial line, accumulate
   if record.logtag == 'P' then
      reassemble_state[reassemble_key] = reassemble_state[reassemble_key] or "" .. record.message
      return -1, 0, 0
   end
   -- otherwise it's a full line, concatenate with accumulated partial lines if any
   record.message = reassemble_state[reassemble_key] or "" .. record.message
   reassemble_state[reassemble_key] = nil
   return 1, timestamp, record
end

We are looking for a solution to handle cri-o multiline logs as well. Is there any way to do this now or is it still a missing feature?

@strantalis You can write logs from application(s) WITHOUT newlines - replacing '\n' for another symbol - collect logs with fluentbit, send messages to ES, and replace "another symbol" back to '\n' in elastic pipeline...

Sounds like a crutch, but it's working - we use this solution for handling multiline java stacktraces - see more

The Lua filter I described in https://github.com/fluent/fluent-bit/issues/1316#issuecomment-617445617 is another approach. Been using something very similar in production for the past few months, works great.

Thanks @ealebed & @imarko. It's hard to get teams to log in a consistent format but the lua filter looks like it is working.

Was this page helpful?
0 / 5 - 0 ratings