It appears as if fluentbit currently does not support cri-o type of multiline logs. This is due to the way the multiline code works.
Given a log format of type A which indicates a normal log line, and a log format of type B which indicates a continuation log line, the current multiline behavior is to only support ABBB type multiline logs, but it doesn't support BBBA type multiline logs (to see what I mean, check out example 1 and 2 below).
Cri-o has two logtags, P indicating a partial log line (one that will be continued by another partial log line, or by a 'full log line'), and F indicating a full log line. A P tagged log will be followed by a terminal F tagged log to close out the multiline log. For example, cri-o logs look like:
<timestamp> <stream> P foo
<timestamp> <stream> P bar
<timestamp> <stream> P baz
<timestamp> <stream> F bot
Which should become foo bar baz bot when combined as a multiline log. Current behavior is such that if Parser_Firstline is set to match (?<date>[^ ]+) (?<stream>[^ ]+) P (?<log>.*), then instead of a single log of foo bar baz bot current behavior would be to get three logs, foo, bar and baz bot.
Instead of a Parser_Firstline and Parser_N, can we have a Parser_Multiline_Start, Parser_Multiline_Continue and Parser_Multiline_End?
Given the following logs in a file /some/file which can be captured with the current multiline behavior:
2019/05/07 some
multiline
log
2019/05/07 another
multiline
log
2019/05/07 non multiline
2019/05/07 non multiline 2
given the following parsers:
[PARSER]
Name full
Format regex
Regex (?<date>\d{4}/\d{2}/\d{2}) (?<log>.*)
[PARSER]
Name partial
Format regex
Regex \s{2}(?<log>.*)
Could be configured as:
[INPUT]
Name tail
Path /some/file
Multiline On
Parser_Multiline_Start full
Parser_Multiline_Continue partial
Parser_Multiline_End partial
Given the following crio logs in a file /some/file which can not be captured with current multiline behavior:
2019-05-07T18:57:50.904275087+00:00 stdout P some
2019-05-07T18:57:51.904275087+00:00 stdout P multiline
2019-05-07T18:57:52.904275087+00:00 stdout F log
2019-05-07T18:57:53.904275087+00:00 stdout P another
2019-05-07T18:57:54.904275087+00:00 stdout P multiline
2019-05-07T18:57:55.904275087+00:00 stdout F log
2019-05-07T18:57:56.904275087+00:00 stdout F non multiline
2019-05-07T18:57:57.904275087+00:00 stdout F non multiline 2
given the following parsers:
[PARSER]
Name full
Format regex
Regex (?<date>[^ ]+) (?<stream>[^ ]+) F (?<log>.*)
[PARSER]
Name partial
Format regex
Regex (?<date>[^ ]+) (?<stream>[^ ]+) P (?<log>.*)
Could be configured as:
[INPUT]
Name tail
Path /some/file
Multiline On
Parser_Multiline_Start partial
Parser_Multiline_Continue partial
Parser_Multiline_End full
I believe the logic to use these parameters, Parser_Multiline_Start, Parser_Multiline_Continue, and Parser_Multiline_End should be fairly straightforward. Would this be something that would be acceptable if it appeared in a PR?
Best
Hi, we also need a solution for partial logs cri-o (containerd). For example it would be great if this solution https://github.com/fluent/fluent-bit/pull/852 could be extended.
Definitely, we will work in a solution...
...but let's be honest, how is possible that another tooling is still adding more complexity when generating multiline logs ?
any updates?
The following is a minimal version of the idea from #852 implemented as a Lua filter. Untested, may be buggy.
local reassemble_state = {}
function reassemble_cri_logs(tag, timestamp, record)
-- IMPORTANT: reassemble_key must be unique for each parser stream
-- otherwise entries from different sources will get mixed up.
-- Either make sure that your parser tags satisfy this or construct
-- reassemble_key some other way
local reassemble_key = tag
-- if partial line, accumulate
if record.logtag == 'P' then
reassemble_state[reassemble_key] = reassemble_state[reassemble_key] or "" .. record.message
return -1, 0, 0
end
-- otherwise it's a full line, concatenate with accumulated partial lines if any
record.message = reassemble_state[reassemble_key] or "" .. record.message
reassemble_state[reassemble_key] = nil
return 1, timestamp, record
end
We are looking for a solution to handle cri-o multiline logs as well. Is there any way to do this now or is it still a missing feature?
@strantalis You can write logs from application(s) WITHOUT newlines - replacing '\n' for another symbol - collect logs with fluentbit, send messages to ES, and replace "another symbol" back to '\n' in elastic pipeline...
Sounds like a crutch, but it's working - we use this solution for handling multiline java stacktraces - see more
The Lua filter I described in https://github.com/fluent/fluent-bit/issues/1316#issuecomment-617445617 is another approach. Been using something very similar in production for the past few months, works great.
Thanks @ealebed & @imarko. It's hard to get teams to log in a consistent format but the lua filter looks like it is working.
Most helpful comment
any updates?