Problem
If the application in kubernetes logs multiline messages, docker split this message to multiple json-log messages.
The actual output from the application
[2019-02-15 10:36:31.224][38][debug][http] source/common/http/conn_manager_impl.cc:521] [C463][S12543431219240717937] request headers complete (end_stream=true):
':authority', 'customer1.demo1.acme.us'
':path', '/api/config/namespaces/test/routes'
':method', 'GET'
'user-agent', 'Go-http-client/1.1'
'cookie', 'X-ACME-GW-AUTH=eyJpc3N1ZWxxxxxxxx948b94'
'accept-encoding', 'gzip'
'connection', 'close'
Now this becomes in docker log, to be parsed by fluentbit in_tail: (example differs from the above)
{"log":"[2019-02-15 11:00:08.688][9][debug][router] source/common/router/router.cc:303] [C0][S14319188767040639561] router decoding headers:\n","stream":"stderr","time":"2019-02-15T11:00:08.688733409Z"}
{"log":"':method', 'POST'\n","stream":"stderr","time":"2019-02-15T11:00:08.688736209Z"}
{"log":"':path', '/envoy.api.v2.ClusterDiscoveryService/StreamClusters'\n","stream":"stderr","time":"2019-02-15T11:00:08.688757909Z"}
{"log":"':authority', 'xds_cluster'\n","stream":"stderr","time":"2019-02-15T11:00:08.688760809Z"}
{"log":"':scheme', 'http'\n","stream":"stderr","time":"2019-02-15T11:00:08.688763609Z"}
{"log":"'te', 'trailers'\n","stream":"stderr","time":"2019-02-15T11:00:08.688766209Z"}
{"log":"'content-type', 'application/grpc'\n","stream":"stderr","time":"2019-02-15T11:00:08.688768809Z"}
{"log":"'x-envoy-internal', 'true'\n","stream":"stderr","time":"2019-02-15T11:00:08.688771609Z"}
{"log":"'x-forwarded-for', '192.168.6.6'\n","stream":"stderr","time":"2019-02-15T11:00:08.688774309Z"}
{"log":"\n","stream":"stderr","time":"2019-02-15T11:00:08.688777009Z"}
docker_mode: 0n shall - recombine split Docker log lines before passing them to any parser as configured above.
I would expect it will apply to this case as well, however I it does not. Below I provided my configuration.
Describe the solution you'd like
in_tail/docker_mode - shall have the possibility to read docker's json-log as a stream of original text. json parser, here is just pre-processor that will buffer the "log" key, so multiline regexp patterns can be used later.
Describe alternatives you've considered
I believe this problem can be avoided if:
However:
Fluent bit FILTERS are applied after the parsing, so can't transform the stream early.
Additional context
Fluentbit config I am using:
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Parser docker
DB /var/log/flb_kube.db
Skip_Long_Lines Off
Docker_Mode On
Refresh_Interval 10
Chunk_Size 32k
Buffer_Max_Size 2M
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc.cluster.local:443
Merge_Log On
K8S-Logging.Parser On
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# Command | Decoder | Field | Optional Action
# =============|==================|=================
Decode_Field_As escaped_utf8 log do_next
Decode_Field_As escaped log do_next
Decode_Field_As json log
Repository that can be used for testing: https://github.com/epcim/fluentbit-sandbox
Hey I'm struggling with the same right now. Is there any additional planned feature or bug fix for this? Docker_Mode On is exactly what I want. My parsers can then extract fields. I struggle finding a solution for spring boot stack traces with fluent bit at all (using Multiline or Docker_Mode). Any update or feedback would be appreciated.
I'm struggling with this right now. Do you have any solution for k8s's multiline log?
I am also stuck in same issue. Multiline log parser is not working in K8.
I'm stuck with this as well; is there a set of input flags under which the large input (16 kb) input from Docker will work?
I'm also experiencing the same problem with not being able to parse multiline logs in Kubernetes cluster. I have tried solutions suggested in related threads in this repo but couldn't get it working.
My input config:
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/abc-*.log
Parser docker
Parser_Firstline multiline_parser_head
Parser_1 multiline_parser_error
Multiline On
DB /var/log/flb_kube.db
Mem_Buf_Limit 10MB
Skip_Long_Lines On
Refresh_Interval 10
Parsers:
parsers.conf: |
[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
[PARSER]
Name multiline_parser_head
Format regex
Regex /\d{4}-\d{1,2}-\d{1,2}/
[PARSER]
Name multiline_parser_error
Format regex
Regex /(?<timestamp>[^ ]* [^ ]*) (?<level>[^\s]+:)(?<message>[\s\S]*)/
Same issue here. Any news on this?
I'm also experiencing the same problem with not being able to parse multiline logs in Kubernetes cluster. I have tried solutions suggested in related threads in this repo but couldn't get it working.
My input config:
input-kubernetes.conf: | [INPUT] Name tail Tag kube.* Path /var/log/containers/abc-*.log Parser docker Parser_Firstline multiline_parser_head Parser_1 multiline_parser_error Multiline On DB /var/log/flb_kube.db Mem_Buf_Limit 10MB Skip_Long_Lines On Refresh_Interval 10Parsers:
parsers.conf: | [PARSER] Name json Format json Time_Key time Time_Format %d/%b/%Y:%H:%M:%S %z [PARSER] Name docker Format json Time_Key time Time_Format %Y-%m-%dT%H:%M:%S.%L Time_Keep On [PARSER] Name multiline_parser_head Format regex Regex /\d{4}-\d{1,2}-\d{1,2}/ [PARSER] Name multiline_parser_error Format regex Regex /(?<timestamp>[^ ]* [^ ]*) (?<level>[^\s]+:)(?<message>[\s\S]*)/
@isurusiri Are you able to figure out any solution for this?
I have the same issue with multi-line JSON output in docker logs.
@isurusiri Based on my understanding of the documentation, the Parser directive is ignored in the tail input when MultiLine is set to On. However, Parser_Firstline and Parser_N are not ignored.
Edit: Link to 1.3 documentation referenced above: https://docs.fluentbit.io/manual/v/1.3/input/tail#multiline
Hi there, any update on that issue? Thank you
Is fluentd this only alternative to fix this issue?
Is fluentd this only alternative to fix this issue?
No, I'm using elastic fielbeat for this and it works like a charm.
Is fluentd this only alternative to fix this issue?
No, I'm using elastic fielbeat for this and it works like a charm.
Can you please share your solution to that? Thank you
I think @TomaszKlosinski is referring to using Elastic Filebeat as a log shipper instead of Fluent Bit.
Unfortunately, that only works if you're using the ELK stack- not much help to those of us using other products, e.g. Splunk.
@edsiper are you able to look into the issue? The original author of the PR that added this feature in https://github.com/fluent/fluent-bit/pull/863 is no longer on GitHub. I took a look at plugins/in_tail/tail_dockermode to see if I could help, but the lack of code comments and use of opaque abbreviations are pretty inaccessible.
I prepared some changes in the dockermode plugin (#2043). I need to suit it to the contributing guide but I believe it's worth to look and give me some feedback.
Output for issued input:
[0] containers.var.log.containers.test.log: [1585073268.000318200, {"log"=>"{"log":"[2019-02-15 11:00:08.688][9][debug][router] source/common/router/router.cc:303] [C0][S14319188767040639561] router decoding headers:\n':method', 'POST'\n':path', '/envoy.api.v2.ClusterDiscoveryService/StreamClusters'\n':authority', 'xds_cluster'\n':scheme', 'http'\n'te', 'trailers'\n'content-type', 'application/grpc'\n'x-envoy-internal', 'true'\n'x-forwarded-for', '192.168.6.6'\n\n","stream":"stderr","time":"2019-02-15T11:00:08.688777009Z"}"}]
@sumo-drosiek - any updates on merging this ?
@sumo-drosiek any updates?
@collardmsc @vishiy Sorry for no updates. The PR has been reviewed. I'm working on the runtime tests and everything should be ready soon :)
@sumo-drosiek Thanks a bunch for your time and effort working on this!
PR is ready for another review 馃
:wave: just wondering if anyone has any updates on this? It's really help me!
Looks like the issue was solved, brilliant. Which version number will support the new Docker_Mode_Parser field?
@Oduig AFAIK the 1.5.0 supports Docker_Mode_Parser
This is not documented yet, is it?
At least I can not find anything about the Docker_Mode_Parser within the official docs here:
https://docs.fluentbit.io/manual/pipeline/inputs/tail#docker_mode
@davelosert Thats right. I didn't documented it yet
Im also stuck with fluent-bit and multiline logs in EKS...Does anyone found a solution/workaround for this? if so I will appreciate your comments in advance
Im also stuck with fluent-bit and multiline logs in EKS...Does anyone found a solution/workaround for this? if so I will appreciate your comments in advance
Hey @shake76 Did you find any solution yet? If yes, pls let me know the sample config for docker parser with multiline parser
@shake76 @ankit1mg Is something wrong with docker_mode_parser and EKS?
Hey folks, for the latest issues around logs not working wanted to check if this might be something around CRI format vs. Docker format log parsing? https://docs.fluentbit.io/manual/installation/kubernetes#container-runtime-interface-cri-parser or if the ask if fully around multiline + docker mode
Most helpful comment
Hey I'm struggling with the same right now. Is there any additional planned feature or bug fix for this?
Docker_Mode Onis exactly what I want. My parsers can then extract fields. I struggle finding a solution for spring boot stack traces with fluent bit at all (using Multiline or Docker_Mode). Any update or feedback would be appreciated.