Fluent-bit: SIGSEGV in record modifier filter plugin w/ large JSON with many keys.

Created on 25 Sep 2020  路  1Comment  路  Source: fluent/fluent-bit

Bug Report

Describe the bug
We see fluent-bit crashing when we feed in large JSON object with many top level keys. We think the problem here https://github.com/fluent/fluent-bit/blob/master/plugins/filter_record_modifier/filter_modifier.c#L264.
CRASH:

[2020/09/24 19:20:03] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2020/09/24 19:20:03] [ info] [sp] stream processor started
[engine] caught signal (SIGSEGV)
#0  0x55ef62185b48      in  cb_modifier_filter() at plugins/filter_record_modifier/filter_modifier.c:306
#1  0xffffffffffffffff  in  ???() at ???:0

The SIGSEGV happens when we run it on cluster in our kubernetes environment but we are having issues reproducing it locally when we run the fluent-bit container, the behavior is that it's skipping the problematic log lines when we run it locally.

Localhost output with debugging (not reading in large json logs):

tseng@jjtseng-macOS:fluent-bit>./run-local.sh 
Fluent Bit v1.5.3
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2020/09/25 18:08:30] [ info] Configuration:
[2020/09/25 18:08:30] [ info]  flush time     | 1.000000 seconds
[2020/09/25 18:08:30] [ info]  grace          | 5 seconds
[2020/09/25 18:08:30] [ info]  daemon         | 0
[2020/09/25 18:08:30] [ info] ___________
[2020/09/25 18:08:30] [ info]  inputs:
[2020/09/25 18:08:30] [ info]      tail
[2020/09/25 18:08:30] [ info] ___________
[2020/09/25 18:08:30] [ info]  filters:
[2020/09/25 18:08:30] [ info]      grep.0
[2020/09/25 18:08:30] [ info]      lua.1
[2020/09/25 18:08:30] [ info]      modify.2
[2020/09/25 18:08:30] [ info]      modify.3
[2020/09/25 18:08:30] [ info]      parser.4
[2020/09/25 18:08:30] [ info]      modify.5
[2020/09/25 18:08:30] [ info]      nest.6
[2020/09/25 18:08:30] [ info]      record_modifier.7
[2020/09/25 18:08:30] [ info] ___________
[2020/09/25 18:08:30] [ info]  outputs:
[2020/09/25 18:08:30] [ info]      stdout.0
[2020/09/25 18:08:30] [ info] ___________
[2020/09/25 18:08:30] [ info]  collectors:
[2020/09/25 18:08:30] [ info] [engine] started (pid=1)
[2020/09/25 18:08:30] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2020/09/25 18:08:30] [debug] [storage] [cio stream] new stream registered: tail.0
[2020/09/25 18:08:30] [ info] [storage] version=1.0.5, initializing...
[2020/09/25 18:08:30] [ info] [storage] in-memory
[2020/09/25 18:08:30] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] inotify watch fd=20
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] scanning path /logging-volume/*.log
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] inode=12961276487 appended as /logging-volume/single-largejson.log
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] scan_glob add(): /logging-volume/single-largejson.log, inode 12961276487
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] 1 new files found on path '/logging-volume/*.log'
[2020/09/25 18:08:30] [debug] [filter:modify:modify.2] Initialized modify filter with 0 conditions and 1 rules
[2020/09/25 18:08:30] [debug] [filter:modify:modify.3] Initialized modify filter with 0 conditions and 1 rules
[2020/09/25 18:08:30] [debug] [filter:modify:modify.5] Initialized modify filter with 0 conditions and 9 rules
[2020/09/25 18:08:30] [debug] [router] match rule tail.0:stdout.0
[2020/09/25 18:08:30] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2020/09/25 18:08:30] [ info] [sp] stream processor started
[2020/09/25 18:08:30] [debug] [filter:modify:modify.2] Input map size 2 elements, output map size 2 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.2] Input map size 2 elements, output map size 2 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.2] Input map size 2 elements, output map size 2 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.2] Input map size 2 elements, output map size 2 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.5] Input map size 2 elements, output map size 11 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.5] Input map size 2 elements, output map size 11 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.5] Input map size 2 elements, output map size 11 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.5] Input map size 2 elements, output map size 11 elements
[2020/09/25 18:08:30] [debug] [filter:nest:nest.6] outer map size is 11, will be 5, nested map size will be 7
[2020/09/25 18:08:30] [debug] [filter:nest:nest.6] outer map size is 11, will be 5, nested map size will be 7
[2020/09/25 18:08:30] [debug] [filter:nest:nest.6] outer map size is 11, will be 5, nested map size will be 7
[2020/09/25 18:08:30] [debug] [filter:nest:nest.6] outer map size is 11, will be 5, nested map size will be 7
[2020/09/25 18:08:30] [debug] [filter:modify:modify.2] Input map size 2 elements, output map size 2 elements
[2020/09/25 18:08:30] [debug] [filter:modify:modify.5] Input map size 2 elements, output map size 11 elements
[2020/09/25 18:08:30] [debug] [filter:nest:nest.6] outer map size is 11, will be 5, nested map size will be 7
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] inode=12961276487 file=/logging-volume/single-largejson.log promote to TAIL_EVENT
[2020/09/25 18:08:30] [ info] inotify_fs_add(): inode=12961276487 watch_fd=1 name=/logging-volume/single-largejson.log
[2020/09/25 18:08:30] [debug] [task] created task=0x7fb975e302a0 id=0 OK
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] scanning path /logging-volume/*.log
[0] in_tail: [1601057310.114859819, {"event"=>"Start SigSegV Repro Case on localhost...", "index"=>"ethos-sandbox-ue1", "host"=>"tseng-local", "sourcetype"=>"ethos-sigsegv", "fields"=>{"file_path"=>"/logging-volume/single-largejson.log", "namespace"=>"ns-personal-tseng", "node_ip"=>"0.0.0.0", "node_name"=>"localhost", "pod_ip"=>"0.0.0.0", "pod_name"=>"ethos-tseng-test", "pod_uid"=>"ethos-sigsegv-localtest"}}]
[1] in_tail: [1601057310.115355491, {"event"=>"test log between large logs...", "index"=>"ethos-sandbox-ue1", "host"=>"tseng-local", "sourcetype"=>"ethos-sigsegv", "fields"=>{"file_path"=>"/logging-volume/single-largejson.log", "namespace"=>"ns-personal-tseng", "node_ip"=>"0.0.0.0", "node_name"=>"localhost", "pod_ip"=>"0.0.0.0", "pod_name"=>"ethos-tseng-test", "pod_uid"=>"ethos-sigsegv-localtest"}}]
[2] in_tail: [1601057310.116012573, {"event"=>"test json log between large logs...", "index"=>"ethos-sandbox-ue1", "host"=>"tseng-local", "sourcetype"=>"ethos-sigsegv", "fields"=>{"file_path"=>"/logging-volume/single-largejson.log", "namespace"=>"ns-personal-tseng", "node_ip"=>"0.0.0.0", "node_name"=>"localhost", "pod_ip"=>"0.0.0.0", "pod_name"=>"ethos-tseng-test", "pod_uid"=>"ethos-sigsegv-localtest"}}]
[3] in_tail: [1601057310.116373777, {"event"=>"test log between large logs...", "index"=>"ethos-sandbox-ue1", "host"=>"tseng-local", "sourcetype"=>"ethos-sigsegv", "fields"=>{"file_path"=>"/logging-volume/single-largejson.log", "namespace"=>"ns-personal-tseng", "node_ip"=>"0.0.0.0", "node_name"=>"localhost", "pod_ip"=>"0.0.0.0", "pod_name"=>"ethos-tseng-test", "pod_uid"=>"ethos-sigsegv-localtest"}}]
[4] in_tail: [1601057310.121352195, {"event"=>"Stop SigSegv localhost testing...", "index"=>"ethos-sandbox-ue1", "host"=>"tseng-local", "sourcetype"=>"ethos-sigsegv", "fields"=>{"file_path"=>"/logging-volume/single-largejson.log", "namespace"=>"ns-personal-tseng", "node_ip"=>"0.0.0.0", "node_name"=>"localhost", "pod_ip"=>"0.0.0.0", "pod_name"=>"ethos-tseng-test", "pod_uid"=>"ethos-sigsegv-localtest"}}]
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] scan_blog add(): dismissed: /logging-volume/single-largejson.log, inode 12961276487
[2020/09/25 18:08:30] [debug] [input:tail:tail.0] 0 new files found on path '/logging-volume/*.log'

To Reproduce
Feed a large JSON object with > 128 Keys. log example:
log.zip

{"key1`":"value1","key2":"value2",..."key129":"value129"}
  • Steps to reproduce the problem:
    K8s crash case:
    Apply manifest to kubernetes cluster see SIGSEGV crash.
kubectl -n <namespace> apply -f ./fluent-bit-crashloop-largejson.yaml

k8s-manifest.zip
localhost skipping json logs case:
Run local and add stdout plugin right after tail a see that is skipping the json logs.
log.zip
local-conf.zip

docker run --memory=512m -v /test-conf/:/fluent-bit/etc/ -v /log/:/logging-volume/ -v /db/:/var/fluent-bit/ fluent/fluent-bit:1.5.3-debug

Expected behavior
No SIGSEGV crash.

When running localhost reproduction should not skip JSON logs.

Screenshots

Your Environment

Version used: 1.4.6 & 1.5.3
Configuration:
Environment name and version Kubernetes 1.17 for crashing behavior.
Server type and version: AWS and Azure
Operating System and version: flatcar linux

  • Filters and plugins: record modifier.
troubleshooting

>All comments

I am troubleshooting this

Was this page helpful?
0 / 5 - 0 ratings