I have noticed tail plugin does not obey the state of the file when a parser is defined. Plugin will always start at the head of the file unless parser is blanked out. This issue can also be replicated using command line:
fluent-bit --parser=parsers.conf -i tail -p path=/var/log/syslog -p db=/var/log/xyz.sqlite -p parser=syslog-custom -o stdout
This command will work as expected if parser parameter is taken out.
This is my sosreport:
[Fluent Bit]
Edition Community Edition
Version 0.12.12
Built Flags JSMN_PARENT_LINKS JSMN_STRICT FLB_HAVE_TLS FLB_HAVE_SQLDB FLB_HAVE_BUFFERING FLB_HAVE_FLUSH_LIBCO FLB_HAVE_FORK FLB_HAVE_PROXY_GO FLB_HAVE_REGEX FLB_HAVE_C_TLS FLB_HAVE_SETJMP FLB_HAVE_ACCEPT4 FLB_HAVE_INOTIFY
[Operating System]
Name Linux
Release 4.1.15
Version #3 SMP PREEMPT Fri Jul 7 11:18:13 PDT 2017
[Hardware]
Architecture armv7l
Processors 1
[Built Plugins]
Inputs cpu mem kmsg tail proc disk netif dummy head health serial stdin tcp mqtt lib forward random syslog
Filters grep stdout kubernetes parser record_modifier
Outputs counter es exit file forward http influxdb kafka-rest nats null plot stdout retry td lib flowcounter
[SERVER] Runtime configuration
Flush 5
Daemon Off
Log_Level Debug
[INPUT] Instance
Name tail.0 (tail, id=0)
Flags DYN_TAG
Threaded No
Tag tail.0
Path /var/log/syslog
db /var/log/landing_files.sqlite
Parser syslog-custom
[OUTPUT] Instance
Name stdout.0 (stdout, mask_id=1)
Match *
TLS Active No
Retry Limit 1
I am hoping I have done something wrong on my end. Thanks!
@2x-bullet-time thanks for raising this issue.
I did some local tests and the problem that I found in in_tail plugin, is that if a parser fails against a log line (bad regex), that content is missed, the right behavior is to keep that information in "raw text" mode. I placed a fix for that behavior in GIT master and 0.12 branches.
Thank you for your work. I will give it a try!
@edsiper
I pulled down the latest version of master branch and recompiled it locally. When executing this command "fluent-bit -i tail -p path=/var/log/syslog -p db=/var/log/landing_files.db -o stdout", the tailing function works fine. However, when I run the following configuration file without a parser, in_tail will always read from beginning of file. This is my simple configuration file:
[SERVICE]
Flush 5
Parsers_File parsers.conf
Log_Level debug
[INPUT]
Name tail
Path /var/log/syslog
##DB /var/log/landing_files.db
##Parser syslog-custom
##[FILTER]
##Name grep
##Match *
##Regex process_name pppd
##[FILTER]
##Name record_modifier
##Match *
##Record deviceURN 1234
##Record deviceName Testing-Unit
[OUTPUT]
Name stdout
Match *
This is the sosreport associated with the above configuration file:
[Fluent Bit]
Edition Community Edition
Version 0.13.0
Built Flags JSMN_PARENT_LINKS JSMN_STRICT FLB_HAVE_TLS FLB_HAVE_SQLDB FLB_HAVE_BUFFERING FLB_HAVE_FLUSH_LIBCO FLB_HAVE_FORK FLB_HAVE_PROXY_GO FLB_HAVE_REGEX FLB_HAVE_C_TLS FLB_HAVE_ACCEPT4 FLB_HAVE_INOTIFY
[Operating System]
Name Linux
Release 4.1.15
Version #3 SMP PREEMPT Fri Jul 7 11:18:13 PDT 2017
[Hardware]
Architecture armv7l
Processors 1
[Built Plugins]
Inputs cpu mem kmsg tail proc disk netif dummy head exec health serial stdin tcp mqtt lib forward random syslog
Filters grep stdout kubernetes parser record_modifier
Outputs counter es exit file forward http influxdb kafka-rest nats null plot stdout retry td lib flowcounter
[SERVER] Runtime configuration
Flush 5
Daemon Off
Log_Level Debug
[INPUT] Instance
Name tail.0 (tail, id=0)
Flags DYN_TAG
Threaded No
Tag tail.0
Path /var/log/syslog
[OUTPUT] Instance
Name stdout.0 (stdout, mask_id=1)
Match *
TLS Active No
Retry Limit 1
This is the sosreport associated with the above command:
[Fluent Bit]
Edition Community Edition
Version 0.13.0
Built Flags JSMN_PARENT_LINKS JSMN_STRICT FLB_HAVE_TLS FLB_HAVE_SQLDB FLB_HAVE_BUFFERING FLB_HAVE_FLUSH_LIBCO FLB_HAVE_FORK FLB_HAVE_PROXY_GO FLB_HAVE_REGEX FLB_HAVE_C_TLS FLB_HAVE_ACCEPT4 FLB_HAVE_INOTIFY
[Operating System]
Name Linux
Release 4.1.15
Version #3 SMP PREEMPT Fri Jul 7 11:18:13 PDT 2017
[Hardware]
Architecture armv7l
Processors 1
[Built Plugins]
Inputs cpu mem kmsg tail proc disk netif dummy head exec health serial stdin tcp mqtt lib forward random syslog
Filters grep stdout kubernetes parser record_modifier
Outputs counter es exit file forward http influxdb kafka-rest nats null plot stdout retry td lib flowcounter
[SERVER] Runtime configuration
Flush 5
Daemon Off
Log_Level Info
[INPUT] Instance
Name tail.0 (tail, id=0)
Flags DYN_TAG
Threaded No
Tag tail.0
path /var/log/syslog
db /var/log/landing_files.db
Routes stdout.0
[OUTPUT] Instance
Name stdout.0 (stdout, mask_id=1)
Match *
TLS Active No
Retry Limit 1
in_tail plugin does not keep track of the state of the file when parser is used, either in command or config-file mode.
This is the sosreport associated with a configuration file with parser enabled:
[Fluent Bit]
Edition Community Edition
Version 0.13.0
Built Flags JSMN_PARENT_LINKS JSMN_STRICT FLB_HAVE_TLS FLB_HAVE_SQLDB FLB_HAVE_BUFFERING FLB_HAVE_FLUSH_LIBCO FLB_HAVE_FORK FLB_HAVE_PROXY_GO FLB_HAVE_REGEX FLB_HAVE_C_TLS FLB_HAVE_ACCEPT4 FLB_HAVE_INOTIFY
[Operating System]
Name Linux
Release 4.1.15
Version #3 SMP PREEMPT Fri Jul 7 11:18:13 PDT 2017
[Hardware]
Architecture armv7l
Processors 1
[Built Plugins]
Inputs cpu mem kmsg tail proc disk netif dummy head exec health serial stdin tcp mqtt lib forward random syslog
Filters grep stdout kubernetes parser record_modifier
Outputs counter es exit file forward http influxdb kafka-rest nats null plot stdout retry td lib flowcounter
[SERVER] Runtime configuration
Flush 5
Daemon Off
Log_Level Info
[INPUT] Instance
Name tail.0 (tail, id=0)
Flags DYN_TAG
Threaded No
Tag tail.0
Path /var/log/syslog
db /var/log/landing_files.db
Parser syslog-custom
[OUTPUT] Instance
Name stdout.0 (stdout, mask_id=1)
Match *
TLS Active No
Retry Limit 1
Thanks!
@edsiper
When executing command "fluent-bit -i tail -p path=/var/log/syslog -p db=/var/log/landing_files.db -o stdout --verbose", in_tail plugin is able to pick up the offset value according to this log:
[2018/02/05 23:51:06] [ info] [engine] started
[2018/02/05 23:51:06] [debug] [in_tail] inotify watch fd=20
[2018/02/05 23:51:06] [debug] [in_tail] scanning path /var/log/syslog
[2018/02/05 23:51:06] [debug] [in_tail] add to scan queue /var/log/syslog, offset=3594
[2018/02/05 23:51:06] [debug] [router] default match rule tail.0:stdout.0
[2018/02/05 23:51:06] [debug] [in_tail] file=/var/log/syslog promote to TAIL_EVENT
However, it fails to detect it in these scenario(config file without parser, config file with parser, command with parser) according to this log:
[2018/02/05 23:54:51] [ info] [engine] started
[2018/02/05 23:54:51] [debug] [in_tail] inotify watch fd=19
[2018/02/05 23:54:51] [debug] [in_tail] scanning path /var/log/syslog
[2018/02/05 23:54:51] [debug] [in_tail] add to scan queue /var/log/syslog, offset=0
[2018/02/05 23:54:51] [debug] [router] input=tail.0 'DYNAMIC TAG'
[2018/02/05 23:54:51] [debug] [input tail.0] [mem buf] size = 4291
[2018/02/05 23:54:51] [debug] [in_tail] file=/var/log/syslog read=3594 lines=41
[2018/02/05 23:54:51] [debug] [in_tail] file=/var/log/syslog promote to TAIL_EVENT
Thanks!
I cannot reproduce the problem you have stated.
This is my test log file (apache.log):
213.217.69.115 - - [06/feb/2018:11:23:01 -0600] "GET /churchy HTTP/1.0" 204 2216
193.56.73.36 - - [06/feb/2018:11:23:01 -0600] "GET /signable HTTP/1.0" 500 2216
18.240.253.3 - - [06/feb/2018:11:23:01 -0600] "GET /beclamour HTTP/1.0" 500 2216
205.219.243.165 - - [06/feb/2018:11:23:01 -0600] "GET /mowe HTTP/1.0" 200 2216
183.43.42.232 - - [06/feb/2018:11:23:01 -0600] "GET /Deimos HTTP/1.0" 200 2216
30.30.47.28 - - [06/feb/2018:11:23:01 -0600] "GET /basify HTTP/1.0" 204 2216
106.165.252.167 - - [06/feb/2018:11:23:01 -0600] "GET /triphase HTTP/1.0" 400 2216
204.168.117.164 - - [06/feb/2018:11:23:01 -0600] "GET /self-contentedly HTTP/1.0" 200 2216
251.244.238.241 - - [06/feb/2018:11:23:01 -0600] "GET /overblessedness HTTP/1.0" 400 2216
119.96.224.125 - - [06/feb/2018:11:23:01 -0600] "GET /senhora HTTP/1.0" 404 2216
$ fluent-bit -R parsers.conf -i tail -p path=apache.log -p parser=apache -p db=test.db -o stdout -f 1
Fluent-Bit v0.13.0
Copyright (C) Treasure Data
[2018/02/06 11:27:11] [ info] [engine] started
[0] tail.0: [1517937781.000000000, {"host"=>"213.217.69.115", "user"=>"-", "method"=>"GET", "path"=>"/churchy", "code"=>"204", "size"=>"2216", "referer"=>"", "agent"=>""}]
[1] tail.0: [1517937781.000000000, {"host"=>"193.56.73.36", "user"=>"-", "method"=>"GET", "path"=>"/signable", "code"=>"500", "size"=>"2216", "referer"=>"", "agent"=>""}]
[2] tail.0: [1517937781.000000000, {"host"=>"18.240.253.3", "user"=>"-", "method"=>"GET", "path"=>"/beclamour", "code"=>"500", "size"=>"2216", "referer"=>"", "agent"=>""}]
[3] tail.0: [1517937781.000000000, {"host"=>"205.219.243.165", "user"=>"-", "method"=>"GET", "path"=>"/mowe", "code"=>"200", "size"=>"2216", "referer"=>"", "agent"=>""}]
[4] tail.0: [1517937781.000000000, {"host"=>"183.43.42.232", "user"=>"-", "method"=>"GET", "path"=>"/Deimos", "code"=>"200", "size"=>"2216", "referer"=>"", "agent"=>""}]
[5] tail.0: [1517937781.000000000, {"host"=>"30.30.47.28", "user"=>"-", "method"=>"GET", "path"=>"/basify", "code"=>"204", "size"=>"2216", "referer"=>"", "agent"=>""}]
[6] tail.0: [1517937781.000000000, {"host"=>"106.165.252.167", "user"=>"-", "method"=>"GET", "path"=>"/triphase", "code"=>"400", "size"=>"2216", "referer"=>"", "agent"=>""}]
[7] tail.0: [1517937781.000000000, {"host"=>"204.168.117.164", "user"=>"-", "method"=>"GET", "path"=>"/self-contentedly", "code"=>"200", "size"=>"2216", "referer"=>"", "agent"=>""}]
[8] tail.0: [1517937781.000000000, {"host"=>"251.244.238.241", "user"=>"-", "method"=>"GET", "path"=>"/overblessedness", "code"=>"400", "size"=>"2216", "referer"=>"", "agent"=>""}]
[9] tail.0: [1517937781.000000000, {"host"=>"119.96.224.125", "user"=>"-", "method"=>"GET", "path"=>"/senhora", "code"=>"404", "size"=>"2216", "referer"=>"", "agent"=>""}]
$ fluent-bit -R parsers.conf -i tail -p path=apache.log -p db=test.db -o stdout -f 1
Fluent-Bit v0.13.0
Copyright (C) Treasure Data
[2018/02/06 11:27:57] [ info] [engine] started
no data output, because database offset is being used.
$ fluent-bit -R parsers.conf -i tail -p path=apache.log -p parser=apache -o stdout -f 1
Fluent-Bit v0.13.0
Copyright (C) Treasure Data
[2018/02/06 11:29:05] [ info] [engine] started
[0] tail.0: [1517937781.000000000, {"host"=>"213.217.69.115", "user"=>"-", "method"=>"GET", "path"=>"/churchy", "code"=>"204", "size"=>"2216", "referer"=>"", "agent"=>""}]
[1] tail.0: [1517937781.000000000, {"host"=>"193.56.73.36", "user"=>"-", "method"=>"GET", "path"=>"/signable", "code"=>"500", "size"=>"2216", "referer"=>"", "agent"=>""}]
[2] tail.0: [1517937781.000000000, {"host"=>"18.240.253.3", "user"=>"-", "method"=>"GET", "path"=>"/beclamour", "code"=>"500", "size"=>"2216", "referer"=>"", "agent"=>""}]
[3] tail.0: [1517937781.000000000, {"host"=>"205.219.243.165", "user"=>"-", "method"=>"GET", "path"=>"/mowe", "code"=>"200", "size"=>"2216", "referer"=>"", "agent"=>""}]
[4] tail.0: [1517937781.000000000, {"host"=>"183.43.42.232", "user"=>"-", "method"=>"GET", "path"=>"/Deimos", "code"=>"200", "size"=>"2216", "referer"=>"", "agent"=>""}]
[5] tail.0: [1517937781.000000000, {"host"=>"30.30.47.28", "user"=>"-", "method"=>"GET", "path"=>"/basify", "code"=>"204", "size"=>"2216", "referer"=>"", "agent"=>""}]
[6] tail.0: [1517937781.000000000, {"host"=>"106.165.252.167", "user"=>"-", "method"=>"GET", "path"=>"/triphase", "code"=>"400", "size"=>"2216", "referer"=>"", "agent"=>""}]
[7] tail.0: [1517937781.000000000, {"host"=>"204.168.117.164", "user"=>"-", "method"=>"GET", "path"=>"/self-contentedly", "code"=>"200", "size"=>"2216", "referer"=>"", "agent"=>""}]
[8] tail.0: [1517937781.000000000, {"host"=>"251.244.238.241", "user"=>"-", "method"=>"GET", "path"=>"/overblessedness", "code"=>"400", "size"=>"2216", "referer"=>"", "agent"=>""}]
[9] tail.0: [1517937781.000000000, {"host"=>"119.96.224.125", "user"=>"-", "method"=>"GET", "path"=>"/senhora", "code"=>"404", "size"=>"2216", "referer"=>"", "agent"=>""}]
I don't see any issue.
I gave it another try using your apache log and here is the result:
I can now isolate parser from contributing to the problem. My regex was missing an end of line. Opps :/
in_tail plugin behaves properly when it is invoked via command line. However, it does not utilize offset when it is executed using a configuration file.
Here is the configuration file (db + parser):
[SERVICE]
Flush 1
Parsers_File parsers.conf
Log_Level debug
Daemon Off
[INPUT]
Name tail
path /var/log/apache.log
DB /var/log/landing_files.db
Parser apache
[OUTPUT]
Name stdout
Match *
The same behavior is observed when parser is commented out.
An identical behavior also occurs when plug-in is tailing syslog + custom syslog parser + db.
This might be a system/conf specific problem. I used the same set up on my ubuntu VM and it was working just fine in either mode. I will close this issue for now. Thank you.
Hello :)
Could it be the problem of running fluent-bit on an armv7l? Fluent-bit PPA didn't seem to supply a build for armv7l. I had to compile fluent-bit source code using cmake on the armv7l machine. On the other hand, I was able to get an amd64 td-agent-bit build on my ubuntu VM from PPA.
@2x-bullet-time once compiled in your armv7l, do you still face the issue ?
@edsiper
opps. Sorry for not being clear. I had to compile fluent-bit source code on the armv7l machine because the machine could not fetch an armv7l td-agent-bit build from PPA. I also had to compile cmake because fluent-bit required cmake to compile. I have been facing this issue since then on the armv7l machine. I went through the same compilation steps on an Ubuntu VM running on windows 10 and in_tail plugin worked properly. The plugin also worked in the amd64 td-agent-bit build on the Ubuntu VM. I have also tried compiling fluent-bit on the armv7l machine with FLB_ALL flag to no avail. Could it be that certain differences might need to be accounted for when compiling fluent-bit for an armv7l architecture? Please excuse my limited knowledge working with low level machinery.
@2x-bullet-time ok, so there are two concerns:
I will focus on #2. would you please specify the kernel version and linux distribution that device/machine uses ?
@edsiper armv7l machine works properly if parameters are inserted as a part of the command as opposed to a config file. Is tail still an issue?
The kernel+dist come with a custom board. The kernel is a custom variation of 4.1.15 and distro is Debian GNU/Linux 8.10 (jessie). I think I am out of luck here.
@2x-bullet-time it's preferred to set all parameters either in the config file or in the command line only, avoid to mix them up (only use config file).
@edsiper Yea. I tested it with all parameters in config file vs. in command.
@2x-bullet-time to clarify: when you are using the configuration file, do you see an unexpected behavior ?
@edsiper Yes. The unexpected behavior is observed when a configuration file is used. This behavior is not observed on the ubuntu vm with the exact same configuration set up.
please paste "command line arguments" that works and the configuration file content..
command line arguments:
fluent-bit -R parsers.conf -i tail -p path=/var/log/apache.log -p parser=apache -p db=/var/log/landing_files.db -o stdout -f 1
configuration file content:
[SERVICE]
Flush 1
Parsers_File parsers.conf
Log_Level debug
Daemon Off
[INPUT]
Name tail
Path /var/log/apache.log
DB /var/log/landing_files.db
Parser apache
[OUTPUT]
Name stdout
Match *
@2x-bullet-time do you know if the problem can be reproduced in a Raspberry Pi or similar ?
@edsiper Hello. I am able to reproduce this problem on a bbb. Here is the sosreport from the exact same setup:
[Fluent Bit]
Edition Community Edition
Version 0.13.0
Built Flags JSMN_PARENT_LINKS JSMN_STRICT FLB_HAVE_TLS FLB_HAVE_SQLDB FLB_HAVE_FLUSH_LIBCO FLB_HAVE_FORK FLB_HAVE_PROXY_GO FLB_HAVE_REGEX FLB_HAVE_C_TLS FLB_HAVE_ACCEPT4 FLB_HAVE_INOTIFY
[Operating System]
Name Linux
Release 4.9.45-ti-r57
Version #1 SMP PREEMPT Fri Aug 25 22:58:38 UTC 2017
[Hardware]
Architecture armv7l
Processors 1
[Built Plugins]
Inputs cpu mem kmsg tail proc disk netif dummy head exec health serial stdin tcp mqtt lib forward random syslog
Filters grep stdout kubernetes parser record_modifier
Outputs counter es exit file forward http influxdb kafka-rest nats null plot stdout td lib flowcounter
[SERVER] Runtime configuration
Flush 1
Daemon Off
Log_Level Debug
[INPUT] Instance
Name tail.0 (tail, id=0)
Flags DYN_TAG
Threaded No
Tag tail.0
Path /var/log/apache.log
DB /var/log/landing_files.db
Parser apache
[OUTPUT] Instance
Name stdout.0 (stdout, mask_id=1)
Match *
TLS Active No
Retry Limit 1
bbb == beaglebone black ?, if so, which OS version (image URL?)
Yea. It is a beaglebone black. /etc/dogtag gives:
BeagleBoard.org Debian Image 2017-08-31
I checked BeagleBoard.org and this is the only image corresponds to that date:
http://debian.beagleboard.org/images/bone-debian-9.1-lxqt-armhf-2017-08-31-4gb.img.xz
@2x-bullet-time
I've tried to replicate without success the issue you just described in a beagle bone black board using the debian image you have provided. The database file offsets are set properly (except when Fluent Bit don't have write access to /var/log/... but a message is triggered)
would you please check if you can reproduce the problem with Fluent Bit 0.14.1 version ?
Closing due to timeout.