[[inputs.logparser]]
## files to tail.
files = ["/u0/log/jetty/*.request.log"]
## Read file from beginning.
from_beginning = false
Telegraf 1.0.0
Ubuntu 14.04.4
Jetty
Telegraf logparser picks up new file on the same pattern defined in the telegraf.conf
Telegraf does not pick up new rotated file automatically
Here's an fuser output of the log directory. The 7692 pid is Telegraf. It was started on 9/28.
/u0/log/jetty/2016_09_23.request.log: 7692
/u0/log/jetty/2016_09_24.request.log: 7692
/u0/log/jetty/2016_09_25.request.log: 7692
/u0/log/jetty/2016_09_26.request.log: 7692
/u0/log/jetty/2016_09_27.request.log: 1081 7692
/u0/log/jetty/2016_09_28.request.log: 1081 7692
/u0/log/jetty/2016_09_29.request.log: 1081 8769
The 2016_09_29.request.log was created by the 8769 process with is jetty. notice that 7692 does not hold a handle of that file.
Interestingly the pid 1081 belongs to logstash-forwarder, which is able to pick up the newly created file automatically.
+1 on this one.
It's working as intended. Jetty should be configured so that it always writes to the same log file and logrotate should be used to do the rotation.
@dgnorton Unfortunately this is not something that is out of the box standard to jetty.
Please refer to here:
http://wiki.eclipse.org/Jetty/Howto/Configure_Request_Logs
Either way, logparser should detect new files. There is not only 1 way to accomplish 1 task.
@akan1979 what happens if retainDays is set to 2 billion (> 5M years) and yyyy_mm_dd is removed from the file name?
This seems to be a problem when a file is (re)moved as well, e.g. given this config:
[[inputs.logparser]]
files = ["/tmp/telegraf_input.log"]
from_beginning=false
[inputs.logparser.grok]
patterns=["%{INT:measurement}"]
I see this behavior:
echo 1 > /tmp/telegraf_input.log
# start telegraf
echo 2 >> /tmp/telegraf_input.log
# telegraf picks up 2
rm /tmp/telegraf_input.log && touch /tmp/telegraf_input.log
# telegraf 1.0: nothing reported, telegraf 1.0.1 reports
# "Waiting for /tmp/telegraf_input.log to appear..."
echo 3 >> /tmp/telegraf_input.log
# Nothing happens :(
On the plus side, truncation detection seems to work, e.g.
echo 1 > /tmp/telegraf_input.log
echo 2 > /tmp/telegraf_input.log
# produce
# Re-opening truncated file /tmp/telegraf_input.log ...
# Successfully reopened truncated /tmp/telegraf_input.log
... but I suspect more than a few logging tools (e.g. logback) don't support rolling by truncation.
Internally telegraf has to spinup a thread for each file that it needs to tail. Currently it only does that once at startup, which is why deleted & created files are not handled.
I think it's worthwhile to handle these two cases in the tailer & logparser so I'm going to reopen this issue and put it into the 1.3 milestone.
Sounds good. FWIW, it looks like this lib solves addresses these problems. I haven't dug into the source yet, but it has a lot of 猸愶笍s so must be good :)
@njhartwell what lib are you talking about?
Sorry, forgot to paste it https://github.com/hpcloud/tail
we are already using that library, it doesn't solve all of this
Is this still slated to go into the next release of Telegraf? (1.3)
Other than restarting the Telegraf service after each log rotation, are there any other workarounds you can suggest for this?
We are trying to replace Logstash (which handles this case) with Telegraf, we can not proceed until this issue is resolved.
yes, still planned
nope, there is no other known workaround
@mottati I believe https://github.com/influxdata/telegraf/pull/2141 should solve this. Are you having problems with newly created files or only logrotated files?
Our particular use case is we have Telegraf pointing to a test machine where we periodically reinstall our software with newer versions. The deployment is fairly simplistic, we shutdown our servers, rename the old deployment directory to backup.directory (which includes the log file we are monitoring), install the new software, and bring our services back up.
We can add a restart of the Telegraf service to this process until 1.3 is released.
Interestingly enough when I went to look into this this morning, I noticed that Telegraf was still holding on to a deleted file, our software had been reinstalled twice since I had last looked so the file was renamed into the backup directory in the first round, and deleted in the second.
sudo lsof -p 43888
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
telegraf 43888 telegraf cwd DIR 8,1 16384 2 /
telegraf 43888 telegraf rtd DIR 8,1 16384 2 /
telegraf 43888 telegraf txt REG 8,1 32548479 543967 /usr/bin/telegraf
telegraf 43888 telegraf mem REG 8,1 1921096 442382 /lib64/libc-2.12.so
telegraf 43888 telegraf mem REG 8,1 142640 442406 /lib64/libpthread-2.12.so
telegraf 43888 telegraf mem REG 8,1 154520 442375 /lib64/ld-2.12.so
telegraf 43888 telegraf 0r CHR 1,3 0t0 4160 /dev/null
telegraf 43888 telegraf 1w CHR 1,3 0t0 4160 /dev/null
telegraf 43888 telegraf 2w REG 253,0 43741 28721156 /var/log/telegraf/telegraf.log
telegraf 43888 telegraf 3r CHR 1,9 0t0 4165 /dev/urandom
telegraf 43888 telegraf 4r REG 253,0 183324583 63406516 /data/solo.backup/soloots/logs/tasks.log (deleted)
telegraf 43888 telegraf 5u REG 0,9 0 4158 [eventpoll]
telegraf 43888 telegraf 7r IPv4 2235795701 0t0 TCP localhost.localdomain:45245->localhost.localdomain:sun-as-jpda (ESTABLISHED)
telegraf 43888 telegraf 8r DIR 0,10 0 1 inotify
telegraf 43888 telegraf 9u REG 0,9 0 4158 [eventpoll]
telegraf 43888 telegraf 10r FIFO 0,8 0t0 2166411407 pipe
telegraf 43888 telegraf 11w FIFO 0,8 0t0 2166411407 pipe
telegraf 43888 telegraf 12u IPv4 2226014191 0t0 TCP localhost.localdomain:55572->localhost.localdomain:d-s-n (ESTABLISHED)
Note the deleted tasks.log file that Telegraf has open above.
Than could be another issue you may want to investigate.
:+1:
The tail plugin Readme explicitly states that it should behave like a tail -F meaning that :
it will follow the name of the given file, so that it will be compatible with log-rotated files, and that it will retry on inaccessible files.
but this is not the case as it keeps open the renamed/deleted file.
So it looks more a bug than an enhancement.
this was closed by https://github.com/influxdata/telegraf/pull/2141, it will be available in 1.3
I tried testing this today using the same technique that was listed at the beginning of this issue. I used the latest nightly build of Telegraf. Here is what I found:
telegraf version:
telegraf --version
Telegraf vdev~n201703100819 (git: master 13f314a5076a47b208aeef5bdab12470837e26c1)
Configuration File
[agent]
flush_interval=2
[[inputs.logparser]]
files = ["/tmp/telegraf/input.log"]
from_beginning=true
[inputs.logparser.grok]
patterns=["%{INT:measurement}"]
[[outputs.file]]
files = ["stdout"]
Test Script
#!/bin/bash
# Start with 1 line of data in the input.log
echo 0 > input.log
rm nohup.out
nohup telegraf --config config --debug 2>/dev/null &
pid=$!
for i in {1..5}
do
echo "Writing line $i"
echo $i >> input.log
sleep 10
rm input.log
done
kill -9 $pid
cat nohup.out
Test Execution
$ ./test
Writing line 1
Writing line 2
Writing line 3
Writing line 4
Writing line 5
logparser_grok,host=node2087.svc.devpg.pdx.wd measurement="0" 1489182278368364318
logparser_grok,host=node2087.svc.devpg.pdx.wd measurement="1" 1489182278368407112
Conclusion
Assuming that this PR is actually in the build I used, it does not appear that this fix accomplishes what was requested e.g. discovery of newly created files with the same name of the original file.
We are currently using LogStash to parse our log that has this behavior. This is the last remaining issue we have that will allow us to use a pure TICK stack.
I can duplicate your report @mottati. It seems that recreated files are not detected, but truncated files are. There is also a bug where if the file is truncated with a file of the same size it is not detected.
I merged a fix for the problem into our tail fork.
@danielnelson Thanks! Do you know when this fix will make it into a released version of Telegraf?
The fix for this was in the 1.3.0 release, though I have heard that there are remaining issues. https://github.com/influxdata/telegraf/issues/2847
This still seems like an issue for [inputs.tail] for Telegraf 1.14.2. Once the file has been deleted and recreated with same name tail doesn't work anymore.
This still seems like an issue for [inputs.tail] for
Telegraf 1.14.2. Once the file has been deleted and recreated with same name tail doesn't work anymore.
Yes, got this bug at telegraf 1.17.0 with [inputs.tail] plugin on ubuntu 20.04
I run telegraf and tail -F --lines=0 /var/log/voodoo/voodoo_metric.log in separate window/
I wrote some loglines to my logfile, and both - telegraf and tail see it and telegraf parse it correctly.
Then, I remove /var/log/voodoo/voodoo_metric.log and touch it, and tail say:
tail: '/var/log/voodoo/voodoo_metric.log' has become inaccessible: No such file or directory
tail: '/var/log/voodoo/voodoo_metric.log' has appeared; following new file
server_iops{operation="data_read"} 976 1594660047713
but telegraf still dont see new logfile(
This still seems like an issue for [inputs.tail] for
Telegraf 1.14.2. Once the file has been deleted and recreated with same name tail doesn't work anymore.Yes, got this bug at telegraf 1.17.0 with [inputs.tail] plugin on ubuntu 20.04
I run telegraf andtail -F --lines=0 /var/log/voodoo/voodoo_metric.login separate window/
I wrote some loglines to my logfile, and both - telegraf and tail see it and telegraf parse it correctly.
Then, I remove /var/log/voodoo/voodoo_metric.log and touch it, and tail say:
tail: '/var/log/voodoo/voodoo_metric.log' has become inaccessible: No such file or directory tail: '/var/log/voodoo/voodoo_metric.log' has appeared; following new file server_iops{operation="data_read"} 976 1594660047713
but telegraf still dont see new logfile(
I have the same issue when using Telegraf (1.17.0) [inputs.tail] and Grok Pattern with Linux (Debian Buster latest version) build-in logrotation that rotates the logfile. After the log file gets rotatet Telegraf don't log the new data and output this error:
2021-01-05T23:00:00Z E! [inputs.tail] Tailing "/var/log/mylogfile.log": Unable to open file /var/log/mylogfile.log: open /var/log/mylogfile.log: permission denied
2021-01-05T23:01:01Z E! [inputs.tail] Stopping tail on "/var/log/mylogfile.log": Unable to open file /var/log/mylogfile.log: open /var/log/mylogfile.log: permission denied
After restarting the telegraf service all works fine again until the next log rotation. As workaround I restart the telegraf service after the log rotation, but thats not a clean solution.
Most helpful comment
This still seems like an issue for [inputs.tail] for
Telegraf 1.14.2. Once the file has been deleted and recreated with same name tail doesn't work anymore.