The fixes implemented in https://github.com/influxdata/telegraf/issues/1829 did not address the case where a file being watched by the Logparser plugin is deleted and recreated.
[agent]
flush_interval=2
[[inputs.logparser]]
files = ["/tmp/telegraf/input.log"]
from_beginning=true
[inputs.logparser.grok]
patterns=["%{INT:measurement}"]
[[outputs.file]]
files = ["stdout"]
Telegraf v1.3.0 (git: release-1.3 2bc5594b44145368823d7aa78bfb753ab51e9235)
Tested on CentOS 6.7
Run this bash script. It first tests the case similar to log rolling where the original file is moved. It next tests the delete case where the original file is deleted and a new file of the same name is created.
#!/bin/bash
function testCase {
echo "$1 Test"
rm -fr /tmp/telegraf
mkdir /tmp/telegraf
cat > /tmp/telegraf/config << EOF
[agent]
flush_interval=2
[[inputs.logparser]]
files = ["/tmp/telegraf/input.log"]
from_beginning=true
[inputs.logparser.grok]
patterns=["%{INT:measurement}"]
[[outputs.file]]
files = ["stdout"]
EOF
# Start with 1 line of data in the input.log
echo 0 > /tmp/telegraf/input.log
rm -f nohup.out
nohup telegraf --config /tmp/telegraf/config --debug 2>/dev/null &
pid=$!
for i in {1..5}
do
echo "Writing line $i"
echo $i >> /tmp/telegraf/input.log
sleep 10
# Treat every non "move" value as "delete".
if [ "$1" = "move" ]; then
mv /tmp/telegraf/input.log /tmp/telegraf/input.log$i
else
rm /tmp/telegraf/input.log
fi
done
echo
kill -9 $pid
echo "Results for $1"
cat nohup.out
echo
}
testCase move
sleep 2
testCase delete
Results
move Test
Writing line 1
Writing line 2
Writing line 3
Writing line 4
Writing line 5
Results for move
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="0" 1495561759145829922
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="1" 1495561759145871841
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="2" 1495561769118308036
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="3" 1495561779120012459
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="4" 1495561789121931675
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="5" 1495561799123925397
./xx: line 45: 26230 Killed nohup telegraf --config /tmp/telegraf/config --debug 2> /dev/null
delete Test
Writing line 1
Writing line 2
Writing line 3
Writing line 4
Writing line 5
Results for delete
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="0" 1495561811159941941
logparser_grok,host=node2055.svc.devpg.pdx.wd measurement="1" 1495561811159987661
The results of the test should be the same regardless of whether the file is deleted or moved.
When file is deleted the Logparser is no longer able to detect that a new file of the same name was created.
Our use case is that the log we are monitoring is always in the same place. It gets purged and recreated when the the service doing the logging is upgraded.
Here is a test case #2965
I can also confirm this issue with telegraf 1.3.4 on debian jessie with python logger doing the logging.
Hi guys, was this issues fixed in the latest version of telegraf?
No this is still an issue.
In version 1.4.5 the problem is reproduced
Think this is related: https://github.com/hpcloud/tail/issues/122.
https://github.com/hpcloud/tail/pull/125/files fixes the issues for me (keeping deleted file open).
maybe that commit can be merged in https://github.com/influxdata/tail if upstream library doesn't.
@piotr1212 That change looks like it introduces a race condition, I don't think it is the right change to make but it might be a clue how to fix this issue. Also it should probably be opened in the fsnotify project and not in tail.
That is unfortunate. I forgot to mention that we had a full disk because of logparser plugin keeping a deleted (rotated) file open. Switched to "poll" method which seems not to be affected.
Most helpful comment
No this is still an issue.