[agent]
flush_interval = "10s"
flush_jitter = "5s"
interval = "10s"
metric_buffer_limit = 20000
round_interval = true
Telegraf version 1.15.1, Amazon Linux 2018.03
/var/log/telegraf directory exists, but has no filesTelegraf starts, and writes logs to /var/log/telegraf/telegraf.log
sudo service telegraf start
Starting the process telegraf [ OK ]
sh: /var/log/telegraf/telegraf.log: Permission denied
Telegraf does not start
When rolling back our version to 14.5.1, telgraf starts correctly and logs to /var/log/telegraf/telegraf.log.
Alternatively, if I manually create a blank log file named /var/log/telegraf/telegraf.log and chown it to telegraf, then version 1.15.1 will start correctly.
This unexpectedly broke metrics reporting for us today on new hosts.
this seems like a permission problem. Telegraf should have write permissions to the /var/log/telegraf folder. Make sure that the user running the telegraf process, or its group, has write access to this folder.
@ssoroka i rolled back telegraf to 1.14.5 on the same host, no permissions changes or any other changes to the host besides yum downgrading. and the process started just fine.
this is the only reason I filed an issue here with telegraf
Did telegraf 1.15.1 add some new requirement to creating this log file? I'm working around things on my end, but would like to understand why a new telegraf version was responsble for breaking things.
We are unable to start the telegraf service on fresh installations on EL nodes as well. This looks like an RPM packaging change.
In the 1.14.5 RPM /var/log/telegraf is owned by telegraf:telegraf
# rpm -qvl telegraf
-rw-r--r-- 1 root root 131 Jun 30 19:20 /etc/logrotate.d/telegraf
drwxr-xr-x 2 root root 0 Jun 30 19:21 /etc/telegraf
-rw-r--r-- 1 root root 235766 Jun 30 19:20 /etc/telegraf/telegraf.conf
drwxr-xr-x 2 root root 0 Jun 30 19:21 /etc/telegraf/telegraf.d
-rwxr-xr-x 1 root root 69213184 Jun 30 19:20 /usr/bin/telegraf
drwxr-xr-x 2 root root 0 Jun 30 19:21 /usr/lib/.build-id
drwxr-xr-x 2 root root 0 Jun 30 19:21 /usr/lib/.build-id/87
lrwxrwxrwx 1 root root 28 Jun 30 19:21 /usr/lib/.build-id/87/58b4a9009b5001278739cd097e59d24c18f23e -> ../../../../usr/bin/telegraf
-rw-r--r-- 1 root root 5803 Jun 30 19:20 /usr/lib/telegraf/scripts/init.sh
-rw-r--r-- 1 root root 492 Jun 30 19:20 /usr/lib/telegraf/scripts/telegraf.service
drwxr-xr-x 2 telegraf telegraf 0 Jun 30 19:21 /var/log/telegraf
In the 1.15.1 it is owned by root:root
# rpm -qvl telegraf
-rw-r--r-- 1 root root 131 Jul 22 22:21 /etc/logrotate.d/telegraf
-rw-r--r-- 1 root root 250761 Jul 22 22:21 /etc/telegraf/telegraf.conf
drwxr-xr-x 2 root root 0 Jul 22 22:21 /etc/telegraf/telegraf.d
-rwxr-xr-x 1 root root 69730912 Jul 22 22:21 /usr/bin/telegraf
drwxr-xr-x 2 root root 0 Jul 22 22:21 /usr/lib/.build-id
drwxr-xr-x 2 root root 0 Jul 22 22:21 /usr/lib/.build-id/3c
lrwxrwxrwx 1 root root 28 Jul 22 22:21 /usr/lib/.build-id/3c/1b944565dc487f5646d216f361977b5c6bb4c0 -> ../../../../usr/bin/telegraf
-rwxr-xr-x 1 root root 5803 Jul 22 22:21 /usr/lib/telegraf/scripts/init.sh
-rw-r--r-- 1 root root 492 Jul 22 22:21 /usr/lib/telegraf/scripts/telegraf.service
drwxr-xr-x 2 root root 0 Jul 22 22:21 /var/log/telegraf
It looks like the debian packages are working because there is a chown in the post install script: https://github.com/influxdata/telegraf/blob/master/scripts/deb/post-install.sh#L52 This is not present in the RPM post install script.
Thanks for the follow up! Reopening
should be resolved. Might have to wait for the nightly for the rpm to build to test. I reproduced locally in a VM and it seems to resolve the issue, so I'm thinking this should work.
Most helpful comment
should be resolved. Might have to wait for the nightly for the rpm to build to test. I reproduced locally in a VM and it seems to resolve the issue, so I'm thinking this should work.