Beats: Use filepath instead of inode

Created on 1 May 2018  路  10Comments  路  Source: elastic/beats

I have followed many threads about Filebeat having problems with inode re-use. Many of them end without a solution. We are using Filebeat, and having inode re-use issues as well, even when the filenames are unique. When this happens, Filebeat starts reading a new file at the wrong offset. This results in fields being unaligned. One solution to this is to have an option to use the full filepath and filename as the hashed key into the Filebeat file registry, instead of the inode.

Some might say this is a feature request, but I think opening the wrong file, and reading from the wrong offset is a bug.

Filebeat Stalled enhancement needs_team

Most helpful comment

We are seeing the same issue using a cifs mount that seemingly provides changing inodes. Our log filenames are unique so would prefer to either use the filename at the identifier.

All 10 comments

This seems to be a duplicate of https://github.com/elastic/beats/issues/4368 ?

There are several reasons I can see why we should add this "feature ;-)" and I elaborated on the benefits on an other issue (which I couldn't find at the moment). In summary this should be a separate prospector type as not having to worry about file rotation etc. simplifies things.

@spacepacket Can you share your exact inode reuse use case that you couldn't solve with the current options in filebeat?

I have described more details about the inode reuse issue here: https://discuss.elastic.co/t/filebeat-to-logstash-error-parsing-csv/127665

We are seeing the same issue using a cifs mount that seemingly provides changing inodes. Our log filenames are unique so would prefer to either use the filename at the identifier.

yes, I think this feature is very useful in many situations. In many case, we don't use the inode and file path is enough to identify the file.

logstash too,,,

I want this feature.
We have a reproducer which clearly shows filebeat losing data.
I strongly suspect inode-reuse as being the culprit.
We even have a support contract with elastic, and we gave elastic support the reproducer, and they still haven't found the problem after weeks.

I have the same problem...

Elastic support has agreed this is an inode-reuse issue.
The have documentation that says "don't do that".
Our solution was to use a combination of settings that tells the since_db to refresh more often than we rotate files. You can make this work, but it's disappointing that there is no option to use filename instead of inode as the unique key in the since_db.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue doesn't have a Team:<team> label.

Was this page helpful?
0 / 5 - 0 ratings