From our chat:
"file sink" does not recreate a file if old one was deleted using "rm" - is it correct behaviour?
We should decide how to handle this.
@MOZGIII, @afoninsky reported this and I want to make sure the new file sink addresses this. Either in the initial PR or as a follow up.
is it correct behaviour
But isn't it correct behaviour? I think it is, at least that what I'd expect. Changing this automatically reload the fd would disallow some fancy tricks one could do.
Maybe we should provide a way to explicitly trigger file reopening rather than doing what is requested here - reopening the file after the removal?
Do you mean an option on the file sink itself?
More like via a signal or external invocation, i.e. pkill -15 vector to force reopen all files at file sink
Eh, I don't think that makes sense. The file sink currently offers dynamic partitioning, so it's already introducing the capabilities to create files on the fly. If a file is removed, we should simply open a new file and proceed as usual, in my opinion. I assume we can react to write errors to recreate the file (with a backoff)?
You are right! However, when file is removed, file writes that vector does won't start failing, and will continue to just work. Removing the file only removes the inode mapping to a path at the file system.
Independently on this, if we get a file write error, we should indeed gracefully reopen the file (and recreate a new one if needed). This is what should be happening with the new file sink implementation.
However, when file is removed, file writes that vector does won't start failing, and will continue to just work. Removing the file only removes the inode mapping to a path at the file system.
馃槮 I assume we can figure something out though. Like listening to inode events or some kind of checkpointing strategy. It would be worth looking at other implementations.
Yes, we can register for inotify events for the files we open. Doable, should even be pretty straitforward, but, I mean - do we want to?
Silently dropping data is not acceptable, so we should try to resolve this. #2057 is tangentially related and might be a better lens to view this through.
To be clear, when the user removes the file - the data is not dropped, we still write to the deleted file. This is what users are normally used to in linux.
To actually make vector reopen the file, users should just force-close the fd that vector holds to that file after they remove it.
I'd like to avoid any inotify shenanigans if possible.
There is a small delay between when unlink happens and when inotify's event arrives, so we could still loose some logs.
How about using st_nlink (st_nlink == 0 means unlinked) before every write, in combination with a buffered writer to mitigate performance issue from using st_nlink?
I'm skeptical of making any deep changes here and agree with @MOZGIII that sticking to standard linux file writing behavior is likely to be the least confusing for users.
That being said, we could definitely look around and see if any similar tools are doing tricks here to adjust their behavior to this use case.
Most helpful comment
I'd like to avoid any inotify shenanigans if possible.