This conversation on twitter:
https://twitter.com/redtwitdown/status/1283600428941246466
highlighted a difference in behaviour in File on Windows and unix
On Windows, the flush method triggers a datasync operation down through the OS disk cache and disk layers.
On Unix the same call, is a no-op.
This is an attractive nuisance; BufWriter, which is a user-space construct designed to make it easy to batch up and give the operating system large writes, so that things like fmt can write a single byte at a time if needed, and we don't pay the death by a thousand cuts toll at the OS boundary.
However, forcing every file to do a a FlushFileBuffers / fdatasync is inappropriate: that will cause performance issues in many cases where a later explicit fdatasync would not - or even a fsync (consider for instance untarring a tgz: committing any single file to storage is irrelevant to the user, what matters is that when they are told 'all done' their data as a whole is safe, and this is most efficiently done by a sync fsync at the end of the entire operation, allowing ext4 and other file systems to optimise group placement on disk as much as possible.
I think the right thing to do here is to remove the FlushFileBuffers call from the windows fs module, bringing it into alignment with the unix module.
The other options are:
I haven't checked blame for the windows source; pretty sure the unix source has been like this forever.
BufWriter::flush() first flushes user-space buffers to the underlying Writer, then calls that Writer's flush. If that's File, it results in OS writes and then the inconsistency here. Dropping BufWriter has correctness concerns because of the possibility of losing the user-space content, even if File::flush() is a NOP (and even if it tries to catch this in the drop).
File::flush() is / can be a NOP because it has no internal buffer to flush, and it has separate sync operations.
But, for the user, they're all just using this via the Writer trait, and we're not supposed to really know or care what we've been given if just using those traits. And the File::sync* operations are not part of the Writer trait - and so BufWriter can't call those when it's just stacked on top of a Writer impl.
So in a sense this is a consequence of that disconnect, plus perhaps some different platform history, expectation or previous issue.
Right - someone passing a Writer around who also needs to do fdatasync on each file would presumably want this on all platforms, so having a file-specific adapter that knows how to fdatasync on flush, and inserting that into the stack they create would work well.
Code passing a file to a helper for a short bit of of writing etc, can when control returns, also decide to fdatasync at that point.
You are linking to fdatasync on Windows and flush on Unix in the original post. The Windows impplementation of flush does nothing, just like unix's: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/windows/fs.rs#L438-L440.
I am so embarrased, how did I fluff that up. Sorry for the noise.
Most helpful comment
You are linking to
fdatasyncon Windows andflushon Unix in the original post. The Windows impplementation offlushdoes nothing, just like unix's: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/windows/fs.rs#L438-L440.