Rust: fd.flush() inconsistent between un*x and windows OS families

Created on 16 Jul 2020 · 4Comments · Source: rust-lang/rust

This conversation on twitter:

https://twitter.com/redtwitdown/status/1283600428941246466

highlighted a difference in behaviour in File on Windows and unix

On Windows, the flush method triggers a datasync operation down through the OS disk cache and disk layers.

On Unix the same call, is a no-op.

This is an attractive nuisance; BufWriter, which is a user-space construct designed to make it easy to batch up and give the operating system large writes, so that things like fmt can write a single byte at a time if needed, and we don't pay the death by a thousand cuts toll at the OS boundary.

However, forcing every file to do a a FlushFileBuffers / fdatasync is inappropriate: that will cause performance issues in many cases where a later explicit fdatasync would not - or even a fsync (consider for instance untarring a tgz: committing any single file to storage is irrelevant to the user, what matters is that when they are told 'all done' their data as a whole is safe, and this is most efficiently done by a sync fsync at the end of the entire operation, allowing ext4 and other file systems to optimise group placement on disk as much as possible.

I think the right thing to do here is to remove the FlushFileBuffers call from the windows fs module, bringing it into alignment with the unix module.

The other options are:

having flush() be inconsistent between platforms, making anything that uses it slow on Windows and fast on Unix, leading to a perception that Windows is slow, when rust is causing the issue.
having flush() be slow on both platforms, so that rather than BufWriter making things faster by insulating high level code from context switching overheads, it will now also introduce IO overheads, which are typically 100-1000x slower than context switches.
we encourage people not to use flush at all, and instead to just drop BufWriter's, which has poor implications for correctness

Most helpful comment

You are linking to fdatasync on Windows and flush on Unix in the original post. The Windows impplementation of flush does nothing, just like unix's: https://github.com/rust-lang/rust/blob/master/src/libstd/sys/windows/fs.rs#L438-L440.

sfackler on 16 Jul 2020

👍3

All 4 comments

BufWriter::flush() first flushes user-space buffers to the underlying Writer, then calls that Writer's flush. If that's File, it results in OS writes and then the inconsistency here. Dropping BufWriter has correctness concerns because of the possibility of losing the user-space content, even if File::flush() is a NOP (and even if it tries to catch this in the drop).
File::flush() is / can be a NOP because it has no internal buffer to flush, and it has separate sync operations.
But, for the user, they're all just using this via the Writer trait, and we're not supposed to really know or care what we've been given if just using those traits. And the File::sync* operations are not part of the Writer trait - and so BufWriter can't call those when it's just stacked on top of a Writer impl.

So in a sense this is a consequence of that disconnect, plus perhaps some different platform history, expectation or previous issue.

dcarosone on 16 Jul 2020

Right - someone passing a Writer around who also needs to do fdatasync on each file would presumably want this on all platforms, so having a file-specific adapter that knows how to fdatasync on flush, and inserting that into the stack they create would work well.

Code passing a file to a helper for a short bit of of writing etc, can when control returns, also decide to fdatasync at that point.

rbtcollins on 16 Jul 2020

👍1