Description:
https://github.com/nodejs/node/pull/3163 added file descriptor support to *File() functions (for example fs.readFile).
Issue: Writing then reading on the same FD does not read the data when using fs.readFile(fd). Only reading (witout writing) works as expected.
The demo opens a file descriptor, writes data to it and tries to read the data using the same FD and fs.readFile. The demo code uses sync methods for simplicity, non-sync version has the same behavior.
const fs = require('fs');
const path = require('path');
const filepath = path.resolve(__dirname, './test-write-and-read.txt');
// 'w+' - Open file for reading and writing.
// The file is created (if it does not exist)
// or truncated (if it exists).
const fd = fs.openSync(filepath, 'w+');
// 'r+' - Open file for reading and writing.
// An exception occurs if the file does not exist.
// const fd = fs.openSync(filepath, 'r+');
// fs.writeFileSync(fd, 'foo bar'); // no issue
fs.writeSync(fd, 'FOOBAR'); // ISSUE: readFileSync(fd) will not read content
// fs.fsyncSync(fd); // <- seems to have no impact
// always reads content (no issue)
const buf = Buffer.alloc(20);
fs.readSync(fd, buf, 0, 20, 0);
console.log('read with fd:', buf.toString())
// always reads content (no issue)
console.log('readFileSync with filepath:', fs.readFileSync(filepath, { encoding: 'utf8' }));
// ISSUE: does not read content if previously writeSync(fd, 'foo bar')
console.log('readFileSync with fd:', fs.readFileSync(fd, { encoding: 'utf8' }));
fs.closeSync(fd);
Expected output:
read with fd: FOOBAR
readFileSync with filepath: FOOBAR
readFileSync with fd: FOOBAR
Actual output:
read with fd: FOOBAR
readFileSync with filepath: FOOBAR
readFileSync with fd:
(last line is missing the file content FOOBAR)
Note:
fs.fsync seems to have no impact. Assuming this could help and flush the data to the filesystemCan reproduce with Node.js 8.1.0 on Windows 7 x64.
This is something like https://github.com/nodejs/node/issues/7099, but not valid for readFile* cases: these methods should not read from the current fd position.
cc @nodejs/fs
@thisconnect Flushing the data to the filesystem seems unneeded, the file size is correctly changed before reading, if I get this right:
const fs = require('fs');
const path = require('path');
const filepath = path.resolve(__dirname, './test-write-and-read.txt');
const fd = fs.openSync(filepath, 'w+');
console.log(fs.statSync(filepath).size);
fs.writeSync(fd, 'FOOBAR');
console.log(fs.statSync(filepath).size);
console.log(fs.readFileSync(fd, { encoding: 'utf8' }));
0
6
// empty string
readFileSync() seems to be this: in this call chain the reading position is undefined:https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L576
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L542
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L680
Which may fall into this note from the doc:
If position is null, data will be read from the current file position.
Does the binding function equal undefined and null here?
fs.readFile() the same happens in this call chain:https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L348
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L426
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L458
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L396
Does the binding function equal -1 and null here?
It seems we should transfer 0 as a position in these both cases.
This happens because when called with an fd, readFile[Sync] reads from the current file position. I tried to document this behavior in https://github.com/nodejs/node/pull/10853, but it got stalled and then closed.
It seems this better be fixed than documented. readFile[Sync] are supposed to return the full file content according to method names, and the current behavior seems confusing.
@vsemozhetbyt Please read the discussion in that PR. TL;DR some fd are unseekable, so the behavior would be inconsistent between seekable and unseekable fds.
Seems to be a duplicate of the https://github.com/nodejs/node/issues/9671
@thisconnect are you seeing the same failure in 6.10.x?
edit: this appears to fail on 6.10.x and 6.9.x... so it is the same older bug not a new regression
@MylesBorins yes same on 6.10.x
@thisconnect Flushing the data to the filesystem seems unneeded, the file size is correctly changed before reading, if I get this right
@vsemozhetbyt flushing was just an attempt to rule out this being a possible problem. I too get correct sizes with stats.
After reading why to not use fs.exists (see RECOMMENDED section of fs.exists in https://nodejs.org/api/fs.html#fs_fs_exists_path_callback) wrote my own small(ish) module, that nobody uses :) . See fildes (https://github.com/thisconnect/fildes), it promises some fs methods, and takes care about creating missing dir structure if flag is 'w', 'w+', 'a' or 'a+'. I was exited that readFile and writeFile support FD, that landed in Node 5.x and started using it with Node 6.x LTS.
I wasn't aware about the fact that fd have something like a position or could open non-seekable files (sockets?). After reading through the issues https://github.com/libuv/libuv/pull/1357 , #7099, #9671, #9699, #10809, #10853
As a naive Node.js user, this is what I suspect:
I pretty much agree with the expectations, FYI, here is some more info:
fd=openSync() and readFile twice from the same fd, the second read will get no data :-(fs.seek() API, I honestly thought there was but just realized its not there. I suspect it can't be implemented on Windows. However you can do positional reads, specifying the offset, so in a way there is an api to work with position. it works slightly differently on Windows and POSIX though: https://github.com/libuv/libuv/pull/1357@sam-github Thanks for the info!
- fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))
- that simple use does, but if you fd=openSync() and readFile twice from the same fd, the second read will get no data :-(
Just for the record. This behavior is unexpected for me and feels like a bug
Should we create an issue to discuss what we're going to do with readFile and fds (document it or change the behavior)?
IMO The expected cases should be fixed and behavior changed according to the current documentation. readFile should read whole file. writeFile should write whole file.
The weird edge cases with non-seekable fd's (files without a beginning) should be documented (assuming they already kind of work as currently documented).
EDIT: read @sam-github's summary https://github.com/nodejs/node/issues/13572#issuecomment-307542356
@nodejs/fs What should happen with this? Close it? Doc change? Code change? Discussion issue? Something else?
Below are some observations but to summarize: it doesn't matter what code changes we make, we'll still need to document the gotchas.
The example is about sync methods; an async version (or a mix of sync and async) would be rife with race conditions because concurrent file operations would observe the file position moving around.
fs.readFile() could be taught to use positional reads (fs.readFileSync() already does) but positional reads aren't atomic on Windows so (1) rears its ugly head again.
fs.appendFile() and fs.writeFile() in append-only mode are inherently non-positional and therefore also suffer from (1)...
...as does any file that isn't seekable for whatever reason.
@bzoz Would it be possible to teach uv_fs_read() and uv_fs_write() to do positional ops without affecting the file position? ReadFile() and WriteFile() take an OVERLAPPED that lets you specify the offset, don't they?
@bnoordhuis positional ReadFile/WriteFile moves the file pointer, thus we have a workaround with restoring the position in https://github.com/libuv/libuv/pull/1357
We can make readFile use positional reads, using the same fd in multiple async operations is asking for trouble anyway. The current behavior however however has been in Node for quite some time, it is possible that we would break something with such change.
OTOH, maybe we should just add fs.seek?
New discussion is in https://github.com/nodejs/node/issues/23433 so I'm going to close this out.
Most helpful comment
@sam-github Thanks for the info!
Just for the record. This behavior is unexpected for me and feels like a bug