Node: 'fs' can not read data if same FD is used for write and readFile

Created on 9 Jun 2017  路  20Comments  路  Source: nodejs/node

  • Version: v6.11.0
  • Platform: Darwin computer.local 16.6.0 Darwin Kernel Version 16.6.0: Fri Apr 14 16:21:16 PDT 2017; root:xnu-3789.60.24~6/RELEASE_X86_64 x86_64
  • Subsystem: fs

Description:

https://github.com/nodejs/node/pull/3163 added file descriptor support to *File() functions (for example fs.readFile).

Issue: Writing then reading on the same FD does not read the data when using fs.readFile(fd). Only reading (witout writing) works as expected.

The demo opens a file descriptor, writes data to it and tries to read the data using the same FD and fs.readFile. The demo code uses sync methods for simplicity, non-sync version has the same behavior.

const fs = require('fs');
const path = require('path');

const filepath = path.resolve(__dirname, './test-write-and-read.txt');

// 'w+' - Open file for reading and writing.
// The file is created (if it does not exist)
// or truncated (if it exists).
const fd = fs.openSync(filepath, 'w+');

// 'r+' - Open file for reading and writing.
// An exception occurs if the file does not exist.
// const fd = fs.openSync(filepath, 'r+');

// fs.writeFileSync(fd, 'foo bar'); // no issue
fs.writeSync(fd, 'FOOBAR'); // ISSUE: readFileSync(fd) will not read content

// fs.fsyncSync(fd); // <- seems to have no impact

// always reads content (no issue)
const buf = Buffer.alloc(20);
fs.readSync(fd, buf, 0, 20, 0);
console.log('read with fd:', buf.toString())

// always reads content (no issue)
console.log('readFileSync with filepath:', fs.readFileSync(filepath, { encoding: 'utf8' }));

// ISSUE: does not read content if previously writeSync(fd, 'foo bar')
console.log('readFileSync with fd:', fs.readFileSync(fd, { encoding: 'utf8' }));

fs.closeSync(fd);

Expected output:

read with fd: FOOBAR
readFileSync with filepath: FOOBAR
readFileSync with fd: FOOBAR

Actual output:

read with fd: FOOBAR
readFileSync with filepath: FOOBAR
readFileSync with fd:

(last line is missing the file content FOOBAR)

Note:

  • fs.fsync seems to have no impact. Assuming this could help and flush the data to the filesystem
  • readFileSync with fd does successfully read data if previously no data has been written to the FD before and the test file contains content.
fs

Most helpful comment

@sam-github Thanks for the info!

  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))

    • that simple use does, but if you fd=openSync() and readFile twice from the same fd, the second read will get no data :-(

Just for the record. This behavior is unexpected for me and feels like a bug

All 20 comments

Can reproduce with Node.js 8.1.0 on Windows 7 x64.

This is something like https://github.com/nodejs/node/issues/7099, but not valid for readFile* cases: these methods should not read from the current fd position.

cc @nodejs/fs

@thisconnect Flushing the data to the filesystem seems unneeded, the file size is correctly changed before reading, if I get this right:

const fs = require('fs');
const path = require('path');

const filepath = path.resolve(__dirname, './test-write-and-read.txt');
const fd = fs.openSync(filepath, 'w+');

console.log(fs.statSync(filepath).size);
fs.writeSync(fd, 'FOOBAR');
console.log(fs.statSync(filepath).size);
console.log(fs.readFileSync(fd, { encoding: 'utf8' }));
0
6
  // empty string
  1. The cause for readFileSync() seems to be this: in this call chain the reading position is undefined:

https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L576
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L542
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L680

Which may fall into this note from the doc:

If position is null, data will be read from the current file position.

Does the binding function equal undefined and null here?

  1. For the fs.readFile() the same happens in this call chain:

https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L348
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L426
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L458
https://github.com/nodejs/node/blob/47b9772f52aba7693eed4df535b35de65ac22c49/lib/fs.js#L396

Does the binding function equal -1 and null here?

It seems we should transfer 0 as a position in these both cases.

This happens because when called with an fd, readFile[Sync] reads from the current file position. I tried to document this behavior in https://github.com/nodejs/node/pull/10853, but it got stalled and then closed.

It seems this better be fixed than documented. readFile[Sync] are supposed to return the full file content according to method names, and the current behavior seems confusing.

@vsemozhetbyt Please read the discussion in that PR. TL;DR some fd are unseekable, so the behavior would be inconsistent between seekable and unseekable fds.

Seems to be a duplicate of the https://github.com/nodejs/node/issues/9671

@thisconnect are you seeing the same failure in 6.10.x?

edit: this appears to fail on 6.10.x and 6.9.x... so it is the same older bug not a new regression

@MylesBorins yes same on 6.10.x

@thisconnect Flushing the data to the filesystem seems unneeded, the file size is correctly changed before reading, if I get this right

@vsemozhetbyt flushing was just an attempt to rule out this being a possible problem. I too get correct sizes with stats.

After reading why to not use fs.exists (see RECOMMENDED section of fs.exists in https://nodejs.org/api/fs.html#fs_fs_exists_path_callback) wrote my own small(ish) module, that nobody uses :) . See fildes (https://github.com/thisconnect/fildes), it promises some fs methods, and takes care about creating missing dir structure if flag is 'w', 'w+', 'a' or 'a+'. I was exited that readFile and writeFile support FD, that landed in Node 5.x and started using it with Node 6.x LTS.

I wasn't aware about the fact that fd have something like a position or could open non-seekable files (sockets?). After reading through the issues https://github.com/libuv/libuv/pull/1357 , #7099, #9671, #9699, #10809, #10853

As a naive Node.js user, this is what I suspect:

  • fs.readFile should read the whole file, else seems like a bug
  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))
  • FD's having something like a "current file position" is not a feature nor documented (afaik)
  • if FD would have postion support, there would be other API needed to work with it
  • fs.read should be used if one does not want the whole file
  • fs.read and fs.write chaning the position is unexpected
  • fs.readFile non-seekable should either read everything it can or throw an error

I pretty much agree with the expectations, FYI, here is some more info:

  • fs.readFile should read the whole file, else seems like a bug

    • agreed

  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))

    • that simple use does, but if you fd=openSync() and readFile twice from the same fd, the second read will get no data :-(

  • FD's having something like a "current file position" is not a feature nor documented (afaik)

    • Its fundamental to POSIX fd handling, and as such, not documented in the node fs docs, but it is indeed a feature. fds come in two flavours, seekable and not-seekable. files are seekable, because random access is possible, sockets and pipes are not seekable, you can only read the next data in the stream.

  • if FD would have postion support, there would be other API needed to work with it

    • there should probably be a fs.seek() API, I honestly thought there was but just realized its not there. I suspect it can't be implemented on Windows. However you can do positional reads, specifying the offset, so in a way there is an api to work with position. it works slightly differently on Windows and POSIX though: https://github.com/libuv/libuv/pull/1357

  • fs.read should be used if one does not want the whole file

    • agreed

  • fs.read and fs.write chaning the position is unexpected

    • well, undocumented, yes, but expected (EDIT: at least by those familiar with the underlying C calls) because its what happens on Windows, Linux,and all other runtimes that use fds to read and write files (all of them?)

  • fs.readFile non-seekable should either read everything it can or throw an error

    • agreed

@sam-github Thanks for the info!

  • fs.readFileSync(path) should equal fs.readFileSync(fs.openSync(path))

    • that simple use does, but if you fd=openSync() and readFile twice from the same fd, the second read will get no data :-(

Just for the record. This behavior is unexpected for me and feels like a bug

Should we create an issue to discuss what we're going to do with readFile and fds (document it or change the behavior)?

10853 already dicusses, it could be reopened.

IMO The expected cases should be fixed and behavior changed according to the current documentation. readFile should read whole file. writeFile should write whole file.

The weird edge cases with non-seekable fd's (files without a beginning) should be documented (assuming they already kind of work as currently documented).

10853 is already too long and hard to follow. I suggest to open a new issue with a proposal like this (if you agree) that people can agree or decline. To me this is a bug and should be fixed in 6.x too

EDIT: read @sam-github's summary https://github.com/nodejs/node/issues/13572#issuecomment-307542356

@nodejs/fs What should happen with this? Close it? Doc change? Code change? Discussion issue? Something else?

Below are some observations but to summarize: it doesn't matter what code changes we make, we'll still need to document the gotchas.

  1. The example is about sync methods; an async version (or a mix of sync and async) would be rife with race conditions because concurrent file operations would observe the file position moving around.

  2. fs.readFile() could be taught to use positional reads (fs.readFileSync() already does) but positional reads aren't atomic on Windows so (1) rears its ugly head again.

  3. fs.appendFile() and fs.writeFile() in append-only mode are inherently non-positional and therefore also suffer from (1)...

  4. ...as does any file that isn't seekable for whatever reason.

@bzoz Would it be possible to teach uv_fs_read() and uv_fs_write() to do positional ops without affecting the file position? ReadFile() and WriteFile() take an OVERLAPPED that lets you specify the offset, don't they?

@bnoordhuis positional ReadFile/WriteFile moves the file pointer, thus we have a workaround with restoring the position in https://github.com/libuv/libuv/pull/1357

We can make readFile use positional reads, using the same fd in multiple async operations is asking for trouble anyway. The current behavior however however has been in Node for quite some time, it is possible that we would break something with such change.

OTOH, maybe we should just add fs.seek?

New discussion is in https://github.com/nodejs/node/issues/23433 so I'm going to close this out.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

fanjunzhi picture fanjunzhi  路  3Comments

akdor1154 picture akdor1154  路  3Comments

vsemozhetbyt picture vsemozhetbyt  路  3Comments

danialkhansari picture danialkhansari  路  3Comments

srl295 picture srl295  路  3Comments