Node: readFile(fd) does not read the whole file (at least on Windows, possibly other platforms)

Created on 18 Nov 2016  路  24Comments  路  Source: nodejs/node

  • Version: v7.1.0
  • Platform: Windows 10 64-bit
  • Subsystem: fs

I wrote a small test that passes on Mac and Linux but throws on Windows:

var fs = require('fs');
var buffer = Buffer.alloc(8192);
var path = 'test-sparse-read-file';
var fd = fs.openSync(path, 'w+');
console.log('created or truncated, sparse file size is now: ' + fs.fstatSync(fd).size);

console.log('writing 1 byte at offset ' + (8192 - 1));
fs.writeSync(fd, Buffer.alloc(1, 1), 0, 1, 8192 - 1);
console.log('sparse file size is now: ' + fs.fstatSync(fd).size + '\r\n');

console.log('reading 8192 bytes using readSync()...');
var bytesRead = fs.readSync(fd, buffer, 0, 8192, 0);
if (bytesRead !== 8192) {
  throw new Error('readSync() read ' + bytesRead + ' bytes');
} else {
  console.log('readSync() read ' + bytesRead + ' bytes');
}

console.log('\r\nreading entire file using readFileSync()...');
var result = fs.readFileSync(fd);
if (result.length !== buffer.length) {
  throw new Error('readFileSync returned a buffer with length ' + result.length);
} else {
  console.log('readFileSync returned a buffer with length ' + result.length);
}

fs.closeSync(fd);
fs.unlinkSync(path);
console.log('\r\nPASSED');

Linux:

created or truncated, sparse file size is now: 0
writing 1 byte at offset 8191
sparse file size is now: 8192

reading 8192 bytes using readSync()...
readSync() read 8192 bytes

reading entire file using readFileSync()...
readFileSync returned a buffer with length 8192

PASSED

Windows:

created or truncated, sparse file size is now: 0
writing 1 byte at offset 8191
sparse file size is now: 8192

reading 8192 bytes using readSync()...
readSync() read 8192 bytes

reading entire file using readFileSync()...
C:\Users\Joran\test-sparse.js:22
  throw new Error('readFileSync returned a buffer with length ' + result.length);
  ^

Error: readFileSync returned a buffer with length 0
    at Object.<anonymous> (C:\Users\Joran\test-sparse.js:22:9)
    at Module._compile (module.js:573:32)
    at Object.Module._extensions..js (module.js:582:10)
    at Module.load (module.js:490:32)
    at tryModuleLoad (module.js:449:12)
    at Function.Module._load (module.js:441:3)
    at Module.runMain (module.js:607:10)
    at run (bootstrap_node.js:420:7)
    at startup (bootstrap_node.js:139:9)
    at bootstrap_node.js:535:3

The test shows that the problem does not appear to be with read() but only with readFileSync().

readFile() has the same issue.

confirmed-bug fs

Most helpful comment

It seems I've found the cause.

fs.readFileSync calls tryReadSync() at line 485. tryReadSync() calls fs.readSync() at line 456, but with wrong arguments.

Compare fs.readSync() signature at line 622:

fs.readSync = function(fd, buffer, offset, length, position) {

and the call from tryReadSync() at line 456

bytesRead = fs.readSync(fd, buffer, pos, len);

so position is allways undefined, ie current.

All 24 comments

It seems sparseness is not the cause. This is also throws (Node.js 7.1.0 on Windows 7 x64):

var fs = require('fs');
var buffer = Buffer.alloc(8192);
var path = 'test-read-file';
var fd = fs.openSync(path, 'w+');
console.log('created or truncated, file size is now: ' + fs.fstatSync(fd).size);

console.log('writing 8192 bytes at offset ' + (0));
fs.writeSync(fd, Buffer.alloc(8192, 1), 0, 8192, 0);
console.log('file size is now: ' + fs.fstatSync(fd).size + '\r\n');

console.log('reading 8192 bytes using readSync()...');
var bytesRead = fs.readSync(fd, buffer, 0, 8192, 0);
if (bytesRead !== 8192) {
  throw new Error('readSync() read ' + bytesRead + ' bytes');
} else {
  console.log('readSync() read ' + bytesRead + ' bytes');
}

console.log('\r\nreading entire file using readFileSync()...');
var result = fs.readFileSync(fd);
if (result.length !== buffer.length) {
  throw new Error('readFileSync returned a buffer with length ' + result.length);
} else {
  console.log('readFileSync returned a buffer with length ' + result.length);
}

fs.closeSync(fd);
fs.unlinkSync(path);
console.log('\r\nPASSED');

It seems the cause is this: after writeSync and readSync, the readFileSync reads from their end fd position. See how error message changes in this case:

// ...
console.log('reading 8191 bytes using readSync()...');
var bytesRead = fs.readSync(fd, buffer, 0, 8191, 0);
if (bytesRead !== 8191) {
  throw new Error('readSync() read ' + bytesRead + ' bytes');
} else {
  console.log('readSync() read ' + bytesRead + ' bytes');
}
// ...
Error: readFileSync returned a buffer with length 1

Or in this case:

// ...
console.log('reading 1 bytes using readSync()...');
var bytesRead = fs.readSync(fd, buffer, 0, 1, 0);
if (bytesRead !== 1) {
  throw new Error('readSync() read ' + bytesRead + ' bytes');
} else {
  console.log('readSync() read ' + bytesRead + ' bytes');
}
// ...
Error: readFileSync returned a buffer with length 8191

While if you use path string instead of fd in your initial code all is OK:

// ...
var result = fs.readFileSync('test-sparse-read-file');
// ...
created or truncated, sparse file size is now: 0
writing 1 byte at offset 8191
sparse file size is now: 8192

reading 8192 bytes using readSync()...
readSync() read 8192 bytes

reading entire file using readFileSync()...
readFileSync returned a buffer with length 8192

PASSED

So the bug seems to be this one: readFile() and readFileSync() don't rewind fd position after writeSync and readSync on the same fd on Windows, while they do rewind on Mac and Linux (I can't confirm the last though).

Your script at https://github.com/nodejs/node/issues/9671#issuecomment-261521377 passes on OS X, I don't have a Windows to try on.

The function used to read files is fs.readSync(fd, buffer, pos, len), it calls http://man7.org/linux/man-pages/man2/pread.2.html, and it reads from a specific offset, and it does not change the file offset (per docs). In other words, no explicit "rewinding" of the fd is needed on Linux, readFile reads from explicit offsets, not the O/S maintained offset.

I wonder, could uv's emulation of pread for Windows not be correct, somehow? Could you try some experiments with fs.read/fs.readSync to see if they work incorrectly, bypassing the readFile wrappers?

@sam-github I'll try, but let's simplify it first. This throws with Node.js 7.1.0 on Windows 7 x64:

'use strict';

const assert = require('assert');
const fs = require('fs');

const filleLength = 2;
const path = 'test-read-file';
const fd = fs.openSync(path, 'w+');

fs.writeSync(fd, Buffer.alloc(filleLength, 1), 0, filleLength, 0);

assert.strictEqual(fs.readFileSync(fd).length, filleLength);
AssertionError: 0 === 2

Does it throw on Linux or Mac?

Does not throw on unixen, as expected.

Please remove the complex ball of js code that is readFile* from the picture, and replace that last line with assert.equal(filleLength, fs.readSync(fd, buf, 0, 10, 0), which I strongly suspect will not pass on Windows.

So, if I get it right, with some corrections (buf => Buffer.alloc(filleLength, 1) and 10 => filleLength), this is the new test code:

const assert = require('assert');
const fs = require('fs');

const filleLength = 2;
const path = 'test-read-file';
const fd = fs.openSync(path, 'w+');

fs.writeSync(fd, Buffer.alloc(filleLength, 1), 0, filleLength, 0);

assert.equal(filleLength, fs.readSync(fd, Buffer.alloc(filleLength, 1), 0, filleLength, 0));

It passes with Node.js 7.1.0 on Windows 7 x64. But this throws (nota bene position is null, so it is current after writeSync):

const assert = require('assert');
const fs = require('fs');

const filleLength = 2;
const path = 'test-read-file';
const fd = fs.openSync(path, 'w+');

fs.writeSync(fd, Buffer.alloc(filleLength, 1), 0, filleLength, 0);

assert.equal(filleLength, fs.readSync(fd, Buffer.alloc(filleLength, 1), 0, filleLength, null));
AssertionError: 2 == 0

Does this last code throw on Linux or Mac? It should, if I get it right. So maybe we should return to previous one.

/to @nodejs/platform-windows

and it does not change the file offset (per docs).

Are there any fs functions that _are_ supposed to increment file offset?

It seems I've found the cause.

fs.readFileSync calls tryReadSync() at line 485. tryReadSync() calls fs.readSync() at line 456, but with wrong arguments.

Compare fs.readSync() signature at line 622:

fs.readSync = function(fd, buffer, offset, length, position) {

and the call from tryReadSync() at line 456

bytesRead = fs.readSync(fd, buffer, pos, len);

so position is allways undefined, ie current.

@seishun fs.read/write[Sync] update the file offset, they use read(2) and write(2).

@vsemozhetbyt got time to PR a fix + regression test? It would be great if you did.

I'm afraid I have not the experience and knowledge required to edit the core and to make appropriate tests(

@sam-github I see. I guess the documentation could be clearer on this.

So the issue is that fs.read/write[Sync] shouldn't change current offset when reading/writing to/from a specific offset, but they do on Windows?

@seishun No, it appears from https://github.com/nodejs/node/issues/9671#issuecomment-261568603 that the issue is that the position is not actually being specified to fs.read in readFileSync, leading to the current OS maintained offset being used. The OS offset is correct only under assumption that readFile was called with a path, opened a new file, so started from offset of 0. Why this is windows specific, I don't know, perhaps because of different buffering strategies?

Or so it is described above, I might have time next week to look at this if nobody else grabs it.

Sorry for my previous deleted comment, I do can reproduce the failure with the readFile:

const assert = require('assert');
const fs = require('fs');

const filleLength = 2;
const path = 'test-read-file';
const fd = fs.openSync(path, 'w+');

fs.writeSync(fd, Buffer.alloc(filleLength, 1), 0, filleLength, 0);

fs.readFile(fd, (err, buf) => {
  if (err) throw err;
  assert.strictEqual(buf.length, filleLength);
});
AssertionError: 0 === 2

I'll try to find the cause.

It seems there's a bug on Linux too:

'use strict';

const assert = require('assert');
const fs = require('fs');

const fileLength = 2;
const path = 'test-read-file';
const fd = fs.openSync(path, 'w+');

fs.writeSync(fd, Buffer.alloc(fileLength, 1), 0, fileLength); // note: not specifying offset

assert.strictEqual(fs.readFileSync(fd).length, fileLength);

readFileSync is supposed to read "the entire contents of a file", but it actually reads starting from current offset.

I've tried to track down readFile() calls chain and I've ended up here:

binding.read(this.fd, buffer, offset, length, -1, req);

Somehow position is -1. Should it be this.pos?

@vsemozhetbyt -1 here means "read from current file position". It works when the initial file position is 0, but not in this case, when the file is already "seeked" ("sought"? ugh).

@seishun So, if ReadFileContext is used only in fs.readFile, this -1 in the ReadFileContext.prototype.read could be safely replaced by this.pos (ie 0 for the first call), because fs.readFile always must read from the start position?

It seems so. At any rate, tests pass with this.pos instead of -1.

Reopening because #9699 had to be reverted in #10809

Um how did this get closed?

This should remain open?

The provided test case no longer reproduces on master, so closing.

Was this page helpful?
0 / 5 - 0 ratings