Node: fs.readSync throws EOF Error instead of returning 0 (Windows, if STDIN is a pipe)

Created on 6 Nov 2020  ·  35Comments  ·  Source: nodejs/node

  • Version: 14.15.0
  • Platform: Windows 10 x86_64
  • Subsystem: fs

What steps will reproduce the bug?

Given the following node script:

/// script.js
const fs = require('fs');

console.error('Node: ' + process.execPath);
console.error('Version: ' + process.version);
console.error('Platform: ' + process.platform);

console.error('process.stdin: ' + process.stdin.constructor.name);

const fd = 0; // STDIN
const buffer = Buffer.alloc(4_096);

let read;
do {
  read = fs.readSync(fd, buffer, 0, buffer.length, null);
  if (read != 0) {
      console.log('STDIN: ' + buffer.slice(0, read).toString('utf-8'))
  }
} while (read != 0);

console.log('EOF was (normally) reached.')

Will behave differently based on how it is invoked:

$ node script.js
Node: C:\Program Files\nodejs\node.exe
Version: v14.15.0
Platform: win32
process.stdin: ReadStream
Data!
STDIN: Data!
^Z
EOF was (normally) reached.


$ echo 'Data!' | node script.js
Node: C:\Program Files\nodejs\node.exe
Version: v14.15.0
Platform: win32
process.stdin: Socket
STDIN: Data!
internal/fs/utils.js:308
    throw err;
    ^

Error: EOF: end of file, read
    at Object.readSync (fs.js:592:3)
    at Object.<anonymous> (H:\repro\test\script.js:14:13)
    at Module._compile (internal/modules/cjs/loader.js:1063:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1092:10)
    at Module.load (internal/modules/cjs/loader.js:928:32)
    at Function.Module._load (internal/modules/cjs/loader.js:769:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)
    at internal/main/run_main_module.js:17:47 {
  errno: -4095,
  syscall: 'read',
  code: 'EOF'
}

How often does it reproduce? Is there a required condition?

This issue happens 100% of the time when process.stdin is a Socket instance.

What is the expected behavior?

The expected and documented behavior is that fs.readSync returns 0 when EOF is reached.

What do you see instead?

Instead, an Error: EOF is thrown, which is unexpected.

Additional information

This problem was surfaced as users are starting to see failures on Windows at https://github.com/aws/aws-cdk/issues/11314.

We have additionally seen issues where in similar conditions (process.stdin is a socket/pipe), attempting to perform fs.readSync on file descriptor 0 (aka STDIN), occasionally results in unexpected EAGAIN errors being thrown, which might be related. That was highlighted in https://github.com/aws/aws-cdk/issues/5187 and we are currently working with a pretty ugly workaround that would be nice if removed (but that other issue is a lot more difficult to steadily reproduce, as it does not always trigger, and may be caused by side-effects of another operation).

All 35 comments

The stack trace of the error points at the following code:

https://github.com/nodejs/node/blob/c1da528bc25c9cc5a8240a7b4f136f5968f6e113/lib/fs.js#L590-L592

I basically expect the windows binding returns different codes for Socket than for ReadStream and in turn causes the difference in behavior.

You will get much better results with:

/// script.js
const fs = require('fs');

console.error('Node: ' + process.execPath);
console.error('Version: ' + process.version);
console.error('Platform: ' + process.platform);

console.error('process.stdin: ' + process.stdin.constructor.name);

const buffer = Buffer.alloc(4_096);

let read;
do {
    read = fs.readFileSync('/dev/stdin', buffer, 0, buffer.length, null);
    if (read != 0) {
        console.log('STDIN: ' + buffer.slice(0, read).toString('utf-8'));
    }
} while (read != 0);

console.log('EOF was (normally) reached.');

Don't fiddle with the 0, 1 and 2 file descriptors. They are open in non-blocking mode and libuv doesn't expect this when you are passing a file descriptor.

Hey @mmomtchev,

read = fs.readFileSync('/dev/stdin', buffer, 0, buffer.length, null);

This is interesting and all... But /dev/stdin is not a thing on Windows, and I cannot require WSL... 🤷🏻‍♂️
Or does fs have some magic tricks to make it look like this is a file that exists?

No, you are right, I can't think of any easy way to synchronously read from stdin on Windows
There are npm modules or you could go the async way
There is an awful solution here http://corpus.hubwiz.com/2/node.js/3430939.html that will simulate blocking access, repeating the call on EAGAIN and breaking the loop on EOF

But the real answer is that Node is built for async I/O

We already have the ugly code to loop on EAGAIN...
Unfortunately, the use-case I have does not lend itself to async exchange here... As there are ordering guarantees I must maintain.

The real thing I'm concerned with here, is how fs.readSync throws an EOF error... When it is documented to never do this and instead return 0.

It says that the descriptor you are passing must be opened in blocking mode.

It says that the descriptor you are passing must be opened in blocking mode.

Where though?

I'm looking at: https://nodejs.org/api/fs.html#fs_fs_readsync_fd_buffer_offset_length_position, and the only place on this page where a requirement for fd to be in blocking (or non-blocking for that matters) mode is on createReadStream and createWriteStream, none of which I am using...


In any case, in light of your confirming that we are doing things likely to trip libuv in weird and awkward ways, I've been implementing a more comprehensive (dirty) workaround in aws/jsii#2238 that also catches the EOF error; which would buy us time to figure out if/how we can proceed without needing to do synchronous I/O on our STD{IN,OUT,ERR}...

@RomainMuller in fact, the stdin is not opened in non-blocking mode on Windows, that is the case only on Linux and OSX, so maybe there is a solution to your problem

On Windows libuv does not properly handle pipes when operating in file mode. I wonder if it is supposed to, because normally, there is a separate pipe mode, but it is a trivial change. In any case, this issue should be taken to https://github.com/libuv/libuv
It happens because the writing process exits and closes the anonymous pipe.
Feel free to open it there, referencing this issue.

This is everything that is needed to make it work:

--- a/deps/uv/src/win/fs.c
+++ b/deps/uv/src/win/fs.c
@@ -918,7 +918,7 @@ void fs__read(uv_fs_t* req) {
     SET_REQ_RESULT(req, bytes);
   } else {
     error = GetLastError();
-    if (error == ERROR_HANDLE_EOF) {
+    if (error == ERROR_HANDLE_EOF || error == ERROR_BROKEN_PIPE) {
       SET_REQ_RESULT(req, bytes);
     } else {
       SET_REQ_WIN32_ERROR(req, error);

But then again, you will end up with Windows-specific JS code, which is not the Node way

@ronag is this supposed to work?

Sorry, sync api's is not my area of expertise.

It is not specific to synchronous mode. This is getting stranger by the minute.
On Linux this leaves stdin in non-blocking mode:

const fs = require('fs');
console.error('process.stdin: ' + process.stdin.constructor.name);
(async () => {
    let read;
    read = fs.createReadStream(null, { fd: 0 })
    for await (const r of read)
        console.log(r)
    console.log('EOF was (normally) reached.')
})();

but this leaves it in blocking mode:

const fs = require('fs');

(async () => {
    let read;
    read = fs.createReadStream(null, { fd: 0 })
    for await (const r of read)
        console.log(r)
    console.log('EOF was (normally) reached.')
})();

The difference is accessing process.stdin.constructor.name. Adding a simple console.error doesn't change anything.

Easiest way to check if a file descriptor is in blocking or non-blocking mode is cat /proc/<pid>/fdinfo/<fd> - look at flags, the fourth digit from the right to the left will be 4 for non-blocking mode

@RomainMuller, please disregard what I said, it seems that Node has a problem with handling consistently stdin across different situations and different platforms

@joyeecheung, the type of stdin is reassigned in the process.stdin getter??

If users are not supposed to be playing with 0, 1 and 2 file descriptors anymore, it should probably be mentioned, and maybe even add a warning
I have seen tons of examples with these, there is a widely used AsyncResource tutorial which does writeSync(1, ...)

So there are three separate issues

  • Should the type of stdin be reassigned in the process.stdin getter - meaning getting the value of process.stdin has side effects
  • Should stdin be a non-blocking tty on Linux - meaning that the 0, 1 and 2 fds become unusable for the user
  • A pipe stdin on Windows when used as a file, does not accept EPIPE as a valid EOF leading to @RomainMuller's original issue

The 2nd bullet point there has two tracking issues for it already (https://github.com/libuv/libuv/issues/2962 and https://github.com/libuv/libuv/issues/2062)

There are some examples I think have the same reason.
Windows 7 via bash from Git for Windows, Node v13.14.0
works as expected:

$ echo test | node.exe -p 'require("fs").readFileSync(0)'
$ echo test | node.exe -e 'require("fs").readFile(0, console.log)'

doesn't works:

$ curl -s http://ifconfig.me/all.json | node.exe -p 'require("fs").readFileSync(0)'
<...>
Error: EOF: end of file, read
    at Object.readSync (fs.js:578:3)
    at tryReadSync (fs.js:353:20)
    at Object.readFileSync (fs.js:390:19)
    at [eval]:1:15
    at Script.runInThisContext (vm.js:131:20)
    at Object.runInThisContext (vm.js:297:38)
    at Object.<anonymous> ([eval]-wrapper:10:26)
    at Module._compile (internal/modules/cjs/loader.js:1118:30)
    at evalScript (internal/process/execution.js:94:25)
    at internal/main/eval_string.js:23:3 {
  errno: -4095,
  syscall: 'read',
  code: 'EOF'
}

$ curl -s http://ifconfig.me/all.json | node.exe -e 'require("fs").readFile(0, console.log)'
<...>
[Error: EOF: end of file, read] {
  errno: -4095,
  code: 'EOF',
  syscall: 'read'
}

In all cases constructor name of stdin is Socket

@joyeecheung @ronag @mcollina

There are three separate issues here:

  • The first and most important, is that stdio seems to change type based on an auto-detection code. I think that this is a design decision by @joyeecheung . If this is indeed the case, than the users need clarification on the 0, 1 and 2 file descriptors. As, depending on the type of the stdio (pipe, tty, file...), these can switch to non-blocking mode, they should not be used anymore as they are not compatible with the file operations of libuv. This should be made clear, especially since I found at least one popular tutorial on AsyncResource which suggests that people make use of writeFileSync(1, ...) in order to avoid the asynchronous operation of console.log. Currently using the 0, 1 and 2 file descriptors will lead to completely random results, depending on the stdio's type and OS. This is not limited to writeFileSync, all synchronous and asynchronous file operations that bypass the ReadStream on these descriptors are affected.
    Changing this requires rewriting lots of code.
  • The second one is that fact that the stdio reassignment takes places in the getter of process.stdin / process.stdout. I think that this was an oversight when trying to lazy-load the ReadStream. So user code will behave in a radically different manner if one adds a console.log(process.stdin.constructor.name) as @RomainMuller did.
    Fixing this is trivial.

  • The status of the third issue - that a pipe stdin on Windows when used as a file, does not accept EPIPE as a valid EOF leading to @RomainMuller's original issue - depends on the decisions for the first two

The first and most important, is that stdio seems to change type based on an auto-detection code. I think that this is a design decision by @joyeecheung . If this is indeed the case, than the users need clarification on the 0, 1 and 2 file descriptors. As, depending on the type of the stdio (pipe, tty, file...), these can switch to non-blocking mode, they should not be used anymore as they are not compatible with the file operations of libuv. This should be made clear, especially since I found at least one popular tutorial on AsyncResource which suggests that people make use of writeFileSync(1, ...) in order to avoid the asynchronous operation of console.log. Currently using the 0, 1 and 2 file descriptors will lead to completely random results, depending on the stdio's type and OS. This is not limited to writeFileSync, all synchronous and asynchronous file operations that bypass the ReadStream on these descriptors are affected.
Changing this requires rewriting lots of code.

This is not really a design decision but a necessity. STDIO can be of different types and the operations on the file descriptors MUST match those cases. I stumbled upon this quite a lot myself. Check out https://www.npmjs.com/package/sonic-boom. This handles most of the edge cases related to use fd and writeFileSync. Note that what looks like _random_ results are in fact not random at all.

https://github.com/nodejs/node/blob/master/lib/internal/bootstrap/switches/is_main_thread.js contains all the logic that create a different implementation depending on the type of the file descriptors.

Note that you can find a lot of __deeply incorrect information__ on blog posts about Node.js.

The second one is that fact that the stdio reassignment takes places in the getter of process.stdin / process.stdout. I think that this was an oversight when trying to lazy-load the ReadStream. So user code will behave in a radically different manner if one adds a console.log(process.stdin.constructor.name) as @RomainMuller did.
Fixing this is trivial.

Can you please articulate this? A PR would be welcomed. It's important that we lazy-load things to speed up the boostrap of Node.js.

The status of the third issue - that a pipe stdin on Windows when used as a file, does not accept EPIPE as a valid EOF leading to @RomainMuller's original issue - depends on the decisions for the first two

This looks like a bug. PR?

@mcollina

Can you please articulate this? A PR would be welcomed. It's important that we lazy-load things to speed up the boostrap of Node.js.

For example, currently, on Linux when stdin is a tty (ie without redirection), fd 0, 1 and 2 are blocking and support readFileSync(0, ...) and readFile(0, ...). As soon as one accesses process.stdin - for example with console.log(process.stdin.constructor.name), they switch to non-blocking mode and readFile(0, ...) fails with EAGAIN as non-blocking file descriptors are not supported by libuv. I don't think this is by design? Unless the design decision is that these descriptors are not to be used by the user code anymore which should be made clear? Maybe this is the best solution if these FDs are going go be handled behind the scenes, switching blocking mode on and off in a transparent way.

This looks like a bug. PR?

If the user is allowed to use FDs 0, 1 and 2, than yes, maybe. The problem is that when stdin is a pipe, it is handled as a Socket and libuv doesn't like it when one uses the file operations on it - this is the libuv issue I created - https://github.com/libuv/libuv/issues/3043 - but its status is not very clear as it is not clear whether this should be allowed or not

@mcollina , on Linux readFileSync('/dev/stdin') works perfectly in all cases, but there is no Windows equivalent. The readStream works perfectly in all cases too. Maybe the best decision is to simply disallow usage of 0, 1 and 2 instead of adding a huge amount of complexity - it already bad enough.

For example, currently, on Linux when stdin is a tty (ie without redirection), fd 0, 1 and 2 are blocking and support readFileSync(0, ...) and readFile(0, ...). As soon as one accesses process.stdin - for example with console.log(process.stdin.constructor.name), they switch to non-blocking mode and readFile(0, ...) fails with EAGAIN as non-blocking file descriptors are not supported by libuv.

It's totally possible to handle EAGAIN with synchronous code. See https://github.com/mcollina/sonic-boom/blob/master/index.js#L94-L105 for more details. Note that the sleep operation that is performed is https://github.com/davidmarkclements/atomic-sleep/blob/b8149d3ca276c84a54fa8fa1478f9cc79aabc15a/index.js#L18.

@mcollina , on Linux readFileSync('/dev/stdin') works perfectly in all cases, but there is no Windows equivalent. The readStream works perfectly in all cases too. Maybe the best decision is to simply disallow usage of 0, 1 and 2 instead of adding a huge amount of complexity - it already bad enough.

I'll be -1 to that.

Ok, there is a solution to the non-blocking problem - I find it awful since it is essentially a way around the fundamentals of libuv and Node, but never mind, it is possible and it seems to be what people do - there is another example of this in my comments above

In this case the only thing that needs solving is the pipe closing on Windows - if this is to be allowed - and in this case this is a libuv problem which should support reading from pipes in file mode - https://github.com/libuv/libuv/issues/3043

@mcollina Just to make it clear - the official position on that one is that FD 0, 1 and 2 can be both blocking or non-blocking depending on the situation and the OS, and it is up to the user code to handle this?

Just to make it clear - the official position on that one is that FD 0, 1 and 2 can be both blocking or non-blocking depending on the situation and the OS, and it is up to the user code to handle this?

Yes.

I don't think this is by design? Unless the design decision is that these descriptors are not to be used by the user code anymore which should be made clear? Maybe this is the best solution if these FDs are going go be handled behind the scenes, switching blocking mode on and off in a transparent way.

I don't remember what the original behavior was but it wasn't by design. I vaguely remember (could be wrong) that we used to turn them non-blocking from the start and allowing them to be blocking until you access process.stdin and friends should be just an artifact of lazy-loading them.

@joyeecheung, the current solution when it comes to FDs 0, 1 and 2 is absolutely horrible - ie they tend to transparently switch from blocking to non-blocking, requiring the user to reimplement the async read loop in his code by eventually microsleeping... However me too (and I didn't take @mcollina word for granted, I spent some time reading/testing) I came to the conclusion that there is simply no other solution. I still think that Node should not incite people to write horrible code, so this should not be official :smile: , but jokes apart, people are using these FDs, and most of the time they don't even realize something very weird is going on.

They only viable long-term solution I see is that libuv supports reading from regular files in non-blocking mode - something that would be totally useless except to give Node (and the user code) an uniform abstraction layer.

When it comes to the lazy-loading - at this point it doesn't change anything to not do it - the user code will still have to expect these FDs to be in either mode.

I will only try pushing a PR to libuv for the EOF problem on Windows - now that we all agree that reading from a pipe in file mode should be supported.

If the confusing part is that the FD switches from blocking to non-blocking once you access process.stdin etc. (due to lazy loading), I guess we could just do something early and make sure that the mode is consistently non-blocking for users (which is probably the old behavior).

Yes, it is exactly what made that issue so confusing for me. Changing this however this will have a detrimental performance effect on process creation for everyone and it won't really accomplish anything - the user will still need to support both blocking and non-blocking FDs

Changing it to non-blocking is often likely going to be problematic for any native C libraries that nodejs interacts with, as well as any programs that are execv from nodejs. Thus, I don't recommend doing that. It's also potentially quite problematic in the (very infrequent) case where some other external library calls dup2 to replace the stdio handles with something else.

Me too, as an old-school UNIX guy, I was initially appalled by the design decision to have a non-blocking stdio - it is something that is known to break things. But how do you implement pipes in libuv? They are currently built around net.Socket and net.Socket is always in non-blocking mode. And at this point it is not even about blocking or non-blocking - it is about it being sometimes blocking and sometimes non-blocking - which is the major problem. But having a consistent stdio requires that libuv fully supports both files, pipes and ttys in one of those modes - which is not the case. You are libuv, would you consider a PR that adds a completely useless support for blocking I/O on pipes/sockets/ttys (through threads) or non-blocking supports for files (that actually blocks and also requires threads)? And the only reason for doing it would be to have a consistent stdio in Node across all platforms? I am not sure it is worth the effort.
By the way, while researching this, I stumbled upon an effort by DJ Bernstein to standardize a new POSIX file API - that will have separate read_nonblocking / write_nonblocking calls instead of a flag attached to the descriptor. There was even an attempt to implement it in Linux (readv2 / writev2 calls) but the PR has been frozen for years.

the following patch appears to work (tested with piped curl)

fs.readSync = function(origReadSync) {
  return function newReadSync() {
    try {
      return origReadSync.apply(this, arguments);
    } catch (e) {
      if (e.code === 'EOF')
        return 0;
      else
        throw e;
    }
  };
}(fs.readSync);

@bughit it is a remarkably ugly solution, so I think it will be very appropriate to include it in the read loop with the microsleep 😄
@RomainMuller there is a work around for your issue, this is libuv, so the final fix won't happen before Node 16

Was this page helpful?
0 / 5 - 0 ratings

Related issues

danielstaleiny picture danielstaleiny  ·  3Comments

filipesilvaa picture filipesilvaa  ·  3Comments

cong88 picture cong88  ·  3Comments

loretoparisi picture loretoparisi  ·  3Comments

dfahlander picture dfahlander  ·  3Comments