Node: process.stdin example from docs no longer works in node 10

Created on 3 May 2018 · 24Comments · Source: nodejs/node

Version: 10.0.0 (using nvm)
Platform: 4.13.0-39-generic #44~16.04.1-Ubuntu
Subsystem:

I am running process.stdin example from docs:

process.stdin.setEncoding('utf8');

process.stdin.on('readable', () => {
  const chunk = process.stdin.read();
  if (chunk !== null) {
    process.stdout.write(`data: ${chunk}`);
  }
});

process.stdin.on('end', () => {
  process.stdout.write('end');
});

I expect the same results as in node v9 and below: command line should wait for my input and return it prefixed with data:. Instead, process is closed right away. I believe this is regression as example works fine in node v9 and v8.

_{(edited by @addaleax: syntax highlighting)}

confirmed-bug process stream v10.x

Source

apieceofbart

👍5

Most helpful comment

I'd like to work on the issue, I seem to have found the underlying cause and hopefully a correct solution =). Will submit a PR soon.

lundibundi on 6 Jul 2018

👍2

All 24 comments

Bisecting this turned into trisecting, because it seems like we introduced this issue gradually:

1e0f3315c77033ef0e01bb37c3d41c8e1d65e686 made a first change in behaviour for the test case, because now only the first chunk of data is being read before the process exits on its own
- 4383c0add89c9075a7c195d07f2c7dfe595f811b from #19377 already takes care of this part of the problem. Yay!
0778f79cb37526c3f4f8bff525fc4d4ca9b86e78 made the change in behaviour to break this fully. I’m not quite sure (yet) why this is happening, though.

@nodejs/streams

addaleax on 3 May 2018

I finally narrowed down the problem I was having with Node 10 and node-pg-query-stream to this and lo and behold someone else had already reported it! What I noticed was that I'd get the _first_ batch of results out of the stream, but then it would abort prematurely without emitting end.

The commit message for 0778f79 looks at odds with the conditional if (!state.destroyed && (state.length || state.ended)) though -- if it's not supposed to emit readable when the stream has ended, shouldn't it be negating state.ended at minimum? In the testcase accompanying the change it looks like the conditional might be satisfied by state.length.

dmfay on 4 May 2018

Can confirm, Changing the condition to if (!state.destroyed && (state.length || !state.ended)) causes the first chunk of data to be read, after which the process exits (which I suppose is fixed by 4383c0a as @addaleax mentioned)

AyushG3112 on 4 May 2018

I think the idea behind (state.length || state.ended) was that we do emit the readable event both for incoming data, and for the end-of-data marker. So readable is supposed to be emitted even if the stream has ended, just not after it has been destroyed.

/cc @nodejs/streams @mafintosh

addaleax on 4 May 2018

My commit message there ended up a bit confusing after some back and
fourths in the PR sorry. The intent there in the if to add a generic
conditional of when a readable event is okay to emit (stream not destroyed,
and has data or is ended).

Can someone share a concrete example that is failing now? Then I'll help
take a look.

On Fri, May 4, 2018, 09:35 Anna Henningsen notifications@github.com wrote:

I think the idea behind (state.length || state.ended) was that we do emit
the readable event both for incoming data, and for the end-of-data
marker. So readable is supposed to be emitted even if the stream has
ended, just not after it has been destroyed.

/cc @nodejs/streams https://github.com/orgs/nodejs/teams/streams
@mafintosh https://github.com/mafintosh

—
You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
https://github.com/nodejs/node/issues/20503#issuecomment-386525906, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAW_VeW1n2TbpOixo-9h8fs-ZT374Y5_ks5tvATZgaJpZM4Txle2
.

mafintosh on 4 May 2018

@mafintosh The PR description has a standalone test case, if you want

addaleax on 4 May 2018

@addaleax ah, didn't see that in the email thread but indeed here.

mafintosh on 4 May 2018

https://github.com/nodejs/node/commit/1e0f3315c77033ef0e01bb37c3d41c8e1d65e686 and https://github.com/nodejs/node/commit/4383c0add89c9075a7c195d07f2c7dfe595f811b should stay in ideally (with a fix if needed).

mcollina on 4 May 2018

@mafintosh this might be more involved than you're looking for but the problem I was having is reducible to the "fast-reader" test case in https://github.com/brianc/node-pg-query-stream if you check out that repository and run the tests.

dmfay on 4 May 2018

Any update on that? It breaks some of our tools in development

klimashkin on 22 Jun 2018

I'd like to work on the issue, I seem to have found the underlying cause and hopefully a correct solution =). Will submit a PR soon.

lundibundi on 6 Jul 2018

👍2

Fixed in fe47b8b

mcollina on 10 Aug 2018

🎉1

Version: v11.6.0
Platform: Linux 4.19.12-arch1-1-ARCH #1 SMP PREEMPT Fri Dec 21 13:56:54 UTC 2018

The process.stdin example that triggered this issue still does not work for me in v11.6.0: it still only accepts one character before the program immediately ends. In node v9 the program continuously waits and prints the characters keyed. Does the example need to be amended or is the fix not complete?

sergiogiogio on 4 Jan 2019

👍1

@sergiogiogio I’m not sure why it’s ending after one character in your case (are you using your TTY in raw mode in some way?), but yes, this currently only accepts a single line of input afaict and is therefore still broken.

Also cc @mcollina, the regression test in fe47b8b6a529233ada74dfb979e6164111737fc2 looks like it should test this but apparently it might not?

addaleax on 4 Jan 2019

Looking into it again … I think the example is buggy, and should look like this?

process.stdin.setEncoding('utf8');

process.stdin.on('readable', () => {
  let chunk;
  while ((chunk = process.stdin.read()) !== null) {
    process.stdout.write(`data: ${chunk}`);
  }
});

process.stdin.on('end', () => {
  process.stdout.write('end');
});

addaleax on 4 Jan 2019

I agree with @addaleax.

mcollina on 4 Jan 2019

Opened a documentation PR at https://github.com/nodejs/node/pull/25344.

addaleax on 4 Jan 2019

Thanks for the prompt follow-up and action on this issue. I have few issues though:
1- The modified example with the "while loop" even works in node v10 (i.e. before this fix): the program accepts multiple lines of input. Hence the question: Which test case is this fix addressing?
2- I am not sure the "while loop" is actually expected because the example works as expected with a filesystem stream, and a stdin stream should work the same way.

After few more tests, here are 2 notables differences between node v9 and v10 (and later):
1- in v9 the 'readable' and 'end' events are always active, while in v10 these events are activated at startup thanks to the 'on' function but we then need to explicitly call process.stdin.read() from the callback so that subsequent event can be triggered. The while loop can be replaced a single call to process.stdin.read() and the program then works, it seems that the second call process.stdin.read is needed to activate the 'readable' events

process.stdin.setEncoding('utf8');

process.stdin.on('readable', () => {
  console.log("readable");
  const chunk = process.stdin.read();
  console.log("read");
  if (chunk !== null) {
    process.stdout.write(`data: ${chunk}`);
  }
  process.stdin.read(); // <== without this function call the callback is not called and the program exists abruptly
});

process.stdin.on('end', () => {
  process.stdout.write('end');
});

2- in v9 the 'readable' event is fired immediately at program startup even without any user input (and process.stdin.read() returns null in this initial run), in v10 'readable' is not fired immediately at program startup but only upon actual user input (and process.stdin.read() returns the actual input in this initial run)

In conclusion I think the example is correct since it works fine on filesystem streams. There may be an issue with how callbacks are activated which would require further attention.

sergiogiogio on 5 Jan 2019

The modified example with the "while loop" even works in node v10 (i.e. before this fix): the program accepts multiple lines of input.

That’s because it’s arguably more correct – in fact, this should work for all streams since the introduction of streams 3 (i.e. streams with a readable event), including object-mode streams.

Hence the question: Which test case is this fix addressing?

I don’t quite understand this question.

I am not sure the "while loop" is actually expected because the example works as expected with a filesystem stream, and a stdin stream should work the same way.

I’m not sure, but this could also mean that you’re relying on a peculiarity of filesystem streams?

@mcollina One thing I’m wondering, .read() always returns all data, right? So shouldn’t a .read() call also trigger further _read()s on its own?

addaleax on 5 Jan 2019

I am not sure the "while loop" is actually expected because the example works as expected with a filesystem stream, and a stdin stream should work the same way.

The fact that they work is not assured, and it depends on low-level file system timing, the business of the operating system, the type of disk, caches, etc. The while loop is necessary in all cases.

@mcollina One thing I’m wondering, .read() always returns all data, right? So shouldn’t a .read() call also trigger further _read()s on its own?

It returns a chunk of data, the chunks are not concatenated. The while loop is necessary to consume all data, and it will trigger a new _read once the data in the buffer is below highWaterMark. When .read() returns null, a new 'readable' will be emitted when there is more data.

mcollina on 6 Jan 2019

@mcollina When .read() returns null, a new 'readable' will be emitted when there is more data.

Thanks a lot for this clarification - I believe it would be useful to make this explicit in the Stream documentation (https://nodejs.org/docs/latest-v11.x/api/stream.html#stream_event_readable), and maybe consider updating the example there too. Beyond Stream and Process, a few more documentation pages use "if" instead of "while" and may benefit from an update too: Crypto (https://nodejs.org/api/crypto.html) and Buffer (https://nodejs.org/api/buffer.html)

@addaleax Hence the question: Which test case is this fix addressing?

My understanding now is that the original issue opened by @apieceofbart is caused by a wrong example rather than a node bug. The correct example (with the while loop) works fine for me in v10. So the code injected in https://github.com/nodejs/node/commit/fe47b8b6a529233ada74dfb979e6164111737fc2 is probably fixing another issue.

In any case thanks a lot for the help, I am satisfied with the change to "while" in my code so I will not bother you further on this issue.

sergiogiogio on 6 Jan 2019

The docs for this feature still aren't very clear...

Has anyone tried piping in a file longer than 65536 bytes via stdin?

The example from the docs shows this code:

process.stdin.on('readable', () => {
  let chunk;
  // Use a loop to make sure we read all available data.
  while ((chunk = process.stdin.read()) !== null) {
    process.stdout.write(`data: ${chunk}`);
  }
});

Note the code comment:

Use a loop to make sure we read all available data.

Perhaps it's stupid of me, but I assumed this meant the while loop there would cause it to read all of stdin, so I made some code like:

process.stdin.on('readable', () => {
  let chunk, chunks, content;
  chunks = [];
  // Use a loop to make sure we read all available data.
  while ((chunk = process.stdin.read()) !== null) {
    chunks.push(chunk);
  }
  content = chunks.join('');
  console.log(content.length);
  render(content);
});

What I see from this naïve code though is that my content.length is always 65536, but I know the file is longer than that - what I've got in content is truncated.

Unfortunately because the code was trying to render(content), exceptions prevented me immediately noticing what actually happens: which is that the on('readable' ...) event handler gets called multiple times... then eventually the end event when there is no more to read.

So I have to structure the code more like this:

      var chunk, chunks, content;
      chunks = [];
      process.stdin.on('end', () => {
        content = chunks.join('');
        console.error(content.length); // 88106 - yay!
        return content;
      });
      process.stdin.on('readable', function() {
        console.error('READABLE')
        while ((chunk = process.stdin.read()) !== null) {
          chunks.push(chunk);
          console.error('CHUNK')
        }
      });

This makes perfect sense, but I think something of this behaviour ought to be mentioned in the docs.

https://nodejs.org/api/process.html#process_process_stdin
Just says:

The process.stdin property returns a stream connected to stdin (fd 0). It is a net.Socket (which is a Duplex stream) unless fd 0 refers to a file, in which case it is a Readable stream.

and the docs for Readable streams waffle on quite a bit but I don't see anything about getting multiple readable events:
https://nodejs.org/api/stream.html#stream_readable_streams

I understand from the long discussion in the thread above that what the while loop is really for is to ensure that the stream doesn't end early, if there is initially nothing to read?

Perhaps this could all be made a bit clearer somehow.