Although it's not technically documented, a lot of code assumes that once a stream emits an 'error' event, it will not subsequently emit 'data' or 'end' events. This is pretty much necessary, because otherwise it's impossible for a stream consumer to know when a stream has come to rest and will emit no more events. I've reproduced a case where Node reliably emits an 'end' event after an 'error' event, which violates the expectations of code that assumes a stream has come to rest after 'error'.
To be really precise about where I tested it:
Works as expected on v0.10 (perhaps only because the 'end' event is emitted first):
Does not work as expected on v0.12 and later:
Here's a test case that's commented with what it's doing. I've tried to simplify it as much as possible, but since it's a race condition, it's tricky to get the timing just right.
/*
* test-stream.js: demonstrates a case where a Node stream can see an 'end'
* event after an 'error' event.
*
* This test case works as follows:
*
* (1) Set up a TCP server socket and connect to it using a TCP client.
* Server: set up listeners for 'end' and 'error'.
* Client: set up listener for 'error'.
*
* (2) Client: write 65536 bytes of data.
*
* (3) Pause one second. Behind the scenes, Node will detect that the
* server's socket has become readable and read all 65536 bytes. These
* will be buffered in JavaScript.
*
* (4) Server: read 65535 bytes of data from the Socket. There will be
* one byte left buffered on the Socket in JavaScript.
* Client: destroy the socket. This will send a FIN to the server.
*
* (5) Asynchronously (via setImmediate):
* Server: read 1 byte of data from the Socket. This will trigger Node
* to read from the underlying socket again, where it will read 0
* bytes, signifying the end of the stream.
*
* Server: write data to the socket. Since the socket is now
* disconnected, eventually these writes will report EPIPE/SIGPIPE.
* This generally happens synchronously with respect to the write()
* call, but the error will be emitted asynchronously.
*
* (6) Asynchronously (via setImmediate):
* Server: read another byte from the socket. At this point, we're
* reading past end-of-stream, and Node will schedule an 'end' event to
* be emitted, but an 'error' event has already been scheduled as well,
* so we'll see 'error' and then 'end', which should be invalid.
*/
var mod_net = require('net');
var mod_os = require('os');
/* IP address and port used for this test case. */
var ip = '127.0.0.1';
var port = 16404;
/* We'll use this buffer as a chunk of data. */
var bufsz = 64 * 1024;
var buf;
/* State for this test */
var server; /* server's listening socket */
var ssock; /* server's connection socket */
var csock; /* client socket */
var end = false; /* server has seen "end" on its connection socket */
var error = false; /* server has seen "error" on its connection socket */
function main()
{
console.log('versions:',
process.version, process.arch, mod_os.platform());
buf = new Buffer(bufsz);
buf.fill(0);
/*
* (1) Set up client and server.
*/
server = mod_net.createServer({ 'allowHalfOpen': true });
server.on('connection', function (s) {
console.log('server: client connected');
ssock = s;
ssock.on('end', function () {
console.log('server: saw "end" on client socket');
if (error) {
console.log('reproduced issue!');
process.abort();
}
end = true;
});
ssock.on('error', function (err) {
console.log('server: saw "error" on client socket', err);
if (error || end) {
console.log('bailing out after server error');
process.exit(0);
}
// ssock.read(1);
error = true;
});
/*
* (2) Client writes data.
*/
csock.write(buf);
/*
* (3) Pause until the server sees that data.
*/
ssock.once('readable', triggerIssue);
});
server.listen(port, function () {
console.log('server: listening');
csock = mod_net.createConnection(port, ip);
csock.on('connect', function () {
console.log('client: connected');
});
csock.on('end', function () {
console.log('client: saw "end" on server socket');
});
});
}
function triggerIssue()
{
console.log('triggering issue by destroying client socket');
/*
* (4) Read _most_ of the data from the socket and have the client
* destroy the socket.
*/
ssock.read(bufsz - 1);
csock.destroy();
setImmediate(function () {
/*
* (5) Read 1 byte of data from the socket and write data to it.
*/
ssock.read(1);
ssock.write(buf);
ssock.write(buf);
setImmediate(function () {
/*
* (6) Read one more byte.
*/
ssock.read(1);
});
});
}
main();
The detailed output for each test I ran is here:
https://gist.github.com/davepacheco/84d450d2c25f6212a99a984a8f089b4c
@nodejs/streams Is this a bug? Documentation issue? Neither? Both?
I would need to look into it.
Isn鈥檛 this unspecified and depends on the individual stream?
As @michaelnisi noted, this is unspecified and depends on the individual stream. To the best of my knowledge, there is no mechanism/enforcement in net to prevent this. For Writable聽 and Duplex聽 streams (including sockets), there is the guarantee that 'error' will only be emitted once.
Also, 'error' will be emitted before 'end', 'close' and 'finish'. The correct approach is to remove all the listeners when 'error' happens.
That being said, I cannot reproduce the issue on node v4.4.2 (the one which output was linked in the issue), 4.8, 6.11, 8, on Mac OS X 10.12. @davepacheco is this still an issue? If that's the case, can you update the script to reproduce it, or maybe send a PR with a failing test?
The script currently hang with:
versions: v4.0.0 x64 darwin
server: listening
server: client connected
client: connected
triggering issue by destroying client socket
server: saw "end" on client socket
Is there documentation that stream consumers should remove their 'close', 'end', and 'finish' listeners when 'error' happens? It seems like that makes using streams robustly a lot more complex.
I'm not seeing this on OS X El Capitan (v10.11.6), but I think that just means something about the timing has changed, not that the underlying issue was OS-specific or went away.
@davepacheco No, I do not think there is. However, most streams respect this generic pattern. Given the code we have in 8, I do not see how that can happen, as we destroy the handle synchronously in case of an internal error. I would be happy to take a look if you can reproduce anyhow.
The implementations in core should not have the problem you are describing, so it is a bug in that case. It might even be OSS specific, and maybe in libuv.
Hello
We know anyone can connect to this server
While we need connect special user
How do we can prevent unwanted connection with known user?
There has been no movement on this in nearly a year and no reports of it happening again. I'm going to close it out but if any new info has surfaced or you believe this should remain open, please feel free to do so.
Has anybody re-run the test program to see if it's still happening?
@davepacheco The situation has changed. The script hangs:
$ node s
versions: v4.8.0 x64 darwin
server: listening
server: client connected
client: connected
triggering issue by destroying client socket
server: saw "end" on client socket
Or I'm getting a nice abort:
$ node s
versions: v8.11.1 x64 darwin
server: listening
server: client connected
client: connected
triggering issue by destroying client socket
server: saw "error" on client socket { Error: write EPIPE
at _errnoException (util.js:1022:11)
at WriteWrap.afterWrite [as oncomplete] (net.js:880:14) code: 'EPIPE', errno: 'EPIPE', syscall: 'write' }
server: saw "end" on client socket
reproduced issue!
1: node::Abort() [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
2: node::Chdir(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
3: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
4: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
5: v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/matteo/.nvm/versions/node/v8.11.1/bin/node]
6: 0x7a6215042fd
Abort trap: 6
This is also confirmed on master.
Here is some output from llnode:
* thread #1, stop reason = signal SIGSTOP
* frame #0: 0x00007fff73a70b6e libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff73c3b080 libsystem_pthread.dylib`pthread_kill + 333
frame #2: 0x00007fff739cc1ae libsystem_c.dylib`abort + 127
frame #3: 0x0000000100028b77 node`node::Abort() at node.cc:1550 [opt]
frame #4: 0x000000010002c7e2 node`node::Abort(args=<unavailable>) at node.cc:1572 [opt]
frame #5: 0x00000001001a3033 node`v8::internal::FunctionCallbackArguments::Call(this=0x00007ffeefbfe210, handler=<unavailable>) at api-arguments.cc:29 [opt]
frame #6: 0x00000001001fdc0f node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(isolate=0x0000000102801e00, function=<unavailable>, new_target=<unavailable>, fun_data=<unavailable>, receiver=<unavailable>, args=BuiltinArguments @ 0x00007ffeefbfe2a0) at builtins-api.cc:107 [opt]
frame #7: 0x00000001001fd2e8 node`v8::internal::Builtin_Impl_HandleApiCall(args=<unavailable>, isolate=0x0000000102801e00) at builtins-api.cc:137 [opt]
frame #8: 0x00003a0a4fc0427d
frame #9: 0x00003a0a4fc144f7
frame #10: 0x00003a0a4fc144f7
frame #11: 0x00003a0a4fc144f7
frame #12: 0x00003a0a4fc144f7
frame #13: 0x00003a0a4fc118d5
frame #14: 0x00003a0a4fc096a1
frame #15: 0x00000001004faa86 node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::Object>, v8::internal::Execution::MessageHandling, v8::internal::Execution::Target) [inlined] v8::internal::GeneratedCode<v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, v8::internal::Object*, int, v8::internal::Object***>::Call(args=<unavailable>, args=<unavailable>, args=<unavailable>, args=0) at simulator.h:110 [opt]
frame #16: 0x00000001004faa74 node`v8::internal::(anonymous namespace)::Invoke(isolate=0x0000000102801ec8, is_construct=<unavailable>, target=<unavailable>, receiver=<unavailable>, argc=0, args=0x0000000000000000, new_target=<unavailable>, message_handling=kReport, execution_target=kCallable) at execution.cc:153 [opt]
frame #17: 0x00000001004fa768 node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) [inlined] v8::internal::(anonymous namespace)::CallInternal(isolate=0x0000000102801e00, callable=Handle<v8::internal::Object> @ r12, argc=0, argv=0x0000000000000000, message_handling=kReport, target=kCallable) at execution.cc:189 [opt]
frame #18: 0x00000001004fa6ed node`v8::internal::Execution::Call(isolate=0x0000000102801e00, callable=Handle<v8::internal::Object> @ r12, receiver=<unavailable>, argc=0, argv=0x0000000000000000) at execution.cc:200 [opt]
frame #19: 0x0000000100187c16 node`v8::Function::Call(this=0x00000001028564a0, context=<unavailable>, recv=<unavailable>, argc=0, argv=0x0000000000000000) at api.cc:5089 [opt]
frame #20: 0x00000001000278f1 node`node::InternalCallbackScope::Close(this=0x00007ffeefbfe850) at node.cc:989 [opt]
frame #21: 0x0000000100027a84 node`node::InternalMakeCallback(env=0x00007ffeefbfea38, recv=<unavailable>, callback=<unavailable>, argc=<unavailable>, argv=0x0000000000000000, asyncContext=<unavailable>) at node.cc:1025 [opt]
frame #22: 0x0000000100027c12 node`node::MakeCallback(isolate=<unavailable>, recv=<unavailable>, callback=<unavailable>, argc=<unavailable>, argv=<unavailable>, asyncContext=<unavailable>) at node.cc:1079 [opt]
frame #23: 0x000000010001a8dd node`node::Environment::CheckImmediate(handle=<unavailable>) at env.cc:419 [opt]
frame #24: 0x000000010094de0a node`uv__run_check(loop=0x0000000101646750) at loop-watcher.c:67 [opt]
frame #25: 0x00000001009494c1 node`uv_run(loop=0x0000000101646750, mode=UV_RUN_DEFAULT) at core.c:369 [opt]
frame #26: 0x0000000100032061 node`node::Start(isolate=0x0000000102801e00, isolate_data=<unavailable>, argc=2, argv=<unavailable>, exec_argc=<unavailable>, exec_argv=<unavailable>) at node.cc:4504 [opt]
frame #27: 0x0000000100031860 node`node::Start(event_loop=0x0000000101646750, argc=2, argv=0x00000001024067d0, exec_argc=1, exec_argv=0x00000001024068a0) at node.cc:4583 [opt]
frame #28: 0x00000001000314cb node`node::Start(argc=<unavailable>, argv=0x00000001024067d0) at node.cc:4640 [opt]
frame #29: 0x0000000100001034 node`start + 52
Do you know why it's missing the JS stacks? It's taken from master.
New V8. llnode hasn't been updated yet.
Also, the Ignition interpreter complicates frame->function mapping. In the naive solution you only see the interpreter frames, like you do with perf(1) now.
@mmarchini has been working on that however, and with respectable stamina too, I may add.
@bnoordhuis I thought llnode was handling interpreted functions 馃 (at least on Node.js 8). Unfortunately what I've been working on will only fix perf(1), but I'll take a look into llnode to see if it's handling interpreted frames correctly.
Looking at the output from @mcollina though, looks like bt was used instead of v8 bt, or the postmortem metadata is missing from the binary. At least some of those missing frames should appear as "\
My bad, I was using the wrong command. Anyway, I tracked down the problem and a PR is on the way.
Sorry to bomb this thread but I am looking for info regarding whether a tcp connection can recover after an 'error' event is emitted. I am guessing not, but I am not really sure. Currently, if 'error' is emitted I call .end() on the current connection and try to open a entirely new one.
Unless you are emitting error yourself, you can safely assume that the connection is gone. You don鈥檛 need to call .end().
If you want to be safe, call .destroy()
@mcollina thanks appreciated
@ronag do you know if this is still an issue or if this is already fixed?
@BridgeAR: It still seems to be an issue. As far as I can tell, the fix was reverted and the follow up was closed... I haven't followed the entire history and rationale of this.
@mcollina mentioned the following https://github.com/nodejs/node/pull/20571#issuecomment-431650497
This can be closed for the moment, it will be fixed with the destroy work @mafintosh has been preparing for some time.
But as far as I can see from master, this is still a problem, i.e. Readable can emit 'end' after 'error'.
I wouldn't mind to pick up on this but would be useful if those involved so far could shime in? @mafintosh @mcollina
@BridgeAR: Now I think this is fixed :)
Most helpful comment
Fix in https://github.com/nodejs/node/pull/20104.