I am experiencing memory leakage in an app that usesprocess.fork() a lot. These child processes get sent messages via process.send() with a sendHandle and are terminated later on.
I did run into issues with memory management here. Some heap dumps show that even after the child-processes exited, the ChildProcess-instances are retained in the master process. I learned that using subprocess.disconnect() partly fixes that issue, but one more retainer can be found here:
https://github.com/nodejs/node/blob/20259f90927a8b2923a0ad3210f6400d3a29966b/lib/net.js#L1665
How, where and when should this socketList be removed from the _workers-array?
Do you have code that reproduces the memory leak?
I tried to make a simplified setup of my app:
server.js
const http = require('http');
const { fork } = require('child_process');
const server = http.createServer((req, res) => {
const workerProcess = fork('worker.js', [], {execArgv: []});
workerProcess.on('message', (status) => {
switch (status) {
case 'ready':
workerProcess.send({
method: req.method,
headers: req.headers,
path: req.path,
httpVersionMajor: req.httpVersionMajor,
query: req.query,
}, res.socket);
break;
case 'done':
res.socket.end();
}
});
});
server.listen(8080);
setInterval(() => {
console.log(process.memoryUsage());
}, 10000);
worker.js
const http = require('http');
process.on('message', (req, socket) => {
const res = new http.ServerResponse(req);
res._finish = function _finish() {
res.emit('prefinish');
socket.end();
};
res.assignSocket(socket);
socket.once('finish', () => {
process.send('done', () => {
process.exit(0);
});
});
res.end(process.pid.toString());
});
process.send('ready');
When simulating load on this server using ApacheBench, memory usage of the server-process slowly rises.
ab -n 1000000 -c 50 http://localhost:8080/
You'll have to be patient: it does take a few hours on my machine to become critical...
Node 8.6.0 has the same issue.
When limiting the available memory for the server to let's say 100MB, it dies within a few minutes at a certain point:
node --max_old_space_size=100 server.js
...
--- Last few GCs --->
[15664:0x39d9410] 504893 ms: Mark-sweep 252.2 (299.0) -> 252.2 (268.0) MB, 219.9 / 0.1 ms (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 220 ms) last resort
[15664:0x39d9410] 505119 ms: Mark-sweep 252.2 (268.0) -> 252.2 (268.0) MB, 226.7 / 0.1 ms last resort
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x3b40f62a8799 <JSObject>
2: text(aka utf8Text) [string_decoder.js:201] [bytecode=0x4695cc3cbd1 offset=29](this=0x54b29377d39 <StringDecoder map = 0x16700ae4d0a9>,buf=0x17c2ab58e4b1 <Uint8Array map = 0x139076a04e01>,i=0)
4: write [string_decoder.js:85] [bytecode=0x4695cc3c8c1 offset=97](this=0x54b29377d39 <StringDecoder map = 0x16700ae4d0a9>,buf=0x17c2ab58e4b1 <Uint8Array map = 0x139076a04e01>)
6: onread [int...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: node::Abort() [node]
2: 0x13576dc [node]
3: v8::Utils::ReportOOMFailure(char const*, bool) [node]
4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [node]
5: v8::internal::Factory::NewStruct(v8::internal::InstanceType) [node]
6: v8::internal::Factory::NewTuple3(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>) [node]
7: v8::internal::LoadIC::LoadFromPrototype(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Smi>) [node]
8: v8::internal::LoadIC::GetMapIndependentHandler(v8::internal::LookupIterator*) [node]
9: v8::internal::IC::ComputeHandler(v8::internal::LookupIterator*) [node]
10: v8::internal::LoadIC::UpdateCaches(v8::internal::LookupIterator*) [node]
11: v8::internal::LoadIC::Load(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>) [node]
12: v8::internal::Runtime_LoadIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*) [node]
13: 0x1745398046fd
Aborted (core dumped)
@Reggino Can you check that #15679 fixes the issue for you?
@bnoordhuis Thanks! While i do think your change is part of the solution , this issue still isn't completely resolved: even now the server-process keeps leaking memory up untill it crashes...
Do you mean with the example you posted or with a larger application?
The example...
The issue still exists...
I was going to remove the Fixes: tag but James merged the PR before I could do so. Inconvenient.
@Reggino I can't reproduce after #15679 anymore. I let your test case run for a couple of hours with --max_old_space_size=32 and it's rock solid.
Just checked out and built https://github.com/nodejs/node/commit/2e59ec0c2d27baa0f0bc140ead72d15b1ed9b29c again and ran the test on my Linux 64bit environment..
Output of server.js:
./node --max_old_space_size=32 server.js
{ rss: 50212864,
heapTotal: 24485888,
heapUsed: 8344752,
external: 8609 }
{ rss: 57155584,
heapTotal: 50176000,
heapUsed: 12755848,
external: 17203 }
{ rss: 65495040,
heapTotal: 55943168,
heapUsed: 17016640,
external: 17179 }
{ rss: 73113600,
heapTotal: 62234624,
heapUsed: 21116696,
external: 17261 }
{ rss: 80306176,
heapTotal: 68001792,
heapUsed: 25077032,
external: 17243 }
{ rss: 86573056,
heapTotal: 73244672,
heapUsed: 28898720,
external: 17203 }
{ rss: 92934144,
heapTotal: 78487552,
heapUsed: 32581744,
external: 17267 }
{ rss: 96854016,
heapTotal: 80060416,
heapUsed: 36295896,
external: 17309 }
{ rss: 98816000,
heapTotal: 80060416,
heapUsed: 39852152,
external: 17210 }
<--- Last few GCs --->
[19835:0x30f51b0] 92906 ms: Mark-sweep 38.3 (77.9) -> 38.3 (46.9) MB, 37.0 / 0.0 ms (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 37 ms) last resort GC in old space requested
[19835:0x30f51b0] 92944 ms: Mark-sweep 38.3 (46.9) -> 38.3 (46.9) MB, 37.4 / 0.0 ms last resort GC in old space requested
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x20b4bb5a5e49 <JSObject>
1: pushValueToArray [bootstrap_node.js:~245] [pc=0x202eb9386617](this=0x2c7b79a1b319 <JSArray[1]>)
2: arguments adaptor frame: 6->0
==== Details ================================================
[1]: pushValueToArray [bootstrap_node.js:~245] [pc=0x202eb9386617](this=0x2c7b79a1b319 <JSArray[1]>) {
// optimized frame
--------- s o u r c e c o d e ---------
function pushValueToArray() {\...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: node::Abort() [./node]
2: 0x128126c [./node]
3: v8::Utils::ReportOOMFailure(char const*, bool) [./node]
4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [./node]
5: v8::internal::Factory::NewFixedArray(int, v8::internal::PretenureFlag) [./node]
6: v8::internal::DeoptimizationInputData::New(v8::internal::Isolate*, int, v8::internal::PretenureFlag) [./node]
7: v8::internal::compiler::CodeGenerator::PopulateDeoptimizationData(v8::internal::Handle<v8::internal::Code>) [./node]
8: v8::internal::compiler::CodeGenerator::FinalizeCode() [./node]
9: v8::internal::compiler::PipelineImpl::FinalizeCode() [./node]
10: v8::internal::compiler::PipelineCompilationJob::FinalizeJobImpl() [./node]
11: v8::internal::Compiler::FinalizeCompilationJob(v8::internal::CompilationJob*) [./node]
12: v8::internal::OptimizingCompileDispatcher::InstallOptimizedFunctions() [./node]
13: v8::internal::StackGuard::HandleInterrupts() [./node]
14: v8::internal::Runtime_StackGuard(int, v8::internal::Object**, v8::internal::Isolate*) [./node]
15: 0x202eb920463d
Aborted (core dumped)
@bnoordhuis What exact commit did you run the test with? Or could it be platform-related?
Today's master, 4f339b54e9. That's about 70 commits ahead of 2e59ec0. It might be a platform-specific issue but I can't reproduce on my x86_64 Ubuntu 16.04 box nor on my MBA.
Hmmm... tried again with https://github.com/nodejs/node/commit/4f339b54e9cd8a2cb69b41d87832ad8ca3a6b5e2 but no cigar: i get the same output.
Not sure what do to next... Should we close this issue since it isn't reproducible?
@fruitl00p @AubreyHewes @mishavantol any luck in reproducing this issue?
@Reggino Perhaps you can investigate with llnode? Its findjsobjects command can tell you what's on the heap.
@Reggino what version of Linux are you using?
Tested both Ubuntu 16.04 and 17.04, both with the same issue.
@bnoordhuis Did you see the worker PID when requesting http://localhost:8080 ? Did ab report all requests successful (up untill stopping it)?
@Reggino did you get a chance to look into this more with llnode or heapdumps?
I did investigate it, but couldn't find the cause. I can still reproduce the issue however...
I would like to bump this again as I am now seeing this in production environments. Will post reproduction code once available
I am also experiencing memory leak with child_process.fork in one of my apps. There is a MQTT client connected to a MQTT broker in index.js and on receiving a message it forwards it to a child_process for processing using child_process.send(). In the child process I catch this message using process.on() and then do the relevant processing.
There are 5 sensors sending data every 5s to the app and every 5 min or so the child process is exiting with SIGABRT and I see the heapdump going to 1GB+ just around that instant.
node version 8.9.4
https://stackoverflow.com/questions/48801587/nodejs-child-process-ipc-communication-issue-on-server
I think the problem with the original code was that after passing the socket to the worker, the underlying handle is automatically closed but the http (HTTPParser, etc) objects associated to the socket don't notice about it so they are never freed. Is this something the child_process module should correctly handle itself?
To avoid this you can pass the option keepOpen: true when sending the socket, so the handle is not automatically closed after being sent. This way when the done message is received you can normally end the socket connection and the http resources associated to it will be freed.
@santigimeno Thank you for your suggestion.
I tried it and indeed the server stays alive a bit longer, doesn't seem to run out of memory, but stops responding after about 2 minutes... weird?
@Reggino can you try to send the done message from the worker on the close event instead of on finish?
@santigimeno same result, the server always crashes, with or without { keepOpen: true }
Weird, I ran the code several hours and the heap size seemed pretty stable. Is it the same crash? Did it took longer to crash?
Can you try with this patch without the { keepOpen: true }?:
diff --git a/lib/internal/child_process.js b/lib/internal/child_process.js
index 58bb46e8a2..91e4cbc978 100644
--- a/lib/internal/child_process.js
+++ b/lib/internal/child_process.js
@@ -18,6 +18,8 @@ const SocketList = require('internal/socket_list');
const { convertToValidSignal } = require('internal/util');
const { isUint8Array } = require('internal/util/types');
const spawn_sync = process.binding('spawn_sync');
+const { HTTPParser } = process.binding('http_parser');
+const { freeParser } = require('_http_common');
const {
UV_EAGAIN,
@@ -94,6 +96,14 @@ const handleConversion = {
if (!options.keepOpen) {
handle.onread = nop;
socket._handle = null;
+ socket.setTimeout(0);
+ // In case of an HTTP connection socket, release the associated resources
+ if (socket.parser && socket.parser instanceof HTTPParser) {
+ freeParser(socket.parser, null, socket);
+ if (socket._httpMessage) {
+ socket._httpMessage.detachSocket(socket);
+ }
+ }
}
return handle;
@santigimeno Thanks for the patch. I tried it, but now, when i try some requests on the server i get an error like:
TypeError: Cannot read property 'end' of null
at ChildProcess.workerProcess.on (/test/server.js:23:28)
at ChildProcess.emit (events.js:129:13)
at emit (internal/child_process.js:793:12)
at process._tickCallback (internal/process/next_tick.js:115:19)
To make sure you (and @bnoordhuis ) should be able to run the test i set up a repo containing the test here: https://github.com/Reggino/node-issue-15651-test . It should make sure the failing test is reproducible and we all use the same node flags and load-testing / benchmarking tool (this example uses https://github.com/carlos8f/slam )
@Reggino yeah with the patch the socket is detached from the response so it's set to null. In fact, you don't need to call socket.end() as the socket is automatically closed after being sent to the child. Can you comment that line?
@santigimeno awesome! Indeed, this seems to fix the issue! :bowing_man:
Hi,
Can someone help me with my issue. https://github.com/nodejs/help/issues/1123
Is it a close candidate?
@santigimeno - was your patch intended to be a workaround, or is it going into the code base?
fwiw, I am able to reproduce the leak consistently in my MAC, with the latest.
@gireeshpunathil I forgot about this one. I'll try to prepare a PR and see how it goes.
thanks, you may ping me for any interim testing as required, as I have a reliable recreate!
Hello,
I updated my application to node.js 10.15 version however, the memory leak is still present.
I ran a test on openshift with a simple "hello world" and a liveness tcp health check
https://drive.google.com/file/d/1bTOkFKETvcWLO4HTbq5eFKXH8NrlHpe_/view?usp=sharing
Regards
@ellisium can you post code that reproduces the leak? Thanks
it's a boilerplate used on my work so I'm not allowed to share it. I will confirm it with another test using only native node.js function and will back with result for end of this week.
@ellisium could you try https://github.com/Reggino/node-issue-15651-test ? I was able to reproduce the issue and validate the actual fix with it...
Ok I will try but I m focusing on testing cluster.fork first. Sorry maybe it's a different use case, if yes you can ignore my last messages so
Hello, after few days testing, the curve stabilized and it seems ok. I guess it's depending on req frequencies which impact the memory allocation ramp up.
Most helpful comment
Weird, I ran the code several hours and the heap size seemed pretty stable. Is it the same crash? Did it took longer to crash?
Can you try with this patch without the
{ keepOpen: true }?: