const { Worker, isMainThread } = require('worker_threads');
if (isMainThread) {
for (let i = 0; i < 2000; i++) {
new Worker(__filename);
}
} else {
console.log(JSON.stringify(process.memoryUsage()));
setInterval(() => {
// Keep thread alive
}, 1000);
}
This problem always occur.
I have to run at least 2000 worker thread at the same time.
The script crash with random GC error.
I need to run at least 2000 thread at the same time, but there are 2 problem that I encounter:
worker_thread
are consuming so much memory, about 5MB in RSS for an empty thread, so I end up with 1500 threads and about 8GB RAM, and cost some more if the thread do something, but it wasn't the real problem, because my server have a large amount of RAM (>100GB)--max-old-space-size=81920 --max-semi-space-size=81920
, but the error still there when RSS reach 8GBOutput of script
// 1486 lines, 1487th line bellow
{"rss":8157556736,"heapTotal":4190208,"heapUsed":2382936,"external":802056}
<--- Last few GCs --->
[19127:0x7f5f4442fa80] 26606 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 1.8 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
{"rss":8158093312,"heapTotal":4190208,"heapUsed":2388408,"external":802056}
1: 0x9ef190 node::Abort() [node]
<--- Last few GCs --->
[19127:0x7f5fa442fb40] 24675 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 1.9 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
{"rss":8158359552,"heapTotal":4190208,"heapUsed":2383584,"external":802056}
{"rss":8158359552,"heapTotal":4190208,"heapUsed":2375648,"external":802056}
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
<--- Last few GCs --->
[19127:0x7f5fb842fb20] 27482 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 2.0 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
1: 0x9ef190 node::Abort() [node]
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
1: 0x9ef190 node::Abort() [node]
{"rss":8158846976,"heapTotal":4190208,"heapUsed":2375568,"external":802056}
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
<--- Last few GCs --->
[19127:0x7f5fac42fb20] 27489 ms: Scavenge 2.0 (2.7) -> 1.6 (3.7) MB, 2.0 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
{"rss":8160190464,"heapTotal":4190208,"heapUsed":2390760,"external":802056}
{"rss":8160190464,"heapTotal":4190208,"heapUsed":2385720,"external":802056}
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
5: 0xd0a765 [node]
1: 0x9ef190 node::Abort() [node]
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
6: 0xd545ee [node]
<--- Last few GCs --->
[19127:0x7f60b842fe20] 29692 ms: Scavenge 2.0 (2.7) -> 1.5 (3.7) MB, 1.8 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
{"rss":8161587200,"heapTotal":3928064,"heapUsed":2385144,"external":802056}
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xd0a765 [node]
7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
<--- Last few GCs --->
[19127:0x7f5ef4165610] 26880 ms: Scavenge 2.0 (2.7) -> 1.5 (3.7) MB, 1.8 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
Cannot get stack trace in GC.
FATAL ERROR: NewSpace::Rebalance Allocation failed - JavaScript heap out of memory
1: 0x9ef190 node::Abort() [node]
5: 0xd0a765 [node]
6: 0xd545ee [node]
8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
{"rss":8161710080,"heapTotal":3928064,"heapUsed":2373360,"external":802056}
1: 0x9ef190 node::Abort() [node]
6: 0xd545ee [node]
7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
5: 0xd0a765 [node]
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
6: 0xd545ee [node]
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
5: 0xd0a765 [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2381024,"external":802056}
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2388704,"external":802056}
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2360896,"external":802056}
8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
6: 0xd545ee [node]
5: 0xd0a765 [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
6: 0xd545ee [node]
{"rss":8161210368,"heapTotal":3665920,"heapUsed":2365384,"external":802056}
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
{"rss":8160485376,"heapTotal":3403776,"heapUsed":2389984,"external":802056}
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
{"rss":8160489472,"heapTotal":3403776,"heapUsed":2389456,"external":802056}
{"rss":8160489472,"heapTotal":3403776,"heapUsed":2397112,"external":802056}
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
<--- Last few GCs --->
[19127:0x7f60a8001010] 47768 ms: Scavenge 2.4 (4.2) -> 2.1 (4.0) MB, 1.6 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure
<--- JS stacktrace --->
FATAL ERROR: Committing semi space failed. Allocation failed - JavaScript heap out of memory
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
7: 0xd58797 v8::internal::MarkCompactCollector::CollectGarbage() [node]
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
{"rss":8150188032,"heapTotal":3403776,"heapUsed":2413400,"external":802056}
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
1: 0x9ef190 node::Abort() [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
8: 0xd16c39 v8::internal::Heap::MarkCompact() [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
2: 0x9f13b2 node::OnFatalError(char const*, char const*) [node]
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
13: 0xede9db v8::internal::Map::RawCopy(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, int, int) [node]
9: 0xd179a3 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
3: 0xb5da9e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xb5de19 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
14: 0xedf104 v8::internal::Map::CopyDropDescriptors(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>) [node]
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
18: 0xee15f2 v8::internal::Map::TransitionToDataProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::StoreOrigin) [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
5: 0xd0a765 [node]
10: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
15: 0xedf1a6 v8::internal::Map::ShareDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Descriptor*) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
19: 0xed1cdf v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::StoreOrigin) [node]
11: 0xd1a959 v8::internal::Heap::ReserveSpace(std::vector<v8::internal::Heap::Chunk, std::allocator<v8::internal::Heap::Chunk> >*, std::vector<unsigned long, std::allocator<unsigned long> >*) [node]
6: 0xd182ee v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
11: 0xd1afcc v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationOrigin, v8::internal::AllocationAlignment) [node]
16: 0xedfcae v8::internal::Map::CopyAddDescriptor(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
18: 0xee15f2 v8::internal::Map::TransitionToDataProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::StoreOrigin) [node]
18: 0xee15f2 v8::internal::Map::TransitionToDataProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::StoreOrigin) [node]
19: 0xed1cdf v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::StoreOrigin) [node]
20: 0xf05566 v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin) [node]
7: 0xd18515 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
12: 0xce7cae v8::internal::Factory::NewMap(v8::internal::InstanceType, int, v8::internal::ElementsKind, int) [node]
17: 0xedfe29 v8::internal::Map::CopyWithField(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::PropertyConstness, v8::internal::Representation, v8::internal::TransitionFlag) [node]
19: 0xed1cdf v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSReceiver>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::StoreOrigin) [node]
20: 0xf05566 v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin) [node]
21: 0xeb08e0 v8::internal::JSObject::DefineOwnPropertyIgnoreAttributes(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::JSObject::AccessorInfoHandling) [node]
22: 0xeb0bec v8::internal::JSObject::SetOwnPropertyIgnoreAttributes(v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes) [node]
23: 0x10283ef [node]
24: 0x102c399 [node]
25: 0x102d303 v8::internal::Runtime_CreateObjectLiteral(int, unsigned long*, v8::internal::Isolate*) [node]
26: 0x13a71b9 [node]
able to recreate. the report data shows this:
"javascriptHeap": {
"totalMemory": 4059136,
"totalCommittedMemory": 3299464,
"usedMemory": 2861680,
"availableMemory": 104855004168,
"memoryLimit": 104857600000,
"heapSpaces": {
"read_only_space": {
"memorySize": 262144,
"committedMemory": 33328,
"capacity": 33040,
"used": 33040,
"available": 0
},
"new_space": {
"memorySize": 1048576,
"committedMemory": 1047944,
"capacity": 1047424,
"used": 633768,
"available": 413656
},
"old_space": {
"memorySize": 1654784,
"committedMemory": 1602320,
"capacity": 1602528,
"used": 1600304,
"available": 2224
},
"code_space": {
"memorySize": 430080,
"committedMemory": 170720,
"capacity": 154336,
"used": 154336,
"available": 0
},
"map_space": {
"memorySize": 528384,
"committedMemory": 309984,
"capacity": 309120,
"used": 309120,
"available": 0
},
"large_object_space": {
"memorySize": 135168,
"committedMemory": 135168,
"capacity": 131112,
"used": 131112,
"available": 0
},
"code_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 0,
"used": 0,
"available": 0
},
"new_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 1047424,
"used": 0,
"available": 1047424
}
}
}
and top (just before the crash):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10138 root 20 0 0.319t 7.563g 13644 S 350.8 0.8 1:46.31 node
there are many spaces seen as exhausted - such as code_space
and map_space
. how do I increase those? I am not sure which flags in node --v8-options
to use
@nodejs/v8
I think the big hint there might actually be VIRT
reporting as 0.319t
– maybe the process is running out of virtual memory? (That would be somewhat related to https://github.com/nodejs/node/issues/25933)
but I have unlimited virtual memory :
virtual memory (kbytes, -v) unlimited
plus the failing stack in the referenced issue has node::NewIsolate
in it, which is not the case here, looks like we are doing gc?
@oh-frontend1 - what is your ulimit -v
showing up?
@gireeshpunathil
> ulimit -v
unlimited
and my report.json
"javascriptHeap": {
"totalMemory": 4452352,
"totalCommittedMemory": 3517904,
"usedMemory": 1448464,
"availableMemory": 85947560576,
"memoryLimit": 85949677568,
"heapSpaces": {
"read_only_space": {
"memorySize": 262144,
"committedMemory": 33088,
"capacity": 32808,
"used": 32808,
"available": 0
},
"new_space": {
"memorySize": 2097152,
"committedMemory": 1683416,
"capacity": 1047456,
"used": 188368,
"available": 859088
},
"old_space": {
"memorySize": 1396736,
"committedMemory": 1368440,
"capacity": 1064504,
"used": 897832,
"available": 166672
},
"code_space": {
"memorySize": 430080,
"committedMemory": 170400,
"capacity": 154016,
"used": 154016,
"available": 0
},
"map_space": {
"memorySize": 266240,
"committedMemory": 262560,
"capacity": 175440,
"used": 175440,
"available": 0
},
"large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 0,
"used": 0,
"available": 0
},
"code_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 0,
"used": 0,
"available": 0
},
"new_large_object_space": {
"memorySize": 0,
"committedMemory": 0,
"capacity": 1047456,
"used": 0,
"available": 1047456
}
}
},
thanks @oh-frontend1 - so our failing contexts seem to match. Let me see if I can figure out what caused the gc to fail
$ grep "ENOMEM" strace.txt | grep "mmap" | wc -l
1372184
there are several mmap
calls that fail. Looking at the manual, the second probable reason stated is exhaustion of process' mappings.
$ sysctl vm.max_map_count
vm.max_map_count = 65530
$ sysctl -w vm.max_map_count=655300
vm.max_map_count = 655300
$ node --max-heap-size=100000 foo
{"rss":11898519552,"heapTotal":62894080,"heapUsed":32157136,"external":940898,"arrayBuffers":9386}
{"rss":14497640448,"heapTotal":62894080,"heapUsed":32192184,"external":940938,"arrayBuffers":9386}
{"rss":15572897792,"heapTotal":62894080,"heapUsed":32200208,"external":940938,"arrayBuffers":9386}
{"rss":16316686336,"heapTotal":62894080,"heapUsed":32203928,"external":940938,"arrayBuffers":9386}
...
$ top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
25537 root 20 0 0.575t 0.014t 13636 S 209.0 1.4 2:25.30 node
so by increasing the mapping count, I am able to create 4K threads, and consume upto 05.t of virtual and 15G of rss
. So looks like adjusting maximum mapping count in the kernel is the solution for this. @oh-frontend1 - can you pls verify?
@gireeshpunathil thank you, this solution also work on my real code.
@oh-frontend1 ... I was wondering if you wouldn't mind expanding on the reason why you need a worker thread pool of several thousand workers. What is the scenario / app case you're exploring here. The reason I'm asking is that we (NearForm) are doing some investigation into worker thread perf diagnostics and the dynamic of profiling small worker pools (4-50) range is much different than profiling pools in the 2k-4k range and we'd like to understand the use case a bit more.
@jasnell the application is confidential.
So, nothing much, in future, I have to monitor a large number of IoT device, having 500 network IO on same thread causing a large bottle neck on CPU, but split to child_process
is hard to manage and communicate to master, so I decide using worker_thread
.
And a simple case is one IO per thread, if I cannot resolve this problem, I would decide to increase number of IO per thread, but it will increase code complexity.
In this real application, as I benchmark, I can only create about ~200 threads and this error happened, so I would create a minimal source code to reproduce (and in this case, number of threads reached 1k5, before the error occurred)
Ok thank you! That is super helpful information @oh-frontend1!
@oh-frontend1 what do you mean by "500 network IO"? Is it 500 client connections? If that's true and your application is I/O bound, Node should be able to handle much more than that. In most cases, you just need to follow the golden rule of Node (don't block the event loop).
And if it's CPU-bound, then it's better to keep the number of worker threads close to number of CPU cores and queue tasks when all members are busy (just like ThreadPoolExecutor
in Java does it). Otherwise, if you run CPU-bound tasks on a large number of threads, you will be wasting memory (due to the footprint per each worker thread) and CPU time spent on context switching on OS level.
Sorry in the advance, if I misunderstood your needs.
Most helpful comment
@oh-frontend1 what do you mean by "500 network IO"? Is it 500 client connections? If that's true and your application is I/O bound, Node should be able to handle much more than that. In most cases, you just need to follow the golden rule of Node (don't block the event loop).
And if it's CPU-bound, then it's better to keep the number of worker threads close to number of CPU cores and queue tasks when all members are busy (just like
ThreadPoolExecutor
in Java does it). Otherwise, if you run CPU-bound tasks on a large number of threads, you will be wasting memory (due to the footprint per each worker thread) and CPU time spent on context switching on OS level.Sorry in the advance, if I misunderstood your needs.