Node: maxBuffer default too small

Created on 28 Nov 2016 · 15Comments · Source: nodejs/node

Version: v6.2.1
Platform: Darwin me.local 15.5.0 Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 x86_64
Subsystem: child_process

Currently maxBuffer for child_process.exec is set to 200*1024 bytes, or ~204.8KB. I ran into an issue where my child process was being terminated and tracking it down was quite tough. It ended up being that it was producing enough output that it exceeded maxBuffer.

I think the buffer size is too small and this behavior (terminating a child) is drastic enough that it should only be done in the case where a child is producing a much larger amount of output.

I'm not sure what's sane here, perhaps 5MB+?

child_process

Source

mlynch

👍8

Most helpful comment

Having the maxBuffer at all is the bug. It gives users an option they don't want to have to specify.

Much better to remove the macBuffer option and have the buffer grow dynamically. Programs with small streams use small amounts of memory, programs that use large streams use more memory. Everything just works and the programmer doesn't need to know about the issue at all.

peterhal on 9 Mar 2017

👍17

All 15 comments

I would go with 10 MB. (Which is what I'm using as default in execa)

sindresorhus on 29 Nov 2016

How about we make maxBuffer mandatory?

sam-github on 29 Nov 2016

@sam-github That sounds like an extremely breaking change and would be very inconvenient. Just set it to an arbitrary high value so most users won't hit it.

sindresorhus on 29 Nov 2016

Yeah, bit of a strawman... but any value is likely to be smaller than someone needs.

sam-github on 29 Nov 2016

@sam-github Yes, but 200 KB is IMHO way too low. Anything higher will at least improve the situation.

sindresorhus on 29 Nov 2016

The flip side is that the memory footprint for a program that spawns twenty simultaneous child processes goes up from max 4 MB to max 200 MB. That's a pretty steep increase.

I infer that you use maxBuffer as a convenient way to collect output but it's intended to be used as a circuit breaker for runaway processes.

Also, I'm curious why you are using child_process.exec/execFile for programs that print a lot to stdout. You'd be better served by child_process.spawn(), it knows how to stream the data.

bnoordhuis on 29 Nov 2016

The flip side is that the memory footprint for a program that spawns twenty simultaneous child processes goes up from max 4 MB to max 200 MB. That's a pretty steep increase.

That's only if they actually use all 10 MB each, right?

Honestly, even maxBuffer at 1 MB would be a huge improvement.

Also, I'm curious why you are using child_process.exec/execFile for programs that print a lot to stdout. You'd be better served by child_process.spawn(), it knows how to stream the data.

.spawn() is inconvenient to use when you just want the output as a whole and do something with it, which is my most common use-case for child_process. Not very often, but once in a while, I have output that takes more than 200 KB, like here.

sindresorhus on 29 Nov 2016

@bnoordhuis That is a good point, though it doesn't solve the core issue that developers hit this limit and their child processes terminate without much indication as to why.

Adding to what @sindresorhus said, if you are using exec to spawn a command and to view the output, you're more explicitly looking for a single process command so, in theory, you're not going to spawn a bunch of children and not monitor their output.

mlynch on 29 Nov 2016

I'd be OK with upping to 1Meg, if someone PRs it, I've run into the limit, too (but I just set maxBuffer when I do). 10 Meg seems excessive, if you want that much output, explicitly configuring node to expect it is reasonable.

Or perhaps the limit should be max-string, and run-away processes become the user's problem to protect against?

sam-github on 13 Dec 2016

👍1

Having the maxBuffer at all is the bug. It gives users an option they don't want to have to specify.

peterhal on 9 Mar 2017

👍17

@peterhal I take it you didn't read this comment?

it's intended to be used as a circuit breaker for runaway processes

bnoordhuis on 10 Mar 2017

Having a maxBuffer option is enough that it allows a developer to create their own circuit breaker when they want one; i.e., a developer can set maxBuffer to a value they choose.

The problem: Node is trying to do what a developer should be doing. My humble opinion is that maxBuffer should default to Infinity and just be made available for developers to use as a circuit breaker when they feel its appropriate to do so.

Otherwise, as has been noted above, any default value for maxBuffer is likely to be too small in certain circumstances, resulting in unforeseen errors in various applications that didn't think to override whatever the default value was.