Given the following code:
const cluster = require('cluster')
if (cluster.isMaster) {
cluster.setupMaster({
execArgv: [
'--debug=1337'
]
})
cluster.fork()
} else {
setInterval(() => true, 1000)
}
I get the following output on the console at runtime:
Debugger listening on port 5859
The reason for this is because the current code assumes that whatever flag present in execArgv will be the same as the one being passed to the master process, and therefore extracts the initial port from process.debugPort instead (ref: https://github.com/nodejs/node/blob/master/lib/internal/cluster/master.js#L98-L119). It also assumes that the first port will be used to debug the master, and automatically increment the port for the first worker. Finally, given the submitted code, one would arguably expect no increments to happen at all; in the actual use-case where I wish to make sure of this pattern, I use cluster with one and only one worker at a time, so re-using the same port would be perfectly fine.
I would be more than happy to contribute a fix, but given the current behaviour and the fact that I don't know how I could actually distinguish programatic setup from the initial extraction of execArgv passed to the master process, I am having a bit of a hard time to figure out how to approach this issue. Suggestions more than welcome.
disclaimer: I'm a coworker of @stelcheck
It seems very reasonable to me to want to be able to set these on child processes instead of the master process, in situations like process managers etc. I would welcome a fix to this where if the flag was not set on the master process, the port value provided is used, and both process.debugPort and debugPortOffset are ignored.
[Just my suggestion]
Add a second argument, opt, to cluster.fork() that will trickle down to createWorkerProcess and allow you to override all of the preset opts:
{
env: workerEnv,
silent: cluster.settings.silent,
execArgv: execArgv,
stdio: cluster.settings.stdio,
gid: cluster.settings.gid,
uid: cluster.settings.uid
}
(maybe not env because that will cause redundancy)
[other option]
Get --inspect-port to be whitelisted in NODE_OPTIONS (#12028)
--inspect-port should have been in NODE_OPTIONS: https://github.com/nodejs/node/pull/13002
I think what @stelcheck is trying to do is perfectly reasonable, and agree that with the special casing its a bit hard to get all the corner cases.
@refack Hi, I'd like to take this one. I've read over your suggestion, the related code, and made a gist. You mean something like this?
https://gist.github.com/arturgvieira/493772fb633acfb1c8dcaba2d94cfa12
@arturgvieira I could be wrong, but I suspect https://github.com/nodejs/node/pull/13002 is the solution more likely to gain favor than adding a new argument to cluster.fork(). It hasn't landed yet, though, so I'm somewhat speculating.
(Because there's a likely solution proposed and undergoing active review, I'm going to remove the help wanted label from this issue.)
@Trott Thanks, I must have missed the reference to the open PR.
So, I'm a bit confused, but did #13002 fix this? Or was it a requirement in order to fix this at all?
No, #13002 did not fix it, or try to. Its an open bug, waiting for a taker.
No, #13002 did not fix it, or try to. Its an open bug, waiting for a taker.
My mistake. I've re-added the help wanted label. Sorry for any confusion I've caused.
@ronkorving if you take the latest nightly build, and pass env = {NODE_OPTIONS: '--inspect-port=1337}' as an argument to fork it _might_ work... (I'll need to read the code again to see who will win in this case argv, or NODE_OPTIONS)
@arturgvieira yes that's pretty much what I meant. It will open up all the other parameters to overriding as well.
^ @stelcheck
My expectation though is that it won't work, because in a child process I think the port is completely ignored.
refack: Hi! I'm working on https://github.com/nodejs/node/issues/12941 and I am implementing your idea to add a 'opt' argument to cluster.fork() Here is the thing, how do I cover the case where env (the first paramenter) is not passed in. Thanks in advance.
Yeah it's a tricky one...:
silent, execArgv, stdio, gid, uid) get a special prefix like _NODE_CLUSTER_ARG_****silent, execArgv, stdio, gid, uid) it's optper-worker customization doesn't align well with the overall design of cluster, which is about symetric TCP servers, where by symetric, I mean each worker is is identical. Per-worker customization breaks that, and moves cluster out of the use-pattern they were designed for. That it creates this subtle API WTF is an indication of the problem. Node has lots of API sigs where we look at the arg types to see which of the optional args were provided, but not any (to my memory), where we look at the keys in an Object argument to try to heuristically decide which of the args it is, this seems like an API oddity. For people who really want per-worker customization, it is a feature already, https://nodejs.org/api/cluster.html#cluster_cluster_setupmaster_settings can be called before cluster.fork(), and cluster.settings can be used to restore the settings if it was saved before calling setupMaster(). This satisfies, I think, the use-case, but its roundabout nature emphasizes that while its possible, its not intended.
That it creates this subtle API WTF is an indication of the problem.
Makes sense.
That it creates this subtle API WTF is an indication of the problem.
Makes sense.
Except for the debug case... Where the port incrementation is a bit of a kludge...
So how about adding debugPort to cluster.setupMaster, or a getDebugPort lambda?
/cc @arturgvieira
I had an idea for a while of adding a second kind of worker ID, one which was unique at any moment, but not over all time (current worker IDs increment infinitely). This would mean that if you had 40 workers, even as they died and were restarted, the stable worker IDs would always be <= 40, and they could have various uses... such as being used to generate a debug port, the current approach uses the current worker ID, which can get large enough the port is no longer in range. Never got around to it, maybe its not the best idea, but I'm throwing it out there.
@sam-github I think there's quite a need for that (or at least more than none :)). I've wanted that since forever. It can make things like logging aggregation much easier (predictable and configurable IDs, instead of ever incrementing new data sources).
I had an idea for a while of adding a second kind of worker ID, one which was unique at any moment, but not over all time
Maybe @arturgvieira will be interested...
Sounds interesting. I am wondering where to start, to make it happen.
https://github.com/nodejs/node/blob/master/lib/internal/cluster/master.js#L154
I would guess add ref-counting in the on('exit') to keep track of released ids
Then look at https://github.com/nodejs/node/blob/master/lib/internal/cluster/master.js#L113 and use the id instead of debugPortOffset
Ok, I'll take a look, and let you know if I run into any problems.
I took a look and I am trying to find a good method to do the ref-counting. I thought about using a simple array. @refack what do you think?
I took a look and I am trying to find a good method to do the ref-counting. I thought about using a simple array. @refack what do you think?
Yeah, no need for anything fancy. But maybe a Set will be better, you have add(), delete(), has(), and you can find the maximal element with Math.max(...s.values())
Cross-ref: https://github.com/nodejs/node/pull/9659
I'm trying to change this behaviour like half a year =)
In https://github.com/nodejs/node/pull/9659#issuecomment-307211154, I've mentioned my new plan. I'm on stage two now (https://github.com/nodejs/node/pull/13619), after that, I'm planning to do stage three.
@refack Does the work @mutantcornholio is doing conflict with the id ref-counting that I am doing?
Hmm, I wasn't going to implement that, but we should totally do it.
If you want to implement it, go for it!
I would like to ask you to wait for #13619 to merge, so our I wouldn't get to rewrite all my tests :(
Next PR I'm planning is to allow to set debug port number in cluster.settings and bypass port increment.
That shouldn't conflict with changes you're planning (I don't care which logic I'm about to bypass;)), so we can work in parallel then.
Great, let's work in parallel then. As for the timetable, I don't think we'll conflict.
@refack Does the work @mutantcornholio is doing conflict with the id ref-counting that I am doing?
AFAIK @mutantcornholio second phase is to give the user a manual opt-out of auto-incrementing or a way disconnect the workerID from the debugger port.
ref-counting workerIDs will be good anyway.
@refack Ok, also I wanted to run some code by you. Here is a short version of the cluster/master.js file https://gist.github.com/arturgvieira/493772fb633acfb1c8dcaba2d94cfa12
I made the changes above but not sure where to go from here.
You mentioned on 'exit' to track the release of the ids, I think that is what I'll do next.
You mentioned on 'exit' to track the release of the ids, I think that is what I'll do next.
Yes, commented just that on the gist.