supervisord Starts All Processes at the Same Time

Created on 31 May 2012 · 218Comments · Source: Supervisor/supervisor

My config:

[program:redis-testapp]
command=/opt/bcs/bin/redis-server /apps/testapp/releases/current/environments/all/redis.conf
stdout_logfile=/var/log/redis_testapp_log
stderr_logfile=/var/log/redis_testapp_log
startsecs=30
priority=1
autostart=true
autorestart=true

[program:celerybeat-testapp]
command=python -O manage.py celerybeat --loglevel=INFO --schedule=/apps/testapp/db/celerybeat_schedule_db
stdout_logfile=/var/log/celerybeat_testapp_log
stderr_logfile=/var/log/celerybeat_testapp_log
priority=999
startsecs=5
autostart=true

[program:celery-testapp]
command=python -O manage.py celeryd --loglevel=INFO --events
stdout_logfile=/var/log/celeryd_testapp_log
stderr_logfile=/var/log/celeryd_testapp_log
priority=100
startsecs=10
autostart=true

[program:gunicorn-testapp]
command=gunicorn_django --workers=10 --log-level info --timeout 500 --bind=127.0.0.1:8004
stdout_logfile=/var/log/gunicorn_testapp_log
stderr_logfile=/var/log/gunicorn_testapp_log
priority=999
startsecs=10
autostart=true

[program:memcached-testapp]
command=/opt/bcs/bin/memcached -m 128 -l 127.0.0.1 -p 11212 -u nobody -P /apps/testapp/run/memcached.pid
stdout_logfile=/var/log/memcached_testapp_log
stderr_logfile=/var/log/memcached_testapp_log
priority=11
autostart=true
autorestart=true

My output =>

2012-05-30 22:37:33,181 INFO daemonizing the supervisord process
2012-05-30 22:37:33,182 INFO supervisord started with pid 16230
2012-05-30 22:37:34,195 INFO spawned: 'redis-testapp' with pid 16232
2012-05-30 22:37:34,206 INFO spawned: 'memcached-testapp' with pid 16233
2012-05-30 22:37:34,214 INFO spawned: 'celery-testapp' with pid 16234
2012-05-30 22:37:34,238 INFO spawned: 'celerybeat-testapp' with pid 16235
2012-05-30 22:37:34,477 INFO spawned: 'gunicorn-testapp' with pid 16241
2012-05-30 22:37:35,240 INFO success: memcached-testapp entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2012-05-30 22:37:39,434 INFO success: celerybeat-testapp entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2012-05-30 22:37:44,434 INFO success: celery-testapp entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2012-05-30 22:37:44,435 INFO success: gunicorn-testapp entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2012-05-30 22:38:04,197 INFO success: redis-testapp entered RUNNING state, process has stayed up for > than 30 seconds (startsecs)

What I expected to happen =>

redis would start and supervisord would wait 30 seconds before starting any lower priority processes.

Source

fredpalmer

👍185 ❤6 👀4 🚀2 👎1

Most helpful comment

someone on the supervisor project needs to take ownership of this, or close the ticket with a reasonable reason for why this won't be addressed e.g. perhaps it doesn't need to be addressed because supervisor can re-attempt a restart on an error condition (a solution that has worked for me).

This issue has been open since 2012 and it's getting kind of ridiculous that a ticket would remain open for this length of time.

orrery on 21 Aug 2017

👍26 👀1

All 218 comments

I don't see what's the point of using supervisor if it does not get this right.

@mnaberez If there is an open issue for this, you should post the link so we can find it.

disposable-ksa98 on 2 Jul 2013

👍8

I also need the ability to start processes in a particular order.

An event-based approach could work. For example, if I have a [program:agent] and a [program:client] I could subscribe the client to start up when the agent emits a started event. By default, if a program does not subscribe to any event, it will be started when supervisord starts.

rca on 9 Jul 2013

👍5

sevastos on 3 Mar 2014

maximilize on 6 Mar 2014

tomislacker on 9 Apr 2014

+1, this major issue.

dmitriy-kiriyenko on 21 Apr 2014

jefferai on 22 Apr 2014

hunterloftis on 23 Apr 2014

orarbel on 1 May 2014

abierbaum on 2 May 2014

gigaroby on 13 May 2014

:+1:

KenjiTakahashi on 14 May 2014

karanlyons on 30 May 2014

voleg on 10 Jun 2014

toxsick on 10 Jun 2014

must have

harmy on 13 Jun 2014

is there a dependency in supervisor? if not, why not?

xbeta on 23 Jun 2014

analytically on 27 Jun 2014

devrim on 8 Jul 2014

alp82 on 10 Jul 2014

mtrienis on 12 Aug 2014

Pindar on 2 Sep 2014

Not wanting to add to any perceived pressure, but I'd be +1 on the very same feature (process dependencies). :-)

miguno on 3 Sep 2014

this would be very useful and mean less ugly hacks like sleep

nand0p on 4 Sep 2014

👍2

jowenn on 6 Sep 2014

pangon on 14 Sep 2014

+1. Maybe an option to turn the process loading synchronous.

ricardosasilva on 15 Sep 2014

:+1:

brianholcomb on 23 Sep 2014

clintecker on 23 Sep 2014

:+1:

kgadek on 23 Sep 2014

vladfr on 24 Sep 2014

bataras on 26 Sep 2014

Anyone got an other implementation thoughts on this? I'm considering making a run at it since no one else seems to care.

tomislacker on 26 Sep 2014

lostsnow on 29 Sep 2014

@tomislacker well, IMHO the best solution is to make dependency-solver à la Puppet. If I can see well, then simplest topological sort with DFS should be more than enough. Then add the "requiresstarted" field (or "waitforstarted", or sth similar). This issue would be fixed by the way.

(edit) Well, not sure what would happen if while having A -> B -> C (A depends on B, etc), someone would try to stop B. Should A be stopped as well? Well… /depends/…

kgadek on 29 Sep 2014

:+1:

sinchb on 4 Nov 2014

swindmill on 5 Nov 2014

Since everyone seems to need/want to do this a slightly different way -- how about a slightly different tack? Instead of attempting to express dependencies directly in a configuration directive, how about instead adding a directive for a 'helper' script, and a couple example helpers that suit 2-3 general use cases with differing requirements as @kgadek pointed out?

Here's a good example to demonstrate what I mean:
http://www.serfdom.io/docs/recipes/event-handler-router.html

There's some stuff out there for doing this sort of thing easily in bash,
https://github.com/progrium/pluginhook

And you certainly can't discount the nicety of "I felt like writing it in python" (Or any other language of admin's choice)
https://github.com/garethr/serf-master

It seems to me like this would be a reasonable win for everyone without consuming a lot of developer time on supervisor itself.

kamilion on 15 Nov 2014

Hey,

The serf model looks very promising. Having a collection of scripts instead of just one handler seems appropriate (e.g. checking for 2 deps before starting)

So, you would define a handler to see if process/group:x "can_be_started", and if all scripts return 0, then it's all good. Sounds right?

Imho this makes "priority" obsolete, am I wrong?

vladfr on 18 Nov 2014

For my needs, simply having a per-process delay from the time the supervisor daemon starts would be enough.

jefferai on 21 Nov 2014

yet another +1 for this

binhex on 1 Dec 2014

+1. Per process delay would be enough tho.

samet on 11 Dec 2014

I've been able to work around this a bit. Quick and dirty.

# Deal with updating our repositories.
supervisorctl start source-code-deploy

# Check for the oneshot process to complete.
while ! supervisorctl status source-code-deploy | grep -q 'EXITED'; do sleep 1; done
# Wait for the while loop to break out signalling success.

# Start the late boot process, now that the deployment is complete.
supervisorctl start system-boot-late

# Check for the oneshot process to complete.
while ! supervisorctl status system-boot-late | grep -q 'EXITED'; do sleep 1; done
# And now we should become EXITED to supervisord and any other tasks relying on the above.

kamilion on 11 Dec 2014

👍6

NicholasTurner on 12 Dec 2014

msierks on 17 Dec 2014

Hello,
I'd like to have a go at this in the coming weeks, let me know if anyone else is working on it.

vladfr on 5 Jan 2015

Attempting to start many processes at exactly the same time tends to push the machines to the limit and sometimes even beyond.

Having a parameter like "startdelay" would be very useful for me especially since some processes tend to use a lot of resources when first starting, resources which are released after a short while.

Unfortunately there are quite a few ways to implement such a feature and finding the best one could take some time.

@vladfr Perhaps we could collaborate on this.

My first "dirty" crack at it:
https://github.com/liutec/supervisor/commit/eab7cc1e04ad49768593183e8134298604459827
(I really don't like having to use sleep this way.)

liutec on 5 Jan 2015

Hi @liutec, look above at what @kamilion is saying about a hook model - I think this is useful to implement custom checks, and could be used to just sleep for a simple case.
You could run the hook scripts in subprocesses and they can sleep all they want.

vladfr on 6 Jan 2015

In my situation, the goal is to be able to have supervisor manage around 240 distinct programs some of which may require more than one instance.

The start delay is only useful for the first time a program is started, simply to avoid the otherwise unlikely situation when all programs start at the exact same time.

Most of the programs are consumers and will only run for a short amount of time before they shutdown and are restarted by the supervisor -- in this case having a start delay will not do any good.

I've considered both @kamilion 's solution as well as the solution given by @mnaberez (autostart autorestart) but unfortunately none actually produce an easily configurable "startdelay".

liutec on 6 Jan 2015

+1 if supervisord had "sequence control" that would be great, now Kafka depends on Zookeeper etc. and supervisor cannot used like we would like to.

ripasapa on 12 Jan 2015

jackwilsdon on 26 Jan 2015

I am using supervisord to start a bunch of exotic qemu instances (for a compilers course) and some of these boot up quickly by themselves but take quite a long time if started all at once. A simple "startdelay" thing would work just fine: Sleep for the given number of seconds after starting this process before moving on to the next one in the list. I don't know exactly how you decide to order these (the priority keyword doesn't seem to order strictly as far as I can tell from the created pids) but whatever it is, being able to delay for a few seconds after each process would help a whole bunch in my case. So :+1: for sure. Twice, actually. :-)

phf on 29 Jan 2015

I think a startdelay option would be very usefull. I have a supervisor to manage an AMQP server and some consumers. Some consumers implement a sleep before starting, some loop over a try to connect / error / sleep sequence (different implementations). I know the maximum time my AMQP server needs to start, so adding a startdelay option to the consumers would avoid those ugly hacks.

I think this may be very helpful in many situations.

After checking the code, I don't see any BC issues, also the supeervisor behaviour is the same when using the default value.
I can't wait for this to be merged.

skafandri on 1 Feb 2015

+1 for dependency support

kforner on 9 Feb 2015

sukrit007 on 13 Feb 2015

danros on 25 Feb 2015

maparent on 5 Mar 2015

5d on 9 Mar 2015

+10 ;)

karl-forner-quartz-bio on 9 Mar 2015

+1.

Bessonov on 9 Mar 2015

This issue needs to be renamed, because I don't think that startsecs is relevant here. I feel like the OP might've misinterpreted what startsecs is, thinking that it's a delay to wait before starting the process.

startsecs

The total number of seconds which the program needs to stay running after a startup to consider
the start successful. If the program does not stay up for this many seconds after it has
started, even if it exits with an “expected” exit code (see exitcodes), the startup will be >
considered a failure. Set to 0 to indicate that the program needn’t stay running for any
particular amount of time.

msabramo on 9 Mar 2015

arinto on 30 Mar 2015

mcneilcode on 31 Mar 2015

important feature
+1 as well

anduslim on 6 Apr 2015

seanmcl on 8 Apr 2015

ahosie on 16 Apr 2015

piotr-piatkowski on 19 Apr 2015

maxexcloo on 11 May 2015

arturhoo on 11 May 2015

Simple workaround I use:

[program:uwsgi]
command=bash -c 'sleep 5 && uwsgi /etc/uwsgi.ini'

debdude on 12 May 2015

👍2

ReSTARTR on 19 May 2015

MorganAntonsson on 2 Jun 2015

Anyone have thoughts on how we may be able to solve this issue in a more responsible fashion? What if there were to be a command that could be invoked to validate the _"online"_ state of a process? Such as:

[program:myapp]
command=/usr/local/bin/myapp-microservice
checkcommand=curl -s http://localhost:54321/v1/_ping
checkfreq=1
checktimeout=3
startsecs=5

The above example would imply:

The command would be run as normal
The checkcommand would be executed every checkfreq seconds after the above invocation until startsecs, possibly in conjunction with checktimeout, causes an abort or the command returns 0.

_Example: If each call to checkcommand takes <=3 seconds, checkfreq=1, checktimeout=3, and startsecs=5; checkcommand gets run +1.0s, and if it failed then, +5.0s, and discontinues marks the serviced failed on the conclusion of the second checkcommand invocation._

If any invocation of checkcommand returns an exit status of 0, then service is considered online.
If checkcommand does not return an exit status of 0 after startsecs, or the last possible invocation of checkcommand hangs for >=checktimeout seconds, the service start is considered as a failure & _"normal"_ logic is assumed that is already in placed. _(Mark the service as a failure, output the same log content, etc...)_

It just seems to me that we want the problem solved but we haven't described it well enough in this issue. The low-hanging-fruit answer is certainly make sure that services with different priorities are not invoked until the _max(startsecs)_ of all higher priority services. But aren't we only asking that in hopes of being able to loosely choreograph our intended result instead of being able to validate it on the way there?

tomislacker on 5 Jun 2015

👍2

:+1:

bazilio91 on 16 Jun 2015

@tomislacker I think for many uses loose choreography is an acceptable 90% case even if full validation is the 100% case.

Once you get into true dependency management you start looking a lot more like a real init and it starts becoming a much more complicated problem. But there are definitely low-hanging-fruit scenarios.

jefferai on 16 Jun 2015

:+1:

vincent-io on 18 Jun 2015

+1 I need this

ghost on 17 Jul 2015

@tomislacker this is a good idea, and easy to use because you just need the current config file.
But how do you handle an actual dependency? If my_program depends on Postgres, I check for it in checkcommand for a couple of times and hope it will start?

To express a dependency, I think you want to run checkcommand before the actual command.

vladfr on 21 Jul 2015

@tomislacker I would prefer there be simple dependencies between the programs

[program:A]
command=/usr/local/bin/A

[program:C]
command=/usr/local/bin/C
dependson=A

and let the user decide if the programs are going to check for more complicated conditions. For example: We can insert [program:B] that will perform a sophisticated check to ensure A is running properly; and fail if not. So, A starts. B depends on A so will only start when Supervisor considers A is RUNNING. When B starts, it will perform its analysis and fail if A is not running properly. C only starts when B ran (or is running successfully).

My suggestion does not cover the extra features you suggest, like number of checks, the timing of those checks, and when to give up; but I also advocate giving Supervisor some Cron-like features so programs like B are easy to define:

[program:A]
command=/usr/local/bin/A

[program:B]
command=curl -s http://localhost:54321/v1/_ping
dependson=A
startretries=3
restartintervalsec=5

[program:C]
command=/usr/local/bin/C
dependson=B

klahnakoski on 21 Jul 2015

I also advocate giving Supervisor some Cron-like features

Please see response in #635. This has been considered at length by the Supervisor developers in the past and it was decided that cron-like functionality is out of scope for this project. Sorry.

mnaberez on 21 Jul 2015

:+1: for a way to deliver sequential execution in a feasible fashion.

Using something like startuppriority would be a good robust way to do this IMO.

ain on 22 Jul 2015

SimaWB on 30 Jul 2015

@klahnakoski in your first example, C depending on A means that supervisor needs to know A is ready. Right now, it only knows it's started.

vladfr on 30 Jul 2015

@vladfr yes, your are correct. That is why I also suggested B, which is a more sophisticated check to determine if A is ready. Ideally B is triggered periodically, and will fail if A is not ready.

klahnakoski on 30 Jul 2015

TalonOne on 14 Sep 2015

zjjott on 15 Sep 2015

+1 ?? This is from May, 2012, I think that supervisor will never have dependencies.

Someone could suggest alternatives with dependencies management?

carlopires on 22 Sep 2015

+1, this would make things much more elegant

aisengard on 1 Oct 2015

naktinis on 14 Oct 2015

gastounage on 16 Oct 2015

cocasema on 17 Oct 2015

+1, it's sad that this is not doable after 3 years, using systemd units for now

Esya on 28 Oct 2015

 .----------------.  .----------------. 
| .--------------. || .--------------. |
| |      _       | || |     __       | |
| |     | |      | || |    /  |      | |
| |  ___| |___   | || |    `| |      | |
| | |___   ___|  | || |     | |      | |
| |     | |      | || |    _| |_     | |
| |     |_|      | || |   |_____|    | |
| |              | || |              | |
| '--------------' || '--------------' |
 '----------------'  '----------------'

Or please, add it.

citizen-stig on 10 Nov 2015

👍1

vellamike on 23 Nov 2015

cutewalker on 24 Nov 2015

wojss on 24 Nov 2015

macayaven on 3 Dec 2015

ddzialak on 3 Dec 2015

+10086

chenjie4255 on 7 Dec 2015

benjaminjones on 17 Dec 2015

ravihuang on 18 Dec 2015

pmec on 21 Dec 2015

wicksy on 22 Dec 2015

feisuzhu on 25 Dec 2015

vishwaraja on 28 Dec 2015

benma on 30 Dec 2015

weissjeffm on 30 Dec 2015

fxstein on 31 Dec 2015

m30m on 6 Jan 2016

maguowei on 11 Jan 2016

tobixx on 14 Jan 2016

rpzatkoff on 14 Jan 2016

jorgegarciadev on 23 Jan 2016

Wow, really? How come no-one cares about this? The "dependson" sounds like a good idea here. The workaround I've heard mentioned was to launch a script as the program, where the script checks for dependencies and otherwise bombs out. By virtue of bombing out, the script will be re-launched later.

boyvinall on 26 Jan 2016

GonZo on 27 Jan 2016

mtokumaru on 29 Jan 2016

vac on 11 Feb 2016

josehenriqueventura on 12 Feb 2016

+1. 2016 right now...

zhuangsirui on 25 Feb 2016

Pithikos on 29 Feb 2016

ppmathis on 5 Mar 2016

aplantenga on 16 Mar 2016

dotpmrcunha on 16 Mar 2016

Guys, please use the Like button on the original topic instead of the old-fashioned +1 comment :) This is what Github made it for.

Thanks!

ain on 16 Mar 2016

Guys, please use the Like button on the original topic instead of the old-fashioned +1 comment :) This is what Github made it for.

+1 :trollface:

vincent-io on 16 Mar 2016

😄2

My vote would be for a simple dependson style system where you express dependencies and supervisord waits for them to start before starting their dependents. It would be great if supervisord would also stop them in the correct order when it was shutting down. However, if I manually stop a single service I don't think it is necessary for supervisord to try to stop dependents in that scenario. If someone wanted to get fancy they could add a flag to stop that shut down dependents.

I need supervisord to be smart about ordering in situations where it is unattended, e.g. startup and shutdown. I don't need it to be smart in response to specific commands I trigger, e.g. stop and start.

jheiss on 22 Mar 2016

👍10

Hi @mnaberez, I was wondering what the status of process dependency feature. You mentioned in 2012 it was on the to-do list? Is this still coming?

orrery on 3 Apr 2016

I was wondering what the status of process dependency feature. You mentioned in 2012 it was on the to-do list? Is this still coming?

It is on the to-do list (TODO.txt) and this issue is still open. I don't think any developers are currently working on it.

mnaberez on 3 Apr 2016

confessin on 4 Apr 2016

Is this really difficult to implement in supervisor? I was wondering if there are other similar tools to supervisor that provide this feature?

orrery on 5 Apr 2016

AJNOURI on 22 Apr 2016

khotkevych on 8 Jun 2016

maxivak on 8 Jun 2016

adamgoose on 14 Jun 2016

Any updates? when we will have this feature, it's a huge win for supervisord for our applicaiton.

madhusudhanane on 15 Jun 2016

I think supervisord is mostly used as lightweight init manager, instead of heavy lifting systemd (upstart, sysv-init, ...) solution, but where systemd is managing service start dependencies really well and supervisord is still lacking such feature since this issue reported in 2012

Why can't some developer here implement a simple dependency like ? is this project in out of maintainer

https://fedoramagazine.org/systemd-unit-dependencies-and-order/

Wants=sshd-keygen.service
After=network.target sshd-keygen.service

:+1:

c0b on 20 Jun 2016

👍5

Hello all!

I finally had some time and gave this a try. This implements "dependson" as a config option for both [program] and [group]. You can specify other programs and groups as dependencies, and supervisor will ensure they will be started before your program.

Example config:

[program:a]
command=a

[program:monitord]
command = ./monitord.sh
dependson=a
startsecs=5

[group:monitoring]
programs=c1,c2
dependson=monitord,a

[program:c1]
command = bash c1.sh
dependson=anotherprogram ; this doesn't matter because c1 is part of a group
autorestart=true

[program:c2]
command = bash c2.sh
autorestart=true

This config will ensure the following start order:

a
monitord
monitoring group, which will start c1 and c2

Notice that program:c1 dependson=anotherprogram
Since c1 is part of a group, its dependson value will be overriden by the group value. In my view, members of a group should depend on the same stuff.
Any thoughts here?

Please see the changes, and feel free to checkout the branch "dependson_122" to play around. There's still stuff to do (circular deps, finish tests, etc.) but first, I want some feedback!

https://github.com/Supervisor/supervisor/compare/master...vladfr:dependson_122?expand=1

Thank you in advance.

vladfr on 24 Jun 2016

👍12 ❤2

sorry for the mistakenly created #776

c0b on 24 Jun 2016

Just to summarize the systemd semantics referenced in https://github.com/Supervisor/supervisor/issues/122#issuecomment-227241447:

Requires= This directive lists any units upon which this unit essentially depends. If the current unit is activated, the units listed here must successfully activate as well, else this unit will fail. These units are started in parallel with the current unit by default.
Wants= This directive is similar to Requires=, but less strict. Systemd will attempt to start any units listed here when this unit is activated. If these units are not found or fail to start, the current unit will continue to function. This is the recommended way to configure most dependency relationships. Again, this implies a parallel activation unless modified by other directives.
BindsTo= This directive is similar to Requires=, but also causes the current unit to stop when the associated unit terminates.
Before= The units listed in this directive will not be started until the current unit is marked as started if they are activated at the same time. This does not imply a dependency relationship and must be used in conjunction with one of the above directives if this is desired.
After= The units listed in this directive will be started before starting the current unit. This does not imply a dependency relationship and one must be established through the above directives if this is required.
Conflicts= This can be used to list units that cannot be run at the same time as the current unit. Starting a unit with this relationship will cause the other units to be stopped.

From https://www.digitalocean.com/community/tutorials/understanding-systemd-units-and-unit-files#unit-section-directives

This subset of commands is extremely powerful, and I strongly suspect that the well-formed dependency-tree + event-generation you would need to implement any one of these directives makes the rest of them quite straightforward.

psigen on 26 Jun 2016

@vladfr (https://github.com/Supervisor/supervisor/issues/122#issuecomment-228416456)

Since c1 is part of a group, its dependson value will be overriden by the group value. In my view, members of a group should depend on the same stuff.
Any thoughts here?

I think a clearer semantic is to union the dependencies together.

This is useful in cases where a group as a whole depends on a few processes, but there is a member that might depend on an additional process: e.g. a web application where one component is interacting with a database server, but the other components are just serving static files.

In this case, many members of the group could be started earlier than the one member dependent on an extra process.

psigen on 26 Jun 2016

krisdigitx on 12 Jul 2016

+1, old-fashioned way :smile:

danielzheng on 13 Jul 2016

+1 plz

marc2982 on 20 Jul 2016

👍1

check this:
https://mmonit.com/monit/documentation/monit.html#SERVICE-DEPENDENCIES

sangechen on 3 Aug 2016

Is there any logic reason why this is not being merged?! It's been 4 years and dozens of people requesting this feature... I'm one of them. This is frustrating.

rrei on 5 Aug 2016

Hi! wow, people are really interested in this.

@psigen sorry for the delay on this. Merging deps of a group makes the config clearer, but it's not better functionally. Due to the way supervisor groups are implemented, you need to start all-or-nothing.

In other words, this won't happen: "In this case, many members of the group could be started earlier than the one member dependent on an extra process." - b/c I can't start just some members of a group.

However, I'm not opposed to merging, because it makes it super-clear from the config which service has the actual dependency.

I'll spend some time next week to implement this, add tests and rewrite docs. The monit page looks really good, I'll do a review and see if there are any low-hanging fruit.

Any other comments on the general approach? @rrei I know there are many people that could use this today, so I'd really like to know if this can be used as-is, before submitting the actual PR.

Thank you!

vladfr on 5 Aug 2016

@vladfr I see, that's alright, it's just an optimization after all. Thanks for the feedback.

I do think that union-ing dependencies makes it clearer to trace dependencies and avoid duplicating information at the group level (and means you don't have to re-evaluate group dependencies every time a member is added or removed).

Overall, if you have only a single DEPENDS directive, the monit-style semantics seem most appropriate (with the appropriate translation to supervisord run states, and allowing the directive to refer to both members and groups as dependencies):

https://mmonit.com/monit/documentation/monit.html#SERVICE-DEPENDENCIES

The syntax for the depend statement is simply:

DEPENDS on service[, service [,...]]
Where service is a check service entry name used in your .monitrc file, for instance apache or datafs.

You may add more than one service name of any type or use more than one depend statement in an entry.

Services specified in a depend statement will be checked during stop/start/monitor/unmonitor operations.

If a service is stopped or unmonitored it will stop/unmonitor any services that depends on itself.

If the service is started, all services which this service depends on will be started before starting this service. if start of some service failed, the service with prerequisites will NOT be started and the, but will remember that it should start and will retry next cycle.

If a service is restarted, it will first stop any active services that depend on it and after it is started, start all depending services that were active before the restart again.

The more complex systemd semantics cover an even wider set of use cases, and would probably be desirable at some point, but are obviously more intricate and may not fit the internal state that supervisor is using right now.

psigen on 5 Aug 2016

Ok, thanks @psigen! I'll finish the simple approach of "dependson".
I added circular deps detection and docs. I need to finish tests and that's it.
This feature works as expected with group deps, [fcgi-process], and supervisorctl.

It's been suggested here that Stop/restart should be manual, i.e. dependson applies as a start-only restriction. So this is what's happening now: after a program with dependson runs, it's independent: if A depends on B, A is not stopped when B terminates. This is mentioned in the docs.

Here are a few more points of discussion, feedback welcomed.

supervisorctl: start will respect deps. Do you feel there is a need to add an override for RPC calls? process.spawn could receive an argument to skip deps checking. This can be a config in the [supervisorctl] section. We can even include this in the [supervisord] section.
Do we want to trigger an event whenever a program can't start because of an unmet dep?
Here is an example of the WARN message for an unmet dep; is this ok as a WARN? I can see it as a DEBUG, because it's kinda spammy, and this message is expected.

2016-08-08 19:21:00,419 WARN process 'tom' cannot start - group 'cats' depends on processes which are not started yet: ['mouse', 'night']

So @mnaberez how is all of this looking to you so far?

vladfr on 8 Aug 2016

So @mnaberez how is all of this looking to you so far?

I haven't looked at any of this. My initial feedback is that if some sort of dependency mechanism was to be added, it would need to have good user feedback in order to not become a support burden. If it were to block a command like supervisorctl start, I think it has to surface that information to RPC consumers so supervisorctl can print the appropriate messages. That may imply new process states or RPC response codes. My experience answering support tickets is that most users are not reading the main log as much as they should. If the dependency code blocks supervisorctl commands, that information needs to be visible in supervisorctl.

mnaberez on 8 Aug 2016

@mnaberez Thank you, I'll give that some thought! Right now, the process goes straight to an error state (abnormal termination).

For the record, this has been a personal thing for some 18 months now, so if it's considered for merge, it's a bonus. In any case, it sounds like a candidate for 4.0.

vladfr on 8 Aug 2016

jneu on 22 Aug 2016

donreilly on 23 Aug 2016

alexzam on 23 Aug 2016

@vladfr I think that in your supervisord branch, setting the 'dependson' to be a member of the class GroupProcessConfig would be a better choice. For it is easier to iteratively call 'removeGroup()' in the 'rpcinterface.py'.

FAKERINHEART on 28 Aug 2016

So enthusiastic~

FAKERINHEART on 28 Aug 2016

Yeah I think I had it like that at some point but changed it to allow for
merging deps at group-level. I haven't looked at rpc yet but I have to,
thank you for the suggestion!

On Sun, Aug 28, 2016, 15:19 FakerInHeart [email protected] wrote:

So enthusiastic~

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/Supervisor/supervisor/issues/122#issuecomment-242971739,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAM-h4JlTLaoxoMLEJRkNLBr_ZoOOmZzks5qkXy0gaJpZM4AAzaa
.

vladfr on 28 Aug 2016

@vladfr The commands you input in the supervisorctl commandline will be transferred to 'function calls' to the class functions in the rpc. And rpc will finally call the member functions of the class supervisor in 'supervisord.py'.

FAKERINHEART on 28 Aug 2016

too many users, too few developers . orz

codeskyblue on 29 Aug 2016

@FAKERINHEART This is the most amazing I have seen. This situation make me to think is this feature request really needed. Or in another way is that the source code is too hard to modify. Haha, really superize me, after all this is an two years issue.

codeskyblue on 29 Aug 2016

@FAKERINHEART I did try to write another supervisor using golang, because I just need part of the supervisor function, especially the supervisor web manager page. Still in progress https://github.com/codeskyblue/gosuv

codeskyblue on 29 Aug 2016

artinnok on 2 Sep 2016

👎1

treevesvarndell on 6 Sep 2016

👎1

nicobouliane on 13 Sep 2016

👎1

Edit: I am currently using chaperone which supports service ordering/dependencies.

strarsis on 26 Sep 2016

👍1

I've tried customizing the 'supervisor' to satisfy the dependencies among programs, which is partially based on codes provided by vladfr. However, when I read the source code of the origin version of the 'supervisor', I think that there is something not excellent for the architecture of the 'supervisor'. I found that the basic unit of subprogram config in the 'supervisor' would be transferred to 'config group'(processGroupconfig), no matter whether its config-key is 'program'(homogeneous group) or is merged into a 'group'(heterogeneous group) in the config file.
And for the 'supervisord' class object and in its transition loop, its is based on 'the array of subprocess group'(process_groups) that is generated from the 'config group'(processGroupconfig) I mentioned above. I think that the 'config group' unit is so big that is not advantage for 'superviosr' to manage them. Furthermore, it's more difficult to realize the dependencies among the subprograms, if this function will be in your plan in the future.
I think that it would be better if the basic unit of subprogram is 'config program'(may be called as 'processProgramconfig'), and 'config group'(processGroupconfig) is only used to put them together.

FAKERINHEART on 28 Sep 2016

❤1

+100

gaojice on 2 Oct 2016

I really needed this, and an event listener didn't sound too hard so I wrote one. It won't solve everyone's problem, it doesn't do dependencies, it's just ordered startup:
https://pypi.python.org/pypi/ordered-startup-supervisord/

jasoncorbett on 8 Oct 2016

👍5 ❤1

+1 ...

petrem on 27 Oct 2016

👎1

@FAKERINHEART apologies for the late reply. I think you are spot on - re: what you described in your last comment

the basic unit of subprogram config in the 'supervisor' would be transferred to 'config group'(processGroupconfig), no matter whether its config-key is 'program'(homogeneous group) or is merged into a 'group'(heterogeneous group) in the config file.

That is exactly the issue, and why this task is not straightforward.
Did you find an improvement or another solution here? I'd love to chat more about it, hit me up.

vladfr on 27 Oct 2016

Gmanweb on 3 Nov 2016

👍4 👎1

joy4eg on 10 Nov 2016

👎1

jerome-plumecoq on 16 Nov 2016

👎2

ZzAntares on 18 Nov 2016

👎2 👍1

rastislavs on 2 Dec 2016

👍3 👎1

rdrsss on 7 Dec 2016

👎1

jacohend on 12 Dec 2016

👎1

bipeens on 20 Dec 2016

👎1

goddanao on 5 Jan 2017

👎1

For those who are interested, I have addressed this feature in https://github.com/julien6387/supvisors through the 'start_sequence' process' rule of the deployment file.

Supvisors is designed for distributed applications but "if you can move mountains you can move molehills".
It works with a single Supervisor instance.

julien6387 on 5 Jan 2017

👍4

GraphicsEmpire on 5 Jan 2017

👎1

afonsosilva91 on 25 Jan 2017

👎1

davidraleigh on 30 Jan 2017

👎1

livelace on 9 Feb 2017

👍3 👎2

keyndark on 22 Feb 2017

👍3 👎1

alexhudici on 23 Feb 2017

👍3 👎1

https://github.com/FAKERINHEART/supervisor
See the Introduction.txt and use the branch dev-3.3.1-sr1.
I have had this branch with dependency features among groups(programs) for supervisor, which is partly based on the codes suggested by @vladfr .
If you find any bugs, please tell me in time.

FAKERINHEART on 24 Feb 2017

prapdm on 23 Mar 2017

👎1 👍1

Clamb94 on 6 Apr 2017

👎1 👍1

This issue has been open since 2012, does anyone have ownership over it? It's clearly important to many users; is there any consensus amongst supervisor contributors?

orrery on 17 Apr 2017

👍17

swapdavv on 3 May 2017

👍3 👎1

Constantin07 on 21 Jul 2017

👎2

bogdanm on 21 Aug 2017

👎3

This issue has been open since 2012 and it's getting kind of ridiculous that a ticket would remain open for this length of time.

orrery on 21 Aug 2017

👍26 👀1

+1...

greenaussie on 14 Sep 2017

👎3

ecebuzz on 24 Oct 2017

👎4 👍1

pickledgator on 24 Oct 2017

👎5

chezbut on 6 Dec 2017

👎1

FYI: replying to an issue just to say '+1' sends a notification to everyone subscribed to the issue. You can just react to the issue's initial comment with the 👍 emoji instead.

plus_1_github

alecbz on 6 Dec 2017

👍11

@AlecBenzer

FYI: replying to an issue just to say '+1' sends a notification to everyone subscribed to the issue.

But that's kind of the point, is not it? I'm subscribed to this issue, and I'm very interested to see that other people are interested is well. This issue is more than 5 years old. It is a good thing to actually ping the developers to gently remind them that people are still interested. It is also good for those who subscribed, to confirm that there are also others who care, not just that person ;)

FYI: There is an "Unsubscribe" button to the right of the thread, if you do not like to receive the notification from this thread, this is perfectly fine, you can use this button to opt-out.

AndrewSav on 6 Dec 2017

👍15 👎1

I can confirm, that starting processes in a specific order is an important requirement. Sorry, I was not able to confirm it earlier. Will it be implemented now?

LMCom on 12 Dec 2017

Sharing my solution to this problem, your mileage may vary. While waiting for this to be fixed I transitioned most of the pieces of our infrastructure relying on this feature to Docker/Kubernetes. The various Docker orchestration frameworks (docker-compose, swarm, Kubernetes, etc) typically have mechanisms to control startup ordering (often with their own set of challenges). So if you are waiting for this to be fixed maybe use it as an excuse to play around with a simple docker-compose project to see if it provides an alternative approach to this problem. I recognize this may only apply to a small fraction of the users waiting for this to be fixed.

jkemp101 on 12 Dec 2017

Hi everyone

I've implemented an event plugin with support for starting services in order specified by dependent services.
Take a look at this example.

https://github.com/bendikro/supervisord-dependent-startup

bendikro on 2 Mar 2018

👍11

An alternative to @bendikro's awesome plugin could also be the Docker community's wait-for-it.sh script. It's a bit easier to set up if you have a complex automation system managing it, though it isn't as pretty as a native plugin:
https://github.com/vishnubob/wait-for-it

CoryPulm on 26 Mar 2018

Also, you can do a simple wait for certain processes like this:

# Wait until PHP FPM is up to start, since supervisord like to start everything at once...
# See
while [[ $(ps -aux | grep "[p]hp-fpm: master process") == "" ]];
do
    echo "Waiting for PHP FPM to come up..."
    sleep 2s
    if [[ $(ps -aux | grep "[p]hp-fpm: master process") != "" ]]; then
        echo "PHP FPM looks to be started, continuing with Icinga daemon initialization"
        ps -aux | grep "[p]hp-fpm: master process"
    fi
done

I only have a few, so I settled for this method

mtdeguzis on 5 Nov 2018

Something i just realized is that, let's say i use supervisor to host my database and my server. The server service has autorestart=true so as long as it fails connecting to the database it is just going to retry right. So it works out.

DenLilleMand on 29 Dec 2018

CBEPX on 13 Feb 2019

zgdgod on 5 Aug 2019

eugene-kulak on 8 Oct 2019

Please click the thumbs up on the original post instead of just commenting "+1". Gets my hopes up when I see a notification :D

mtdeguzis on 8 Oct 2019

👍6

On option to solve this problem is to define in the supervisor.conf that the program will get autostart value of false for example:

[program:test]
Autostart:” false”
Command=“/home/user/test”

Second step is to use bash script to start the supervisorctl example:

Gnome-terminal -e “supervisord”
Gnome-terminal -e “supervisorctl”
Sleep 10
Gnome-terminal -e “supervisorctl start test”

*Thanks to Avner gidron for giving me the idea

AdiBrucker on 2 Feb 2020

Hi,
I have below script in the supervisor.cnf file :

[program:zookeeper]
startsecs=60
directory= /app
command=/bin/bash -c "java -jar zoo.jar"
priority=1
autostart = true
autorestart = true

[program:kafka]
startsecs=60
directory= /app
command=/bin/bash -c "java -jar kaf.jar"
stdout_logfile=/var/log/supervisor/%(program_name)s.log
stderr_logfile=/var/log/supervisor/%(program_name)s.log
priority=999
autostart = true
autorestart = true

I want to start first Zookeeper and then Kafka when Zookeeper is up. This script is some times not working as expected.
What would be the way to handle this through supervisor.
Please suggest.

kumarshorav11 on 28 Feb 2020

@kumarshorav11 it's not pretty, but what I do is monitor for the process to come up first (whatever needs to wait)/. This is an example only.

httpd.conf

[program:httpd]
command=/opt/supervisor/httpd_supervisor
autorestart=true
startretries=3
# Start only after PHP FPM
priority=2
# redirect output to stdout/stderr and do not use a regular logfile
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0

/opt/supervisor/httpd_supervisor:

#!/bin/bash

# Wait until PHP FPM is up to start, since supervisord likes to start everything at once...
# See https://github.com/Supervisor/supervisor/issues/122

echo -e "\n==> Starting httpd"
# PHP FPM
while [[ $(ps -aux | grep "[p]hp-fpm: master process") == "" ]];
do
    echo "Waiting for PHP FPM to come up to start httpd..."
    sleep 2s
    if [[ $(ps -aux | grep "[p]hp-fpm: master process") != "" ]]; then
        echo "PHP FPM looks to be started, continuing with Icinga daemon initialization"
        ps -aux | grep "[p]hp-fpm: master process"
    fi
done

# Another project had this, but we won't be using systemd
# service httpd start
/usr/sbin/httpd

# Allow any signal which would kill a process to stop server
trap "service httpd stop" HUP INT QUIT ABRT ALRM TERM TSTP

while pgrep -u apache httpd > /dev/null; do sleep 5; done

mtdeguzis on 2 Mar 2020

👍1

Another team with this problem, we need to start a apps only after dnsmasq correctly started, if not the app will resolve and cache the wrong DNS

So there is already the https://github.com/jasoncorbett/ordered-startup-supervisord, why this is not merged in the main supervisor, is there any reason for this not to be fixed?

edit: another similar fork https://github.com/bendikro/supervisord-dependent-startup

danielmotaleite on 16 Mar 2020

👍6

Any elegant solution?

OrangePJ on 11 Jun 2020

i'm all for this feature

mrkeyiano on 26 Aug 2020

It has been fun tracking this issue for all these years, but I finally unsubscribed :joy:

benma on 26 Aug 2020

😄15 👍2

My case is that I just want to control the stopping order when manually not in supervisord control period, A > B,C ,
Simple solution: in a shell script, write them one by one,
supervisorctl stop A:*
supervisorctl stop B:* C:*
supervisorctl start A:* B:* C:*
Or if there are other than B,C, do a grep like this
supervisorctl status | grep -v '^A:*' | grep -v '^B:*' | grep -v '^C:*' | awk -F':' '{print $1":*"}' | xargs | supervisorctl restart