Supervisor: Allow a custom "stop-command" to manage graceful shutdown of processes

Created on 6 Sep 2012 · 57Comments · Source: Supervisor/supervisor

I am trying to use supervisord to manage instances of playframework applications. The framework supplies commands for starting and stopping instances which allows an application to properly shutdown and the execution of lifecycle events of plugins. An example is the proper closing of Lucene search index, which cannot be opened again if they are shutdown ungracefully, because a "write-lock" file remains the in index directory.

What I would consider a great addition, would be the possibility to specify something like this:

[program:my-play-web-app]
command=/opt/play-1.2.5/play start
stop-command=/opt/play-1.2.5/play stop
directory=/path/to/home-of-play-web-app/

Is there any means that I can do this with supervisord right now? The only workaround I have found is the following:

[program:my-play-web-app]
command=/path/to//home-of-play-web-app/startup.sh
directory=/path/to/home-of-play-web-app/

With startup.sh being:

trap "{ echo Stopping play app; /opt/play-1.2.5/play stop; exit 0; }" EXIT
echo Starting play app
/opt/play-1.2.5/play start

(taken from here: http://stackoverflow.com/questions/7732371/how-to-properly-manage-rabbitmq-with-supervisord )

Source

grandfatha

👍21

Most helpful comment

Five years later, still no "stop-command" to manage graceful shutdown of processes, that's a shame, honestly.

blunt1973 on 28 Sep 2017

👍14

All 57 comments

Like to see this, too. Seems to be a common java daemon problem. Can't stop my wowza instances gracfully.

kraiz on 27 Sep 2012

hckjck on 9 Aug 2013

I just published a post that may provide a work-around (for Play2) How to make Play Framework 2 work with Supervisord and Monit. Not sure it will work in your case though

FGRibreau on 2 Sep 2013

genworld on 11 Jan 2014

dzwicker on 5 Feb 2014

dustinlacewell on 25 Mar 2014

kaaloo on 29 Apr 2014

tylerhenthorn on 6 May 2014

brianclements on 10 May 2014

eropple on 3 Jun 2014

I would also like the option of having a command sent to the process directly as if through fg mode.

So for example, given a game server that requires a /stop command to cleanly shut down and save the server, I'd like to be able to do the following in my config file:

[program:my-awesome-server]
command=/some/dir/start-awesome-server.sh
fg-stop-command=/stop

where /stop is the command I would normally enter manually while in supervisorctl fg <process>

JohannesMP on 21 Jun 2014

foyo23 on 22 Jun 2014

mihneagiurgea on 24 Jun 2014

+1
Is there some reason this still hasn't been addressed?

tomislacker on 25 Jun 2014

Probably because nobody's done it. I'm sure they take pull requests.

eropple on 25 Jun 2014

+1 for sure.

rogthefrog on 25 Jun 2014

+1
would be very useful when supervisor has to manage docker containers and those are started using bash script so sending kill signal to bash process does not stop docker container

miki725 on 25 Jul 2014

👍3

pankajtakawale on 11 Aug 2014

eduardo-matos on 18 Aug 2014

For better or worse, supervisor requires that processes which it starts remain direct children of the supervisord process (see http://supervisord.org/subprocess.html#nondaemonizing-of-subprocesses), and that sending some signal to that child process will stop it cleanly. You can specify _which_ signal supervisor should send to cleanly shut the process down (see stopsignal in http://supervisord.org/configuration.html#program-x-section-settings) but supporting an entirely different model for stopping a process than by sending a direct child process a signal is out of scope for Supervisor. As a result the workaround in the original issue post is "correct" (the bash program which, in this case, acts a sort of proxy, using trap to handle the signal) for some definition of correct. There may be a "more correct" way to start the program (play in this case), one which would start the program in such a way that sending it a signal like SIGTERM would allow it to shut down cleanly (as described in http://blog.fgribreau.com/2013/09/how-to-make-play-framework-2-work-with.html linked to by @FGRibreau above), but the workaround is reasonable if there isn't.

As a result, I'm afraid I'm going to need to close this issue. There's close to zero chance of supervisor supporting a different mechanism for signaling to a program that it should shut down other than sending an actual UNIX signal to a direct child process.

My personal opinion, which colors the above: if the program cannot be shut down gracefully using a UNIX signal, it's arguably broken, as run-of-the-mill commands like "kill" won't work against it either. And that's not really sane.

mcdonc on 18 Aug 2014

👍11 👎8 😕2

silvax on 20 Nov 2014

Thank you for the excellent explanation of why it works that way. The "stop" command usually just organizes a graceful shutdown and runs some application level shutdown hooks and user code. If that is being done when simply sending a signal to the process, then I agree there is no need for a change.

However there might be a lot of other programs that don't do this that people might want to run with supervisor. Instead of changing the process mechanism of supervisor, would it be feasible to have a "pre-shutdown" command? Instead of having custom control of the stop procedure and taking that control away from supervisor, just a means of reacting to that event assuming the user program does not correctly handle the unix signal.

grandfatha on 21 Nov 2014

👍2

I'm afraid that's not in the cards either as it would add a good deal of complexity to the shutdown code for processes (executing the command, logging its output, coping with a command that does not exit). Sorry.

mcdonc on 21 Nov 2014

Well essentially you could treat it just like you treat the actual process right now. But it makes no sense to discuss, since you have good reasons why it is not a good idea and i am not stepping up with a PR, so why even bother. Back to init.d scripts then.

grandfatha on 21 Nov 2014

👍1

The normal way to handle this need is to use a wrapper script around your daemon which would receive supervisor's signals and issue meaningful instructions to the underlying daemon.

israelshirk on 5 Dec 2014

Using some sort of vendor stop command that likely relies on a pid file is definitely a step backwards. A better way to achieve the desired result would be sending 'Ctrl-C' or 'Ctrl-\' via STDIN as described in #555.

dtoubelis on 14 Jan 2015

lifeofguenter on 7 Apr 2016

ghost on 13 May 2016

jkryanchou on 24 May 2016

ghost on 1 Jun 2016

Still not clear to me why its so hard to issue a stop command when a supervised process stops for any reason. I have a perfect example of why it would be useful:

I have a Raspberry Pi bittorrent client. It just sits there and downloads and seeds various torrents via Transmission (bittorrent client). However, it also runs OpenVPN and I don't want it doing any torrent-related activity unless OpenVPN is running and connected.

Ideally I would set up Supervisor to keep OpenVPN running, but in the case that OpenVPN ever crashed or supervisor needed to restart it for any reason, after it crashes it would immediately run the bash command:

# transmission-remote --torrents all --stop

In order to stop all the torrent activity. And then I would start them again once OpenVPN makes a connection.

So simply specifying a bash command to run when a process stops or crashes.... it would be extremely useful.

Jakobud on 2 Jun 2016

Systemd supports a custom stop command if you'd wanna use that.
On Jun 1, 2016 3:56 PM, "Jake Wilson" [email protected] wrote:

Still not clear to me why its so hard to issue a stop command when a
supervised process stops for any reason. I have a perfect example of why it
would be useful:

I have a Raspberry Pi bittorrent client. It just sits there and downloads
and seeds various torrents via Transmission (bittorrent client). However,
it also runs OpenVPN and I don't want it doing any torrent-related activity
unless OpenVPN is running and connected.

Ideally I would set up Supervisor to keep OpenVPN running, but in the case
that OpenVPN ever crashed or supervisor needed to restart it for any
reason, after it crashes it would immediately run the bash command:

transmission-remote --torrents all --stop

In order to stop all the torrent activity. And then I would start them
again once OpenVPN makes a connection.

So simply specifying a bash command to run when a process stops or
crashes.... it would be extremely useful.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/Supervisor/supervisor/issues/147#issuecomment-223149392,
or mute the thread
https://github.com/notifications/unsubscribe/AABwJydcnmZ8pn8fd5U6F9V1JHooErABks5qHg4YgaJpZM4AJAO7
.

LeifW on 2 Jun 2016

shuaiming on 2 Jun 2016

infinityhacks on 11 Jul 2016

Any reason this was closed? I don't see this feature yet.

shatil on 8 Sep 2016

@shatil read up ^^^^

It's just a design decision to use signals and only a single child process per application/process. This avoids all sorts of situations like accidental fork-bombing with kill processes and other unexpected behaviors that happen in systemd and similar methodologies. If the process you're running doesn't handle that, use a Bash script and trap to catch the signal and run a custom command.

israelshirk on 8 Sep 2016

👍1

Also, some applications start one process, spawn another, and drop privileges. Sending a kill signal to the one started by supervisord doesn't do anything useful. A shining example is mysqld_safe on Alpine Linux, which is easily stopped using mysqladmin shutdown, but not supervisorctl stop mysqld.

shatil on 8 Sep 2016

supervisor has different design goals and it is likely a wrong tool to use with those kind of applications. It actually requires applications to be designed in a specific way rather than to accommodate everything in the universe. Instead of forcing square peg into a round hole you may want to try something like daemontools, runit, systemd, etc.

dtoubelis on 8 Sep 2016

😕23 👍3 😄2

That's a shame, honestly. This feature is essential for any service management software.

Oloremo on 7 Aug 2017

👍9

Five years later, still no "stop-command" to manage graceful shutdown of processes, that's a shame, honestly.

blunt1973 on 28 Sep 2017

👍14

UnR34L on 1 Dec 2017

You are asking for supervisor to be more like systemd or init but why don't you use systemd or init instead and leave supervisor alone? The absence of stop command is what makes supervisor so awesome. My vote is for "no stop command ever"!

dtoubelis on 1 Dec 2017

😄1

The absence of stop command is what makes supervisor so awesome.

How?..

Oloremo on 1 Dec 2017

In order to implement stop command one would need to keep track of process pid. This is usually done by storing pid file somewhere in the file system. This creates at least two problems 1) the pid file can be deleted (intentionally/unintentionally/fs corruption) 2) pid file is not deleted upon system crash or ungraceful shutdown (i.e. pull the plug event) and another process started with the same pid id upon reboot. These are rare circumstances but I experience all of this more times than I ever expected.

The only reliable way to monitor the process state is by forking it from a parent (supervisor) and using kernel facilities to control/monitor the child.

The start/stop/pid approach works in 99% of cases and suitable for generic applications where failures are common occurrence and not a big deal. However, there are applications out there where 99% is not good enough. Think of industrial automation, robots, car controls, flight controls, etc.

There are very few tools that do process control in this way (daemontools, runit, systemd, supervisor, inittab, pm2), there are even fewer that do it right and supervisor is one of the best. Not supporting unreliable process control mechanisms, having small footprint and clear design is what sets supervisor apart from the rest and I want it to stay this way.

dtoubelis on 1 Dec 2017

In order to implement stop command one would need to keep track of process pid.

Hm... no. In order to implement stop command, one would need to add it as a optional alternative to the send_signal logic. Custom stop command may not use pid at all.

The only real problem is what to do if stop command won't stop the process? I'd suggest to fallback to SIGKILL here.

Oloremo on 1 Dec 2017

👍1

Ugh... I have to use init.d don't I.

NHuebner1983 on 9 Mar 2018

👎1

tahayk on 21 May 2018

hermes-pimentel on 25 May 2018

cough
+1

However - awesome work, thanks for maintaining this package, and I respect the maintainers' design decisions.

florianm on 13 Jul 2018

👍5

The fun thing is, supervisord uses stop-command by itself:
https://github.com/Supervisor/initscripts/blob/master/centos-systemd-etcs
:)

astorath on 24 Aug 2018

😄2

For more than 3 years, people have asked to make it possible to stop applications with a command or script, but nothing has changed. It's sad.

rirmak on 19 Sep 2018

👍4

@rirmak http://www.lmgtfy.com/?q=bash+handle+signal

The Linux Journal one has copy pasta.

israelshirk on 19 Sep 2018

bash+handle+signal

Supervisor 3.3.0:
stopsignal
The signal used to kill the program when a stop is requested. This can be any of TERM, HUP, INT, QUIT, KILL, USR1, or USR2.

rirmak on 19 Sep 2018

👍2

justinas-kazanavicius on 4 Dec 2018

Also, some applications start one process, spawn another, and drop privileges. Sending a kill signal to the one started by supervisord doesn't do anything useful. A shining example is mysqld_safe on Alpine Linux, which is easily stopped using mysqladmin shutdown, but not supervisorctl stop mysqld.

I had the problem with supervisor and mysql. I solved that with this small bash script:

#!/bin/bash

# Start script (that would ordinarily have been entered into supervisor config file)
/usr/bin/mysqld_safe &

# Stop script
stop_script() {
    mysqladmin -u root shutdown
    exit 0
}
# Wait for supervisor to stop script
trap stop_script SIGINT SIGTERM

while true
do
    sleep 1
done

johnbrannstrom on 14 Jan 2019

👍1

I've been looking for a solution for a similar problem.

I have a PHP app that runs consumers on RabbitMQ, startup with supervisor. These run ons EC2/docker.
Once I initiate a scale-down or redeploy with latest code, the EC2 instance/docker gets destroyed. Once this happens supervisor/RabbitMQ does not allow the process to detach as a connection/consumer on the queues it's connected to. Thus I have lots of orphaned consumers/connections on my queues.

I need a graceful shutdown to the consumer processes, so I found a work-around in the meanwhile to assist in doing this. I setup a Python script on Jenkins to check for draining targets on a LB. This allows me to see the instances/dockers that are being destroyed, once I see then I can execute a supervisor stop command via ssh in that script to the IP addresses from the aws cli.

This script runs once I deploy new code and new instances are booted up. But, with auto scale there is not a way to run these jobs as yet. So now I'm forced to run a periodic CRON every 15min or so to have a look for draining targets. This will then catch most of the instances/dockers upon being destroyed from auto scaling.

I hope someone here can use this as a workaround too.

I will let you guys know if I have better solutions/workarounds.