Supervisor: Running `supervisorctl restart <name>` causes xmlrpclib.Fault

Created on 9 Nov 2011 · 20Comments · Source: Supervisor/supervisor

Sometimes (for some reason not all the time), running sudo supervisorctl restart someservice returns:

<class 'xmlrpclib.Fault'>, <Fault 6: 'SHUTDOWN_STATE'>: file: /usr/lib/python2.6/xmlrpclib.py line: 838

In some odd cases this kills supervisord completely, printing "unix:///var/run/supervisor.sock no such file" whenever I run a command in supervisorctl manually. /etc/init.d/supervisor restart won't fix it, rather I need /etc/init.d/supervisor stop && /etc/init.d/supervisor start to get it back running.

I'm using supervisor 3.0a8 on Ubuntu 10.04.3 LTS (Kernel 2.6.32-21-server) 64bit.

Any idea how i can make the restart command not kill my supervisor and actually work?

supervisorctl

Source

ojii

👍8

Most helpful comment

solution, kill process!

ps -aux|grep supervisor
kill -9 the_pid
supervisord
supervisorctl

lokielse on 21 Oct 2014

👍19 ❤5 😕4

All 20 comments

The SHUTDOWN_STATE fault will be returned to supervisorctl when any command is issued but supervisord is no longer accepting commands because it is in the process of either shutting itself down or restarting itself. It is normal to encounter this after the supervisorctl shutdown or supervisorctl reload commands are issued.

In some odd cases this kills supervisord completely, printing "unix:///var/run/supervisor.sock no such file" whenever
I run a command in supervisorctl manually

It is normal to see this in supervisorctl if you have configured it to communicate with supervisord via a domain socket and the file has gone away. This usually means that supervisord has exited.

Since you first see the SHUTDOWN_STATE fault and then supervisorctl can't connect at all, it sounds like supervisord was in the process of shutting itself down and then finally was shut down.

Any idea how i can make the restart command not kill my supervisor ... ?

I'm not aware of any code path where restarting a subprocess with supervisorctl restart <name> would trigger supervisord to start shutting itself down.

Could it be that the supervisorctl shutdown or supervisorctl reload commands were issued?

mnaberez on 9 Nov 2011

👍1

yes indeed, just before calling supervisorctl restart I issue supervisorctl reload. This is all part of our automated deployment. I've added a 3 second sleep between the two and that works most of the time, but not all the time.

ojii on 9 Nov 2011

Was there any development on this issue ? I currently have the same problem.

When I run the command manually it works every time. When it's my deployment script it faults every time.

h3 on 10 Oct 2013

👍1

Just in case anybody stumbles onto this - got the same error, but it was because the filesystem was mounted readonly after a fs error.

friedcell on 14 Apr 2014

solution, kill process!

ps -aux|grep supervisor
kill -9 the_pid
supervisord
supervisorctl

lokielse on 21 Oct 2014

👍19 ❤5 😕4

Can we get a less-useless error message reported by supervisorctl?

Perhaps something like:

Supervisor is restarting, please wait a few minutes or find and kill the supervisor process manually.

nhooey on 22 Jul 2015

👍9

solution, kill process!

ps -aux|grep supervisor
kill -9 the_pid

kill -9 will orphan any child processes running under supervisord and is not recommended. If you have issued the supervisorctl shutdown command but supervisord has not exited, it is almost always because it is waiting for its child processes to exit. The log will have messages during shutdown.

mnaberez on 22 Jul 2015

👍3

Could you put that explanation in the error log so it's clear why supervisorctl won't work as expected?

nhooey on 22 Jul 2015

It's already there. If supervisord is blocking shutdown waiting for child processes to die, it prints "waiting for processname to die" in the log at regular intervals.

mnaberez on 22 Jul 2015

It's there, in the logs, but not in the console output from supervisorctl.

Here's what it looks like:

# supervisorctl status
<class 'xmlrpclib.Fault'>, <Fault 6: 'SHUTDOWN_STATE'>: file: /usr/lib/python2.6/xmlrpclib.py line: 838

I'm saying that it should print something informative instead of just a cryptic error code and code reference. You could check the log for more information, but the tool itself should say things that make sense to the user.

nhooey on 22 Jul 2015

👍2

That sounds like a pretty reasonable addition to supervisorctl to me. I'll leave this issue open, feel free to open a pull request.

mnaberez on 22 Jul 2015

👍1

I meet this issue just now , it occurs atfer runing supervisorctl reload.And I have to kill supervisord to solve this problem

eromoe on 23 Jul 2015

I meet this issue just now , it occurs atfer runing supervisorctl reload.
And I have to kill supervisord to solve this problem

Instead of killing supervisord, look at its log for information about why it is blocking shutdown and address that root problem.

If you see messages in the log like waiting for processname to die, you may need to change the signal sent to stop the process (stopsignal), or the amount of time supervisord waits before resorting to sending SIGKILL to the process (stopwaitsecs).

Example: if stopsignal=TERM but the process doesn't exit after it receives SIGTERM, and stopwaitsecs=90, then supervisord is going to block shutdown for a full 90 seconds while it waits for the process to exit.

mnaberez on 23 Jul 2015

👍2

kill supervisord
and restart works for me

ghost on 7 Sep 2017

Hi guys, I noticed a similar issue and filed it here: https://github.com/Supervisor/supervisor/issues/1041

acannon828 on 11 Jan 2018

I guess that issue was closed as a duplicate, so let me repeat what I think are the relevant parts here:

I'm pretty sure this is what is happening, though I don't follow enough of the XMLRPC code to verify it, can anyone else?

Supervisord process shuts down
Meanwhile, a new supervisord process is started, with a new XMLRPC server
Shutdown signal from (1) is sent to the new XMLRPC server, shutting down the new supervisord process

This seems like a race condition that supervisor should guard against.

acannon828 on 11 Jan 2018

supervisorctl shutdown commands supervisord to shut down and it then returns immediately. It doesn't currently have a way to wait until supervisord has actually exited. This issue is still open because it probably should.

As described above, the SHUTDOWN_STATE error is returned if supervisord receives any more commands after it has already been commanded to shut down. It is refusing to process any more commands because it is in the middle of shutting down.

A workaround is to insert some time delay between when you run supervisorctl shutdown (or supervisorctl reload) before you try another command. It may take supervisord several seconds or even minutes to shut down, depending on the largest value of stopwaitsecs in the config file. You must wait longer than that to ensure it has fully shut down or reloaded.

mnaberez on 11 Jan 2018

FYI: I get this without a shutdown or reload :(

DylanYoung on 10 Oct 2019

After exploring a bit, I can see that it only comes immediately after sending a HUP to one of the managed programs (gunicorn), very strange.

DylanYoung on 15 Oct 2019

systemctl restart supervisord.service
supervisorctl status

ykfq on 9 Jul 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Add ability to specify log format

mminer · 41Comments

Unable to specify log file owner or permissions

lra · 60Comments

Differences between reread, reload, restart, update?

flaugher · 30Comments

New Feature: `disable`/`enable`

jvanasco · 46Comments

Dynamic numproc change

ivan1986 · 49Comments