Supervisor: "supervisorctl reload" fails to re-start program

Created on 25 May 2012  Â·  64Comments  Â·  Source: Supervisor/supervisor

Running "supervisorctl reload" on a running supervisord instance will cause program to stop and not re-start.

I've noticed this happening since upgrading from 3.0a10 to 3.0a12. Can provide more information if it's not duplicatable.

Most helpful comment

I solved this problem by starting the service:

sudo service supervisord start

After this you can run:

sudo supervisorctl reload

All 64 comments

There were no changes related to this between 3.0a10 and 3.0a12 as far as I can tell. Do you have autostart=true configured for the program?

On Ubuntu10.04, supervisorctl reload causes the supervisord process to fail. Subsequent attempts to run supervisorctl result in

unix:///var/run/supervisor.sock no such file

Or the more obscure

error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1

The supervisord process must be manually restarted with sudo supervisord.

This is really bad.

@leopd That sounds unrelated to this bug. The messages from supervisorctl are saying that it can't connect to supervisord. You'll have to check the supervisord log to see what happened. You might also want to run it in the foreground (supervisord -n) and see if it exits during reload.

@mnaberez Thanks for the debugging tip. I'm not sure why you think what I'm seeing is any different from what jbrehm originally reported. They sound identical to me. I'll happily open a new bug on this if you explain why the cases are different, but I see no reason to.

If I run supervisord in the foreground as you advised, running supervisorctl reload causes supervisord to crashes with this:

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 81, in emit
    self.stream.write(msg)
ValueError: I/O operation on closed file
Traceback (most recent call last):
  File "/usr/local/bin/supervisord", line 9, in <module>
    load_entry_point('supervisor==3.0b1', 'console_scripts', 'supervisord')()
  File "/usr/local/lib/python2.6/dist-packages/supervisor/supervisord.py", line 360, in main
    go(options)
  File "/usr/local/lib/python2.6/dist-packages/supervisor/supervisord.py", line 370, in go
    d.main()
  File "/usr/local/lib/python2.6/dist-packages/supervisor/supervisord.py", line 77, in main
    info_messages)
  File "/usr/local/lib/python2.6/dist-packages/supervisor/options.py", line 1274, in make_logger
    self.logger.critical(msg)
  File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 313, in critical
    self.log(LevelsByName.CRIT, msg, **kw)
  File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 319, in log
    handler.emit(record)
  File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 214, in emit
    self.doRollover()
  File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 223, in doRollover
    if not (self.stream.tell() >= self.maxBytes):
ValueError: I/O operation on closed file

I'm running 3.0b1 if that matters.

... and the problem goes away if I revert to 3.0a10.

I'm not sure why you think what I'm seeing is any different from what jbrehm originally reported.

I could be wrong but I had interpreted the original report as a program running under supervisord does not restart, not that supervisord crashes.

and the problem goes away if I revert to 3.0a10.

Thanks for the backtrace and this additional info. This same crash was reported in #130. It looks like we introduced this bug in d2bc68561f96b3d8d523c751879429ead8e87894.

I'm having the same issue.

Here's a sample session:

$ sudo supervisorctl -c conf/supervisord.conf reload
Restarted supervisord
$ sudo supervisorctl -c conf/supervisord.conf status
unix:///tmp/supervisord.sock no such file

I'm also having this issue. Supervisord crashes when calling reload. Is the only fix to downgrade to 3.0a10?

@landreville The bug was introduced in 3.0b1. If you are going to downgrade, you probably want the previous version, which was 3.0a12.

+1 for this issue

same with sending sighup

supervisord -n -u nobody -c ../conf/supervisor.ini --pid=/tmp/supervisor.pid
2012-12-17 16:08:20,208 CRIT Set uid to user 65534
2012-12-17 16:08:20,211 INFO supervisord started with pid 5016
2012-12-17 16:08:21,215 INFO spawned: 'celeryd' with pid 5019
2012-12-17 16:08:31,556 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2012-12-17 16:08:34,444 WARN received SIGHUP indicating restart request
2012-12-17 16:08:34,444 INFO waiting for celeryd to die
2012-12-17 16:08:34,582 INFO stopped: celeryd (exit status 0)
Traceback (most recent call last):
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 81, in emit
self.stream.write(msg)
ValueError: I/O operation on closed file
Traceback (most recent call last):
File "/var/www/configtest/env/bin/supervisord", line 8, in
load_entry_point('supervisor==3.0b1', 'console_scripts', 'supervisord')()
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/supervisord.py", line 360, in main
go(options)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/supervisord.py", line 370, in go
d.main()
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/supervisord.py", line 77, in main
info_messages)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/options.py", line 1274, in make_logger
self.logger.critical(msg)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 313, in critical
self.log(LevelsByName.CRIT, msg, **kw)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 319, in log
handler.emit(record)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 214, in emit
self.doRollover()
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 223, in doRollover
if not (self.stream.tell() >= self.maxBytes):
ValueError: I/O operation on closed file

Fixed in 25303d45eb75c980e97a5ea3eca307a464ad8e2b.

anyway to recover in this situation?

So is it really true that simply reloading config files on older versions than a month ago
leave you in a state where supervisor stops working?

This issue only occurs on the 3.0b1 version when the log rotation options are enabled. If you are using an earlier version or you are not using the log rotation options, this will not affect you.

anyway to recover in this situation?

Unfortunately, no. If supervisord exits unexpectedly, its subprocesses may be orphaned and they may keep running on their own for some time. However, they will have to be killed manually at some point before supervisord is restarted. If a new supervisord instance is started, it will not know about any processes that may have been orphaned by another instance.

The 25303d4 fix did not actually fix this issue for me. Same error.

I rescind my last comment. I'm able to reload now without the socket error using master (3.0b2-dev). The recent problem seems to have been caused by a new supervisor conf whose process would not start upon a supervisor reload. That caused the same looking unix socket error and so threw me off, but once the process was fixed reloads starting working.

This bug still exists.

the bug exists for me, too.

on supervisorctl reload

Logfile says:

2013-06-01 13:58:54,697 INFO waiting for memmon, celerydb, celerycam, gunicorn to die
2013-06-01 13:58:54,698 INFO stopped: celerycam (terminated by SIGTERM)
2013-06-01 13:58:54,868 INFO stopped: gunicorn (exit status 0)
2013-06-01 13:58:54,871 INFO stopped: celerydb (exit status 0)
2013-06-01 13:58:54,871 INFO stopped: memmon (terminated by SIGTERM)

(No further entries)

supervisor exists in a virtualenv, version is supervisor==3.0b2
Python 2.7.3

Still exists on supervisor==3.0b2, maybe it was re-introduced ? It might be related to custom socket in the configuration file, I don't know really.

We have been using supervisor==3.0b1 for about 6 months now and only recently we started facing issue described in here. Specifically, we use Jenkins for our build/deployment purposes driven by fabric file. Here are the supervisor logs when started:

2013-06-23 14:03:10,623 INFO daemonizing the supervisord process
2013-06-23 14:03:10,624 INFO supervisord started with pid 24211
2013-06-23 14:03:11,627 INFO spawned: 'websocket' with pid 24212
2013-06-23 14:03:11,628 WARN received SIGHUP indicating restart request
2013-06-23 14:03:11,628 INFO waiting for websocket to die
2013-06-23 14:03:15,051 INFO waiting for websocket to die
2013-06-23 14:03:18,674 INFO waiting for websocket to die
2013-06-23 14:03:21,678 WARN killing 'websocket' (24212) with SIGKILL
2013-06-23 14:03:21,678 INFO waiting for websocket to die
2013-06-23 14:03:21,685 INFO stopped: websocket (terminated by SIGKILL)

As you can see, immediately after demonizing and spawning processes, supervisor receives SIGHUP. I m not entirely sure from where is the SIGHUP signal being sent to supervisor process.

Is this fixed with 3.0.b2? The changelog here doesn't indicate that.

The above comment doesn't sound like a bug. It sounds like something is sending supervisor a SIGHUP and supervisor is responding normally to that SIGHUP. You'll need to find out what is sending that signal; supervisor does not send signals to itself.

This particular issue report is a little out of control, I think. There seem to be at least four different issues being reported as if they were the same one.

I solved this problem by starting the service:

sudo service supervisord start

After this you can run:

sudo supervisorctl reload

@harph I'll rather try and avoid running it as root.

@aneumeier well assign permissions to a user to take of this service.

@ harph
It worked for me - great thanks!

@harph I cannot agree - when the user can _start_ the service, why can't he _restart_ the service?

@aneumeier I didn't say that. What I said is that you should start the service before reloading it. I did it using sudo on the example but if you assign to a user the right permissions, he will be able to start and restart the service.

for me sudo service supervisor start works good :) Ubuntu 12.04 LTS

sudo service supervisor start
It also worked for me - very thanks!

@harph +1

:+1:
I've removed today Supervisor from CentOS repository and installed using easy_install. supervisord works better then ever. but I also have the problem with supervisorctl.

For me service supervisor start solution is not working. also trying service supervisor restart then start, stop, start... on and off..restart.. nothing.

$ supervisorctl reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
$ supervisorctl restart foo
unix:///var/tmp/supervisor.sock no such file

supervisor-3.1.1-py2.6 on CentOS release 6.5 (Final)

On 08/18/2014 05:00 PM, Etay Cohen-Solal wrote:

I've removed today Supervisor from CentOS repository and installed using
easy_install. supervisord works better then ever. but I also have the
problem with supervisorctl.

Supervisorctl needs access to supervisord.conf in custom configuration.
Supervisorctl looks for supervisord.conf files in the following
locations (in this order):

etc/supervisord.conf
supervisord.conf
/etc/supervisord.conf

Try:

supervisorctl -c /path/to/the/supervisorctl.conf

For me |service supervisor start| solution is not working. also trying
|service supervisor restart| then start, stop, start... on and
off..restart.. nothing.

|$ supervisorctl reload
error: , [Errno 2] No such file or directory: file: line: 1
$ supervisorctl restart myapp
unix:///var/tmp/supervisor.sock no such file
|

supervisor-3.1.1-py2.6 on CentOS release 6.5 (Final)

—
Reply to this email directly or view it on GitHub
https://github.com/Supervisor/supervisor/issues/121#issuecomment-52554399.

Thank you.
File is at /etc/supervisord.conf

[supervisord]
http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB       ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10          ; (num of main logfile rotation backups;default 10)
loglevel=debug              ; (logging level;default info; others: debug,warn)
pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
minfds=1024                 ; (min. avail startup file descriptors;default 1024)
minprocs=200                ; (min. avail process descriptors;default 200)
user=root                   ; (default is current user, required if root)

[supervisorctl]
serverurl=unix:///var/tmp/supervisor.sock ; use a unix:// URL  for a unix socket
username=root               ; should be same as http_username if set
password=[mypassword]       ; should be same as http_password if set

[include]
files = /var/www/websites/supervisor/*.conf

Running the command:

# supervisorctl -c /etc/supervisord.conf
unix:///var/tmp/supervisor.sock no such file

And the file is indeed not there. Supervisord works and all the daemons are online and working.

On 08/18/2014 05:48 PM, Etay Cohen-Solal wrote:

[include]
files = /var/www/websites/supervisor/*.conf
|

Running the command:

|# supervisorctl -c /etc/supervisord.conf
unix:///var/tmp/supervisor.sock no such file
|

And the file is indeed not there. Supervisord works and all the daemons
are online and working.

Supervisord was not started using this configuration file, then. You
can start supervisord with that configuration file the same way:

supervisord -c /etc/supervisord.conf

We should take this to the maillist, I didn't realize we weren't on it..
this is far afield of the OP's bug report.

It's must to be the configuration file loaded for two reasons:
It's the only file I have

locate supervisord.conf
/etc/supervisord.conf
/etc/supervisord.conf.backup
/etc/supervisord.conf.rpmsave
  1. The include from /var/www/websites/supervisor/*.conf is working, and no chance that this folder is set in another default or old file as I've only set it in this file today.

But anyway, I've tried:

# service supervisord stop
Shutting down supervisord:                                 [  OK  ]
# supervisord -c /etc/supervisord.conf
# supervisorctl reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
# supervisorctl restart foo
unix:///var/tmp/supervisor.sock no such file

Thank you again. Do you want me to post again to the maillist? though it's similar to some of the errors others reported here.

It's totally unrelated to the original post, so yeah.

But anyway, since we've polluted it this far, try:

supervisord -c /etc/supervisord.conf
supervisorctl -c /etc/supervisord.conf reload
supervisorctl -c /etc/supervisord.conf restart foo

Thank you.

# supervisord -c /etc/supervisord.conf
# supervisorctl -c /etc/supervisord.conf reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
# supervisorctl -c /etc/supervisord.conf restart foo
unix:///var/tmp/supervisor.sock no such file

I've also tried to locate supervisor.sock

# updatedb
# locate supervisor.sock
# 

No such file.

On 08/18/2014 06:09 PM, Etay Cohen-Solal wrote:

Thank you.

|# supervisord -c /etc/supervisord.conf

supervisorctl -c /etc/supervisord.conf reload

error: , [Errno 2] No such file or directory: file: line: 1

supervisorctl -c /etc/supervisord.conf restart foo

unix:///var/tmp/supervisor.sock no such file

I've also tried to locate supervisor.sock

No clue. Let's definitely stop talking about this here. I'd suggest
taking it to the maillist.

  • C

I am running into the same issue as @ET-CS
can not find the sock file
using vagrant to spin up a ubuntu vm

and using

adding supervisor support

sudo supervisord -c /vagrant/scripts/supervisord.conf

i get the same errors as posted above .. sock missing

same problem for 3.0b2 in ubuntu 14.04, reboot won't fix

error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224

I found the reason:

stdout_logfile_maxbytes=100M

should be

stdout_logfile_maxbytes=100MB

I had a similar issue.
My problem was that the path to the log file was not created or accessible.

i ran into the same issue. i killed supervisord and tried to restart it using

supervisord -c /path/to/supervisord.conf

and ran into the following error

Error: Another program is already listening on a port that one of our HTTP servers is configured to use.  Shut this program down first before starting supervisord.

so deleted the supervisor socket at /var/run and reran

supervisord -c /path/to/supervisord.conf

which worked. hope this helps

@harph good work! it's usefull for me! thanks!

Just run supervisorctl -c /etc/supervisor/supervisord.conf
supervisorctl can not find config file

supervisord -n help me find the detail error info, thanks.

I got the

unix:///var/run/supervisor.sock no such file

error when trying to access supervisorctl. To solve this you need to make sure that the supervisor service is running:

sudo service supervisor status

will tell you if that's your issue, if you get that it's not running, simply start it with

sudo service supervisor start

I have found that when running supervisor in a Vagrant VM, the service will not autostart for some reason, necessitating a manual service start.

Found this error as well on my raspberry pi whith updated Weesy.

sudo supervisorctl reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224

Reading above I did:

sudo supervisord
Error: The directory named as part of the path /var/log/supervisor/supervisord.log does not exist.

Shich showed an issue with logging. Indeed I remove * from /var/log/ a day or so back..
recreated & after another sudo supervisord, all good.

Thanks @JayH5 its worked for me

I had the same issue of .sock file not found. @mistermoe suggestions helped me to the solve the issue.

$ sudo supervisorctl reload
error: , [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224
$ sodu supervisorctl status
unix:///var/run/supervisor.sock no such file

After restart supervisorctl ,this issue be solved!

change
[supervisord]
;;;;;;http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
http_port=127.0.0.1:9001 ; (alternately, ip_address:port specifies AF_INET)

to
[supervisord]
http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
;;;;;;http_port=127.0.0.1:9001 ; (alternately, ip_address:port specifies AF_INET)

solved my problem

If someone still has the issue:

create a file in /etc/supervisord.conf and add the following code
(Please note, change the user name accordingly )

[unix_http_server]
file = /tmp/supervisor.sock
chmod = 0777
chown= mywebsiteuser:mywebsiteusergroup


[supervisord]
logfile = /tmp/supervisord.log
logfile_maxbytes = 50MB
logfile_backups=10
loglevel = info
pidfile = /tmp/supervisord.pid
nodaemon = false
minfds = 1024
minprocs = 200
umask = 022
user = mywebsiteuser
identifier = supervisor
directory = /tmp
nocleanup = true
childlogdir = /tmp
strip_ansi = false

[supervisorctl]
##serverurl = unix:///tmp/supervisor.sock
serverurl = http://localhost:9001

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /home/mywebsiteuser/public_html/artisan queue:work database --sleep=3 --tries=3 --daemon
autostart=true
autorestart=true
user=mywebsiteuser
numprocs=8
redirect_stderr=true
stdout_logfile=/home/mywebsiteuser/worker.log

Save and exit the editor.

Find the existing supervisord process and kill it

pgrep -fl supervisord

Run the following commands and you should have your worker running

supervisord -c /etc/supervisord.conf
supervisorctl -c /etc/supervisord.conf reload
supervisorctl -c /etc/supervisord.conf start laravel-worker:*

thanks @omsobliga

Hi guys!
I have a problem like this.
I followed this link: https://serversforhackers.com/monitoring-processes-with-supervisord, but when I run this command "supervisorctl reread" this mesage appears:
"error: , [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 228"
What, exactly it means?
Thank you.

Having this issue using puppet

Error: /Stage[main]/Supervisor/Service[supervisor]: Failed to call refresh: Could not restart Service[supervisor]: Execution of 'supervisorctl reload' returned 2: error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224
Error: /Stage[main]/Supervisor/Service[supervisor]: Could not restart Service[supervisor]: Execution of 'supervisorctl reload' returned 2: error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224

Maybe No such file or directory has something to do with the issue?

The solution is to

  1. Make sure supervisord is running. Use your OS's service manager (start, /etc/init.d, service etc) to start it
  2. Then reload by supervisorctl.

supervisorctl talks to supervisord daemon. If the daemon doesn't exist then this tool can't instruct to anything. When it tries to connect it cant find the socket file

This has been an issue that's plagued us from time to time on some of our older servers. For rhel/centos, check this bug report. In short, the supervisor 3.0-1.el7 release places sock file at /var/tmp/supervisor.sock, which is cleaned up after 30 days. This is why, for us, the issue appeared intermittent. Fixed by this commit in the 3.1.3 release.

@harph Thanks!
sudo service supervisord start
it works fine!

This comment helps me a lot, @efusionsoft , thank you.

But I have some improvements for your solution.
1) When I'm launched supervisor with efusionsoft's config, I' got an error:

Please check that the [rpcinterface:supervisor] section is enabled in the configuration file (see sample.conf).

If you have the same error - add this code after [supervisor] section:

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

2) serverurl = http://localhost:9001 does not work fo me, and I don't have any idea why. Fortunately serverurl = unix:///tmp/supervisor.sock works fine for me.

Here is full version of my /etc/supervisord.conf:

[unix_http_server]
file  = /tmp/supervisor.sock
chmod = 0777
chown = mywebsiteuser:mywebsiteusergroup

[supervisord]
logfile = /tmp/supervisord.log
logfile_maxbytes = 50MB
logfile_backups=10
loglevel = info
pidfile = /tmp/supervisord.pid
nodaemon = false
minfds = 1024
minprocs = 200
umask = 022
user = mywebsiteuser
identifier = supervisor
directory = /tmp
nocleanup = true
childlogdir = /tmp
strip_ansi = false

[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl = unix:///tmp/supervisor.sock
##serverurl = http://127.0.0.1:9001

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php/home/mywebsiteuser/public_html/artisan queue:work database --sleep=3 --tries=3 --daemon
autostart=true
autorestart=true
user=mywebsiteuser
numprocs=1
redirect_stderr=true
stdout_logfile=/home/mywebsiteuser/worker.log

After saving config:
1) find all running supervisor instances:
pgrep -fl supervisord
2) kill all running supervisor processes
sudo kill {your process PID}
3) launch supervisor with new config:
supervisord -c /etc/supervisord.conf
4) check the supervisor status
supervisorctl -c /etc/supervisord.conf status
5) Enjoy!

This worked for me

sudo systemctl start supervisor
sudo systemctl enable supervisor
sudo supervisorctl reload

Why not to add appropriate error message that supervisord is not running instead of:

error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib64/python2.7/socket.py line: 224

My problem was that worker's log directory did not exist, just like @YAmikep's. Creating the directory solved the problem and supervisor successfully launched and started the workers.

This is how I solved the unix:///tmp/supervisor.sock no such file problem on RHEL

ps -ef | grep supervisord # get supervisord PID
kill -s SIGTERM 1234 # where 1234 is the PID
ps -ef | grep supervisord # check that supervisord is killed
supervisord # restart the daemon

I had the same problem and after I ran supervisord the log showed that the problem was coming from an old service that supervisor was trying to start but I deleted that program folder time ago, but not from the supervisor conf.d/ folder. so it was trying to start a program who didn't exist, and this caused the problem.

Was this page helpful?
0 / 5 - 0 ratings