Running "supervisorctl reload" on a running supervisord instance will cause program to stop and not re-start.
I've noticed this happening since upgrading from 3.0a10 to 3.0a12. Can provide more information if it's not duplicatable.
There were no changes related to this between 3.0a10 and 3.0a12 as far as I can tell. Do you have autostart=true configured for the program?
On Ubuntu10.04, supervisorctl reload causes the supervisord process to fail. Subsequent attempts to run supervisorctl result in
unix:///var/run/supervisor.sock no such file
Or the more obscure
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
The supervisord process must be manually restarted with sudo supervisord.
This is really bad.
@leopd That sounds unrelated to this bug. The messages from supervisorctl are saying that it can't connect to supervisord. You'll have to check the supervisord log to see what happened. You might also want to run it in the foreground (supervisord -n) and see if it exits during reload.
@mnaberez Thanks for the debugging tip. I'm not sure why you think what I'm seeing is any different from what jbrehm originally reported. They sound identical to me. I'll happily open a new bug on this if you explain why the cases are different, but I see no reason to.
If I run supervisord in the foreground as you advised, running supervisorctl reload causes supervisord to crashes with this:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 81, in emit
self.stream.write(msg)
ValueError: I/O operation on closed file
Traceback (most recent call last):
File "/usr/local/bin/supervisord", line 9, in <module>
load_entry_point('supervisor==3.0b1', 'console_scripts', 'supervisord')()
File "/usr/local/lib/python2.6/dist-packages/supervisor/supervisord.py", line 360, in main
go(options)
File "/usr/local/lib/python2.6/dist-packages/supervisor/supervisord.py", line 370, in go
d.main()
File "/usr/local/lib/python2.6/dist-packages/supervisor/supervisord.py", line 77, in main
info_messages)
File "/usr/local/lib/python2.6/dist-packages/supervisor/options.py", line 1274, in make_logger
self.logger.critical(msg)
File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 313, in critical
self.log(LevelsByName.CRIT, msg, **kw)
File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 319, in log
handler.emit(record)
File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 214, in emit
self.doRollover()
File "/usr/local/lib/python2.6/dist-packages/supervisor/loggers.py", line 223, in doRollover
if not (self.stream.tell() >= self.maxBytes):
ValueError: I/O operation on closed file
I'm running 3.0b1 if that matters.
... and the problem goes away if I revert to 3.0a10.
I'm not sure why you think what I'm seeing is any different from what jbrehm originally reported.
I could be wrong but I had interpreted the original report as a program running under supervisord does not restart, not that supervisord crashes.
and the problem goes away if I revert to 3.0a10.
Thanks for the backtrace and this additional info. This same crash was reported in #130. It looks like we introduced this bug in d2bc68561f96b3d8d523c751879429ead8e87894.
I'm having the same issue.
Here's a sample session:
$ sudo supervisorctl -c conf/supervisord.conf reload
Restarted supervisord
$ sudo supervisorctl -c conf/supervisord.conf status
unix:///tmp/supervisord.sock no such file
I'm also having this issue. Supervisord crashes when calling reload. Is the only fix to downgrade to 3.0a10?
@landreville The bug was introduced in 3.0b1. If you are going to downgrade, you probably want the previous version, which was 3.0a12.
+1 for this issue
same with sending sighup
supervisord -n -u nobody -c ../conf/supervisor.ini --pid=/tmp/supervisor.pid
2012-12-17 16:08:20,208 CRIT Set uid to user 65534
2012-12-17 16:08:20,211 INFO supervisord started with pid 5016
2012-12-17 16:08:21,215 INFO spawned: 'celeryd' with pid 5019
2012-12-17 16:08:31,556 INFO success: celeryd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2012-12-17 16:08:34,444 WARN received SIGHUP indicating restart request
2012-12-17 16:08:34,444 INFO waiting for celeryd to die
2012-12-17 16:08:34,582 INFO stopped: celeryd (exit status 0)
Traceback (most recent call last):
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 81, in emit
self.stream.write(msg)
ValueError: I/O operation on closed file
Traceback (most recent call last):
File "/var/www/configtest/env/bin/supervisord", line 8, in
load_entry_point('supervisor==3.0b1', 'console_scripts', 'supervisord')()
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/supervisord.py", line 360, in main
go(options)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/supervisord.py", line 370, in go
d.main()
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/supervisord.py", line 77, in main
info_messages)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/options.py", line 1274, in make_logger
self.logger.critical(msg)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 313, in critical
self.log(LevelsByName.CRIT, msg, **kw)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 319, in log
handler.emit(record)
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 214, in emit
self.doRollover()
File "/var/www/configtest/env/lib/python2.7/site-packages/supervisor/loggers.py", line 223, in doRollover
if not (self.stream.tell() >= self.maxBytes):
ValueError: I/O operation on closed file
Fixed in 25303d45eb75c980e97a5ea3eca307a464ad8e2b.
anyway to recover in this situation?
So is it really true that simply reloading config files on older versions than a month ago
leave you in a state where supervisor stops working?
This issue only occurs on the 3.0b1 version when the log rotation options are enabled. If you are using an earlier version or you are not using the log rotation options, this will not affect you.
anyway to recover in this situation?
Unfortunately, no. If supervisord exits unexpectedly, its subprocesses may be orphaned and they may keep running on their own for some time. However, they will have to be killed manually at some point before supervisord is restarted. If a new supervisord instance is started, it will not know about any processes that may have been orphaned by another instance.
The 25303d4 fix did not actually fix this issue for me. Same error.
I rescind my last comment. I'm able to reload now without the socket error using master (3.0b2-dev). The recent problem seems to have been caused by a new supervisor conf whose process would not start upon a supervisor reload. That caused the same looking unix socket error and so threw me off, but once the process was fixed reloads starting working.
This bug still exists.
the bug exists for me, too.
on supervisorctl reload
Logfile says:
2013-06-01 13:58:54,697 INFO waiting for memmon, celerydb, celerycam, gunicorn to die
2013-06-01 13:58:54,698 INFO stopped: celerycam (terminated by SIGTERM)
2013-06-01 13:58:54,868 INFO stopped: gunicorn (exit status 0)
2013-06-01 13:58:54,871 INFO stopped: celerydb (exit status 0)
2013-06-01 13:58:54,871 INFO stopped: memmon (terminated by SIGTERM)
(No further entries)
supervisor exists in a virtualenv, version is supervisor==3.0b2
Python 2.7.3
Still exists on supervisor==3.0b2, maybe it was re-introduced ? It might be related to custom socket in the configuration file, I don't know really.
We have been using supervisor==3.0b1 for about 6 months now and only recently we started facing issue described in here. Specifically, we use Jenkins for our build/deployment purposes driven by fabric file. Here are the supervisor logs when started:
2013-06-23 14:03:10,623 INFO daemonizing the supervisord process
2013-06-23 14:03:10,624 INFO supervisord started with pid 24211
2013-06-23 14:03:11,627 INFO spawned: 'websocket' with pid 24212
2013-06-23 14:03:11,628 WARN received SIGHUP indicating restart request
2013-06-23 14:03:11,628 INFO waiting for websocket to die
2013-06-23 14:03:15,051 INFO waiting for websocket to die
2013-06-23 14:03:18,674 INFO waiting for websocket to die
2013-06-23 14:03:21,678 WARN killing 'websocket' (24212) with SIGKILL
2013-06-23 14:03:21,678 INFO waiting for websocket to die
2013-06-23 14:03:21,685 INFO stopped: websocket (terminated by SIGKILL)
As you can see, immediately after demonizing and spawning processes, supervisor receives SIGHUP. I m not entirely sure from where is the SIGHUP signal being sent to supervisor process.
Is this fixed with 3.0.b2? The changelog here doesn't indicate that.
The above comment doesn't sound like a bug. It sounds like something is sending supervisor a SIGHUP and supervisor is responding normally to that SIGHUP. You'll need to find out what is sending that signal; supervisor does not send signals to itself.
This particular issue report is a little out of control, I think. There seem to be at least four different issues being reported as if they were the same one.
I solved this problem by starting the service:
sudo service supervisord start
After this you can run:
sudo supervisorctl reload
@harph I'll rather try and avoid running it as root.
@aneumeier well assign permissions to a user to take of this service.
@ harph
It worked for me - great thanks!
@harph I cannot agree - when the user can _start_ the service, why can't he _restart_ the service?
@aneumeier I didn't say that. What I said is that you should start the service before reloading it. I did it using sudo on the example but if you assign to a user the right permissions, he will be able to start and restart the service.
for me sudo service supervisor start works good :) Ubuntu 12.04 LTS
sudo service supervisor start
It also worked for me - very thanks!
@harph +1
:+1:
I've removed today Supervisor from CentOS repository and installed using easy_install. supervisord works better then ever. but I also have the problem with supervisorctl.
For me service supervisor start solution is not working. also trying service supervisor restart then start, stop, start... on and off..restart.. nothing.
$ supervisorctl reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
$ supervisorctl restart foo
unix:///var/tmp/supervisor.sock no such file
supervisor-3.1.1-py2.6 on CentOS release 6.5 (Final)
On 08/18/2014 05:00 PM, Etay Cohen-Solal wrote:
I've removed today Supervisor from CentOS repository and installed using
easy_install. supervisord works better then ever. but I also have the
problem with supervisorctl.
Supervisorctl needs access to supervisord.conf in custom configuration.
Supervisorctl looks for supervisord.conf files in the following
locations (in this order):
etc/supervisord.conf
supervisord.conf
/etc/supervisord.conf
Try:
supervisorctl -c /path/to/the/supervisorctl.conf
For me |service supervisor start| solution is not working. also trying
|service supervisor restart| then start, stop, start... on and
off..restart.. nothing.|$ supervisorctl reload
error:, [Errno 2] No such file or directory: file: line: 1
$ supervisorctl restart myapp
unix:///var/tmp/supervisor.sock no such file
|supervisor-3.1.1-py2.6 on CentOS release 6.5 (Final)
—
Reply to this email directly or view it on GitHub
https://github.com/Supervisor/supervisor/issues/121#issuecomment-52554399.
Thank you.
File is at /etc/supervisord.conf
[supervisord]
http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
logfile=/var/log/supervisor/supervisord.log ; (main log file;default $CWD/supervisord.log)
logfile_maxbytes=50MB ; (max main logfile bytes b4 rotation;default 50MB)
logfile_backups=10 ; (num of main logfile rotation backups;default 10)
loglevel=debug ; (logging level;default info; others: debug,warn)
pidfile=/var/run/supervisord.pid ; (supervisord pidfile;default supervisord.pid)
minfds=1024 ; (min. avail startup file descriptors;default 1024)
minprocs=200 ; (min. avail process descriptors;default 200)
user=root ; (default is current user, required if root)
[supervisorctl]
serverurl=unix:///var/tmp/supervisor.sock ; use a unix:// URL for a unix socket
username=root ; should be same as http_username if set
password=[mypassword] ; should be same as http_password if set
[include]
files = /var/www/websites/supervisor/*.conf
Running the command:
# supervisorctl -c /etc/supervisord.conf
unix:///var/tmp/supervisor.sock no such file
And the file is indeed not there. Supervisord works and all the daemons are online and working.
On 08/18/2014 05:48 PM, Etay Cohen-Solal wrote:
[include]
files = /var/www/websites/supervisor/*.conf
|Running the command:
|# supervisorctl -c /etc/supervisord.conf
unix:///var/tmp/supervisor.sock no such file
|And the file is indeed not there. Supervisord works and all the daemons
are online and working.
Supervisord was not started using this configuration file, then. You
can start supervisord with that configuration file the same way:
supervisord -c /etc/supervisord.conf
We should take this to the maillist, I didn't realize we weren't on it..
this is far afield of the OP's bug report.
It's must to be the configuration file loaded for two reasons:
It's the only file I have
locate supervisord.conf
/etc/supervisord.conf
/etc/supervisord.conf.backup
/etc/supervisord.conf.rpmsave
/var/www/websites/supervisor/*.conf is working, and no chance that this folder is set in another default or old file as I've only set it in this file today.But anyway, I've tried:
# service supervisord stop
Shutting down supervisord: [ OK ]
# supervisord -c /etc/supervisord.conf
# supervisorctl reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
# supervisorctl restart foo
unix:///var/tmp/supervisor.sock no such file
Thank you again. Do you want me to post again to the maillist? though it's similar to some of the errors others reported here.
It's totally unrelated to the original post, so yeah.
But anyway, since we've polluted it this far, try:
supervisord -c /etc/supervisord.conf
supervisorctl -c /etc/supervisord.conf reload
supervisorctl -c /etc/supervisord.conf restart foo
Thank you.
# supervisord -c /etc/supervisord.conf
# supervisorctl -c /etc/supervisord.conf reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: <string> line: 1
# supervisorctl -c /etc/supervisord.conf restart foo
unix:///var/tmp/supervisor.sock no such file
I've also tried to locate supervisor.sock
# updatedb
# locate supervisor.sock
#
No such file.
On 08/18/2014 06:09 PM, Etay Cohen-Solal wrote:
Thank you.
|# supervisord -c /etc/supervisord.conf
supervisorctl -c /etc/supervisord.conf reload
error:
, [Errno 2] No such file or directory: file: line: 1 supervisorctl -c /etc/supervisord.conf restart foo
unix:///var/tmp/supervisor.sock no such file
I've also tried to locate supervisor.sock
No clue. Let's definitely stop talking about this here. I'd suggest
taking it to the maillist.
I am running into the same issue as @ET-CS
can not find the sock file
using vagrant to spin up a ubuntu vm
and using
sudo supervisord -c /vagrant/scripts/supervisord.conf
i get the same errors as posted above .. sock missing
same problem for 3.0b2 in ubuntu 14.04, reboot won't fix
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224
I found the reason:
stdout_logfile_maxbytes=100M
should be
stdout_logfile_maxbytes=100MB
I had a similar issue.
My problem was that the path to the log file was not created or accessible.
i ran into the same issue. i killed supervisord and tried to restart it using
supervisord -c /path/to/supervisord.conf
and ran into the following error
Error: Another program is already listening on a port that one of our HTTP servers is configured to use. Shut this program down first before starting supervisord.
so deleted the supervisor socket at /var/run and reran
supervisord -c /path/to/supervisord.conf
which worked. hope this helps
@harph good work! it's usefull for me! thanks!
Just run supervisorctl -c /etc/supervisor/supervisord.conf
supervisorctl can not find config file
supervisord -n help me find the detail error info, thanks.
I got the
unix:///var/run/supervisor.sock no such file
error when trying to access supervisorctl. To solve this you need to make sure that the supervisor service is running:
sudo service supervisor status
will tell you if that's your issue, if you get that it's not running, simply start it with
sudo service supervisor start
I have found that when running supervisor in a Vagrant VM, the service will not autostart for some reason, necessitating a manual service start.
Found this error as well on my raspberry pi whith updated Weesy.
sudo supervisorctl reload
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224
Reading above I did:
sudo supervisord
Error: The directory named as part of the path /var/log/supervisor/supervisord.log does not exist.
Shich showed an issue with logging. Indeed I remove * from /var/log/ a day or so back..
recreated & after another sudo supervisord, all good.
Thanks @JayH5 its worked for me
I had the same issue of .sock file not found. @mistermoe suggestions helped me to the solve the issue.
$ sudo supervisorctl reload
error:
$ sodu supervisorctl status
unix:///var/run/supervisor.sock no such file
After restart supervisorctl ,this issue be solved!
change
[supervisord]
;;;;;;http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
http_port=127.0.0.1:9001 ; (alternately, ip_address:port specifies AF_INET)
to
[supervisord]
http_port=/var/tmp/supervisor.sock ; (default is to run a UNIX domain socket server)
;;;;;;http_port=127.0.0.1:9001 ; (alternately, ip_address:port specifies AF_INET)
solved my problem
If someone still has the issue:
create a file in /etc/supervisord.conf and add the following code
(Please note, change the user name accordingly )
[unix_http_server]
file = /tmp/supervisor.sock
chmod = 0777
chown= mywebsiteuser:mywebsiteusergroup
[supervisord]
logfile = /tmp/supervisord.log
logfile_maxbytes = 50MB
logfile_backups=10
loglevel = info
pidfile = /tmp/supervisord.pid
nodaemon = false
minfds = 1024
minprocs = 200
umask = 022
user = mywebsiteuser
identifier = supervisor
directory = /tmp
nocleanup = true
childlogdir = /tmp
strip_ansi = false
[supervisorctl]
##serverurl = unix:///tmp/supervisor.sock
serverurl = http://localhost:9001
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /home/mywebsiteuser/public_html/artisan queue:work database --sleep=3 --tries=3 --daemon
autostart=true
autorestart=true
user=mywebsiteuser
numprocs=8
redirect_stderr=true
stdout_logfile=/home/mywebsiteuser/worker.log
Save and exit the editor.
Find the existing supervisord process and kill it
pgrep -fl supervisord
Run the following commands and you should have your worker running
supervisord -c /etc/supervisord.conf
supervisorctl -c /etc/supervisord.conf reload
supervisorctl -c /etc/supervisord.conf start laravel-worker:*
thanks @omsobliga
Hi guys!
I have a problem like this.
I followed this link: https://serversforhackers.com/monitoring-processes-with-supervisord, but when I run this command "supervisorctl reread" this mesage appears:
"error:
What, exactly it means?
Thank you.
Having this issue using puppet
Error: /Stage[main]/Supervisor/Service[supervisor]: Failed to call refresh: Could not restart Service[supervisor]: Execution of 'supervisorctl reload' returned 2: error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224
Error: /Stage[main]/Supervisor/Service[supervisor]: Could not restart Service[supervisor]: Execution of 'supervisorctl reload' returned 2: error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib/python2.7/socket.py line: 224
Maybe No such file or directory has something to do with the issue?
The solution is to
supervisorctl talks to supervisord daemon. If the daemon doesn't exist then this tool can't instruct to anything. When it tries to connect it cant find the socket file
This has been an issue that's plagued us from time to time on some of our older servers. For rhel/centos, check this bug report. In short, the supervisor 3.0-1.el7 release places sock file at /var/tmp/supervisor.sock, which is cleaned up after 30 days. This is why, for us, the issue appeared intermittent. Fixed by this commit in the 3.1.3 release.
@harph Thanks!
sudo service supervisord start
it works fine!
This comment helps me a lot, @efusionsoft , thank you.
But I have some improvements for your solution.
1) When I'm launched supervisor with efusionsoft's config, I' got an error:
Please check that the [rpcinterface:supervisor] section is enabled in the configuration file (see sample.conf).
If you have the same error - add this code after [supervisor] section:
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
2) serverurl = http://localhost:9001 does not work fo me, and I don't have any idea why. Fortunately serverurl = unix:///tmp/supervisor.sock works fine for me.
Here is full version of my /etc/supervisord.conf:
[unix_http_server]
file = /tmp/supervisor.sock
chmod = 0777
chown = mywebsiteuser:mywebsiteusergroup
[supervisord]
logfile = /tmp/supervisord.log
logfile_maxbytes = 50MB
logfile_backups=10
loglevel = info
pidfile = /tmp/supervisord.pid
nodaemon = false
minfds = 1024
minprocs = 200
umask = 022
user = mywebsiteuser
identifier = supervisor
directory = /tmp
nocleanup = true
childlogdir = /tmp
strip_ansi = false
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[supervisorctl]
serverurl = unix:///tmp/supervisor.sock
##serverurl = http://127.0.0.1:9001
[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php/home/mywebsiteuser/public_html/artisan queue:work database --sleep=3 --tries=3 --daemon
autostart=true
autorestart=true
user=mywebsiteuser
numprocs=1
redirect_stderr=true
stdout_logfile=/home/mywebsiteuser/worker.log
After saving config:
1) find all running supervisor instances:
pgrep -fl supervisord
2) kill all running supervisor processes
sudo kill {your process PID}
3) launch supervisor with new config:
supervisord -c /etc/supervisord.conf
4) check the supervisor status
supervisorctl -c /etc/supervisord.conf status
5) Enjoy!
This worked for me
sudo systemctl start supervisor
sudo systemctl enable supervisor
sudo supervisorctl reload
Why not to add appropriate error message that supervisord is not running instead of:
error: <class 'socket.error'>, [Errno 2] No such file or directory: file: /usr/lib64/python2.7/socket.py line: 224
My problem was that worker's log directory did not exist, just like @YAmikep's. Creating the directory solved the problem and supervisor successfully launched and started the workers.
This is how I solved the unix:///tmp/supervisor.sock no such file problem on RHEL
ps -ef | grep supervisord # get supervisord PID
kill -s SIGTERM 1234 # where 1234 is the PID
ps -ef | grep supervisord # check that supervisord is killed
supervisord # restart the daemon
I had the same problem and after I ran supervisord the log showed that the problem was coming from an old service that supervisor was trying to start but I deleted that program folder time ago, but not from the supervisor conf.d/ folder. so it was trying to start a program who didn't exist, and this caused the problem.
Most helpful comment
I solved this problem by starting the service:
After this you can run: