I use supervisor on CentOS. Here is the situation I have seen more than 10 times in the past 2 years:
supervisorctl command to control tasks.I have tried all the solutions mentioned here: https://github.com/Supervisor/supervisor/issues/1084 None of them works.
What I can do is kill supervisor forcibly and restart supervisor. It definitely should not be the standard way to solve this problem. BTW, I am using supervisor (3.3.2) now.
How to prevent/solve this problem?
Do you get any error message when running supervisorctl?
I have the same problem with all my ec2 machines. Supervisor will start, and I can use supervisorctl just fine. After some time the *.sock disappears.
Let me log into one machine, we can see the *.sock has disappeared.
[ec2-user@ip-172-31-1-82 ~]$ sudo supervisorctl
unix:///tmp/supervisor.sock no such file
supervisor> exit
[ec2-user@ip-172-31-1-82 ~]$ sudo ls -al /tmp/supervisor.sock
ls: cannot access /tmp/supervisor.sock: No such file or directory
supervisord is still running
[ec2-user@ip-172-31-1-82 ~]$ ps -aux | grep supervisord
ec2-user 13900 0.0 0.0 119464 976 pts/0 S+ 18:48 0:00 grep --color=auto supervisord
root 20120 0.0 0.1 236060 17072 ? Ss Jan21 19:26 /usr/bin/python2 /bin/supervisord -c /etc/supervisord.conf
and it will restart any processes I happen to kill. I can send a SIGINT to supervisord, and have it stop.
Let me run an experiment. I will stop supervisord, start it, and confirm it is working...
[ec2-user@ip-172-31-1-160 ~]$ sudo supervisord -c /etc/supervisord.conf
[ec2-user@ip-172-31-1-160 ~]$ sudo supervisorctl
es RUNNING pid 32277, uptime 0:00:09
push_to_es:00 STARTING
supervisor> exit
[ec2-user@ip-172-31-1-160 ~]$ date
Sat Apr 6 18:25:01 UTC 2019
[ec2-user@ip-172-31-1-160 ~]$ sudo supervisorctl status
es RUNNING pid 32277, uptime 0:00:52
push_to_es:00 RUNNING pid 32558, uptime 0:00:12
the *.sock exists;
[ec2-user@ip-172-31-1-160 ~]$ sudo ls -al /tmp/supervisor.sock
srwx------ 1 root root 0 Apr 6 18:24 /tmp/supervisor.sock
Here is some version info:
[ec2-user@ip-172-31-1-160 ~]$ sudo supervisord -v
3.3.5
[ec2-user@ip-172-31-1-160 ~]$ sudo cat /etc/*release*
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
Amazon Linux release 2 (Karoo)
cpe:2.3:o:amazon:amazon_linux:2
I will be back in a few days when the *.sock is gone.
It may be that the /tmp/supervisor.sock is being scrubbed by some other process. So, I will also restart another machine using /etc/supervisor.sock.
[ec2-user@ip-172-31-1-82 ~]$ sudo kill -SIGINT 20120
[ec2-user@ip-172-31-1-82 ~]$ sudo nano /etc/supervisord.conf
[ec2-user@ip-172-31-1-82 ~]$ sudo supervisord -c /etc/supervisord.conf
[ec2-user@ip-172-31-1-82 ~]$ sudo supervisorctl
es RUNNING pid 15133, uptime 0:00:07
push_to_es:00 RUNNING pid 15204, uptime 0:00:02
supervisor> exit
[ec2-user@ip-172-31-1-82 ~]$ sudo ls -al /etc/supervisor*
-rw-r--r-- 1 root root 1241 Apr 6 19:11 /etc/supervisord.conf
srwx------ 1 root root 0 Apr 6 19:11 /etc/supervisor.sock
and get back to you.
Confirmed. /tmp is a dangerous place for the supervisor.sock file:
[ec2-user@ip-172-31-1-160 ~]$ sudo supervisorctl
unix:///tmp/supervisor.sock no such file
The /etc is safer
[ec2-user@ip-172-31-1-82 ~]$ sudo supervisorctl
es RUNNING pid 15133, uptime 26 days, 3:01:59
push_to_es:00 RUNNING pid 2581, uptime 14 days, 9:58:02
Adding the docs label. It sounds like at least one of the commenters here is running with the output of echo_supervisord_conf, which has the socket file (and all other files) in /tmp.
The output of echo_supervisord_conf is intended to be for example only. It will likely allow supervisord to start on many systems for demonstration purposes. For actual use, it will always be necessary to modify the configuration for the target system. We can't predict how supervisord might interact with tempfile cleaners, logfile rotators, init systems, etc. The user performing the installation must know the system and adapt supervisord.conf to it.
We will add some notes to the docs to clarify and warn about this issue.
Confirmed.
/tmpis a dangerous place for thesupervisor.sockfile:[ec2-user@ip-172-31-1-160 ~]$ sudo supervisorctl unix:///tmp/supervisor.sock no such fileThe
/etcis safer[ec2-user@ip-172-31-1-82 ~]$ sudo supervisorctl es RUNNING pid 15133, uptime 26 days, 3:01:59 push_to_es:00 RUNNING pid 2581, uptime 14 days, 9:58:02
Edit the configuration file supervisord.conf, put the supervisor.sock file in a safer directory solved this problem.
Most helpful comment
Confirmed.
/tmpis a dangerous place for thesupervisor.sockfile:The
/etcis safer