After several weeks of successful usage, salt-minion service started to crash on booting up:
2018-03-28 13:17:15,009 [salt.log.setup :1133][ERROR ][20736] An un-handled exception was caught by salt's global exception handler:
TypeError: unorderable types: NoneType() < int()
Traceback (most recent call last):
File "c:\salt\bin\Scripts\salt-minion", line 26, in <module>
salt_minion()
File "c:\salt\bin\lib\site-packages\salt\scripts.py", line 168, in salt_minion
minion.start()
File "c:\salt\bin\lib\site-packages\salt\cli\daemons.py", line 342, in start
super(Minion, self).start()
File "c:\salt\bin\lib\site-packages\salt\utils\parsers.py", line 1039, in start
self.prepare()
File "c:\salt\bin\lib\site-packages\salt\cli\daemons.py", line 299, in prepare
if self.check_running():
File "c:\salt\bin\lib\site-packages\salt\utils\parsers.py", line 1022, in check_running
if self.check_pidfile() and self.is_daemonized(pid):
File "c:\salt\bin\lib\site-packages\salt\utils\parsers.py", line 1028, in is_daemonized
return os_is_running(pid)
File "c:\salt\bin\lib\site-packages\salt\utils\process.py", line 187, in os_is_running
return psutil.pid_exists(pid)
File "c:\salt\bin\lib\site-packages\psutil\__init__.py", line 1438, in pid_exists
if pid < 0:
TypeError: unorderable types: NoneType() < int()
For some reason, C:\salt\var\run\salt-minion.pid
contains exactly 4 null bytes:
I suppose a couple of cold system reboots could lead to such file content, but as a result salt-minion can't start and crashing here.
But the issue itself is not about starting salt-minion.
I've detected this situation only after some time, and all that time salt-minion windows service attempted to start minion endlessly, which leads to continuous high CPU usage:
On a long uptime this was leading to 100% cpu and further overall system perf degradation.
Windows 10 Pro x64
C:\salt\var\run\salt-minion.pid
Salt Version:
Salt: 2017.7.4
Dependency Versions:
cffi: 1.10.0
cherrypy: unknown
dateutil: 2.6.0
docker-py: Not Installed
gitdb: 2.0.3
gitpython: 2.1.3
ioflo: Not Installed
Jinja2: 2.9.6
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: 1.0.6
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: Not Installed
pycparser: 2.17
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)]
python-gnupg: 0.4.0
PyYAML: 3.12
PyZMQ: 16.0.2
RAET: Not Installed
smmap: 2.0.3
timelib: 0.2.4
Tornado: 4.5.1
ZMQ: 4.1.6
System Versions:
dist:
locale: cp1251
machine: AMD64
release: 10
system: Windows
version: 10 10.0.16299 SP0 Multiprocessor Free
@dwoz or @twangboy can one of yall take a look at this?
Thanks,
Daniel
I am able to reproduce this when. The key is salt being run as a service.
It's not a permissions issue. SYSTEM has full control of all directories in the tree, including salt-minion.pid
.
@landergate The above PR will fix the issue. We'll see what the reviewers say to see if this is the best fix or not.
@landergate Does #46786 fix this issue for you?
@rallytime Absolutely. Many thanks =)
No more CPU peaks and *.pid file is being filled with proper new PID even if previously there was invalid data. Things going smooth.
@landergate That's wonderful news! :D
Most helpful comment
@rallytime Absolutely. Many thanks =)
No more CPU peaks and *.pid file is being filled with proper new PID even if previously there was invalid data. Things going smooth.