Similar to #32517 but in this instance it is only pointing to one master and its a different error. At the head of 2015.8 When a minion cannot reach a master that is down there is an error in the logs and the minions shows that it is restarting.
/etc/salt/minion:
master: ipaddress
[root@ch3ll-cent7 ~]# salt-minion -ldebug
[DEBUG ] Reading configuration from /etc/salt/minion
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG ] Using cached minion ID from /etc/salt/minion_id: ch3ll-cent7.c7.saltstack.net
[DEBUG ] Configuration file path: /etc/salt/minion
[WARNING ] Insecure logging configuration detected! Sensitive data may be logged.
[INFO ] Setting up the Salt Minion "ch3ll-cent7.c7.saltstack.net"
[DEBUG ] Created pidfile: /var/run/salt-minion.pid
[DEBUG ] Reading configuration from /etc/salt/minion
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[WARNING ] IMPORTANT: Do not use md5 hashing algorithm! Please set "hash_type" to sha256 in Salt Minion config!
[INFO ] The Salt Minion is starting up
[INFO ] Minion is starting as user 'root'
[DEBUG ] AsyncEventPublisher PUB socket URI: ipc:///var/run/salt/minion/minion_event_9126913ec3_pub.ipc
[DEBUG ] AsyncEventPublisher PULL socket URI: ipc:///var/run/salt/minion/minion_event_9126913ec3_pull.ipc
[INFO ] Starting pub socket on ipc:///var/run/salt/minion/minion_event_9126913ec3_pub.ipc
[INFO ] Starting pull socket on ipc:///var/run/salt/minion/minion_event_9126913ec3_pull.ipc
[DEBUG ] Minion 'ch3ll-cent7.c7.saltstack.net' trying to tune in
[DEBUG ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'ch3ll-cent7.c7.saltstack.net', 'tcp://54.159.114.180:4506')
[DEBUG ] Generated random reconnect delay between '1000ms' and '11000ms' (2882)
[DEBUG ] Setting zmq_reconnect_ivl to '2882ms'
[DEBUG ] Setting zmq_reconnect_ivl_max to '11000ms'
[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'ch3ll-cent7.c7.saltstack.net', 'tcp://54.159.114.180:4506', 'clear')
[INFO ] The Salt Minion is shut down
[WARNING ] /usr/lib/python2.7/site-packages/salt/scripts.py:83: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
log.error('Minion failed to start: {0}'.format(exc.message), exc_info=True)
[ERROR ] Minion failed to start: Attempt to authenticate with the salt master failed
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 81, in minion_process
minion.start()
File "/usr/lib/python2.7/site-packages/salt/cli/daemons.py", line 320, in start
self.minion.tune_in()
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 1699, in tune_in
self.sync_connect_master()
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 760, in sync_connect_master
raise six.reraise(*future_exception)
File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 876, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 767, in connect_master
master, self.pub_channel = yield self.eval_master(self.opts, self.timeout, self.safe)
File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 870, in run
value = future.result()
File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 876, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python2.7/site-packages/salt/minion.py", line 503, in eval_master
yield pub_channel.connect()
File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 870, in run
value = future.result()
File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 876, in run
yielded = self.gen.throw(*exc_info)
File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 338, in connect
yield self.auth.authenticate()
File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 870, in run
value = future.result()
File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
SaltClientError: Attempt to authenticate with the salt master failed
[WARNING ] ** Restarting minion **
[INFO ] Sleeping random_reauth_delay of 6 seconds
[DEBUG ] Reading configuration from /etc/salt/minion
[DEBUG ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
The minion does restart and when the master comes back online the minion is able to connect to the master just fine, the issue is the error and the minion restarting.
Also to note I did edit scripts.py to have exc_info=True instead of false so I could get the stack trace error:
log.error('Minion failed to start: {0}'.format(exc.message), exc_info=True)
[root@ch3ll-cent7 ~]# salt --versions-report
Salt Version:
Salt: 2015.8.8-324-g492ebfc
Dependency Versions:
Jinja2: 2.7.2
M2Crypto: 0.21.1
Mako: Not Installed
PyYAML: 3.11
PyZMQ: 14.7.0
Python: 2.7.5 (default, Nov 20 2015, 02:00:19)
RAET: Not Installed
Tornado: 4.2.1
ZMQ: 4.0.5
cffi: Not Installed
cherrypy: Not Installed
dateutil: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
libgit2: Not Installed
libnacl: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.7
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pygit2: Not Installed
python-gnupg: Not Installed
smmap: Not Installed
timelib: Not Installed
System Versions:
dist: centos 7.2.1511 Core
machine: x86_64
release: 3.10.0-327.4.5.el7.x86_64
system: CentOS Linux 7.2.1511 Core
Committed for 2015.8.9
@Ch3LL there are two points:
[WARNING ] /usr/lib/python2.7/site-packages/salt/scripts.py:83: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
log.error('Minion failed to start: {0}'.format(exc.message), exc_info=True)
I've fixed it in #32556
[ERROR ] Minion failed to start: Attempt to authenticate with the salt master failed
This message appears because master is down. Minion tries to authenticate some times with timeout and reports this. After that minion restarts itself.
I see no problem here. Am I wrong?
Also there is a config option auth_tries. User can set it to some huge value to avoid minion restarts in single master mode. I'm not sure do we have to set it to some huge value automatically for single master mode because it could break some use cases: for example config re-read on restart.
I've updated the code. Now if single master is unreachable minion will report
Attempt to authenticate with the salt master failed with timeout error
There was actual logic mistake where we were hiding the original exception reason replacing it with a constant string.
@Ch3LL thank you for catch!
@DmitryKuzmenko oh that error makes a lot more sense and is not as confusing. Thanks! I tested this and i'm now seeing that new timeout error. I'm just waiting for #32556 to be merged and then I will test and close this issue.
@Ch3LL I will try to get that fix in this morning. You should be able to re-test and close this by the end of the day. Thanks.
@cachedout thanks I went ahead and tested on the head of 2015.8 and i'm not seeing the error anymore. Once again thanks @DmitryKuzmenko !!