After our minions were upgraded from 2019.2.0 to 2019.2.1, the communication (IPv6 only) with the master (either 2019.2.0 or 2019.2.1) is broken.
Minion is stuck at:
SaltReqTimeoutError, retrying. (2/7)
Via tcpdump communication - but with length "0" - is observable.
Master says:
Sep 26 11:59:36 salt salt-master[9266]: [WARNING ] /usr/lib/python3/dist-packages/zmq/eventloop/ioloop.py:211: RuntimeWarning: IOLoop.current expected instance of <class 'zmq.eventloop.ioloop.ZMQIOLoop'>, got <tornado.platform.epoll.EPollIOLoop object at 0x7fd9e798feb8>
Sep 26 11:59:36 salt salt-master[9266]: ioloop.IOLoop.instance() is IOLoop.instance(), "tornado IOLoop already initialized"
IPv6-only communication, no special configuration.
After minion is started:
Sep 26 12:33:52 salt-minion[15491]: [DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'minion', 'tcp://<master-ip>', 'clear')
Sep 26 12:33:52 salt-minion[15491]: [DEBUG ] Connecting the Minion to the Master URI (for the return server): tcp://<master-ip>
Sep 26 12:33:52 salt-minion[15491]: [DEBUG ] Trying to connect to: tcp://<master-ip>
Sep 26 12:33:52 salt-minion[15491]: [DEBUG ] salt.crypt.get_rsa_pub_key: Loading public key
Sep 26 12:33:57 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (1/7)
Sep 26 12:34:02 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (2/7)
Sep 26 12:34:07 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (3/7)
Sep 26 12:34:12 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (4/7)
Sep 26 12:34:17 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (5/7)
Sep 26 12:34:22 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (6/7)
Sep 26 12:34:27 salt-minion[15491]: [DEBUG ] SaltReqTimeoutError, retrying. (7/7)
Sep 26 12:34:32 salt-minion[15491]: [DEBUG ] Re-init ZMQ socket: Message timed out
Sep 26 12:34:32 salt-minion[15491]: [DEBUG ] Trying to connect to: tcp://<master-ip>
Sep 26 12:34:32 salt-minion[15491]: [DEBUG ] Closing AsyncZeroMQReqChannel instance
Sep 26 12:34:32 salt-minion[15491]: [ERROR ] Error while bringing up minion for multi-master. Is master at <master-hostname> responding?
Master:
Sep 26 11:59:36 salt salt-master[9266]: [WARNING ] /usr/lib/python3/dist-packages/zmq/eventloop/ioloop.py:211: RuntimeWarning: IOLoop.current expected instance of <class 'zmq.eventloop.ioloop.ZMQIOLoop'>, got <tornado.platform.epoll.EPollIOLoop object at 0x7fd9e798feb8>
Sep 26 11:59:36 salt salt-master[9266]: ioloop.IOLoop.instance() is IOLoop.instance(), "tornado IOLoop already initialized"
With downgrade to 2019.2.0 of the minion everything works fine.
Master:
```Salt Version:
Salt: 2019.2.1
Dependency Versions:
cffi: 1.12.3
cherrypy: Not Installed
dateutil: 2.5.3
docker-py: Not Installed
gitdb: 2.0.3
gitpython: 2.1.7
ioflo: Not Installed
Jinja2: 2.9.4
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: Not Installed
pycparser: 2.19
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
python-gnupg: Not Installed
PyYAML: 3.12
PyZMQ: 16.0.2
RAET: Not Installed
smmap: 2.0.3
timelib: Not Installed
Tornado: 4.4.3
ZMQ: 4.2.1
System Versions:
dist: debian 9.11
locale: UTF-8
machine: x86_64
release: 4.19.0-0.bpo.5-amd64
system: Linux
version: debian 9.11
Minion:
Salt Version:
Salt: 2019.2.1
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: 2.5.3
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.9.4
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
python-gnupg: Not Installed
PyYAML: 3.12
PyZMQ: 16.0.2
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.4.3
ZMQ: 4.2.1
System Versions:
dist: debian 9.11
locale: UTF-8
machine: x86_64
release: 4.19.0-0.bpo.5-amd64
system: Linux
version: debian 9.11
```
This is affecting our systems as well. We have a hybrid ipv4
and ipv6
setup, which no longer works. I haven't been able to link it to any of the items posted in the changelog yet though.
Only our ipv4
minions are connected. Thankfully, we don't heavily depend on ipv6 yet, so removing the ipv6
record to the master resolves it for now.
All our minions broke because of this issue ... switching to v4 is not an option here as not all hosts even have a v4 address.
I suspect an issue in file:
https://github.com/saltstack/salt/blob/develop/salt/utils/zeromq.py#L81
line 81
Or the method is not called at all ...
Since the connect string on old minions looks like this:
And on new minions like this:
[DEBUG ] Trying to connect to: tcp://2001:608:a01:ffff::28:4506
So the brackets are missing which leads to a none connectable address.
The issue could be caused by this commit/PR:
https://github.com/saltstack/salt/commit/24bb94fb9e9e74ca165d147ef7bec5a361a0e6aa
https://github.com/saltstack/salt/pull/49755
IPV6 is broken in 2019.2.1
Thanks everyone for the clear reports - check out #54784 and let me know if you see any issues :)
Went to ansible for a quick fix. Inventory came from salt-key
. For us we enabled IPv6 in /etc/salt/minion.d/ipv6.conf
, so killing that file and restarting restored connection.
Adapt as needed.
YMMV.
- name: salt_disable_ipv6
hosts: salt
become: yes
become_method: sudo
handlers:
- name: restart salt-minion
service:
name: salt-minion
state: restarted
tasks:
- name: "Remove /etc/salt/minion.d/ipv6.conf"
file:
path: /etc/salt/minion.d/ipv6.conf
state: absent
notify: restart salt-minion
With #54784 applied, either over the release version of 2019.2.1, or just using the head of branch 2019.2.1, errors still triggered as below.
2019-09-29 12:59:40,157 [salt.minion :1049][CRITICAL][2819594] Unexpected error while connecting to HOSTNAME.DOMAIN
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/salt/minion.py", line 1026, in _connect_minion
yield minion.connect_master(failed=failed)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 261, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/usr/local/lib/python2.7/dist-packages/salt/minion.py", line 1224, in connect_master
master, self.pub_channel = yield self.eval_master(self.opts, self.timeout, self.safe, failed)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 261, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 1141, in run
yielded = self.gen.throw(*exc_info)
File "/usr/local/lib/python2.7/dist-packages/salt/minion.py", line 749, in eval_master
yield pub_channel.connect()
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 1133, in run
value = future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 261, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 326, in wrapper
yielded = next(result)
File "/usr/local/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 509, in connect
log.debug('Connecting the Minion to the Master publish port, using the URI: %s', self.master_pub)
File "/usr/local/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 520, in master_pub
source_port=self.opts.get('source_publish_port'))
File "/usr/local/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 84, in _get_master_uri
master_ip=ip_bracket(master_ip), master_port=master_port)
File "/usr/local/lib/python2.7/dist-packages/salt/utils/zeromq.py", line 86, in ip_bracket
addr = ipaddress.ip_address(addr)
File "/usr/local/lib/python2.7/dist-packages/salt/ext/ipaddress.py", line 120, in ip_address
address)
ValueError: u'[xxxx:xxxx::xxxx:xxxx]' does not appear to be an IPv4 or IPv6 address
Thanks for that stack trace!
modules/status.py:ping_master
minion.py:eval_master
The above functions have a side effect of updating opts['master_ip']
to the resolved ip address.
opts['master_ip']
is later passed to _get_master_uri
, which assumes a bare ip.
connect_master
and reconnect
are two of several functions that call eval_master.
We'll need to do make sure we're consistent in doing the conversion, and then we can assume we're dealing with URI-formatted ips everywhere.
Shouldn't this be closed now that https://github.com/saltstack/salt/pull/54823 is merged?
Most helpful comment
The issue could be caused by this commit/PR:
https://github.com/saltstack/salt/commit/24bb94fb9e9e74ca165d147ef7bec5a361a0e6aa
https://github.com/saltstack/salt/pull/49755