Salt: 2018.3 regression - unable to create Caller in a runner

Created on 5 Apr 2018  Â·  32Comments  Â·  Source: saltstack/salt

Description of Issue/Question

Runners which create salt.client.Caller throw exception listed below.
This stops us from upgrading to 2018.3 release.

Steps to Reproduce Issue

On a CentOS 7.4 minimally installed VM, do the following -

# curl -o bootstrap-salt.sh -L https://bootstrap.saltstack.com
# sudo sh bootstrap-salt.sh -M git 2018.3

Make sure that following returns true -

[root@base2 ~]# salt-call test.ping
local:
    True

Now create a test runner with the following -
cat /srv/_runners/test_runner.py

# -*- coding: utf-8 -*-
'''
test_runner.py - implementation of test runner api
Custom salt runner module
'''

import logging
import salt.client

__virtualname__ = 'test_runner'

#pylint: disable=bad-indentation, line-too-long, invalid-name

def __virtual__():
  return __virtualname__

def _get_logger():
  '''
  Get current logger.
  '''
  logger = logging.getLogger(__name__)
  return logger

def ping():
  '''
  Verify presence and installation of the module
  CLI Example:
     salt-run test_runner.ping
  '''
  local = salt.client.Caller()
  _get_logger().info('%s pinged', __name__)
  return True

Running (after configuring runners etc.) the salt-run test_runner.ping returns the following exception -

2018-04-05 12:41:06,259 [tornado.application:611 ][ERROR   ][9823] Exception in callback <functools.partial object at 0x2800ec0>
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 591, in _run_callback
    ret = callback()
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 342, in wrapped
    raise_exc_info(exc)
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 313, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 597, in <lambda>
    self.add_future(ret, lambda f: f.result())
  File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
    raise_exc_info(self._exc_info)
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 230, in wrapper
    yielded = next(result)
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 595, in _authenticate
    io_loop=self.io_loop)
  File "/usr/lib/python2.7/site-packages/salt/transport/client.py", line 108, in factory
    return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 133, in __new__
    obj.__singleton_init__(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 197, in __singleton_init__
    kwargs={'io_loop': self._io_loop})
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 927, in __init__
    super(AsyncReqMessageClientPool, self).__init__(AsyncReqMessageClient, opts, args=args, kwargs=kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/__init__.py", line 71, in __init__
    self.message_clients = [tgt(*args, **kwargs) for _ in range(sock_pool_size)]
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 974, in __init__
    self._init_socket()
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 1028, in _init_socket
    self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 114, in __init__
    self._init_io_state()
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 535, in _init_io_state
    self.io_loop.add_handler(self.socket, self._handle_events, self._state)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 703, in add_handler
    self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method.

If I manually update the pyzmq using pip to 17.0.0 version, this exception disappears.

Observation - By default repo.saltstack.com is supplying python-zmq version 15.3, should this be updated?

Versions Report

Salt Version:
           Salt: 2018.3.0-n/a-a064a3e

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.1
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug  4 2017, 00:39:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.4.1708 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.17.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.4.1708 Core
Core severity-high team-core

Most helpful comment

@akbhat - for what it's worth, I used pip to upgrade pyzmq to version 17 (as recommended by @isbm on April 12), and found this to be a suitable workaround for the time being:

# pip install --upgrade pyzmq
pyzmq (17.0.0)         - Python bindings for 0MQ
INSTALLED: 15.3.0 <- Version we were on
LATEST:    17.0.0 <- Version we upgraded to via pip

All 32 comments

@dmurphy18 @rallytime @Ch3LL thoughts?

We should be supported both the new 17.0 version and the older versions of pyzmq, so we need to get this fixed. @garethgreenaway Were you able to reproduce this? Looks like @golmaal is working off of the 2018.3 branch.

@isbm was the one who updated the support for 17.0 somewhat recently in preparation for the 2018.3.0 release. @isbm Perhaps you know where the fix for this might be since you were in this code most recently.

PyZMQ 15... OK, I have to look at this then. I will be anyway putting all that together for 2017.7

Overview

It looks like we're also striking this bug that @golmaal reported (and note that we're also using a custom runner which makes use of salt.client.Caller).
This bug was induced by upgrading from 2017.7.5 to 2018.3.0 on one of our salt master nodes.

Traceback

# salt-run state.event
[ERROR   ] Exception in callback <functools.partial object at 0x3e901b0>
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 591, in _run_callback
    ret = callback()
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 274, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 597, in <lambda>
    self.add_future(ret, lambda f: f.result())
  File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
    raise_exc_info(self._exc_info)
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 230, in wrapper
    yielded = next(result)
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 596, in _authenticate
    io_loop=self.io_loop)
  File "/usr/lib/python2.7/site-packages/salt/transport/client.py", line 108, in factory
    return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 132, in __new__
    obj.__singleton_init__(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 196, in __singleton_init__
    kwargs={'io_loop': self._io_loop})
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 929, in __init__
    super(AsyncReqMessageClientPool, self).__init__(AsyncReqMessageClient, opts, args=args, kwargs=kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/__init__.py", line 71, in __init__
    self.message_clients = [tgt(*args, **kwargs) for _ in range(sock_pool_size)]
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 976, in __init__
    self._init_socket()
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 1030, in _init_socket
    self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 114, in __init__
    self._init_io_state()
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 535, in _init_io_state
    self.io_loop.add_handler(self.socket, self._handle_events, self._state)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 703, in add_handler
    self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method.

Versions Report

Salt Version:                                                                                                                                                                                                                                                                                                                
           Salt: 2018.3.0

Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: 0.6.4
      gitpython: 1.0.1
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: 0.24.6
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.1
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.24.2
         Python: 2.7.5 (default, Aug  4 2017, 00:39:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.4.1708 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.21.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.4.1708 Core

Yuck... Sorry for this, guys. It was pretty heavily tested with 14 and then 17. Part of this because at SUSE we had 14 and then upgraded it to the latest in upcoming SLE 15 and Leap. And so I had naÑ—ve trust to ZMQ's claim it is different only after 16, but till 16 it is same as 13. But seems I have shouldn't trust that... Also strange that this popped up only now. No Beta testers around?

If you are unable to wait, just upgrade your PyZMQ to 17 and libzmq (or upgrade to Leap over CentOS, then it will work flawlessly out of the box — Salt is default there now, hehe :stuck_out_tongue: ). It has anyway quite dramatic bugfixes to ZMQ you _want_ to have.

Jokes aside, in the meanwhile I will try to fix this for you.

Thanks @isbm!

Some update: I cannot reproduce this anymore in develop branch. Since I am on virtenv, I have my configs elsewhere, so I modified the thing a bit:

import logging
import salt.client
import salt.config
from salt.client import Caller

__opts__ = salt.config.minion_config('/home/isbm/sandbox/salt/etc/minion')
__opts__['file_client'] = 'local'

def __virtual__():
    return __virtualname__

def _get_logger():
    '''
    Get current logger.
    '''
    logger = logging.getLogger(__name__)
    return logger

def foo():
    Caller(mopts=__opts__)
    _get_logger().info('%s pinged', __name__)
    return True

Call:

$ salt -V | grep PyZMQ
          PyZMQ: 15.0.0

$ salt-run --version
salt-run 2018.3.0-386-g8f3f314 (Oxygen)

$ salt-run test_runner.foo
True

Is this problem is still relevant?

@binocvlar BTW, the whole way of doing this sort of things in _runner_ is wrong. You should not import modules directly, but use utils, as per documentation.

Needless to say, there was couple of PRs regards to ZMQ from @DmitryKuzmenko and myself, so probably this was fixed in between. It just looks like totally irrelevant to the issue, because ZMQ registrar was touched indeed. And in some cases it probably was wrongly registering itself with the specific ZMQ version, and thus could not further process the salt-run which itself initialises all that stuff. I remember the latter PRs was exactly addressing this sort of type of issues.

I will try to look at other branches later.

@binocvlar BTW, the whole way of doing this sort of things in runner is wrong. You should not import modules directly, but use utils, as per documentation.

@isbm, when you write "the whole way of doing this sort of things in runner is wrong", are you saying that @golmaal's specific example is wrong? Or that _ever_ using import salt.client within a runner is wrong?

For context, what I've done is as follows:

  1. Setup a reactor, which triggers a runner
  2. When the runner is triggered, it transforms some input data, and then passes that data into a call to a statefile.

Are you suggesting that instead of running the statefile via the runner, I instead run a custom utility module instead?

I've read over the documentation for runners and utils, and can't see anything that seems to suggest that what I'm doing is in error. Can you please elaborate?

@binocvlar well, I meant that you normally don't do import foo in runners, but using it as properly documented. :wink: On the other hand, seems like you're one of many people stumbling upon this, since naturally it is indeed just to import modules as usual. But in this case.

As of the issue, do you still getting the problem? Since I cannot reproduce it anymore (unless my environment has something hidden I am missing and I am blind seeing it).

I meant that you normally don't do import foo in runners, but using it as properly documented

@isbm - there is an import statement in the very simple demo runner that's listed on the Salt Runners documentation page:

# Import salt modules
import salt.client

def up():
    '''
    Print a list of all of the minions that are up
    '''
    client = salt.client.LocalClient(__opts__['conf_file'])
    minions = client.cmd('*', 'test.ping', timeout=1)
    for minion in sorted(minions):
        print minion

Additionally, I just sampled five runners from https://github.com/saltstack/salt/tree/develop/salt/runners , and they all contain import statements.

As I mentioned earlier:

I've read over the documentation for runners and utils, and can't see anything that seems to suggest that what I'm doing is in error.

With regard to this question:

As of the issue, do you still getting the problem?

You mentioned earlier that something seems to have been fixed within the develop branch. When I performed my testing, I was using the official 2018.3.0 package. I'll probably wait for an updated package prior to testing again.

We import salt.client into runners all the time.

https://github.com/saltstack/salt/blob/2017.7/salt/runners/salt.py#L39

The clients are not exposed through utils.

@binocvlar we made some changes in after that package has been released, can you try using the 2018.3.1 branch and install from git to test and see if the new release is going to fix your issues?

@binocvlar arrgggh, my bad. The standard imports are OK, of course. I just saw import.... and somewhat missed it was a standard lib. Sorry.

The issue is still was not really with the imports, but with the initialisation of the ZMQ. But there was number of fixes to this topic. Please try what @gtmanfred said and tell us if you still have the same issue reappearing.

Thanks!

@golmaal any updates so far?

@isbm: I've deployed a UAT saltmaster that I can use to test a develop branch master deployment on. I'm busy today and tomorrow, but am hoping to tackle this on Monday or Tuesday. I'll keep you posted.

I'm seeing this issue now in 2017.7.6 release as well. Here as well if I upgrade the pyzmq through pip before installing salt through bootstrap, I'm then able to use Caller class as expected from within runner methods.

@isbm really sorry for such a late reply. Is there anything in particular you're looking for?

Upgraded from 2017.7.5 to 2017.7.6 and now seeing the same crash when executing a runner. Dependency and Systems versions report is identical (PyZMQ: 15.2.0, ZMQ: 4.1.4, Tornado: 4.2.1). Same crash with 2018.3.1 as well.

[DEBUG ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'seed1', 'tcp://192.168.122.97:4506', 'clear')
[ERROR ] Exception in callback
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/tornado/ioloop.py", line 592, in _run_callback
ret = callback()
File "/usr/lib/python2.7/dist-packages/tornado/stack_context.py", line 343, in wrapped
raise_exc_info(exc)
File "/usr/lib/python2.7/dist-packages/tornado/stack_context.py", line 314, in wrapped
ret = fn(args, *kwargs)
File "/usr/lib/python2.7/dist-packages/tornado/ioloop.py", line 598, in
self.add_future(ret, lambda f: f.result())
File "/usr/lib/python2.7/dist-packages/tornado/concurrent.py", line 215, in result
raise_exc_info(self._exc_info)
File "/usr/lib/python2.7/dist-packages/tornado/gen.py", line 230, in wrapper
yielded = next(result)
File "/usr/lib/python2.7/dist-packages/salt/crypt.py", line 510, in _authenticate
io_loop=self.io_loop)
File "/usr/lib/python2.7/dist-packages/salt/transport/client.py", line 108, in factory
return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, *kwargs)
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 130, in __new__
obj.__singleton_init__(opts, *
kwargs)
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 193, in __singleton_init__
kwargs={'io_loop': self._io_loop})
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 907, in __init__
super(AsyncReqMessageClientPool, self).__init__(AsyncReqMessageClient, opts, args=args, kwargs=kwargs)
File "/usr/lib/python2.7/dist-packages/salt/transport/__init__.py", line 69, in __init__
self.message_clients = [tgt(args, *kwargs) for _ in range(sock_pool_size)]
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 954, in __init__
self._init_socket()
File "/usr/lib/python2.7/dist-packages/salt/transport/zeromq.py", line 1007, in _init_socket
self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
File "/usr/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 114, in __init__
self._init_io_state()
File "/usr/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 535, in _init_io_state
self.io_loop.add_handler(self.socket, self._handle_events, self._state)
File "/usr/lib/python2.7/dist-packages/tornado/ioloop.py", line 704, in add_handler
self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method.

@gtmanfred, @isbm:

This morning, I applied yum updates on our non-production saltmaster. From the list of upgraded packages, here are the most interesting ones:

salt-2017.7.6-1.el7.noarch # was 2017.7.5-1
salt-minion-2017.7.6-1.el7.noarch # was 2017.7.5-1
salt-master-2017.7.6-1.el7.noarch # was 2017.7.5-1
zeromq-4.1.4-7.el7.x86_64 # was 4.1.4-6

Note that python-zmq-15.3.0-2.el7.x86_64 remained unchanged.

After this change, attempting to inspect the message bus would yield a traceback (which not the case prior to updating):

# salt-run state.event pretty=True
[ERROR   ] Exception in callback <functools.partial object at 0x7ff739a556d8>
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 591, in _run_callback
    ret = callback()
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 274, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 597, in <lambda>
    self.add_future(ret, lambda f: f.result())
  File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
    raise_exc_info(self._exc_info)
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 230, in wrapper
    yielded = next(result)
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 510, in _authenticate
    io_loop=self.io_loop)
  File "/usr/lib/python2.7/site-packages/salt/transport/client.py", line 108, in factory
    return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 130, in __new__
    obj.__singleton_init__(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 193, in __singleton_init__
    kwargs={'io_loop': self._io_loop})
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 907, in __init__
    super(AsyncReqMessageClientPool, self).__init__(AsyncReqMessageClient, opts, args=args, kwargs=kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/__init__.py", line 69, in __init__
    self.message_clients = [tgt(*args, **kwargs) for _ in range(sock_pool_size)]
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 954, in __init__
    self._init_socket()
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 1007, in _init_socket
    self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 114, in __init__
    self._init_io_state()
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 535, in _init_io_state
    self.io_loop.add_handler(self.socket, self._handle_events, self._state)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 703, in add_handler
    self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method.

I decided to try downgrading zeromq to the EPEL version, in order to see whether the traceback was a result of the salt updates, or the zeromq ones:

# yum downgrade zeromq-4.1.4-5.el7
Loaded plugins: fastestmirror, ps
Repository base is listed more than once in the configuration
Repository updates is listed more than once in the configuration
Loading mirror speeds from cached hostfile
 * base: mirror.overthewire.com.au
 * extras: mirror.overthewire.com.au
 * updates: mirror.overthewire.com.au
Resolving Dependencies
--> Running transaction check
---> Package zeromq.x86_64 0:4.1.4-5.el7 will be a downgrade
--> Processing Dependency: libsodium.so.13()(64bit) for package: zeromq-4.1.4-5.el7.x86_64
---> Package zeromq.x86_64 0:4.1.4-7.el7 will be erased
--> Running transaction check
---> Package libsodium13.x86_64 0:1.0.5-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=====================================================================================================================================
 Package                           Arch                         Version                             Repository                  Size
=====================================================================================================================================
Downgrading:
 zeromq                            x86_64                       4.1.4-5.el7                         epel                       556 k
Installing for dependencies:
 libsodium13                       x86_64                       1.0.5-1.el7                         epel                       144 k

Transaction Summary
=====================================================================================================================================
Install               ( 1 Dependent package)
Downgrade  1 Package

Total download size: 700 k
Is this ok [y/d/N]: y
Downloading packages:
(1/2): zeromq-4.1.4-5.el7.x86_64.rpm                                                                          | 556 kB  00:00:00
(2/2): libsodium13-1.0.5-1.el7.x86_64.rpm                                                                     | 144 kB  00:00:00
-------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                3.7 MB/s | 700 kB  00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : libsodium13-1.0.5-1.el7.x86_64                                                                                    1/3
  Installing : zeromq-4.1.4-5.el7.x86_64                                                                                         2/3
  Cleanup    : zeromq-4.1.4-7.el7.x86_64                                                                                         3/3
  Verifying  : libsodium13-1.0.5-1.el7.x86_64                                                                                    1/3
  Verifying  : zeromq-4.1.4-5.el7.x86_64                                                                                         2/3
  Verifying  : zeromq-4.1.4-7.el7.x86_64                                                                                         3/3

Removed:
  zeromq.x86_64 0:4.1.4-7.el7

Installed:
  zeromq.x86_64 0:4.1.4-5.el7

Dependency Installed:
  libsodium13.x86_64 0:1.0.5-1.el7

Complete!

Unfortunately, running salt-run state.event pretty=True still throws the same traceback as above.

Finally, note that we're still seeing this behaviour after upgrading to 2018.3.1.

... and once again, if I remove the runner_dirs: directive from /etc/salt/master, the problem disappears.

Any luck figuring this out?

@akbhat - for what it's worth, I used pip to upgrade pyzmq to version 17 (as recommended by @isbm on April 12), and found this to be a suitable workaround for the time being:

# pip install --upgrade pyzmq
pyzmq (17.0.0)         - Python bindings for 0MQ
INSTALLED: 15.3.0 <- Version we were on
LATEST:    17.0.0 <- Version we upgraded to via pip

Hello there, not sure but I think I've got an issue related to this.

Context
Minion send an event
Reactor catch the event and launch a runner
Runner is calling caller and vault module to write some secrets

caller = salt.client.Caller()
caller.cmd('vault.write_secret', ...)

It works fine a couple of time and then

Traceback (most recent call last):
  File \"/usr/lib/python2.7/dist-packages/salt/client/mixins.py\", line 387, in _low
    data['return'] = self.functions[fun](*args, **kwargs)
  File \"/srv/runners/alkivi.py\", line 86, in sync
    caller = salt.client.Caller()
  File \"/usr/lib/python2.7/dist-packages/salt/client/__init__.py\", line 2094, in __init__
    self.sminion = salt.minion.SMinion(self.opts)
  File \"/usr/lib/python2.7/dist-packages/salt/minion.py\", line 793, in __init__
    lambda: self.eval_master(self.opts, failed=True)
  File \"/usr/lib/python2.7/dist-packages/tornado/ioloop.py\", line 452, in run_sync
    self.start()
  File \"/usr/lib/python2.7/dist-packages/tornado/ioloop.py\", line 752, in start
    raise RuntimeError(\"IOLoop is already running\")
RuntimeError: IOLoop is already running

I think the underlying issue is that the module vault.write_secret itself calls a runner as well.

Any thought on that ?

@akbhat yeah, sorry about that. But if you are on PIP, then there is no much reasons not to upgrade. I would understand if you are on some enterprise-like OS where only validated vendor-supported patches are allowed...

@isbm I upgraded pyzmq to ver 17.0 and then installed 2017.7.6. Now I see a completely different error which may or may not be related though I tried the same install with 2017.7.5 and pyzmq 15.2 and it works without error. Here is the error I am encountering:

 Comment: An exception occurred in this state: Traceback (most recent call last):
            File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1879, in call
              **cdata['kwargs'])
            File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1826, in wrapper
              return f(*args, **kwargs)
            File "/usr/lib/python2.7/dist-packages/salt/states/saltmod.py", line 511, in function
              if mdata.get('retcode'):
          AttributeError: 'str' object has no attribute 'get'

Salt Version:
Salt: 2017.7.6

Dependency Versions:
cffi: 1.11.5
cherrypy: unknown
dateutil: Not Installed
docker-py: Not Installed
gitdb: 0.6.4
gitpython: 1.0.1
ioflo: Not Installed
Jinja2: 2.8
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: 0.21.1
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.6
mysql-python: Not Installed
pycparser: 2.18
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 2.7.12 (default, Dec 4 2017, 14:50:18)
python-gnupg: Not Installed
PyYAML: 3.11
PyZMQ: 17.0.0
RAET: Not Installed
smmap: 0.9.0
timelib: Not Installed
Tornado: 4.2.1
ZMQ: 4.1.6

System Versions:
dist: Ubuntu 16.04 xenial
locale: UTF-8
machine: x86_64
release: 4.4.0-38-generic
system: Linux
version: Ubuntu 16.04 xenial

Aren't we're talking about 2018.3?

TypeError: argument must be an int, or have a fileno() method.

This one looks related to #48370.
We have to avoid using of direct IOLoop construction in salt.utils.event. Also we have to call install_zmq() somewhere in runners. We done it in master and minion processes, but not in engines that is the problem.

Just wanted to say thanks for the recommended workaround. Upgrading from pyzmq 14.0.1 to 17.1.0 fixed my custom runners.

I have quite similar problem. I see a lot of errors like this:

2018-11-29 16:27:58,655 [py.warnings      :156 ][WARNING ][12991] /usr/lib/python2.7/dist-packages/zmq/eventloop/ioloop.py:211: RuntimeWarning: IOLoop.current expected instance of <class 'zmq.eventloop.ioloop.ZMQIOLoop'>, got <tornado.platform.epoll.EPollIOLoop object at
0x7fe7f1712c50>
  ioloop.IOLoop.instance() is IOLoop.instance(), "tornado IOLoop already initialized"

Upgrading pyzmq is not an option as python-zmq is provided by ubuntu and this package is required by salt-minion and salt-master:

Collecting pyzmq
  Using cached https://files.pythonhosted.org/packages/ab/00/57fec02fdd27aaa38f8d570a1736a6aa9534d242690a89b55eeee902a757/pyzmq-17.1.2-cp27-cp27mu-manylinux1_x86_64.whl
Installing collected packages: pyzmq
  Found existing installation: pyzmq 16.0.2
Cannot uninstall 'pyzmq'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
The following packages have unmet dependencies:
 salt-master : Depends: python-zmq (>= 14.4.0) but it is not going to be installed
 salt-minion : Depends: python-zmq (>= 14.4.0) but it is not going to be installed

I'm using 2018.3.3 with tcp transport protocol.

I found another variation of this problem... and the workaround listed above doesn't seem to help.

I have several runners that create caller clients. Here's a simple example:

# my_file_roots/my_state_dir/_runnners/charles_test.py

def test():
    caller_client = salt.client.Caller()
    pillar_key = 'salt'
    pillar = caller_client.cmd('pillar.get', pillar_key, pillarenv='base')
    return pillar

After my recent upgrade to 2018.3.4, I made sure to have pyzmq==17.1.0 as mentioned here.

Runners which create caller clients work for me after pinning pyzmq when I run them via salt-run, but calling them over the http API (rest_cherrypy) brings me back to the IOLoop is already running error..

[root@ops-saltmastertest-1c001 salt]# salt-run charles_test.test
[WARNING ] The 'environment' minion config option has been renamed to 'saltenv'. Using cmclaughlin as the 'saltenv' config value.
[ERROR   ] Broken CPE_NAME format in /etc/os-release!
[WARNING ] The 'environment' minion config option has been renamed to 'saltenv'. Using cmclaughlin as the 'saltenv' config value.
[ERROR   ] Broken CPE_NAME format in /etc/os-release!
minion:
    ----------
    master_tries:
        -1
    mine_functions:
        ----------
        network.interface_ip:
            - eth0
    pillarenv_from_saltenv:
        True
restart_via_at:
    True
[INFO    ] Runner completed: 20190520203558135448

# Same runner over http...
[root@ops-saltmastertest-1c001 salt]# pepper --client runner charles_test.test
{
    "return": [
        "Exception occurred in runner charles_test.test: Traceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/salt/client/mixins.py\", line 387, in _low\n    data['return'] = self.functions[fun](*args, **kwargs)\n  File \"/salt/extension_modules/runners/charles_test.py\", line 10, in test\n    caller_client = salt.client.Caller()\n  File \"/usr/lib/python2.7/site-packages/salt/client/__init__.py\", line 2114, in __init__\n    self.sminion = salt.minion.SMinion(self.opts)\n  File \"/usr/lib/python2.7/site-packages/salt/minion.py\", line 806, in __init__\n    lambda: self.eval_master(self.opts, failed=True)\n  File \"/usr/lib64/python2.7/site-packages/tornado/ioloop.py\", line 439, in run_sync\n    self.start()\n  File \"/usr/lib64/python2.7/site-packages/zmq/eventloop/ioloop.py\", line 162, in start\n  File \"/usr/lib64/python2.7/site-packages/tornado/ioloop.py\", line 730, in start\n    raise RuntimeError(\"IOLoop is already running\")\nRuntimeError: IOLoop is already running\n"
    ]
}

Should this be considered a new issue?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

twangboy picture twangboy  Â·  3Comments

mooperd picture mooperd  Â·  3Comments

saurabhnemade picture saurabhnemade  Â·  3Comments

golmaal picture golmaal  Â·  3Comments

Oloremo picture Oloremo  Â·  3Comments