saltutil.cmd giving an exception on the minion end

Created on 23 Oct 2019  路  17Comments  路  Source: saltstack/salt

Description of Issue

Trying to issue this on the salt-master:

salt 'minion1' saltutil.cmd '*' grains.items

or

salt-call saltutil.cmd '*' grains.items

results in an exception below, the log from the target minion:

2019-10-23 18:21:45,663 [salt.minion      :1605][INFO    ][9981] Starting a new job 20191023182145625298 with PID 9981
2019-10-23 18:21:45,668 [salt.utils.lazy  :107 ][DEBUG   ][9981] Could not LazyLoad {0}.allow_missing_func: '{0}.allow_missing_func' is not available.
2019-10-23 18:21:45,669 [salt.config      :2161][DEBUG   ][9981] Reading configuration from /etc/salt/minion
2019-10-23 18:21:45,671 [salt.config      :2265][DEBUG   ][9981] Missing configuration file: /root/.saltrc
2019-10-23 18:21:45,672 [salt.utils.event :324 ][DEBUG   ][9981] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
2019-10-23 18:21:45,673 [salt.utils.event :327 ][DEBUG   ][9981] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
2019-10-23 18:21:45,698 [salt.client      :1709][ERROR   ][9981] Unable to connect to the salt master publisher at /var/run/salt/master
2019-10-23 18:21:45,699 [salt.minion      :1738][WARNING ][9981] The minion function caused an exception
2019-10-23 18:21:45,699 [salt.utils.event :324 ][DEBUG   ][9981] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_7d1840618d_pub.ipc
2019-10-23 18:21:45,699 [salt.utils.event :327 ][DEBUG   ][9981] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_7d1840618d_pull.ipc
2019-10-23 18:21:45,699 [salt.transport.ipc:261 ][DEBUG   ][9981] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_7d1840618d_pull.ipc
2019-10-23 18:21:45,701 [salt.utils.event :742 ][DEBUG   ][9981] Sending event: tag = _salt_error; data = {'message': u'The minion function caused an exception', 'args': ('The minion function caused an exception',), u'_stamp': '2019-10-23T16:21:45.701169'}
2019-10-23 18:21:45,701 [salt.transport.ipc:405 ][DEBUG   ][9981] Closing IPCMessageClient instance
2019-10-23 18:21:45,702 [salt.minion      :2355][DEBUG   ][1553] Minion of x.x.x.x is handling event tag '_salt_error'
2019-10-23 18:21:45,702 [salt.minion      :2528][DEBUG   ][1553] Forwarding salt error event tag=_salt_error
2019-10-23 18:21:45,702 [salt.minion      :767 ][DEBUG   ][9981] Minion return retry timer set to 10 seconds (randomized)
2019-10-23 18:21:45,703 [salt.minion      :1941][INFO    ][9981] Returning information for job: 20191023182145625298
2019-10-23 18:21:45,703 [salt.transport.zeromq:132 ][DEBUG   ][1553] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'minion1', u'tcp://x.x.x.x:4506', u'aes')
2019-10-23 18:21:45,703 [salt.transport.zeromq:132 ][DEBUG   ][9981] Initializing new AsyncZeroMQReqChannel for (u'/etc/salt/pki/minion', u'minion1', u'tcp://x.x.x.x:4506', u'aes')
2019-10-23 18:21:45,703 [salt.crypt       :463 ][DEBUG   ][9981] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'minoin1', u'tcp://x.x.x.x:4506')
2019-10-23 18:21:45,703 [salt.crypt       :463 ][DEBUG   ][1553] Initializing new AsyncAuth for (u'/etc/salt/pki/minion', u'minoin1', u'tcp://x.x.x.x:4506')
2019-10-23 18:21:45,704 [salt.transport.zeromq:203 ][DEBUG   ][9981] Connecting the Minion to the Master URI (for the return server): tcp://x.x.x.x:4506
2019-10-23 18:21:45,704 [salt.transport.zeromq:203 ][DEBUG   ][1553] Connecting the Minion to the Master URI (for the return server): tcp://x.x.x.x:4506
2019-10-23 18:21:45,704 [salt.transport.zeromq:1178][DEBUG   ][9981] Trying to connect to: tcp://x.x.x.x:4506
2019-10-23 18:21:45,705 [salt.transport.zeromq:1178][DEBUG   ][1553] Trying to connect to: tcp://x.x.x.x:4506
2019-10-23 18:21:45,774 [salt.transport.zeromq:227 ][DEBUG   ][1553] Closing AsyncZeroMQReqChannel instance
2019-10-23 18:21:45,793 [salt.transport.zeromq:227 ][DEBUG   ][9981] Closing AsyncZeroMQReqChannel instance
2019-10-23 18:21:45,793 [salt.minion      :1783][DEBUG   ][9981] minion return:{u'fun_args': [u'minion1', u'grains.items'], u'jid': u'20191024113408001798', u'return': u'The minion function caused an exception: Traceback (most recent call last):\n  File "/usr/lib/python2.7/dist-packages/salt/minion.py", line 1660, in _thread_return\n    return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/executors/direct_call.py", line 12, in execute\n    return func(*args, **kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1490, in cmd\n    client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1454, in _exec\n    for ret_comp in _cmd(**cmd_kwargs):\n  File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 841, in cmd_iter\n    **kwargs)\n  File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 341, in run_job\n    \'The salt master could not be contacted. Is master running?\'\nSaltClientError: The salt master could not be contacted. Is master running?\n', u'retcode': 1, u'success': False, u'fun': u'saltutil.cmd', u'out': u'nested'}

the result to the master is:

    The minion function caused an exception: Traceback (most recent call last):
      File "/usr/lib/python2.7/dist-packages/salt/minion.py", line 1660, in _thread_return
        return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/executors/direct_call.py", line 12, in execute
        return func(*args, **kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1490, in cmd
        client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 1454, in _exec
        for ret_comp in _cmd(**cmd_kwargs):
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 841, in cmd_iter
        **kwargs)
      File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 341, in run_job
        'The salt master could not be contacted. Is master running?'
    SaltClientError: The salt master could not be contacted. Is master running?


Setup

salt master and minion are running flourine with python 2.7

Steps to Reproduce Issue

installed salt fluorine on both master and minion and then issued:

salt 'minion1' saltutil.cmd '*' grains.items

Versions Report

master:

Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: Not Installed
       cherrypy: unknown
       dateutil: 2.5.3
      docker-py: Not Installed
          gitdb: 2.0.0
      gitpython: 2.1.1
          ioflo: Not Installed
         Jinja2: 2.9.4
        libgit2: 0.24.5
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.24.2
         Python: 2.7.13 (default, Sep 26 2018, 18:42:22)
   python-gnupg: Not Installed
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.1
        timelib: Not Installed
        Tornado: 4.4.3
            ZMQ: 4.2.1

System Versions:
           dist: debian 9.8
         locale: UTF-8
        machine: x86_64
        release: 4.9.0-8-amd64
         system: Linux
        version: debian 9.8

minion:

Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: 1.12.3
       cherrypy: Not Installed
       dateutil: 2.5.3
      docker-py: 4.0.2
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.9.4
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: Not Installed
      pycparser: 2.19
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.13 (default, Sep 26 2018, 18:42:22)
   python-gnupg: Not Installed
         PyYAML: 3.13
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.4.3
            ZMQ: 4.2.1

System Versions:
           dist: debian 9.11
         locale: UTF-8
        machine: x86_64
        release: 4.9.0-9-amd64
         system: Linux
        version: debian 9.11

Pending Discussion

Most helpful comment

So, I was able to reproduce this. Here is my setup...

  • Ubuntu minion is running 2019.2.0 with python 2.7.14
  • Windows minion is running 2019.2.0. with python 3.5.4

I ran salt * saltutil.cmd '*' grains.items and the ubuntu minion returned, but my windows minion did not, and gave me this instead:

windows:
    The minion function caused an exception: Traceback (most recent call last):
      File "c:\dev\salt\salt\client\__init__.py", line 1730, in pub
        raise SaltReqTimeoutError()
    salt.exceptions.SaltReqTimeoutError

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "c:\dev\salt\salt\client\__init__.py", line 334, in run_job
        **kwargs)
      File "c:\dev\salt\salt\client\__init__.py", line 1734, in pub
        'Salt request timed out. The master is not responding. You '
    salt.exceptions.SaltReqTimeoutError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "c:\dev\salt\salt\minion.py", line 1664, in _thread_return
        return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)
      File "c:\dev\salt\salt\executors\direct_call.py", line 12, in execute
        return func(*args, **kwargs)
      File "c:\dev\salt\salt\modules\saltutil.py", line 1503, in cmd
        client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)
      File "c:\dev\salt\salt\modules\saltutil.py", line 1467, in _exec
        for ret_comp in _cmd(**cmd_kwargs):
      File "c:\dev\salt\salt\client\__init__.py", line 838, in cmd_iter
        **kwargs)
      File "c:\dev\salt\salt\client\__init__.py", line 346, in run_job
        raise SaltClientError(general_exception)
    salt.exceptions.SaltClientError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

So far it seems that 2019.2.0 and python3 is where we see this.

All 17 comments

@waheedi, this might be a silly question. But can you connect to both tcp ports 4505 and 4506 of the host running the salt-master?

@arizvisa , Its totally legitimate question, thank you. I will check if anything is blocking ports, I have no access atm, but I suspect they can be reached normally, some other saltutil functions works just fine, the cmd function is not. But I will check and report back tomorrow.

@arizvisa I telneted to the two ports and worked fine. Connectivity is valid, up and running

I'm not a maintainer or anything, so whenever they pick up this issue they might have more insight on this, but master/minion communication is handled by tornado and zmq. The issue is happening when the results of the command are being iterated through (according to your backtrace) which comes from the returner, so maybe fiddling around with either updating those two modules or tinkering with returner options might help.

I've had numerous problems with the different returners until I rewrote the etcd returner and have been using that. So if updating those two modules doesn't work, and you're in a tight place while waiting for a developer to get assigned this issue, you can maybe try using a different returner.

I just gave this a test run and here's my results...

root@DESKTOP-SQIU2I1:/home/eacott# salt-call saltutil.cmd '*' grains.items
local:
    ----------
    DESKTOP-SQIU2I1.localdomain:
        ----------
        jid:
            20191031135824729292
        out:
            nested
        ret:
            ----------
... (snipped) ...

And a versions report

root@DESKTOP-SQIU2I1:/home/eacott# salt --versions-report
Salt Version:
           Salt: 2019.2.0

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: 4.1.0
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.7
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.15+ (default, Oct  7 2019, 17:39:04)
   python-gnupg: 0.4.1
         PyYAML: 3.13
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: UTF-8
        machine: x86_64
        release: 4.4.0-18362-Microsoft
         system: Linux
        version: Ubuntu 18.04 bionic

@xeacott Thanks a lot for your reply and thanks for the investigation @arizvisa I think the modules difference makes total sense, as I have been able to make it work with a different tornado pyzmq/zmq. @xeacott it would be great if you have it working with python3+ as it would not make sense to use the 2.7 python. If you can report the versions for python3+ that would be appreciated.

Much appreciate the efforts.

All my best,

Waheed

I tried again using python3.5 and the below versions report:

Salt Version:
           Salt: 2019.2.2

Dependency Versions:
           cffi: 1.12.3
       cherrypy: 3.5.0
       dateutil: 2.5.3
      docker-py: Not Installed
          gitdb: 2.0.0
      gitpython: 2.1.1
          ioflo: Not Installed
         Jinja2: 2.9.4
        libgit2: 0.28.2
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: 2.19
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.28.2
         Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
   python-gnupg: Not Installed
         PyYAML: 3.12
          PyZMQ: 18.1.0
           RAET: Not Installed
          smmap: 2.0.1
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.3.2

System Versions:
           dist: debian 9.11
         locale: UTF-8
        machine: x86_64
        release: 4.9.0-11-amd64
         system: Linux
        version: debian 9.11

I now get this error message:

minion_1:
    The minion function caused an exception: Traceback (most recent call last):
      File "/usr/local/lib/python3.5/dist-packages/salt/client/__init__.py", line 1734, in pub
        payload = channel.send(payload_kwargs, timeout=timeout)
      File "/usr/local/lib/python3.5/dist-packages/salt/utils/asynchronous.py", line 64, in wrap
        ret = self._block_future(ret)
      File "/usr/local/lib/python3.5/dist-packages/salt/utils/asynchronous.py", line 74, in _block_future
        return future.result()
      File "/usr/local/lib/python3.5/dist-packages/tornado/concurrent.py", line 237, in result
        try:
      File "<string>", line 3, in raise_exc_info
      File "/usr/local/lib/python3.5/dist-packages/tornado/gen.py", line 1021, in run
        def set_result(self, key, result):
      File "/usr/local/lib/python3.5/dist-packages/salt/transport/zeromq.py", line 381, in send
        ret = yield self._uncrypted_transfer(load, tries=tries, timeout=timeout)
      File "/usr/local/lib/python3.5/dist-packages/tornado/gen.py", line 1015, in run
        def is_ready(self, key):
      File "/usr/local/lib/python3.5/dist-packages/tornado/concurrent.py", line 237, in result
        try:
      File "<string>", line 3, in raise_exc_info
      File "/usr/local/lib/python3.5/dist-packages/tornado/gen.py", line 1021, in run
        def set_result(self, key, result):
      File "/usr/local/lib/python3.5/dist-packages/salt/transport/zeromq.py", line 370, in _uncrypted_transfer
        tries=tries,
      File "/usr/local/lib/python3.5/dist-packages/tornado/gen.py", line 1015, in run
        def is_ready(self, key):
      File "/usr/local/lib/python3.5/dist-packages/tornado/concurrent.py", line 237, in result
        try:
      File "<string>", line 3, in raise_exc_info
    salt.exceptions.SaltReqTimeoutError: Message timed out

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.5/dist-packages/salt/client/__init__.py", line 337, in run_job
        **kwargs)
      File "/usr/local/lib/python3.5/dist-packages/salt/client/__init__.py", line 1737, in pub
        'Salt request timed out. The master is not responding. You '
    salt.exceptions.SaltReqTimeoutError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.5/dist-packages/salt/minion.py", line 1664, in _thread_return
        return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)
      File "/usr/local/lib/python3.5/dist-packages/salt/executors/direct_call.py", line 12, in execute
        return func(*args, **kwargs)
      File "/usr/local/lib/python3.5/dist-packages/salt/modules/saltutil.py", line 1503, in cmd
        client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)
      File "/usr/local/lib/python3.5/dist-packages/salt/modules/saltutil.py", line 1467, in _exec
        for ret_comp in _cmd(**cmd_kwargs):
      File "/usr/local/lib/python3.5/dist-packages/salt/client/__init__.py", line 841, in cmd_iter
        **kwargs)
      File "/usr/local/lib/python3.5/dist-packages/salt/client/__init__.py", line 349, in run_job
        raise SaltClientError(general_exception)
    salt.exceptions.SaltClientError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

So, I was able to reproduce this. Here is my setup...

  • Ubuntu minion is running 2019.2.0 with python 2.7.14
  • Windows minion is running 2019.2.0. with python 3.5.4

I ran salt * saltutil.cmd '*' grains.items and the ubuntu minion returned, but my windows minion did not, and gave me this instead:

windows:
    The minion function caused an exception: Traceback (most recent call last):
      File "c:\dev\salt\salt\client\__init__.py", line 1730, in pub
        raise SaltReqTimeoutError()
    salt.exceptions.SaltReqTimeoutError

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "c:\dev\salt\salt\client\__init__.py", line 334, in run_job
        **kwargs)
      File "c:\dev\salt\salt\client\__init__.py", line 1734, in pub
        'Salt request timed out. The master is not responding. You '
    salt.exceptions.SaltReqTimeoutError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "c:\dev\salt\salt\minion.py", line 1664, in _thread_return
        return_data = minion_instance.executors[fname](opts, data, func, args, kwargs)
      File "c:\dev\salt\salt\executors\direct_call.py", line 12, in execute
        return func(*args, **kwargs)
      File "c:\dev\salt\salt\modules\saltutil.py", line 1503, in cmd
        client, tgt, fun, arg, timeout, tgt_type, ret, kwarg, **kwargs)
      File "c:\dev\salt\salt\modules\saltutil.py", line 1467, in _exec
        for ret_comp in _cmd(**cmd_kwargs):
      File "c:\dev\salt\salt\client\__init__.py", line 838, in cmd_iter
        **kwargs)
      File "c:\dev\salt\salt\client\__init__.py", line 346, in run_job
        raise SaltClientError(general_exception)
    salt.exceptions.SaltClientError: Salt request timed out. The master is not responding. You may need to run your command with `--async` in order to bypass the congested event bus. With `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

So far it seems that 2019.2.0 and python3 is where we see this.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

This issue was successfully reproduced by @xeacott and therefore has not gone stale as it's still just trying to live its life.

*snaps whip at bot*

Thank you for updating this issue. It is no longer marked as stale.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

not stale. been repro'd by a contributor.

Thank you for updating this issue. It is no longer marked as stale.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

not stale. has been repro'd by a contributor.

Thank you for updating this issue. It is no longer marked as stale.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mooperd picture mooperd  路  3Comments

Arguros picture Arguros  路  3Comments

saurabhnemade picture saurabhnemade  路  3Comments

Oloremo picture Oloremo  路  3Comments

golmaal picture golmaal  路  3Comments