Salt: [2016.11.0rc1] salt calls sometimes have permisison errors on cache

Created on 8 Oct 2016  路  14Comments  路  Source: saltstack/salt

Description of Issue/Question

[ERROR   ] An un-handled exception was caught by salt's global exception handler:
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
Traceback (most recent call last):
  File "/opt/local/bin/salt", line 10, in <module>
    salt_main()
  File "/opt/local/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/opt/local/lib/python2.7/site-packages/salt/cli/salt.py", line 163, in run
    for full_ret in cmd_func(**kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/opt/local/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/opt/local/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/opt/local/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
Traceback (most recent call last):
  File "/opt/local/bin/salt", line 10, in <module>
    salt_main()
  File "/opt/local/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/opt/local/lib/python2.7/site-packages/salt/cli/salt.py", line 163, in run
    for full_ret in cmd_func(**kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/opt/local/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/opt/local/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/opt/local/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
salt.exceptions.SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'

Setup

####
## SaltStack Master Configuration
####
##########################################
## publisher_acl
publisher_acl:
  sjorge:
    - .*

# vim: tabstop=2 expandtab shiftwidth=2 softtabstop=2

Steps to Reproduce Issue

salt '*' grains.get saltversion

as sjorge

Versions Report

Salt Version:
           Salt: 2016.11.0rc1

Dependency Versions:
           cffi: 1.4.2
       cherrypy: 3.8.0
       dateutil: 2.4.0
          gitdb: 0.6.4
      gitpython: 2.0.8
          ioflo: 1.6.1
         Jinja2: 2.7.3
        libgit2: Not Installed
        libnacl: 1.5.0
       M2Crypto: 0.22
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.12 (default, Jul 20 2016, 14:22:30)
   python-gnupg: 0.3.9
         PyYAML: 3.11
          PyZMQ: 14.4.1
           RAET: 0.6.5
          smmap: 0.9.0
        timelib: 0.2.4
        Tornado: 4.3
            ZMQ: 4.1.3

System Versions:
           dist:
        machine: i86pc
        release: 5.11
         system: SunOS
        version: Not Installed
Bug Core P3 ZRELEASED - Carbon fixed-pending-your-verification severity-critical severity-high

Most helpful comment

Seems fixed!

All 14 comments

@sjorge i'm having a hard time replicating this one. Is there anything else setup in your conf files? also did you set permissions similar to the docs here

EDIT: Also can you list the permissions on that data.p file? I'm assuming it has incorrect permissions

I cleaned the cache by running 'salt-call saltutil.clear_cache' as root. After that it worked fine, so I am unable to replicate it currently.

Some more info, I usually run salt/salt-call as sjorge and have a publisher_acl set to allow this.

I guess we can close this for now, it was probably a hickup from the 2016.3.3->2016.11.0rc1 upgrade.

I just his this problem again:

[cronos :: sjorge][~]
[.]$ salt -p '*' state.apply
[ERROR   ] An un-handled exception was caught by salt's global exception handler:###################################           | Elapsed Time: 0:00:23 Returns: [10/11]
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
Traceback (most recent call last):
  File "/opt/local/bin/salt", line 10, in <module>
    salt_main()
  File "/opt/local/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/opt/local/lib/python2.7/site-packages/salt/cli/salt.py", line 142, in run
    for progress in cmd_func(**kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/opt/local/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/opt/local/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/opt/local/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
Traceback (most recent call last):
  File "/opt/local/bin/salt", line 10, in <module>
    salt_main()
  File "/opt/local/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/opt/local/lib/python2.7/site-packages/salt/cli/salt.py", line 142, in run
    for progress in cmd_func(**kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/opt/local/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/opt/local/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/opt/local/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
salt.exceptions.SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
[!]$ ls -l /var/cache/salt/master/minions/             
total 6
drwxr-xr-x 2 root root 3 Oct 11 15:02 apollo
drwxr-xr-x 2 root root 3 Oct  9 01:34 ares
drwxr-xr-x 2 root root 4 Oct 11 15:02 artemis
drwxr-xr-x 2 root root 3 Oct 11 14:57 core
drwxr-xr-x 2 root root 4 Oct 11 14:47 cronos
drwxr-xr-x 2 root root 3 Oct 11 14:55 crust
drwxr-xr-x 2 root root 3 Oct 11 14:43 eris
drwxr-xr-x 2 root root 3 Oct 11 14:47 eros
drwxr-xr-x 2 root root 3 Oct 11 14:45 hermes
drwxr-xr-x 2 root root 3 Oct 11 14:53 isotope
drwxr-xr-x 2 root root 3 Oct 11 14:57 mantle
[cronos :: sjorge][~]
[!]$ ls -l /var/cache/salt/master/minions/*/data.p
-rw------- 1 root root  25608 Oct 11 15:02 /var/cache/salt/master/minions/apollo/data.p
-rw------- 1 root root  25502 Oct  9 01:34 /var/cache/salt/master/minions/ares/data.p
-rw------- 1 root root  24562 Oct 11 15:02 /var/cache/salt/master/minions/artemis/data.p
-rw------- 1 root root 129471 Oct 11 14:57 /var/cache/salt/master/minions/core/data.p
-rw------- 1 root root  22685 Oct 11 14:47 /var/cache/salt/master/minions/cronos/data.p
-rw------- 1 root root  26880 Oct 11 14:55 /var/cache/salt/master/minions/crust/data.p
-rw------- 1 root root  27184 Oct 11 14:43 /var/cache/salt/master/minions/eris/data.p
-rw------- 1 root root  27234 Oct 11 14:47 /var/cache/salt/master/minions/eros/data.p
-rw------- 1 root root  24822 Oct 11 14:45 /var/cache/salt/master/minions/hermes/data.p
-rw------- 1 root root  25430 Oct 11 14:53 /var/cache/salt/master/minions/isotope/data.p
-rw------- 1 root root  65191 Oct 11 14:57 /var/cache/salt/master/minions/mantle/data.p

It's weird that it only fails at minion 10, all the permissions are the same for all files? Oddly enough it is on apollo again :s

Just running apollo works fine:

[cronos :: sjorge][~]
[.]$ salt apollo state.apply 
apollo:
  Name: nl_BE.UTF-8 - Function: locale.present - Result: Clean Started: - 15:14:35.897316 Duration: 265.63 ms
  Name: en_US.UTF-8 - Function: locale.present - Result: Clean Started: - 15:14:36.163564 Duration: 261.965 ms
  Name: en_US.UTF-8 - Function: locale.system - Result: Clean Started: - 15:14:36.431999 Duration: 98.723 ms
  Name: Europe/Brussels - Function: timezone.system - Result: Clean Started: - 15:14:36.539638 Duration: 4.227 ms
  Name: /etc/resolv.conf - Function: file.managed - Result: Clean Started: - 15:14:36.558299 Duration: 107.298 ms
  Name: /etc/hosts - Function: file.managed - Result: Clean Started: - 15:14:36.665875 Duration: 75.204 ms
  Name: /usr/sbin/route -p add -inet 0.0.0.0/0 172.16.30.1 -ifp net0 -static - Function: cmd.run - Result: Clean Started: - 15:14:36.743186 Duration: 121.713 ms
  Name: /usr/sbin/route -p add -inet6 ::/0 2001:470:7ee7:30::1 -ifp net0 -static - Function: cmd.run - Result: Clean Started: - 15:14:36.866078 Duration: 129.654 ms
  Name: /etc/motd - Function: file.managed - Result: Clean Started: - 15:14:36.996404 Duration: 166.471 ms
  Name: zsh - Function: pkg.installed - Result: Clean Started: - 15:14:40.002823 Duration: 565.639 ms
  Name: bash - Function: pkg.installed - Result: Clean Started: - 15:14:40.569029 Duration: 55.164 ms
  Name: root - Function: user.present - Result: Clean Started: - 15:14:40.628029 Duration: 105.452 ms
  Name: sysadmin - Function: group.present - Result: Clean Started: - 15:14:40.735606 Duration: 1.48 ms
  Name: user - Function: group.present - Result: Clean Started: - 15:14:40.737348 Duration: 0.919 ms
  Name: service - Function: group.present - Result: Clean Started: - 15:14:40.738538 Duration: 0.915 ms
  Name: sudo - Function: pkg.installed - Result: Clean Started: - 15:14:40.739710 Duration: 56.817 ms
  Name: sjorge - Function: user.present - Result: Clean Started: - 15:14:40.796805 Duration: 107.813 ms
  Name: AAAAB3NzaC1yc2EAAAABIwAAAQEA7st+KRfqBdNGhyEDBxzssIv9xVH7yl5fMOUVHLk1VTmpOyCpYi/+yI2QsMIdndFqWK7rM4xgCK+osKP2kj/edy7FTzi78c6vP3Jli+Dt17ZgC/CxlEReJe+o0/l1tdWbl+kg9nB0DtGMSvntFcIdQmn5ZiPOzGrLKvI9Tngt0BWLrYXlQzBkq65MBbizoxkErR4nHppt9mLeTR9ZY6YsRi2roVVNUJWKat/RvC6ncPs4+Mpy56330MA+40cGlFFlucQnlzIx6ghey13P+RdZYuexJP6JunyZOntCcieahNJ+FklCEeeVChhjO0tFtioz49udCEOxo9kmfDUOiqi61w== - Function: ssh_auth.present - Result: Clean Started: - 15:14:40.908600 Duration: 8.833 ms
  Name: /opt/local/etc/sudoers.d/sjorge - Function: file.managed - Result: Clean Started: - 15:14:40.920890 Duration: 4.075 ms
  Name: sjorge - Function: rbac.managed - Result: Clean Started: - 15:14:40.935855 Duration: 3.83 ms
  Name: /home/sjorge - Function: archive.extracted - Result: Clean Started: - 15:14:42.042753 Duration: 1.057 ms
  Name: /home/sjorge - Function: archive.extracted - Result: Clean Started: - 15:14:42.046068 Duration: 0.912 ms
  Name: /home/sjorge - Function: archive.extracted - Result: Clean Started: - 15:14:42.049212 Duration: 0.892 ms
  Name: /home/sjorge - Function: file.directory - Result: Clean Started: - 15:14:42.061907 Duration: 27.289 ms
  Name: /home/sjorge/.salt/profile-default - Function: file.managed - Result: Clean Started: - 15:14:42.108180 Duration: 3.045 ms
  Name: /home/sjorge/.salt/profile-shell - Function: file.managed - Result: Clean Started: - 15:14:42.129875 Duration: 3.035 ms
  Name: /home/sjorge/.salt/profile-user - Function: file.managed - Result: Clean Started: - 15:14:42.151567 Duration: 3.027 ms
  Name: tmux - Function: pkg.installed - Result: Clean Started: - 15:14:42.154881 Duration: 55.339 ms
  Name: watch - Function: pkg.installed - Result: Clean Started: - 15:14:42.210468 Duration: 58.177 ms
  Name: package::schedule-upgrade - Function: schedule.present - Result: Clean Started: - 15:14:42.268892 Duration: 22.558 ms
  Name: states - Function: grains.present - Result: Clean Started: - 15:14:42.291694 Duration: 2.174 ms
  Name: openssl - Function: pkg.installed - Result: Clean Started: - 15:14:42.294115 Duration: 55.558 ms
  Name: perl - Function: pkg.installed - Result: Clean Started: - 15:14:42.349919 Duration: 54.381 ms
  Name: /opt/local/etc/openssl/certs - Function: file.directory - Result: Clean Started: - 15:14:42.404548 Duration: 1.737 ms
  Name: /opt/local/etc/openssl/certs/internal-ca.crt - Function: x509.pem_managed - Result: Clean Started: - 15:14:42.409250 Duration: 1.795 ms
  Name: /opt/local/bin/c_rehash - Function: cmd.wait - Result: Clean Started: - 15:14:42.413337 Duration: 0.86 ms
  Name: /opt/local/etc/pki - Function: file.directory - Result: Clean Started: - 15:14:42.414421 Duration: 1.591 ms
  Name: /opt/local/etc/pki/goto.acheron.be.key - Function: x509.private_key_managed - Result: Clean Started: - 15:14:42.438258 Duration: 3.756 ms
  Name: /opt/local/etc/pki/goto.acheron.be.crt - Function: x509.certificate_managed - Result: Clean Started: - 15:14:42.463873 Duration: 400.842 ms
  Name: salt:minion - Function: service.running - Result: Clean Started: - 15:14:42.864972 Duration: 778.577 ms
  Name: /salt/config/minion - Function: file.managed - Result: Clean Started: - 15:14:43.644727 Duration: 67.841 ms
  Name: /salt/config/minion.d - Function: file.directory - Result: Clean Started: - 15:14:43.712838 Duration: 2.371 ms
  Name: /salt/config/minion.d/smtp.conf - Function: file.managed - Result: Clean Started: - 15:14:43.718829 Duration: 53.865 ms
  Name: salt.minion::schedule-apply - Function: schedule.present - Result: Clean Started: - 15:14:43.772978 Duration: 18.788 ms
  Name: /etc/ssh/sshd_config - Function: file.managed - Result: Clean Started: - 15:14:43.792043 Duration: 71.102 ms
  Name: ssh - Function: service.running - Result: Clean Started: - 15:14:43.866116 Duration: 718.305 ms
  Name: py27-cherrypy - Function: pkg.installed - Result: Clean Started: - 15:14:44.585092 Duration: 60.133 ms
  Name: py27-yaml - Function: pkg.installed - Result: Clean Started: - 15:14:44.645500 Duration: 56.173 ms
  Name: bottle - Function: pkg.installed - Result: Clean Started: - 15:14:44.701942 Duration: 56.234 ms
  Name: py27-OpenSSL - Function: pkg.installed - Result: Clean Started: - 15:14:44.758426 Duration: 56.103 ms
  Name: /opt/goto/sbin/goto-daemon - Function: file.managed - Result: Clean Started: - 15:14:44.818087 Duration: 27.808 ms
  Name: /opt/goto/sbin/.goto-daemon.xml - Function: file.managed - Result: Clean Started: - 15:14:44.848945 Duration: 23.485 ms
  Name: svccfg import /opt/goto/sbin/.goto-daemon.xml - Function: cmd.wait - Result: Clean Started: - 15:14:44.875556 Duration: 1.006 ms
  Name: /opt/goto/etc/goto.yaml - Function: file.managed - Result: Clean Started: - 15:14:44.893674 Duration: 167.271 ms
  Name: goto - Function: service.running - Result: Clean Started: - 15:14:45.068551 Duration: 765.134 ms
  Name: /opt/goto/etc/goto-mapping.yaml - Function: file.managed - Result: Clean Started: - 15:14:45.839759 Duration: 39.688 ms
  Name: salt.minion::restarter - Function: at.watch - Result: Clean Started: - 15:14:45.885189 Duration: 1.144 ms

Summary for apollo
-------------
Succeeded: 57
Failed:     0
-------------
Total states run:     57
Total run time:    5.805 s

Apollo with a few friends:

[cronos :: sjorge][~]
[.]$ salt -L apollo,artemis,cronos -p state.apply
100% |#######################################################################################################################################################| Elapsed Time: 0:00:10 Returns: [3/3]

-------------------------------------------
Summary
-------------------------------------------
# of minions targeted: 3
# of minions returned: 3
# of minions that did not return: 0
# of minions with errors: 0
-------------------------------------------
[cronos :: sjorge][~]
[!]$ salt -L curst,artemis,cronos,core,mantle,ares -p state.apply
[ERROR   ] An un-handled exception was caught by salt's global exception handler:#############################################                               | Elapsed Time: 0:00:18 Returns: [4/5]
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
Traceback (most recent call last):
  File "/opt/local/bin/salt", line 10, in <module>
    salt_main()
  File "/opt/local/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/opt/local/lib/python2.7/site-packages/salt/cli/salt.py", line 142, in run
    for progress in cmd_func(**kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/opt/local/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/opt/local/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/opt/local/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'
Traceback (most recent call last):
  File "/opt/local/bin/salt", line 10, in <module>
    salt_main()
  File "/opt/local/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/opt/local/lib/python2.7/site-packages/salt/cli/salt.py", line 142, in run
    for progress in cmd_func(**kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/opt/local/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/opt/local/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/opt/local/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/opt/local/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
salt.exceptions.SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/apollo/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/apollo/data.p'

5 minions in one go and I get the error.
<5 minions and it's fine.

When running with more than 5 minions it always fails, doesn't matter which minions I run on.
Even if apollo is not in the list of minions, it still errors out on apolo/data.p!

I think there is a very funky race condition somewhere.

Aha! one of my minions had died (not Apollo btw!)

How to replicate:

  1. setup publisher_acl for user XYZ (make sure you have a few minions)
  2. [as XYZ] salt -p '*' state.apply -> all should be fine
  3. [as root] salt -p '*' state.apply -> all should be fine
  4. stop the minion on a random host
  5. [as XYZ] salt -p '*' state.apply -> get error
  6. [as root] salt -p '*' state.apply -> all should be fine, one minion did not return
  7. start the minion again
  8. all good again

@Ch3LL took a while before I could constantly replicate

It always complains about the first minion's data.p (alphabetically) which totally threw me of.

Perfect! Thank you for the investigation work on this one as well! I am now able to replicate it with just two minions connected to a master and using your test case as you mention exactly above, but just ot make it easier for anyone else replicating I did the following:

  1. useradd ch3ll
  2. add following to /etc/salt/master:
publisher_acl:
  ch3ll:
    - .*
  1. add two minions
  2. chown -R ch3ll:ch3ll /var/log/salt/
  3. chmod 755 /var/cache/salt /var/cache/salt/master /var/cache/salt/master/jobs /var/run/salt /var/run/salt/master
  4. then run the steps as you posted above 1-6

And here is my sstack trace:

[ERROR   ] An un-handled exception was caught by salt's global exception handler:
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/cent-test-M1/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/cent-test-M1/data.p'
Traceback (most recent call last):
  File "/bin/salt", line 10, in <module>
    salt_main()
  File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/usr/lib/python2.7/site-packages/salt/cli/salt.py", line 163, in run
    for full_ret in cmd_func(**kwargs):
  File "/usr/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/usr/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/usr/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/usr/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/usr/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/cent-test-M1/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/cent-test-M1/data.p'
Traceback (most recent call last):
  File "/bin/salt", line 10, in <module>
    salt_main()
  File "/usr/lib/python2.7/site-packages/salt/scripts.py", line 446, in salt_main
    client.run()
  File "/usr/lib/python2.7/site-packages/salt/cli/salt.py", line 163, in run
    for full_ret in cmd_func(**kwargs):
  File "/usr/lib/python2.7/site-packages/salt/client/__init__.py", line 638, in cmd_cli
    **kwargs):
  File "/usr/lib/python2.7/site-packages/salt/client/__init__.py", line 1332, in get_cli_event_returns
    connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
  File "/usr/lib/python2.7/site-packages/salt/utils/minions.py", line 583, in connected_ids
    mdata = self.cache.fetch('minions/{0}'.format(id_), 'data')
  File "/usr/lib/python2.7/site-packages/salt/cache/__init__.py", line 146, in fetch
    return self.modules[fun](bank, key)
  File "/usr/lib/python2.7/site-packages/salt/cache/localfs.py", line 68, in fetch
    key_file, exc
salt.exceptions.SaltCacheError: There was an error reading the cache file "/var/cache/salt/master/minions/cent-test-M1/data.p": [Errno 13] Permission denied: '/var/cache/salt/master/minions/cent-test-M1/data.p'

And as you stated above this is not the minion that is actually stopped. The minion where the permission denied error shows up is actually running

I have a truss (solaris) output of the syscalls if that helps.

56934:  stat("/var/cache/salt/master/minions/apollo/data.p", 0xFFFFFD7FFFDFE690) = 0
56934:  open("/var/cache/salt/master/minions/apollo/data.p", O_RDONLY) Err#13 EACCES [file_dac_read]

is probably the only interesting bit.

Ok, after doing a git bisect with these awesome steps to reproduce the error, it looks like #35388 broke this support, which makes sense. I'll keep digging, but I wanted to update everyone here.

Ok, so the changes in #35388 didn't _break_ this per se. In 2016.3 and older, when we try to access the cache files and we don't have permission, we are catching the IOErrors when permission is denied and continuing. The pluggable minion database change made it so that when we see an IOError, we're explicitly raising the SaltCacheError, which is exposing the stack trace on the CLI to the non-root user.

PR #37020 fixes this up by catching the SaltCacheError and continuing, like we did before with the various IOErrors.

Thanks to @Ch3LL for providing the reproducible test case and to @sjorge for filing this and helping us track it down! With the fix applied, the minion now has the correct "not connected" response:

cent-test-min1.c7.saltstack.net:
    Minion did not return. [No response]

Instead of stack tracing. :)

@rallytime I patched salt/utils/minion.py and I am still getting it.

I also tried putting the patched minion.py in _utils/ and syncing it to all nodes. I still get the error.

@sjorge The patch is in salt.utils.minions.py (note the s, not minion.py). I just want to make sure you're patching the right file and that isn't a typo. Can you confirm?

@rallytime woops, wrong file. Let met reset both and patch the right one

Seems fixed!

Was this page helpful?
0 / 5 - 0 ratings