Salt: Clearing pillar cache from Master leaves stray files & does not appear to clear it

Created on 18 Oct 2016  路  12Comments  路  Source: saltstack/salt

Description of Issue/Question

Using the pillar cache with a disk backend. The cache.clear_pillar runner appears to clear the cache upon immediately calling cache.pillar, but the files still persist on the disk, and if it's within the cache TTL, the cached pillar data still gets retrieved, since those files still persist.

This is notable because we leverage a _custom external pillar_ that takes about 30 seconds to retrieve its data, so it's very obvious whether or not the cached data is being retrieved. I emphasized the fact that it is a custom external pillar in the event that this issue is a complication of using one.

Setup

(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)
Master config:

pillar_cache: True
pillar_cache_backend: disk
pillar_cache_ttl: 86400

Steps to Reproduce Issue

(Include debug logs if possible and relevant.)
1) Populate the cache: salt-call pillar.items.
2) Attempts to clear cache with the runner:

(salt) root@:pillar_cache # salt-run cache.clear_pillar '*'
True
(salt) root@:pillar_cache # salt-run cache.pillar
i-ot7rj6eg:
    ----------
i-v8vpgxba:
    ----------
(salt) root@:pillar_cache # ls -l
total 40
-rw-r--r-- 1 root root 17956 Oct 18 18:31 i-ot7rj6eg
-rw-r--r-- 1 root root 17956 Oct 18 18:54 i-v8vpgxba

3) Run salt-call pillar.items again.

[ERROR   ] Unable to read instance data, giving up
[INFO    ] Not an EC2 instance, skipping
[DEBUG   ] Connecting to master. Attempt 1 of 1
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506')
[DEBUG   ] Generated random reconnect delay between '1000ms' and '11000ms' (9163)
[DEBUG   ] Setting zmq_reconnect_ivl to '9163ms'
[DEBUG   ] Setting zmq_reconnect_ivl_max to '11000ms'
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506', 'clear')
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_4b91cc8893_pub.ipc
[DEBUG   ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_4b91cc8893_pull.ipc
[DEBUG   ] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_4b91cc8893_pull.ipc
[DEBUG   ] Sending event: tag = salt/auth/creds; data = {'_stamp': '2016-10-18T18:58:14.714370', 'creds': {'publish_port': 4505, 'aes': 'bWRJlUX08onQaBa1U4b9AV5JmEiS/QXwgZwWyNDrB1umL7ww38V2jr4NqNvDo1EUXQePf
PsAtDo=', 'master_uri': 'tcp://172.254.254.100:4506'}, 'key': ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506')}
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Determining pillar cache
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506')
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] LazyLoaded pillar.items
[DEBUG   ] Determining pillar cache
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506')
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506')
[DEBUG   ] LazyLoaded nested.output

4) Repeat step 2, then delete the files in pillar_cache.
5) The pillar refreshes its data from the external pillar as expected. The SaltReqTimeoutErrors are safe to ignore.

[ERROR   ] Unable to read instance data, giving up
[INFO    ] Not an EC2 instance, skipping
[DEBUG   ] Connecting to master. Attempt 1 of 1
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506')
[DEBUG   ] Generated random reconnect delay between '1000ms' and '11000ms' (10703)
[DEBUG   ] Setting zmq_reconnect_ivl to '10703ms'
[DEBUG   ] Setting zmq_reconnect_ivl_max to '11000ms'
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'i-v8vpgxba', 'tcp://172.254.254.100:4506', 'clear')
[DEBUG   ] SaltReqTimeoutError, retrying. (1/7)
[DEBUG   ] SaltReqTimeoutError, retrying. (2/7)
[DEBUG   ] SaltReqTimeoutError, retrying. (3/7)
[DEBUG   ] SaltReqTimeoutError, retrying. (4/7)
[DEBUG   ] SaltReqTimeoutError, retrying. (5/7)
[DEBUG   ] SaltReqTimeoutError, retrying. (6/7)
[DEBUG   ] SaltReqTimeoutError, retrying. (7/7)

Versions Report

(Provided by running salt --versions-report. Please also mention any differences in master/minion versions.)

Masters and minions use the same version.

Salt Version:
           Salt: 2016.3.3-186-g86ac8bd

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.5.3
          gitdb: 0.6.4
      gitpython: 2.0.2
          ioflo: Not Installed
         Jinja2: 2.8
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.9 (default, Jun 29 2016, 13:08:31)
   python-gnupg: Not Installed
         PyYAML: 3.12
          PyZMQ: 16.0.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: Not Installed
        Tornado: 4.4.2
            ZMQ: 4.1.5

System Versions:
           dist: debian 8.6 
        machine: x86_64
        release: 3.16.0-4-amd64
         system: Linux
        version: debian 8.6
Aluminium Bug Community Confirmed Pillar Platform TEAM Platform phase-plan severity-medium status-in-prog

Most helpful comment

We are seeing this with version 3000.2. What is the safest way to remove the cache items?

All 12 comments

I am able to replicate this on 2016.3.3 with just regular pillar files.

[root@salt ~]# cat /srv/pillar/test.sls
this: one
that: two
those:
  - three
  - four
  - five
[root@salt ~]# salt-call pillar.items
[WARNING ] /usr/lib/python2.7/site-packages/salt/grains/core.py:1493: DeprecationWarning: The "osmajorrelease" will be a type of an integer.

local:
    ----------
    that:
        two
    this:
        one
    those:
        - three
        - four
        - five
[root@salt ~]# sed -iv 's/five/six/' /srv/pillar/test.sls
[root@salt ~]# cat !$
cat /srv/pillar/test.sls
this: one
that: two
those:
  - three
  - four
  - six
[root@salt ~]# salt-call pillar.items
[WARNING ] /usr/lib/python2.7/site-packages/salt/grains/core.py:1493: DeprecationWarning: The "osmajorrelease" will be a type of an integer.

local:
    ----------
    that:
        two
    this:
        one
    those:
        - three
        - four
        - five
[root@salt ~]# salt-run cache.clear_pillar '*'
[WARNING ] /usr/lib/python2.7/site-packages/salt/grains/core.py:1493: DeprecationWarning: The "osmajorrelease" will be a type of an integer.

True
[root@salt ~]# salt-call pillar.items
[WARNING ] /usr/lib/python2.7/site-packages/salt/grains/core.py:1493: DeprecationWarning: The "osmajorrelease" will be a type of an integer.

local:
    ----------
    that:
        two
    this:
        one
    those:
        - three
        - four
        - five

It appears that the cache.clear_pillar is not actually invalidating the cache for some reason.

any news? i have old data in pillar in 2016.11

any news in this one?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

I have the same problem with 2017.7.8 The only option I found is to rm -f /var/cache/salt/master/pillar_cache/* but I know this is not clean to do that.

Thank you for updating this issue. It is no longer marked as stale.

ping?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

Thank you for updating this issue. It is no longer marked as stale.

I'm seeing this same behavior with version 3000.3

It probably goes without saying, but we experience the same issue on 2019.2.5 too.

Just yesterday I had to quickly roll back a broken build of one of our apps that someone pushed and somehow passed its tests and QA, and the quickest way to do that is (or should have been) to update the pillar file to pin the version to the previous release.

In my haste I forgot about this issue, which left me spending a few minutes (much longer than it should have been) tracking down this issue so I could remember why the cache wouldn't clear. It was not a good look.

We are seeing this with version 3000.2. What is the safest way to remove the cache items?

Was this page helpful?
0 / 5 - 0 ratings