Salt: pillars not updated on minions until salt-minion is restarted

Created on 15 Mar 2016 · 25Comments · Source: saltstack/salt

Description of Issue/Question

When using git_pilalr, if I make a change to the pillar data on the repo and run:

salt '*' saltutil.refresh_pillar

The pillar data is not updated. only if the minion is restarted that new pillars show up when I run

salt '*' pillar.item pillar_name

the git_pillar config file in /etc/salt/master.d/pillar_config.conf:

   git_pillar_provider: pygit2
   ext_pillar:
     - git:
       - master gitlab@repository_url/repository_name.git:
         - root: pillar/base
         - env: base
         - privkey: /root/.ssh/id_rsa
         - pubkey: /root/.ssh/id_rsa.pub

Steps to reproduce:

I am not sure how this will be reproduced. I am using the same repo for gitfs and git_pillar and all hosts are RHEL6/7 on a virtual environment (vMWare)

Versions Report

Salt Version:
           Salt: 2015.8.7

Dependency Versions:
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
           Mako: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.7.0
         Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
           RAET: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
           cffi: 0.8.6
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: 0.5.4
      gitpython: 0.3.2 RC1
          ioflo: Not Installed
        libgit2: 0.21.0
        libnacl: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
         pygit2: 0.21.4
   python-gnupg: Not Installed
          smmap: 0.8.1
        timelib: Not Installed

System Versions:
           dist: redhat 7.2 Maipo
        machine: x86_64
        release: 3.10.0-229.el7.x86_64
         system: Red Hat Enterprise Linux Server 7.2 Maipo

Aluminium Bug Community Core Pillar phase-plan severity-medium status-in-prog

Source

aabognah

👍2

Most helpful comment

I'm hitting the same issue in 2018.3.2.

devopsprosiva on 28 Aug 2018

👍2

All 25 comments

@aabognah, thanks for reporting. What happens when you do salt '*' pillar.get pillar_name?

jfindlay on 15 Mar 2016

salt '*' pillar.get pillar_name return the old pillar value.

aabognah on 16 Mar 2016

@aabognah, thanks for confirming. This is possibly related to #23391 and #25160.

jfindlay on 16 Mar 2016

I updated to the new 2015.8.8.2 version but the problem still exists.

I have another setup with another master (running on ERHEL6) and I don't see the problem on that one.

Here is the versions report of the WORKING master (where minions DO NOT need to be restarted for pillars updates to show up):

salt --versions-report
Salt Version:
       Salt: 2015.8.8.2

Dependency Versions:
     Jinja2: unknown
   M2Crypto: 0.20.2
       Mako: Not Installed
     PyYAML: 3.11
      PyZMQ: 14.5.0
     Python: 2.6.6 (r266:84292, May 22 2015, 08:34:51)
       RAET: Not Installed
    Tornado: 4.2.1
        ZMQ: 4.0.5
       cffi: Not Installed
   cherrypy: 3.2.2
   dateutil: 1.4.1
      gitdb: 0.5.4
  gitpython: 0.3.2 RC1
      ioflo: Not Installed
    libgit2: 0.20.0
    libnacl: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.6
mysql-python: Not Installed
  pycparser: Not Installed
   pycrypto: 2.6.1
     pygit2: 0.20.3
python-gnupg: Not Installed
      smmap: 0.8.1
    timelib: Not Installed

System Versions:
       dist: redhat 6.7 Santiago
    machine: x86_64
    release: 2.6.32-573.12.1.el6.x86_64
     system: Red Hat Enterprise Linux Server 6.7 Santiago

And here is the versions report of the NON-WORKING master (where the minions NEED TO BE RESTARTED after the pillars are updated for changes to take effect):

Salt Version:
       Salt: 2015.8.8.2

Dependency Versions:
     Jinja2: 2.7.2
   M2Crypto: 0.21.1
       Mako: Not Installed
     PyYAML: 3.11
      PyZMQ: 14.7.0
     Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
       RAET: Not Installed
    Tornado: 4.2.1
        ZMQ: 4.0.5
       cffi: 0.8.6
   cherrypy: Not Installed
   dateutil: 1.5
      gitdb: 0.5.4
  gitpython: 0.3.2 RC1
      ioflo: Not Installed
    libgit2: 0.21.0
    libnacl: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.4.7
mysql-python: Not Installed
  pycparser: 2.14
   pycrypto: 2.6.1
     pygit2: 0.21.4
python-gnupg: Not Installed
      smmap: 0.8.1
    timelib: Not Installed

System Versions:
       dist: redhat 7.2 Maipo
    machine: x86_64
    release: 3.10.0-229.el7.x86_64
     system: Red Hat Enterprise Linux Server 7.2 Maipo

The minions for both masters looks similar and are all RHEL6/7 or OEL. here is a versions report of one minion:

Salt Version:
           Salt: 2015.8.8.2

Dependency Versions:
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
           Mako: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.7.0
         Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
           RAET: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
        libgit2: Not Installed
        libnacl: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
   python-gnupg: Not Installed
          smmap: Not Installed
        timelib: Not Installed

System Versions:
           dist: redhat 7.2 Maipo
        machine: x86_64
        release: 3.10.0-327.10.1.el7.x86_64
         system: Red Hat Enterprise Linux Server 7.2 Maipo

aabognah on 30 Mar 2016

@jfindlay is there a workaround that I can implement to fix this?

aabognah on 1 Apr 2016

@aabognah, not that I know of.

jfindlay on 1 Apr 2016

Hmm, my first reaction here is that this might be be related to the difference in git provider libs. If you take pygit2 down to the version on the working master, does this problem go away?

cachedout on 7 Apr 2016

Does the fact that I have two masters setup in redundant-master setup have anything to do with this?

setup was made base on this walk through:
https://docs.saltstack.com/en/latest/topics/tutorials/multimaster.html

The two masters have the same key, minions are configured to check-in with both masters, and both masters look at the same repository for gitfs and git_pillar.

Do I need to keep the local cache files on each master in sync in order to solve this?

aabognah on 18 Apr 2016

Same here - here is our minion:

salt-call --versions
Salt Version:
           Salt: 2016.3.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.5 (default, Oct 11 2015, 17:47:16)
   python-gnupg: Not Installed
         PyYAML: 3.10
          PyZMQ: 14.5.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5

System Versions:
           dist: redhat 7.2 Maipo
        machine: x86_64
        release: 3.10.0-327.13.1.el7.x86_64
         system: Linux
        version: Red Hat Enterprise Linux Server 7.2 Maipo

davidkarlsen on 20 Oct 2016

We're struggling with this issue and we're running a multimaster topology with master_shuffle: True. The masters and minions are running 2016.3.4. We're using a custom pillar backend and pillar_cache: False and minion_data_cache: False.

Changes in pillars are not updated in pillar.get and .item calls when you target minions from one of the masters after you've executed saltutil.refresh_pillar. This is also the case when the pillar.get/item command is run from within a custom module on the targeted minion. A pillar.items without args does hand you fresh pillar data. Running salt-call pillar.get or pillar.item on the minion also works ok.

There seems to be little activity on this and the related issues linked here. Is it something that's being worked on or are there workarounds we can use, perhaps a different multimaster topology?

luxi2001 on 9 Feb 2017

I can confirm this issue is still occurring in 2016.3.8:

Salt Version:
           Salt: 2016.3.8

Dependency Versions:
           cffi: 1.1.2
       cherrypy: 11.0.0
       dateutil: 2.2
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.9.6
        libgit2: 0.22.2
        libnacl: Not Installed
       M2Crypto: 0.21.1
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: 1.2.5
      pycparser: 2.14
       pycrypto: 2.6.1
         pygit2: 0.22.0
         Python: 2.7.10 (default, Oct 14 2015, 16:09:02)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: Not Installed
        timelib: 0.2.4
        Tornado: 4.5.2
            ZMQ: 4.1.6

System Versions:
           dist: Ubuntu 15.10 wily
        machine: x86_64
        release: 4.2.0-42-generic
         system: Linux
        version: Ubuntu 15.10 wily

The only workaround even in a simple master/minion setup is to restart the salt-minion. Neither of the associated issues have been addressed.

gravyboat on 8 Dec 2017

This issue is still occuring on 2017.7.2

Salt Version:
           Salt: 2017.7.2

Dependency Versions:
           cffi: 1.7.0
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.7
   mysql-python: Not Installed
      pycparser: 2.10
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 3.6.4 (default, Jan  2 2018, 01:25:35)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 16.0.3
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.5.2
            ZMQ: 4.2.2

System Versions:
           dist:
         locale: US-ASCII
        machine: amd64
        release: 11.1-RELEASE
         system: FreeBSD
        version: Not Installed

angeloudy on 2 Feb 2018

Still reproducible on 2018.3

Salt Version:
           Salt: 2018.3.2

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.6.1
      docker-py: Not Installed
          gitdb: 2.0.3
      gitpython: 2.1.8
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.7
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.15rc1 (default, Apr 15 2018, 21:51:34)
   python-gnupg: 0.4.1
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: 2.0.3
        timelib: Not Installed
        Tornado: 4.5.3
            ZMQ: 4.2.5

System Versions:
           dist: Ubuntu 18.04 bionic
         locale: UTF-8
        machine: x86_64
        release: 4.15.0-23-generic
         system: Linux
        version: Ubuntu 18.04 bionic

Is it going to be addressed soon?

pauldamian on 6 Jul 2018

I'm hitting the same issue in 2018.3.2.

devopsprosiva on 28 Aug 2018

👍2

Same here with 2018.3.2

cavepopo on 28 Aug 2018

I ran into something similar, but a restart of the minion didn't help. My problem was with how we use packer and the salt-masterless provisioner.

The provisioner copies pillar and salt files under /srv due to remote_pillar_roots and remote_state_tree. Once the salt-master EC2 instance launched based off of this new AMI, it would merge data in both /srv (left over from the build process) and gitfs/ext_pillar. I was able to see this after running the salt-master in debug mode.

The fix was to do a clean up (rm -rf /srv) during the packer build (shell provisioner) before it actually created the AMI.

https://www.packer.io/docs/provisioners/salt-masterless.html

xbglowx on 28 Sep 2018

One interesting thing: salt-call -l debug saltutil.pillar_refresh and salt-call -l debug pillar.items don't give the files read for the pillar information.

dereckson on 21 Nov 2018

Seeing similarly strange things in 2019.2.0 where, having updated a plain YAML/SLS-style pillar file on the master (and tried restarting master, minions, saltutil.refresh_pillar, etc), the minions are not getting the same values.

A bit more debugging running commands on a minion:

root@elder# salt-call pillar.get key1:key2 pillarenv=base
    primary:
        elder
    secondary:
        gage

root@elder# salt-call pillar.get key1:key2
    primary:
        gage
    secondary:
        elder

The pillar that you see when I pass pillarenv=base reflects the "source of truth" (pillar file on master). There is only one pillar defined on this deployment, it's a very small-scale single-master deployment, not really using many clever SaltStack tricks.

I was expecting to have stumbled onto a weird edge case of my own making, but I'm instead surprised by how long this has been a problem for other users. How can we help you get more information to fix this, because it's fundamental to why people use Salt — repeatability. Right now, Salt can literally deploy the wrong things on the wrong hosts.

maznu on 14 Aug 2019

I found that I could workaround my problems by doing this on the master:

pillar_cache: False

maznu on 22 Sep 2019

👍1

I found that I could workaround my problems by doing this on the master:
pillar_cache: False

I needed to do the above, but also renamed a cache directory under /var/cache/salt/master/git_pillar (and restarted the service)

DirectRoot on 18 Oct 2019

I'm having the same problem. I just noticed that one of my (Windows) minions is not refreshing its in memory pillar data after a saltutil.refresh_pillar. I'm on salt 3000 on my minions and salt 3001 on the master. I don't see what setting _pillar_cache: False_ on the master would do since that supposed to be the default but I'm trying it anyway. I have done that and I have deleted all of the directories in _/var/cache/salt/master/minions_ just to see what happens.

I also notice that pillar based scheduling stops doing anything on these minions once the refresh stops working.

It my particular case there could be some kind of timeout issue lurking in the background. I schedule a _saltutil.refresh_pillar_, but in the scheduling specification I don't see how to include a timeout value. If the salt master is not available to the minion at the time of the function is called it's possible that the the job never returns. which may be the cause of what I'm seeing (somehow).

Sorry for this stream of conciousness babble. I'm trying to understand what's going on. WHat I said about refresh_pilar makes no sense since that just causes a signal to be sent. I am seeing this happening on machines that I believe are suspending (usually laptops) and then waking up. I notice that for the pillar events since the pillar is apparently frozen _schedule.next_fire_time_ for all of the events specified in the pillar also becomes frozen, and all the times become times in the past.

H20-17 on 16 Apr 2020

Alright, I apologise for the previous post. I wasn't really ready to say anything but I am now (sort of).
AFAICS it only happens on minions that experience network disruptions, or in the very least, it happens ways more on those machines. In particular it happens a lot with laptops and I assume that this is because people are closing the lids and putting them to sleep or they're going to sleep on their own. I don't know if this bug happens on Linux minions because I haven't got Linux installed on any minions that aren't always connected. I don't think that whatever sleep mode a machine goes into would make any difference (assuming the sleep mode is properly implemented), so I think it has to be the network disruption.

Some things I have observed:

_pillar.items_ always gives correct up to date pillar data (as we would hope)
Pillar schedule events stop firing (or appear to). In some cases the fire times reported by _shchedule.show_next_fire_time_ are in the past and stop updating. In some cases the fire times are updating but the events stop firing anyway.

That's all I've got. I have no idea how I could possibly triangulate this. I hope that this can be looked at because I consider it to be a quite serious problem with core functionality. If it is due to network disruptions and cannot be fixed (for instance due to however zeromq is implemented), then the FAQ should have workarounds for that situation. On Windows machines I believe I can have the scheduler restart minions after waking up (which I will try next I think). This may be an adequate workadound, if not ideal (fingers crossed).

H20-17 on 19 Apr 2020

This seems to be 90-100% resolved for Windows minions by having the salt-minion service restart after waking up from sleep. I don't know what the situation is for linux minions. I now have much more reliability with minions (specifically the laptops) reporting in regularly and actually carrying out their scheduled events.

H20-17 on 29 Apr 2020

I have the same problem with my slat-minion version 3000. I couldn't refresh pillars I have created. I couldn't see my pillars data on my minion even after restarting minion. I could see only the static state /tmp//tmp/deletemeplease.txt: in my init.sls is applied but not the user creation state that obtain multiple user data from pillars. Attaching my sample init.sls for user creation state that needs to read data from pillars through jinja template. Also Attaching my sample qa environment for users state and its pillars directory structure. I am learning salt to deploy our infrastructure with code via jenkins. Your help is much appreciated here to expedite my learning process.
my-qa-env

my init.sls code to create multiple users with one state via jinja and pillar data.
`{% for user, data in pillar.get('admin_users', {}).items() %}
user_{{ user }}:
user.present:
- name: {{ user }}
- fullname: {{ data['fullname'] }}
- shell: {{ data['shell'] }}
- home: {{ data['home'] }}
- uid: {{ data['uid'] }}
- groups: {{ data ['groups'] }}

{{ user }}_key:
ssh_auth.present:
- user: {{ user }}
- name: {{ data['ssh_key'] }}

{% endfor %}

/tmp/deletemeplease.txt:
file.absent
`

sriramperumalla on 10 Jun 2020

I am new to Salt and was following an older tutorial when I ran into the same issue. It seems that the expected folder structure changed.

The tutorial said that I should store both my state and pillar data in the /srv/salt/ directory. According to the master config etc/salt/master the actual default pillar root is in /srv/pillar/:

#####         Pillar settings        #####
##########################################
# Salt Pillars allow for the building of global data that can be made selectively
# available to different minions based on minion grain filtering. The Salt
# Pillar is laid out in the same fashion as the file server, with environments,
# a top file and sls files. However, pillar data does not need to be in the
# highstate format, and is generally just key/value pairs.
#pillar_roots:
#  base:
#    - /srv/pillar

Once I moved my files to the correct folders everything started to work :)

EDIT: Another beginner problem I ran into - when creating sub-folders in /srv/salt don't forget to set permissions.