Salt: service does not restart after watched config

Created on 14 Jul 2014  ·  66Comments  ·  Source: saltstack/salt

os: opensuse
service_provider: rh_service
salt-call --versions-report
Salt: 2014.1.5
Python: 2.7.2 (default, Aug 19 2011, 20:41:43) [GCC]
Jinja2: 2.6
M2Crypto: 0.21.1
msgpack-python: 0.2.4
msgpack-pure: Not Installed
pycrypto: 2.3
PyYAML: 3.10
PyZMQ: 13.0.0
ZMQ: 3.2.2

scenario:

  • on opensuse install postgresql-server package,
  • start service so it's generate default configs in /var/lib/pgsql
  • edit pg_hba with file.replace
  • expect service restart, but it just check if postgresql running and returns true

if we edit pg_hba so file.replace runs again then service restarts ok, but not for the first run. I think it's somehow related to postgres_first_run starting service, but can't find anything
relevant in the logs

example states from debug output:

postgresql:
  pkg:
    - installed
    - name: postgresql-server
  service:
    - running
    - enable: True
    - watch:  
      - file: /var/lib/pgsql/data/pg_hba.conf 

/var/lib/pgsql/data/pg_hba.conf: 
  file.replace:
    - pattern: (^host.*127\.0\.0\.1/32\s*).*$ 
    - repl: \1md5 
    - watch: 
      - pkg: postgresql 

postgres_first_run: 
  cmd.wait:
    - name: /etc/init.d/postgresql start >/dev/null 2>&1; sleep 5; echo -e '\nchanged=yes\n' 
    - stateful: True 
    - watch: 
      - pkg: postgresql 
    - watch_in: 
      - file: /var/lib/pgsql/data/pg_hba.conf 
Bug P2 Platform severity-medium

Most helpful comment

@cmclaughlin Okay, so I discovered that you are definitely finding some unexpected behavior! It took a while for me to nail down exactly what's happening - I'm not sure if it's just a bug in our documentation, or if it's an actual bug.

What's happening here is kind of crazy.

If you remove the enable: True, or split it into

enable-apache:
  service.enabled:
    - name: httpd24

Everything works 🎉

The question you might have then is, "Buy _why_??"

Which is a darn good question 🤣

Turns out that the answer is pretty simple though - the only thing that's going wrong here is that by starting (but not enabling) a service, when Salt looks at the service it says, "Ah, yeah, running is good. Oh, but let me enable this service. Cool, I've made some changes to it. What, you have a watch? No, I don't need to do anything, I've already changed this service!"

Oops :trollface:

I definitely agree that it seems strange, or perhaps even unintuitive, but it is at least consistent, and now I understand exactly why 🙂


Just a had a quick conversation, and this is definitely intended behavior - I'm going to see about adding some information to our docs that calls it out, and perhaps gives some better explanation.

All 66 comments

Thanks for the report @kevich! I think what's happening here, at least according to your example state, is some incorrect spacing with yaml. I've changed your file so that it has the correct number of spaces. Can you give the following a try and let us know if that fixes it?

postgresql:
  pkg:
    - installed
    - name: postgresql-server 
  service:
    - running
    - enable: True
    - watch:  
      - file: /var/lib/pgsql/data/pg_hba.conf 

/var/lib/pgsql/data/pg_hba.conf: 
  file.replace:
    - pattern: (^host.*127\.0\.0\.1/32\s*).*$ 
    - repl: \1md5 
    - watch: 
      - pkg: postgresql 

postgres_first_run: 
  cmd.wait:
    - name: /etc/init.d/postgresql start >/dev/null 2>&1; sleep 5; echo -e '\nchanged=yes\n' 
    - stateful: True 
    - watch: 
      - pkg: postgresql 
    - watch_in: 
      - file: /var/lib/pgsql/data/pg_hba.conf 

Ah, sorry for first wrong example states, spaces disappeared while I was preparing issue text. In real states the spaces is ok, as in your example. And this didn't worked out. =(

@kevich Would you mind editing your original post and copying your states in again, preserving the spacing? I just want to make sure there aren't any slight differences that could be causing this issue.

@basepi sure. Now edited initial issue post with correct yaml states.

It certainly looks correct to me. I wonder if this is an opensuse service issue? We'll investigate this.

@basepi any progress on issue?

@kevich Sorry, I haven't had time to look into this yet. So many bugs to fix...

os: ubuntu

I have a similar issue but am not using a block like first_run. I have the service watch the file and the file is modified with file.replace. In my case it's sshd_config. Is something like *_first_run needed or is the following enough:

postgresql:
  pkg:
    - installed
    - name: postgresql-server 
  service:
    - running
    - enable: True
    - watch:  
      - file: /var/lib/pgsql/data/pg_hba.conf 

/var/lib/pgsql/data/pg_hba.conf: 
  file.replace:
    - pattern: (^host.*127\.0\.0\.1/32\s*).*$ 
    - repl: \1md5 
    - watch: 
      - pkg: postgresql 

@basepi still no time for this issue?

@kevich No time for any issues, at the moment. I've been doing salt trainings the last few weeks. This is on our radar, there are just a lot of open bugs currently.

Happy new year =)
Sorry for being a bit intrusive, but is there is any status changes for this issue?

@kevich Not intrusive at all. We just have too many issues to keep up with at the moment. I'm going to bump the severity on this one to get more eyes on it. Restarting services with watch is a common use case, so we want to make sure it works. I expect it's something specific to opensuse, because I think this is working in general.

@basepi thanks for info! And yes, I'm also quite sure, that it's an issue specific to opensuse

@basepi any luck with this issue as High Severity?

Unfortunately, not yet. We've been working on getting a couple of delayed releases out. Hopefully we'll have time to start attacking the less high severity "High Severity" bugs soon. ;)

Looks like I'm also having this issue on debian (wheezy and jessie).
My processes are never restarted/reloaded when a watched file change. Happens for mysql, redis, nginx... basically everything...

Same for me on Debian 8 (jessie). It doesn't look to be service-specific. Here is an example for the service ferm:

ferm:
  pkg.installed:
    - name: ferm
  service.running:
    - name: ferm
    - enable: True
    - require:
      - pkg: ferm
    - watch:
      - file: /etc/ferm/ferm.conf
      - file: /etc/ferm.d/*

ferm.conf:
  file:
    - managed
    - name: /etc/ferm/ferm.conf
    - source: salt://ferm/ferm.conf
    - makedirs: True

After making changes to ferm.conf, the new configuration file gets deployed to my server, but the service ferm does not restart.

Same issue here on Debian Jessie

@basepi hi there! Is there is any progress on issue? Is it possible that fix will be in 2015.8 release?

It will not be in the 2015.8 release, as we're just putting the finishing touches on that release. However, I did label it such that it should get eyes on it for the next release.

First I thought that I have the same issue, but then this thread helped me to find a programming error.

This state workes for services created with nssm, but FAILED for servives created with InstallUtil (on Windows 7):

myservice:
  service.running:
    - enable: True
    - watch:
      - file: /somewhere/file.exe

In fact Windows write-protects files when they constitute a service.

When I update the file, the service does not stop because the file never updated because it is blocked by the running service.

I need to shutdown the service based "on the expected results of a state that has not yet been executed"

Stop_service_before_file_changes:
  service.dead:
    - name: myservice
    - prereq:
      - file: /somewhere/file.exe

Probably I still need watch because the service could be stopped initially.

This is not an issue anymore for me.

Same issue with apache on Ubuntu 12.04 with apache service.
This is a very basic scenario for this kind of tool...

No problems with 50+ service starts/stops on 700+ Windows minions using prereq, which is more complex than watch.

@doc75, could you paste your state file? I would like to understand what Jenkins testing can do.

Here is an extract of my state file:

in apache/install.sls:

apache_service:
  service.running:
    - name: apache2
    - enable: True

in apache/site.sls:

/etc/apache2/sites-available/{{ name }}{{ site_extension }}:
  file.managed:
    - user: root
    - group: root
    - mode: 644
    - source: salt://apache/files/apache_conf/{{ name }}{{ site_extension }}
    - watch_in:
      - service: apache_service

I cannot see what is wrong with it.

As a temporary by-pass I put in apache/install.sls this extra state (on changes is just there to ensure it is not launched if nothing is changed):

apache_service_restart:
  cmd.run:
    - name: service apache2 restart
    - user: root
    - group: root
    - onchanges:
      - service: apache_service

and in apache/site.sls:

/etc/apache2/sites-available/{{ name }}{{ site_extension }}: 
  file.managed: 
    - user: root 
    - group: root 
    - mode: 644 
    - source: salt://apache/files/apache_conf/{{ name }}{{ site_extension }} 
    - onchanges_in: 
      - cmd: apache_service_restart 

And it works fine.

I cannot figure out what is wrong in my first version.

@doc75

I'm also having the same problem with watched files not triggering a restart. I applied your workaround and can get the service to restart, but I agree that watch and/or watch_in should achieve the same functionality and currently do not.

Same problem here on FreeBSD. If a watch'ed configfile is changed, the service only gets enabled and not restarted.

I got the same issue in production ENV that the service didn't get restarted after replacing the binary.

https://github.com/kubernetes/kubernetes/blob/v1.5.3/cluster/saltbase/salt/kubelet/init.sls#L67-L85

And it's unstable. For some kubernetes clusters, the kubelet service gets restarted, and for some clusters, it doesn't.

Same here... Amazon Linux and Apache. My scenario is:

  • Install Apache
  • Manage with service.running
  • Add new config file or cert with watch_in on the service
  • Service does not restart

However, editing the config file or cert after the initial write to the filesystem and re-applying the state does restart the service. At least it my case, it seems watch_in only applies to changes to existing files.

@rallytime can you please escalate this issue? I tried bumping up my minion to trace level logging, but don't see anything useful. I also looked for mod_watch, but don't see anything obvious.

Here's a very simple test case to reproduce this bug:


httpd24:
  pkg.installed

httpd:
  service.running:
    - enable: True

service httpd status 1:
  cmd.run:
    - name: service httpd status

/tmp/test:
  file.managed:
    - source: salt://watchtest/files/test
    - watch_in:
      - service: httpd

service httpd status 2:
  cmd.run:
    - name: service httpd status

Output below - note the Apache pid doesn't change... perhaps not a great test, since the watch_in changes the order. Maybe that's part of the problem? I expect service.running to be applied, then file.managed then a restart of the service should be triggered:

          ID: httpd24
    Function: pkg.installed
      Result: True
     Comment: The following packages were installed/updated: httpd24
     Started: 08:50:42.914899
    Duration: 8398.427 ms
     Changes:
              ----------
              httpd24:
                  ----------
                  new:
                      2.4.27-3.75.amzn1
                  old:
----------
          ID: /tmp/test
    Function: file.managed
      Result: True
     Comment: File /tmp/test updated
     Started: 08:50:51.332453
    Duration: 30.196 ms
     Changes:
              ----------
              diff:
                  New file xxx
              mode:
                  0644
----------
          ID: httpd
    Function: service.running
      Result: True
     Comment: Service httpd has been enabled, and is running
     Started: 08:50:51.362954
    Duration: 408.962 ms
     Changes:
              ----------
              httpd:
                  True
----------
          ID: service httpd status 1
    Function: cmd.run
        Name: service httpd status
      Result: True
     Comment: Command "service httpd status" run
     Started: 08:50:51.773693
    Duration: 33.523 ms
     Changes:
              ----------
              pid:
                  9289
              retcode:
                  0
              stderr:
              stdout:
                  httpd (pid  9214) is running...
----------
          ID: service httpd status 2
    Function: cmd.run
        Name: service httpd status
      Result: True
     Comment: Command "service httpd status" run
     Started: 08:50:51.807502
    Duration: 33.122 ms
     Changes:
              ----------
              pid:
                  9305
              retcode:
                  0
              stderr:
              stdout:
                  httpd (pid  9214) is running...

I tried to narrow it down... I think the problem is somewhere in state.py, but I'm really unsure. With this test case and some log statements added, I can see mod_watch in states/service.py is only called when existing files are updated.

It seems that the watch behaves only as a requisite in this case. service.running returns with changes, namely that the service is running and enabled. And it seems that subsequent watch_in has no effect because of the previous changes on the service. Just a guess... I'd love to fix this, but still don't know exactly where the problem is.

I have same issue on Debian stretch. Is any progress in solving?

Same issue on Debian Stretch. Watch_in in the same init.sls works for php7.0-fpm and does not work for nagios-nrpe-server

I have a simple example:

# openvpn/server.sls
openvpn-server:
  pkg.installed:
    - name: openvpn
  service.running:
    - name: openvpn
    - enable: True

openvpn-server-config:
  file.managed:
    - name: /etc/openvpn/server.conf
    - source: salt://openvpn/server.conf
    - template: jinja
    - user: root
    - watch_in:
      - service: openvpn

Application of said state follows. Note that the server.conf file is put in place for the first time but the service does not restart.

salt tang-master state.apply openvpn.server
tang-master:
----------
          ID: openvpn-server
    Function: pkg.installed
        Name: openvpn
      Result: True
     Comment: All specified packages are already installed
     Started: 01:26:36.644242
    Duration: 22.522 ms
     Changes:   
----------
          ID: openvpn-server-config
    Function: file.managed
        Name: /etc/openvpn/server.conf
      Result: True
     Comment: File /etc/openvpn/server.conf updated
     Started: 01:26:36.669657
    Duration: 34.687 ms
     Changes:   
              ----------
              diff:
                  New file
              mode:
                  0644
----------
          ID: openvpn-server
    Function: service.running
        Name: openvpn
      Result: True
     Comment: Service openvpn is already enabled, and is running
     Started: 01:26:36.704622
    Duration: 68.776 ms
     Changes:   
              ----------
              openvpn:
                  True

Summary for tang-master
------------
Succeeded: 3 (changed=2)
Failed:    0
------------
Total states run:     3
Total run time: 125.985 ms

Versions:

root@tang-master:/# salt-minion --versions
Salt Version:
           Salt: 2018.3.0

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.4.2
      docker-py: Not Installed
          gitdb: 0.6.4
      gitpython: 1.0.1
          ioflo: Not Installed
         Jinja2: 2.8
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.3
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.12 (default, Dec  4 2017, 14:50:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.2.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: Ubuntu 16.04 xenial
         locale: ANSI_X3.4-1968
        machine: x86_64
        release: 4.13.0-41-generic
         system: Linux
        version: Ubuntu 16.04 xenial

This is a 4 year old bug and it still exists in the latest version 2018.3.2. Any idea when or whether this will get a fix? @basepi @rallytime

I keep on facing this issue, as exposed also in https://github.com/saltstack-formulas/nginx-formula/issues/191.
Seems that if a service is already running, salt simply ignores the request of a restart/reload, as shown by @cmclaughlin.
I have to manually add a systemctl restart nginx with a onchange requisite.

Hello @basepi @rallytime we stumbled upon this also. Any news about when this will be scheduled ? Thanks!

@marco-m Unfortunately neither myself nor @rallytime work at SaltStack anymore. Hopefully someone on @saltstack/team-triage can help you out with that.

Hey @marco-m, thanks for reaching out. I just started here, and this looks like something I can take a look at.

Hello @basepi, thanks for answering, appreciate it.
Hello @waynew, thanks! Happy to hear this :-)

Okay, so from what I can tell it looks like this is only an issue if the server is installed, started, and the config file installed in the same run.

Has anyone seen any other behavior?

I'll have to check the code to verify but my current theory is that the server is either not yet running when the file state is applied, or because it went from not running to running it just assumes that it was "resatarted", so it doesn't bother with a restart. In the interim as a workaround one should be able to do:

salt minion file.remove <your config file>
salt minion state.highstate   # or state.apply <yourstate>

Which, at least in my tests, does trigger a restart. Is that what everyone else sees?

@cmclaughlin / @rsyring / @marco-m

At least one of us is definitely misunderstanding something 🙂

(tl;dr it appears some services don't need a restart because they weren't started in the first place)

I just cloned the bootstrap repo, and used the Vagrantfile to start up a vm.

In that vm I changed /etc/salt/minion, setting file_client: local to make it a standalone minion. Then I stuck these files in /srv/salt:

root@ubuntu-xenial:/srv/salt# tree
.
├── nginx.sls
├── openvpn
│   └── server.conf
├── openvpn.sls
├── top.sls
└── watchtest
    └── files
        ├── nginx.conf
        └── test

I wasn't sure what to put in the openvpn server.conf, so I just added a #comment line. For nginx.conf it was pretty much just a copy of the default /etc/nginx/nginx.conf - I just changed it from including the sites-available to just have a basic config that I could easily change:

        server {
                listen 80 default_server;
                listen [::]:80 default_server;

                root /var/www/a;   # it's super easy to just change this line to `b`
                index index.html index.htm index.nginx-debian.html;

                server_name _;

                location / {
                        # First attempt to serve request as file, then
                        # as directory, then fall back to displaying a 404.
                        try_files $uri $uri/ =404;
                }
        }

I created /var/www/a/index.html and /var/www/b/index.html with just A is my name and B is my name, respectively.

The openvpn.sls file was just what I copied from this comment

openvpn-server:
  pkg.installed:
    - name: openvpn
  service.running:
    - name: openvpn
    - enable: True

openvpn-server-config:
  file.managed:
    - name: /etc/openvpn/server.conf
    - source: salt://openvpn/server.conf
    - template: jinja
    - user: root
    - watch_in:
      - service: openvpn

nginx.sls is also pretty simple:

nginx-installed:
  pkg.installed:
    - name: nginx

nginx:
  service.running:
    - enable: True

/etc/nginx/nginx.conf:
  file.managed:
    - source: salt://watchtest/files/nginx.conf
    - watch_in:
      - service: nginx

/tmp/test:
  file.managed:
    - source: salt://watchtest/files/test
    - watch_in:
      - service: nginx

What I observed is that openvpn did not restart, while nginx did say it was restarted. On further inspection (setting log_level: debug in the minion file) I discovered this:

[INFO    ] File changed:
New file
[INFO    ] Completed state [/etc/openvpn/server.conf] at time 17:24:13.845676 (duration_in_ms=32.92)
[INFO    ] Running state [openvpn] at time 17:24:13.846255
[INFO    ] Executing state service.running for [openvpn]
[DEBUG   ] stdout: * openvpn.service - OpenVPN service
   Loaded: loaded (/lib/systemd/system/openvpn.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Tue 2019-01-08 16:48:32 UTC; 35min ago

It appears that openvpn is not started when it's first installed. A few lines later in the log:

[INFO    ] Executing command [u'systemd-run', u'--scope', u'systemctl', 'start', u'openvpn.service'] in directory '/home/vagrant'

As it turns out there's no need to restart the service after installing the config file the first time, since it wasn't running.

For nginx, I saw something different:

[INFO    ] Completed state [/etc/nginx/nginx.conf] at time 17:24:14.111745 (duration_in_ms=10.813)
[INFO    ] Running state [nginx] at time 17:24:14.112401
[INFO    ] Executing state service.running for [nginx]
[INFO    ] Executing command [u'systemctl', 'status', u'nginx.service', u'-n', u'0'] in directory '/home/vagrant'
[DEBUG   ] stdout: * nginx.service - A high performance web server and a reverse proxy server
   Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2019-01-08 17:22:55 UTC; 1min 18s ago

The service was already started and running. Indeed:

[INFO    ] Executing command [u'systemd-run', u'--scope', u'systemctl', 'restart', u'nginx.service'] in directory '/home/vagrant'

Appeared a few lines later, indicating the nginx restart. I also verified that nginx had the correct config with curl localhost. Swapping the directory names in the nginx config produced an nginx restart, and curl localhost verified that it was pointing to the right directory.

I also removed nginx and tried with this apache.sls:

apache:
  pkg.installed:
    - name: apache2

apache-running:
  service.running:
    - name: apache2
    - enable: True

/tmp/test:
  file.managed:
    - source: salt://watchtest/files/test
    - watch_in:
      - service: apache2

And saw it restart.

Based on what I've seen, I'm not seeing any bugs so I'm going to go ahead and close this issue. But @marco-m (or anyone else), if you can find me a case I can reproduce I'll definitely re-open it!

It was year ago or so. Of course I did not keep broken configurations. But the issue was first reported long time before me. If memory serves me well, the existence of the problem was acknowledged years ago.

Regards,
Igor

--

On Tuesday, Jan 08, 2019 at 7:50 PM, Wayne Werner

@cmclaughlin (https://github.com/cmclaughlin) / @rsyring (https://github.com/rsyring) / @marco-m (https://github.com/marco-m)

At least one of us is definitely misunderstanding something 🙂

(tl;dr it appears some services don't need a restart because they weren't started in the first place)

I just cloned the bootstrap repo (https://github.com/saltstack/salt-bootstrap), and used the Vagrantfile to start up a vm.

In that vm I changed /etc/salt/minion, setting file_client: local to make it a standalone minion. Then I stuck these files in /srv/salt:

root@ubuntu-xenial:/srv/salt# tree . ├── nginx.sls ├── openvpn │ └── server.conf ├── openvpn.sls ├── top.sls └── watchtest └── files ├── nginx.conf └── test

I wasn't sure what to put in the openvpn server.conf, so I just added a #comment line. For nginx.conf it was pretty much just a copy of the default /etc/nginx/nginx.conf - I just changed it from including the sites-available to just have a basic config that I could easily change:

server { listen 80 default_server; listen [::]:80 default_server; root /var/www/a; # it's super easy to just change this line to b index index.html index.htm index.nginx-debian.html; server_name _; location / { # First attempt to serve request as file, then # as directory, then fall back to displaying a 404. try_files $uri $uri/ =404; } }

I created /var/www/a/index.html and /var/www/b/index.html with just A is my name and B is my name, respectively.

The openvpn.sls file was just what I copied from this comment (https://github.com/saltstack/salt/issues/14183#issuecomment-390067691)

openvpn-server: pkg.installed: - name: openvpn service.running: - name: openvpn - enable: True openvpn-server-config: file.managed: - name: /etc/openvpn/server.conf - source: salt://openvpn/server.conf - template: jinja - user: root - watch_in: - service: openvpn

nginx.sls is also pretty simple:

nginx-installed: pkg.installed: - name: nginx nginx: service.running: - enable: True /etc/nginx/nginx.conf: file.managed: - source: salt://watchtest/files/nginx.conf - watch_in: - service: nginx /tmp/test: file.managed: - source: salt://watchtest/files/test - watch_in: - service: nginx

What I observed is that openvpn did not restart, while nginx did say it was restarted. On further inspection (setting log_level: debug in the minion file) I discovered this:

[INFO ] File changed: New file [INFO ] Completed state [/etc/openvpn/server.conf] at time 17:24:13.845676 (duration_in_ms=32.92) [INFO ] Running state [openvpn] at time 17:24:13.846255 [INFO ] Executing state service.running for [openvpn] [DEBUG ] stdout: * openvpn.service - OpenVPN service Loaded: loaded (/lib/systemd/system/openvpn.service; enabled; vendor preset: enabled) Active: inactive (dead) since Tue 2019-01-08 16:48:32 UTC; 35min ago

It appears that openvpn is not started when it's first installed. A few lines later in the log:

[INFO ] Executing command [u'systemd-run', u'--scope', u'systemctl', 'start', u'openvpn.service'] in directory '/home/vagrant'

As it turns out there's no need to restart the service after installing the config file the first time, since it wasn't running.

For nginx, I saw something different:

[INFO ] Completed state [/etc/nginx/nginx.conf] at time 17:24:14.111745 (duration_in_ms=10.813) [INFO ] Running state [nginx] at time 17:24:14.112401 [INFO ] Executing state service.running for [nginx] [INFO ] Executing command [u'systemctl', 'status', u'nginx.service', u'-n', u'0'] in directory '/home/vagrant' [DEBUG ] stdout: * nginx.service - A high performance web server and a reverse proxy server Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-01-08 17:22:55 UTC; 1min 18s ago

The service was already started and running. Indeed:

[INFO ] Executing command [u'systemd-run', u'--scope', u'systemctl', 'restart', u'nginx.service'] in directory '/home/vagrant'

Appeared a few lines later, indicating the nginx restart. I also verified that nginx had the correct config with curl localhost. Swapping the directory names in the nginx config produced an nginx restart, and curl localhost verified that it was pointing to the right directory.

I also removed nginx and tried with this apache.sls:

apache: pkg.installed: - name: apache2 apache-running: service.running: - name: apache2 - enable: True /tmp/test: file.managed: - source: salt://watchtest/files/test - watch_in: - service: apache2

And saw it restart.

Based on what I've seen, I'm not seeing any bugs so I'm going to go ahead and close this issue. But @marco-m (https://github.com/marco-m) (or anyone else), if you can find me a case I can reproduce I'll definitely re-open it!


You are receiving this because you commented.
Reply to this email directly, view it on GitHub (https://github.com/saltstack/salt/issues/14183#issuecomment-452390269), or mute the thread (https://github.com/notifications/unsubscribe-auth/ABzktTbqa-VFeOVbsJTdXCEiqyJSQI4vks5vBNp4gaJpZM4CMyou).

@waynew Please standby, we will provide a repro setup tomorrow. Unfortunately the issue is still present.

@odormond Could you please follow-up with @waynew ?

This is still a problem for me.

@cmclaughlin are you seeing results that differ from what I saw, using the same steps? If so, can you provide your salt-minion --versions?

@waynew I did some more tests...

I don't see the problem on Ubuntu, but I still see it on AWS Linux (so it probably happens on all yum based distros but no Debian based distros).

Here's a simple top file and state to reproduce the problem:

# top.sls
base:
  '*':
    - apache

# apache.sls

{% if grains['os_family'] == 'Debian' %}
{% set pkg, service = ('apache2', 'apache2') %}
{% else %}
{% set pkg, service = ('httpd24', 'httpd') %}
{% endif %}

apache2-pkg:
  pkg.installed:
    - name: {{ pkg }}

apache-service:
  service.running:
    - name: {{ service }}
    - enable: True

/tmp/testwatch:
  file.managed:
    - contents: hello
    - watch_in:
      - service: apache-service

On my AWS Linux minion here's what I see in the state output:

----------
          ID: apache2-pkg
    Function: pkg.installed
        Name: httpd24
      Result: True
     Comment: The following packages were installed/updated: httpd24
     Started: 22:10:10.738857
    Duration: 3325.217 ms
     Changes:
              ----------
              httpd24:
                  ----------
                  new:
                      2.4.37-1.83.amzn1
                  old:
----------
          ID: /tmp/testwatch
    Function: file.managed
      Result: True
     Comment: File /tmp/testwatch updated
     Started: 22:10:14.092117
    Duration: 8.407 ms
     Changes:
              ----------
              diff:
                  New file
----------
          ID: apache-service
    Function: service.running
        Name: httpd
      Result: True
     Comment: Service httpd has been enabled, and is running
     Started: 22:10:14.100807
    Duration: 370.312 ms
     Changes:
              ----------
              httpd:
                  True

Summary for ip-172-18-10-5.ec2.internal
------------
Succeeded: 3 (changed=3)
Failed:    0
------------
Total states run:     3
Total run time:   3.704 s

My AWS Linux minion logs show the service is not restarted:

[DEBUG   ] Rendered data from file: /var/cache/salt/minion/files/base/apache.sls:




apache2-pkg:
  pkg.installed:
    - name: httpd24

apache-service:
  service.running:
    - name: httpd
    - enable: True

/tmp/testwatch:
  file.managed:
    - contents: hello
    - watch_in:
      - service: apache-service

[DEBUG   ] LazyLoaded config.get
[DEBUG   ] Results of YAML rendering:
OrderedDict([('apache2-pkg', OrderedDict([('pkg.installed', [OrderedDict([('name', 'httpd24')])])])), ('apache-service', OrderedDict([('service.running', [OrderedDict([('name', 'httpd')]), OrderedDict([('enable', True)])])])), ('/tmp/testwatch', OrderedDict([('file.managed', [OrderedDict([('contents', 'hello')]), OrderedDict([('watch_in', [OrderedDict([('service', 'apache-service')])])])])]))])
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/apache.sls' using 'yaml' renderer: 0.0049889087677
[DEBUG   ] LazyLoaded pkg.install
[DEBUG   ] LazyLoaded pkg.installed
[DEBUG   ] LazyLoaded boto.assign_funcs
[DEBUG   ] Failed to import utils mako:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1440, in _load_module
    mod = imp.load_module(mod_namespace, fn_, fpath, desc)
  File "/usr/lib/python2.7/dist-packages/salt/utils/mako.py", line 12, in <module>
    from mako.lookup import TemplateCollection, TemplateLookup  # pylint: disable=import-error,3rd-party-module-not-gated
  File "/usr/lib/python2.7/dist-packages/salt/utils/mako.py", line 12, in <module>
    from mako.lookup import TemplateCollection, TemplateLookup  # pylint: disable=import-error,3rd-party-module-not-gated
ImportError: No module named lookup
[DEBUG   ] Failed to import utils raetevent:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1440, in _load_module
    mod = imp.load_module(mod_namespace, fn_, fpath, desc)
  File "/usr/lib/python2.7/dist-packages/salt/utils/raetevent.py", line 24, in <module>
    from raet import raeting, nacling
ImportError: No module named raet
[DEBUG   ] Failed to import utils raetlane:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1440, in _load_module
    mod = imp.load_module(mod_namespace, fn_, fpath, desc)
  File "/usr/lib/python2.7/dist-packages/salt/utils/raetlane.py", line 63, in <module>
    from raet import raeting, nacling
ImportError: No module named raet
[DEBUG   ] Could not LazyLoad boto3.assign_funcs: 'boto3' __virtual__ returned False
[ERROR   ] Exception raised when processing __virtual__ function for salt.loaded.int.module.boto_elbv2. Module will not be loaded: 'boto3.assign_funcs'
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1717, in _process_virtual
    virtual = getattr(mod, virtual_func)()
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_elbv2.py", line 71, in __virtual__
    __utils__['boto3.assign_funcs'](__name__, 'elbv2')
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 101, in __getitem__
    raise KeyError(key)
KeyError: 'boto3.assign_funcs'
[WARNING ] salt.loaded.int.module.boto_elbv2.__virtual__() is wrongly returning `None`. It should either return `True`, `False` or a new name. If you're the developer of the module 'boto_elbv2', please fix this.
[DEBUG   ] LazyLoaded boto3.assign_funcs
[DEBUG   ] Error loading module.boto_iot: __init__ failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1504, in _load_module
    module_init(self.opts)
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_iot.py", line 109, in __init__
    __utils__['boto3.assign_funcs'](__name__, 'iot')
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 98, in __getitem__
    return self._dict[key]
KeyError: 'boto3.assign_funcs'
[DEBUG   ] LazyLoaded boto3.assign_funcs
[ERROR   ] Exception raised when processing __virtual__ function for salt.loaded.int.module.boto_kinesis. Module will not be loaded: 'boto3.assign_funcs'
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1717, in _process_virtual
    virtual = getattr(mod, virtual_func)()
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_kinesis.py", line 78, in __virtual__
    __utils__['boto3.assign_funcs'](__name__, 'kinesis')
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 98, in __getitem__
    return self._dict[key]
KeyError: 'boto3.assign_funcs'
[WARNING ] salt.loaded.int.module.boto_kinesis.__virtual__() is wrongly returning `None`. It should either return `True`, `False` or a new name. If you're the developer of the module 'boto_kinesis', please fix this.
[DEBUG   ] LazyLoaded boto3.assign_funcs
[DEBUG   ] Error loading module.boto_lambda: __init__ failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1504, in _load_module
    module_init(self.opts)
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_lambda.py", line 148, in __init__
    __utils__['boto3.assign_funcs'](__name__, 'lambda')
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 98, in __getitem__
    return self._dict[key]
KeyError: 'boto3.assign_funcs'
[DEBUG   ] LazyLoaded boto3.assign_funcs
[DEBUG   ] Error loading module.boto_rds: __init__ failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1504, in _load_module
    module_init(self.opts)
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_rds.py", line 144, in __init__
    __utils__['boto3.assign_funcs'](__name__, 'rds')
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 98, in __getitem__
    return self._dict[key]
KeyError: 'boto3.assign_funcs'
[DEBUG   ] LazyLoaded boto3.assign_funcs
[DEBUG   ] Error loading module.boto_s3_bucket: __init__ failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1504, in _load_module
    module_init(self.opts)
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_s3_bucket.py", line 107, in __init__
    __utils__['boto3.assign_funcs'](__name__, 's3')
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 98, in __getitem__
    return self._dict[key]
KeyError: 'boto3.assign_funcs'
[DEBUG   ] LazyLoaded boto3.assign_funcs
[DEBUG   ] Error loading module.boto_vpc: __init__ failed
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1504, in _load_module
    module_init(self.opts)
  File "/usr/lib/python2.7/dist-packages/salt/modules/boto_vpc.py", line 204, in __init__
    __utils__['boto3.assign_funcs'](__name__, 'ec2',
  File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1121, in __getitem__
    func = super(LazyLoader, self).__getitem__(item)
  File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 98, in __getitem__
    return self._dict[key]
KeyError: 'boto3.assign_funcs'
[DEBUG   ] SMBIOS: neither dmidecode nor smbios found!
[DEBUG   ] DSC: Only available on Windows systems
[DEBUG   ] Module PSGet: Only available on Windows systems
[DEBUG   ] Could not LazyLoad pkg.ex_mod_init: 'pkg.ex_mod_init' is not available.
[INFO    ] Running state [httpd24] at time 22:10:10.738857
[INFO    ] Executing state pkg.installed for [httpd24]
[INFO    ] Executing command ['rpm', '-qa', '--queryformat', '%{NAME}_|-%{EPOCH}_|-%{VERSION}_|-%{RELEASE}_|-%{ARCH}_|-(none)'] in directory '/root'
[DEBUG   ] Could not LazyLoad pkg.check_db: 'pkg.check_db' is not available.
[INFO    ] Executing command ['yum', '--quiet', '--assumeyes', 'clean', 'expire-cache'] in directory '/root'
[INFO    ] Executing command ['yum', '--quiet', '--assumeyes', 'check-update'] in directory '/root'
[DEBUG   ] Failed to get holds, versionlock plugin is probably not installed
[INFO    ] Executing command ['yum', '-y', 'install', 'httpd24'] in directory '/root'
[INFO    ] Executing command ['rpm', '-qa', '--queryformat', '%{NAME}_|-%{EPOCH}_|-%{VERSION}_|-%{RELEASE}_|-%{ARCH}_|-(none)'] in directory '/root'
[INFO    ] Made the following changes:
'httpd24' changed from 'absent' to '2.4.37-1.83.amzn1'

[DEBUG   ] Refreshing modules...
[INFO    ] Loading fresh modules for state activity
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] LazyLoaded saltutil.refresh_modules
[DEBUG   ] LazyLoaded event.fire
[DEBUG   ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_a3ed1b4b2d_pub.ipc
[DEBUG   ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_a3ed1b4b2d_pull.ipc
[DEBUG   ] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_a3ed1b4b2d_pull.ipc
[DEBUG   ] Sending event: tag = module_refresh; data = {'_stamp': '2019-01-10T22:10:14.057018'}
[DEBUG   ] Minion of "salt" is handling event tag 'module_refresh'
[DEBUG   ] Refreshing modules. Notify=False
[DEBUG   ] Reading configuration from /etc/salt/minion
[INFO    ] Completed state [httpd24] at time 22:10:14.064074 duration_in_ms=3325.217
[DEBUG   ] LazyLoaded config.option
[DEBUG   ] LazyLoaded systemd.booted
[DEBUG   ] LazyLoaded service.start
[DEBUG   ] LazyLoaded service.running
[DEBUG   ] LazyLoaded file.managed
[INFO    ] Running state [/tmp/testwatch] at time 22:10:14.092118
[INFO    ] Executing state file.managed for [/tmp/testwatch]
[DEBUG   ] LazyLoaded file.source_list
[INFO    ] File changed:
New file
[INFO    ] Completed state [/tmp/testwatch] at time 22:10:14.100524 duration_in_ms=8.407
[INFO    ] Running state [httpd] at time 22:10:14.100805
[INFO    ] Executing state service.running for [httpd]
[DEBUG   ] LazyLoaded cmd.retcode
[INFO    ] Executing command '/sbin/service httpd status' in directory '/root'
[DEBUG   ] stdout: httpd is stopped
[DEBUG   ] retcode: 3
[INFO    ] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
[DEBUG   ] stdout: httpd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] Executing command '/sbin/service httpd start' in directory '/root'
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] stdout: Starting httpd:                         [  OK  ]
[INFO    ] Executing command '/sbin/service httpd status' in directory '/root'
[DEBUG   ] stdout: httpd (pid  3581) is running...
[INFO    ] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
[DEBUG   ] stdout: httpd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
[DEBUG   ] stdout: httpd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
[DEBUG   ] stdout: httpd            0:off   1:off   2:off   3:off   4:off   5:off   6:off
[INFO    ] Executing command '/sbin/chkconfig httpd on' in directory '/root'
[DEBUG   ] Please install 'virt-what' to improve results of the 'virtual' grain.
[INFO    ] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
[DEBUG   ] stdout: httpd            0:off   1:off   2:on    3:on    4:on    5:on    6:off
[INFO    ] Executing command '/sbin/runlevel' in directory '/root'
[DEBUG   ] stdout: N 3
[INFO    ] {'httpd': True}
[INFO    ] Completed state [httpd] at time 22:10:14.471119 duration_in_ms=370.312
[DEBUG   ] File /var/cache/salt/minion/accumulator/140708361483408 does not exist, no need to cleanup.
[DEBUG   ] Minion return retry timer set to 7 seconds (randomized)
[INFO    ] Returning information for job: 20190110221010264379
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'ip-172-18-10-5.ec2.internal', 'tcp://172.18.10.69:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'ip-172-18-10-5.ec2.internal', 'tcp://172.18.10.69:4506')
[DEBUG   ] minion return: {'fun_args': ['apache'], 'jid': '20190110221010264379', 'return': {'service_|-apache-service_|-httpd_|-running': {'comment': 'Service httpd has been enabled, and is running', '__id__': 'apache-service', 'name': 'httpd', 'duration': 370.312, '__run_num__': 2, 'start_time': '22:10:14.100807', '__sls__': u'apache', 'changes': {'httpd': True}, 'result': True}, 'file_|-/tmp/testwatch_|-/tmp/testwatch_|-managed': {'comment': 'File /tmp/testwatch updated', 'pchanges': {}, 'name': '/tmp/testwatch', 'start_time': '22:10:14.092117', 'result': True, 'duration': 8.407, '__run_num__': 1, '__sls__': u'apache', 'changes': {'diff': 'New file'}, '__id__': '/tmp/testwatch'}, 'pkg_|-apache2-pkg_|-httpd24_|-installed': {'comment': 'The following packages were installed/updated: httpd24', 'name': 'httpd24', 'start_time': '22:10:10.738857', 'result': True, 'duration': 3325.217, '__run_num__': 0, '__sls__': u'apache', 'changes': {'httpd24': {'new': '2.4.37-1.83.amzn1', 'old': ''}}, '__id__': 'apache2-pkg'}}, 'retcode': 0, 'success': True, 'fun': 'state.apply'}

Here's my Ubuntu minion state output...

----------
          ID: apache2-pkg
    Function: pkg.installed
        Name: apache2
      Result: True
     Comment: The following packages were installed/updated: apache2
     Started: 22:15:39.733726
    Duration: 7429.408 ms
     Changes:
              ----------
              apache2:
                  ----------
                  new:
                      2.4.7-1ubuntu4.20
                  old:
----------
          ID: /tmp/testwatch
    Function: file.managed
      Result: True
     Comment: File /tmp/testwatch updated
     Started: 22:15:47.543790
    Duration: 1.953 ms
     Changes:
              ----------
              diff:
                  New file
----------
          ID: apache-service
    Function: service.running
        Name: apache2
      Result: True
     Comment: Service restarted
     Started: 22:15:47.579603
    Duration: 2189.78 ms
     Changes:
              ----------
              apache2:
                  True

Summary for ip-172-18-10-69.ec2.internal
------------
Succeeded: 3 (changed=3)
Failed:    0
------------
Total states run:     3
Total run time:   9.621 s

And my Ubuntu minion logs show the restart...

[DEBUG   ] Rendered data from file: /var/cache/salt/minion/files/base/apache.sls:




apache2-pkg:
  pkg.installed:
    - name: apache2

apache-service:
  service.running:
    - name: apache2
    - enable: True

/tmp/testwatch:
  file.managed:
    - contents: hello
    - watch_in:
      - service: apache-service

[DEBUG   ] LazyLoaded config.get
[DEBUG   ] Results of YAML rendering:
OrderedDict([('apache2-pkg', OrderedDict([('pkg.installed', [OrderedDict([('name', 'apache2')])])])), ('apache-service', OrderedDict([('service.running', [OrderedDict([('name', 'apache2')]), OrderedDict([('enable', True)])])])), ('/tmp/testwatch', OrderedDict([('file.managed', [OrderedDict([('contents', 'hello')]), OrderedDict([('watch_in', [OrderedDict([('service', 'apache-service')])])])])]))])
[PROFILE ] Time (in seconds) to render '/var/cache/salt/minion/files/base/apache.sls' using 'yaml' renderer: 0.00413703918457
[DEBUG   ] LazyLoaded pkg.install
[DEBUG   ] LazyLoaded pkg.installed
[DEBUG   ] LazyLoaded systemd.booted
[DEBUG   ] You should upgrade pyOpenSSL to at least 0.14.1 to enable the use of X509 extensions in the tls module
[DEBUG   ] DSC: Only available on Windows systems
[DEBUG   ] Module PSGet: Only available on Windows systems
[DEBUG   ] Could not LazyLoad pkg.ex_mod_init: 'pkg.ex_mod_init' is not available.
[INFO    ] Running state [apache2] at time 22:15:39.733726
[INFO    ] Executing state pkg.installed for [apache2]
[INFO    ] Executing command ['dpkg-query', '--showformat', '${Status} ${Package} ${Version} ${Architecture}', '-W'] in directory '/root'
[DEBUG   ] Could not LazyLoad pkg.normalize_name: 'pkg.normalize_name' is not available.
[DEBUG   ] Could not LazyLoad pkg.check_db: 'pkg.check_db' is not available.
[INFO    ] Executing command ['apt-cache', '-q', 'policy', 'apache2'] in directory '/root'
[DEBUG   ] Could not LazyLoad pkg.normalize_name: 'pkg.normalize_name' is not available.
[INFO    ] Executing command ['apt-get', '-q', 'update'] in directory '/root'
[INFO    ] User root Executing command saltutil.find_job with jid 20190110221544174506
[DEBUG   ] Command details {'tgt_type': 'list', 'jid': '20190110221544174506', 'tgt': ['ip-172-18-10-69.ec2.internal'], 'ret': '', 'user': 'root', 'arg': ['20190110221539118759'], 'fun': 'saltutil.find_job'}
[INFO    ] Starting a new job with PID 13006
[DEBUG   ] LazyLoaded saltutil.find_job
[DEBUG   ] LazyLoaded direct_call.get
[DEBUG   ] Minion return retry timer set to 9 seconds (randomized)
[INFO    ] Returning information for job: 20190110221544174506
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'ip-172-18-10-69.ec2.internal', 'tcp://127.0.0.1:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'ip-172-18-10-69.ec2.internal', 'tcp://127.0.0.1:4506')
[DEBUG   ] minion return: {'fun_args': ['20190110221539118759'], 'jid': '20190110221544174506', 'return': {'tgt_type': 'grain', 'jid': '20190110221539118759', 'tgt': 'os:Ubuntu', 'pid': 12959, 'ret': '', 'user': 'root', 'arg': ['apache'], 'fun': 'state.apply'}, 'retcode': 0, 'success': True, 'fun': 'saltutil.find_job'}
[INFO    ] Executing command ['dpkg', '--get-selections', '*'] in directory '/root'
[INFO    ] Executing command ['apt-get', '-q', '-y', '-o', 'DPkg::Options::=--force-confold', '-o', 'DPkg::Options::=--force-confdef', 'install', 'apache2'] in directory '/root'
[INFO    ] Executing command ['dpkg-query', '--showformat', '${Status} ${Package} ${Version} ${Architecture}', '-W'] in directory '/root'
[INFO    ] Made the following changes:
'apache2' changed from 'absent' to '2.4.7-1ubuntu4.20'

[DEBUG   ] Refreshing modules...
[INFO    ] Loading fresh modules for state activity
[DEBUG   ] LazyLoaded jinja.render
[DEBUG   ] LazyLoaded yaml.render
[DEBUG   ] LazyLoaded saltutil.refresh_modules
[DEBUG   ] LazyLoaded event.fire
[DEBUG   ] SaltEvent PUB socket URI: /var/run/salt/minion/minion_event_7ddd07e1a4_pub.ipc
[DEBUG   ] SaltEvent PULL socket URI: /var/run/salt/minion/minion_event_7ddd07e1a4_pull.ipc
[DEBUG   ] Initializing new IPCClient for path: /var/run/salt/minion/minion_event_7ddd07e1a4_pull.ipc
[DEBUG   ] Sending event: tag = module_refresh; data = {'_stamp': '2019-01-10T22:15:47.162443'}
[INFO    ] Completed state [apache2] at time 22:15:47.163134 duration_in_ms=7429.408
[DEBUG   ] Minion of "salt" is handling event tag 'module_refresh'
[DEBUG   ] Refreshing modules. Notify=False
[DEBUG   ] Reading configuration from /etc/salt/minion
[DEBUG   ] LazyLoaded config.option
[DEBUG   ] LazyLoaded systemd.booted
[DEBUG   ] Including configuration from '/etc/salt/minion.d/_schedule.conf'
[DEBUG   ] Reading configuration from /etc/salt/minion.d/_schedule.conf
[DEBUG   ] Please install 'virt-what' to improve results of the 'virtual' grain.
[DEBUG   ] LazyLoaded config.merge
[DEBUG   ] LazyLoaded mine.update
[DEBUG   ] You should upgrade pyOpenSSL to at least 0.14.1 to enable the use of X509 extensions in the tls module
[DEBUG   ] LazyLoaded service.start
[DEBUG   ] LazyLoaded service.running
[DEBUG   ] LazyLoaded file.managed
[INFO    ] Running state [/tmp/testwatch] at time 22:15:47.543791
[INFO    ] Executing state file.managed for [/tmp/testwatch]
[INFO    ] File changed:
New file
[INFO    ] Completed state [/tmp/testwatch] at time 22:15:47.545743 duration_in_ms=1.953
[INFO    ] Running state [apache2] at time 22:15:47.545985
[INFO    ] Executing state service.running for [apache2]
[INFO    ] Executing command ['runlevel', '/run/utmp'] in directory '/root'
[DEBUG   ] stdout: N 2
[INFO    ] Executing command ['service', 'apache2', 'status'] in directory '/root'
[DEBUG   ] stdout:  * apache2 is running
[INFO    ] The service apache2 is already running
[INFO    ] Completed state [apache2] at time 22:15:47.579432 duration_in_ms=33.445
[INFO    ] Running state [apache2] at time 22:15:47.579603
[INFO    ] Executing state service.mod_watch for [apache2]
[INFO    ] Executing command ['service', 'apache2', 'status'] in directory '/root'
[DEBUG   ] stdout:  * apache2 is running
[INFO    ] Executing command ['service', 'apache2', 'restart'] in directory '/root'
[DEBUG   ] stdout:  * Restarting web server apache2
   ...done.
[INFO    ] {'apache2': True}
[INFO    ] Completed state [apache2] at time 22:15:49.769383 duration_in_ms=2189.78

Here are the versions I'm testing with...

Master

Salt Version:
           Salt: 2017.7.8

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: 0.5.4
      gitpython: 0.3.2 RC1
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 0.9.1
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: 1.2.3
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.6 (default, Nov 23 2017, 15:49:48)
   python-gnupg: 0.3.6
         PyYAML: 3.10
          PyZMQ: 14.0.1
           RAET: Not Installed
          smmap: 0.8.2
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5

System Versions:
           dist: Ubuntu 14.04 trusty
         locale: UTF-8
        machine: x86_64
        release: 3.13.0-161-generic
         system: Linux
        version: Ubuntu 14.04 trusty

Ubuntu minion

Salt Version:
           Salt: 2017.7.8

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 1.5
      docker-py: Not Installed
          gitdb: 0.5.4
      gitpython: 0.3.2 RC1
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 0.9.1
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: 1.2.3
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.6 (default, Nov 23 2017, 15:49:48)
   python-gnupg: 0.3.6
         PyYAML: 3.10
          PyZMQ: 14.0.1
           RAET: Not Installed
          smmap: 0.8.2
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5

System Versions:
           dist: Ubuntu 14.04 trusty
         locale: UTF-8
        machine: x86_64
        release: 3.13.0-161-generic
         system: Linux
        version: Ubuntu 14.04 trusty

AWS Linux minion

Salt Version:
           Salt: 2017.7.8

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.1
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.12 (default, Nov  2 2017, 19:20:38)
   python-gnupg: Not Installed
         PyYAML: 3.10
          PyZMQ: 14.5.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5

System Versions:
           dist:
         locale: UTF-8
        machine: x86_64
        release: 4.9.76-3.78.amzn1.x86_64
         system: Linux
        version: Not Installed

The master was launched using ubuntu-trusty-14.04-amd64-server-20181022 (ami-05a36d3b9aa4a17ac).
And the AWS Linux minion Was launched using AWS Linux 2017.09.1.20180115 (AMI ami-97785bed).

Last time I looked at this, I think I recall digging through the mod_watch source. I even tried stepping through with pdb... but never could figure it out. Would love to see if you get farther than me. Until then how about we reopen this?

Thanks

@cmclaughlin Just glancing at your example I do see that the service is not being restarted - I'll double check tomorrow using a Yum based installation - but I'm not seeing (at a glance) where the config is not being loaded.

I was able to reproduce the behavior of the service not being restarted - but that was because the service was not started before the file was added.

In simpler terms:

  1. Install/Start Service
  2. Add Conf file
  3. Restart Service

That's what people are expecting. What I'm seeing is that for some services, that is the case. At least one service, I observed:

  1. Install Service
  2. Add Conf file
  3. Start service

Which is confusing because from Salt's perspective, the service didn't need restarting (having never been started in the first place), but the user expects that. However, Salt is actually doing the right thing here because it doesn't start the service, copy the config, and then restart the service. It just waits until the config file is in place to bring the service online.

Weird, huh?

But I'll double check your sample tomorrow to see if it's actually exhibiting buggy behavior, or just confusing 😬 😀

@marco-m @rongshengfang would you please review my last comment and compare against your environments? I'm curious what Salt versions and OSs you're using, etc.

Curious if you have any info to help @waynew.

Thanks!

@cmclaughlin can you try something like this? (I was on Centos7 using this Vagrantfile

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  config.vm.box = "centos/7"
  config.vm.provision "shell", inline: "yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm; yum install -y salt-minion;"
  #config.vm.synced_folder "../states", "/srv/salt"
end

)

apache2-pkg:
  pkg.installed:
    - name: httpd

apache-service:
  service.running:
    - name: httpd
    - enable: True

/path/to/A/index.html:
  file.managed:
    - makedirs: True
    - contents: I am your page
    - watch_in:
      - service: apache-service
    - mode: 644
    - user: root
    - group: root

/etc/httpd/conf/httpd.conf:
  file.managed:
    - makedirs: True
    - watch_in:
      - service: apache-service
    - contents: |
          ServerRoot "/etc/httpd"
          Listen 80
          Include conf.modules.d/*.conf
          User apache
          Group apache
          ServerAdmin root@localhost
          <Directory />
              AllowOverride none
              Require all denied
          </Directory>

          # Change these directories to match
          DocumentRoot "/path/to/A"
          <Directory "/path/to/A">
              Options Indexes FollowSymLinks
              AllowOverride None
              Require all granted
          </Directory>

          <IfModule dir_module>
              DirectoryIndex index.html
          </IfModule>
          <Files ".ht*">
              Require all denied
          </Files>
          ErrorLog "logs/error_log"
          LogLevel warn
          <IfModule log_config_module>
              LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
              LogFormat "%h %l %u %t \"%r\" %>s %b" common
              <IfModule logio_module>
                LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %I %O" combinedio
              </IfModule>
              CustomLog "logs/access_log" combined
          </IfModule>
          <IfModule alias_module>
              ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
          </IfModule>
          <Directory "/var/www/cgi-bin">
              AllowOverride None
              Options None
              Require all granted
          </Directory>
          <IfModule mime_module>
              TypesConfig /etc/mime.types
              AddType application/x-compress .Z
              AddType application/x-gzip .gz .tgz
              AddType text/html .shtml
              AddOutputFilter INCLUDES .shtml
          </IfModule>
          AddDefaultCharset UTF-8
          <IfModule mime_magic_module>
              MIMEMagicFile conf/magic
          </IfModule>
          EnableSendfile on
          IncludeOptional conf.d/*.conf

You should expect to curl localhost and see "I am your page", rather than the default apache startup page. If you see something different, please let me know!

Commenting on my phone and can test more later... but I would like to clarify the problem is the service is not reloaded after new files are added - where there is a watch between the new file and the already running service.

I don’t think the last Apache test mentioned above demonstrates the problem I’m seeing because Apache will always serve the contents of a html file. A service restart is not needed for that.

@cmclaughlin Yes, but out of a random directory?

where there is a watch between the new file and the already running service.

That's the crux of the matter. It the service isn't already running in your test. The program gets installed (1) the file gets added (2) and the service gets started (3).

There's nothing for salt to restart because the file already existed, in the correct state, before the service was started. Restarting the service won't actually do anything, because there's nothing to do.

If you want to see what I mean - comment this block out:

/tmp/testwatch:
  file.managed:
    - contents: hello
    - watch_in:
      - service: apache-service

Then run state.apply/highstate. Then add it back in and run your state (ensuring, of course, that /tmp/testwatch doesn't already exist). You should see your service restart - that's what I've seen in every test I've tried.

I'll take another look today.. sorry, I didn't take a close look at the centos example over the weekend / reading on my phone. I see your point about serving out of a random directory... that makes sense now.

Installing Vagrant now... but I think this is where our expectations are different:

That's the crux of the matter. It the service isn't already running in your test. The program gets installed (1) the file gets added (2) and the service gets started (3).

Based on my example above, here's what I expect:

1) Package gets install
2) Service gets started
3) File with watch gets added
4) Service restarts because of watch on service

Thinking it through a bit more, it makes sense that I don't see the problem on Debian based systems since apt packages typically start the service automatically.

I suppose I'm proposing Salt's requisite system should handle this special case.

@cmclaughlin and that's what I've failed to communicate well :trollface:

Salt doesn't need to handle the special case because nothing wrong is happening. It's just weird, because it's not what we're used to seeing.

We're used to seeing your expectations - package installed, service gets started, file gets added, and service gets restarted. Why is the service restarted? Because it needs to load the changed file. And for Debian systems, apparently that's the norm - installing the package starts the service.

What we're seeing here is actually OK - because the package is installed, the files get updated, and then the service gets started. It would be silly to start the service just to restart it after adding files, or to add a file, start a service, and then restart the service just because we've already added the file 😛

I think we find it so weird an unexpected because when you or I manually install a server we're going to fire it up first to make sure it's working, then we'll futz with the config files, and then restart the server. When we do our setup in Salt... we've pre-futzed our files :trollface: 😆

Glad we're on the same page now...

I'd like to come up with a way for Salt to handle this situation... at the very least we should have some documentation that this is a common problem which can lead to very undesirable consequences.

I could easily follow the onchanges example/work-around posted above... but I'm thinking service.running should start the service ASAP (by injecting another entry into the low state?) in this situation. Perhaps service.running should have an additional option like eary_start?

I started poking around at the state and service sources... but I don't have a very good grasp on the code flow here. If anyone has ideas how to improve this I'd love to know.

I'm curious - what undesirable consequences are you imagining would happen here?

Thanks for your patience... I'm all over the place on this one. Maybe my internal code doesn't match up with the tests we've come up with here. I'll take a closer look at my internal Salt code and see if I can track down the problem.

Thanks again for your patience. I took a look at my internal salt repo and I think I can finally provide a decent example.

In at least one place we have the problem we started the service manually before service.running was called, but didn't enable the service.

I've concluded:

  • watch does not restart the service if it was already running, but not enabled.
  • watch does restart the service if the service was already running and enabled.

Example:

apache-pkg:
  pkg.installed:
    - name: httpd24

# imagine module installation here
restart-apache-after-module-installation:
  cmd.run:
    - name: service httpd restart

apache-service:
  service.running:
    - name: httpd
    - enable: True

/etc/httpd/foo.crt:
  file.managed:
    - contents: hi
    - watch_in:
      - service: apache-service

State output:

----------
          ID: apache-pkg
    Function: pkg.installed
        Name: httpd24
      Result: True
     Comment: The following packages were installed/updated: httpd24
     Started: 06:23:03.421665
    Duration: 8289.552 ms
     Changes:
              ----------
              httpd24:
                  ----------
                  new:
                      2.4.37-1.83.amzn1
                  old:
----------
          ID: restart-apache-after-module-installation
    Function: cmd.run
        Name: service httpd restart
      Result: True
     Comment: Command "service httpd restart" run
     Started: 06:23:11.714577
    Duration: 134.233 ms
     Changes:
              ----------
              pid:
                  29860
              retcode:
                  0
              stderr:
              stdout:
                  Stopping httpd: [FAILED]
                  Starting httpd: [  OK  ]
----------
          ID: /etc/httpd/wildcard.crt
    Function: file.managed
      Result: True
     Comment: File /etc/httpd/wildcard.crt updated
     Started: 06:23:11.864664
    Duration: 13.596 ms
     Changes:
              ----------
              diff:
                  New file
----------
          ID: apache-service
    Function: service.running
        Name: httpd
      Result: True
     Comment: Service httpd has been enabled, and is in the desired state
     Started: 06:23:11.878634
    Duration: 259.397 ms
     Changes:
              ----------
              httpd:
                  True

Log output:

2019-01-16 06:25:57,323 [salt.state                                                       ][INFO    ][30096] Made the following changes:
'httpd24' changed from 'absent' to '2.4.37-1.83.amzn1'

2019-01-16 06:25:57,350 [salt.state                                                       ][INFO    ][30096] Loading fresh modules for state activity
2019-01-16 06:25:57,402 [salt.state                                                       ][INFO    ][30096] Completed state [httpd24] at time 06:25:57.402692 duration_in_ms=8236.931
2019-01-16 06:25:57,406 [salt.state                                                       ][INFO    ][30096] Running state [service httpd restart] at time 06:25:57.406222
2019-01-16 06:25:57,406 [salt.state                                                       ][INFO    ][30096] Executing state cmd.run for [service httpd restart]
2019-01-16 06:25:57,410 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command 'service httpd restart' in directory '/root'
2019-01-16 06:25:57,530 [salt.state                                                       ][INFO    ][30096] {'pid': 30143, 'retcode': 0, 'stderr': '', 'stdout': 'Stopping httpd: [FAILED]\r\nStarting httpd: [  OK  ]'}
2019-01-16 06:25:57,531 [salt.state                                                       ][INFO    ][30096] Completed state [service httpd restart] at time 06:25:57.531133 duration_in_ms=124.91
2019-01-16 06:25:57,553 [salt.state                                                       ][INFO    ][30096] Running state [/etc/httpd/foo.crt] at time 06:25:57.553259
2019-01-16 06:25:57,553 [salt.state                                                       ][INFO    ][30096] Executing state file.managed for [/etc/httpd/foo.crt]
2019-01-16 06:25:57,569 [salt.state                                                       ][INFO    ][30096] File changed:
New file
2019-01-16 06:25:57,569 [salt.state                                                       ][INFO    ][30096] Completed state [/etc/httpd/foo.crt] at time 06:25:57.569583 duration_in_ms=16.325
2019-01-16 06:25:57,570 [salt.state                                                       ][INFO    ][30096] Running state [httpd] at time 06:25:57.569983
2019-01-16 06:25:57,570 [salt.state                                                       ][INFO    ][30096] Executing state service.running for [httpd]
2019-01-16 06:25:57,571 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/service httpd status' in directory '/root'
2019-01-16 06:25:57,675 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
2019-01-16 06:25:57,683 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/runlevel' in directory '/root'
2019-01-16 06:25:57,704 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/runlevel' in directory '/root'
2019-01-16 06:25:57,721 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
2019-01-16 06:25:57,742 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/runlevel' in directory '/root'
2019-01-16 06:25:57,756 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/runlevel' in directory '/root'
2019-01-16 06:25:57,778 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
2019-01-16 06:25:57,786 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/chkconfig httpd on' in directory '/root'
2019-01-16 06:25:57,809 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
2019-01-16 06:25:57,815 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30096] Executing command '/sbin/runlevel' in directory '/root'
2019-01-16 06:25:57,821 [salt.state                                                       ][INFO    ][30096] {'httpd': True}
2019-01-16 06:25:57,821 [salt.state                                                       ][INFO    ][30096] Completed state [httpd] at time 06:25:57.821821 duration_in_ms=251.837

After changing cmd.run to service httpd restart && /sbin/chkconfig httpd on... the service is running and enabled when the watch runs... and the watch/restart works:

State output:

          ID: apache-pkg
    Function: pkg.installed
        Name: httpd24
      Result: True
     Comment: The following packages were installed/updated: httpd24
     Started: 06:27:36.354670
    Duration: 8040.531 ms
     Changes:
              ----------
              httpd24:
                  ----------
                  new:
                      2.4.37-1.83.amzn1
                  old:
----------
          ID: restart-apache-after-module-installation
    Function: cmd.run
        Name: service httpd restart && /sbin/chkconfig httpd on
      Result: True
     Comment: Command "service httpd restart && /sbin/chkconfig httpd on" run
     Started: 06:27:44.406192
    Duration: 151.919 ms
     Changes:
              ----------
              pid:
                  30390
              retcode:
                  0
              stderr:
              stdout:
                  Stopping httpd: [FAILED]
                  Starting httpd: [  OK  ]
----------
          ID: /etc/httpd/foo.crt
    Function: file.managed
      Result: True
     Comment: File /etc/httpd/foo.crt updated
     Started: 06:27:44.569892
    Duration: 16.433 ms
     Changes:
              ----------
              diff:
                  New file
----------
          ID: apache-service
    Function: service.running
        Name: httpd
      Result: True
     Comment: Service restarted
     Started: 06:27:44.726029
    Duration: 290.689 ms
     Changes:
              ----------
              httpd:
                  True

Summary for dev-charlestest-1b053.shopstyleops.com
------------
Succeeded: 4 (changed=4)
Failed:    0
------------
Total states run:     4
Total run time:   8.500 s

Logs

2019-01-16 06:27:44,308 [salt.state                                                       ][INFO    ][30342] Made the following changes:
'httpd24' changed from 'absent' to '2.4.37-1.83.amzn1'

2019-01-16 06:27:44,334 [salt.state                                                       ][INFO    ][30342] Loading fresh modules for state activity
2019-01-16 06:27:44,395 [salt.state                                                       ][INFO    ][30342] Completed state [httpd24] at time 06:27:44.395201 duration_in_ms=8040.531
2019-01-16 06:27:44,406 [salt.state                                                       ][INFO    ][30342] Running state [service httpd restart && /sbin/chkconfig httpd on] at time 06:27:44.406191
2019-01-16 06:27:44,406 [salt.state                                                       ][INFO    ][30342] Executing state cmd.run for [service httpd restart && /sbin/chkconfig httpd on]
2019-01-16 06:27:44,410 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30342] Executing command 'service httpd restart && /sbin/chkconfig httpd on' in directory '/root'
2019-01-16 06:27:44,557 [salt.state                                                       ][INFO    ][30342] {'pid': 30390, 'retcode': 0, 'stderr': '', 'stdout': 'Stopping httpd: [FAILED]\r\nStarting httpd: [  OK  ]'}
2019-01-16 06:27:44,558 [salt.state                                                       ][INFO    ][30342] Completed state [service httpd restart && /sbin/chkconfig httpd on] at time 06:27:44.558111 duration_in_ms=151.919
2019-01-16 06:27:44,569 [salt.state                                                       ][INFO    ][30342] Running state [/etc/httpd/foo.crt] at time 06:27:44.569892
2019-01-16 06:27:44,571 [salt.state                                                       ][INFO    ][30342] Executing state file.managed for [/etc/httpd/foo.crt]
2019-01-16 06:27:44,585 [salt.state                                                       ][INFO    ][30342] File changed:
New file
2019-01-16 06:27:44,586 [salt.state                                                       ][INFO    ][30342] Completed state [/etc/httpd/foo.crt] at time 06:27:44.586325 duration_in_ms=16.433
2019-01-16 06:27:44,586 [salt.state                                                       ][INFO    ][30342] Running state [httpd] at time 06:27:44.586681
2019-01-16 06:27:44,587 [salt.state                                                       ][INFO    ][30342] Executing state service.running for [httpd]
2019-01-16 06:27:44,587 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30342] Executing command '/sbin/service httpd status' in directory '/root'
2019-01-16 06:27:44,705 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30342] Executing command '/sbin/chkconfig --list httpd' in directory '/root'
2019-01-16 06:27:44,713 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30342] Executing command '/sbin/runlevel' in directory '/root'
2019-01-16 06:27:44,725 [salt.state                                                       ][INFO    ][30342] The service httpd is already running
2019-01-16 06:27:44,725 [salt.state                                                       ][INFO    ][30342] Completed state [httpd] at time 06:27:44.725779 duration_in_ms=139.097
2019-01-16 06:27:44,726 [salt.state                                                       ][INFO    ][30342] Running state [httpd] at time 06:27:44.726029
2019-01-16 06:27:44,726 [salt.state                                                       ][INFO    ][30342] Executing state service.mod_watch for [httpd]
2019-01-16 06:27:44,727 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30342] Executing command '/sbin/service httpd status' in directory '/root'
2019-01-16 06:27:44,816 [salt.loaded.int.module.cmdmod                                    ][INFO    ][30342] Executing command '/sbin/service httpd restart' in directory '/root'
2019-01-16 06:27:45,016 [salt.state                                                       ][INFO    ][30342] {'httpd': True}
2019-01-16 06:27:45,016 [salt.state                                                       ][INFO    ][30342] Completed state [httpd] at time 06:27:45.016718 duration_in_ms=290.689
2019-01-16 06:27:45,018 [salt.minion

@waynew I finally had time to come up with a streamlined setup of what we are doing here when the problem occurs. I'm attaching a tarball that makes it 100% reproducible with the help of terraform and aws (only free tier resources are used). Running the exhibit.sh script will do the trick and it should be fairly easy to follow.

In short, it spins up an ubuntu server, install the latest salt in master-less mode and copy the states files and pillar and apply the highstate. Then it will change a pillar value and re-apply the highstate at which time the bug will be triggered and the service not-restarted as expected.

In this case, it boils down to a bad interaction between onchanges and watch. Here is the sls for reference:

/tmp/test/index.html:
  file.managed:
    - makedirs: True
    - contents:
      - Hello World
{# this will break the watch on `test_running` #}
    - onchanges_in:
      - test_running

test_systemd_unit:
  file.managed:
    - name: /etc/systemd/system/test.service
    - source: salt://test.service.jinja
    - template: jinja
    - mode: 600
  module.run:
    - name: service.systemctl_reload
    - onchanges:
      - file: test_systemd_unit

test_running:
  service.running:
    - name: test
    - enable: True
    - watch:
      - module: test_systemd_unit

After changing the pillar, the salt-call output is:

local:
----------
          ID: /tmp/test/index.html
    Function: file.managed
      Result: True
     Comment: File /tmp/test/index.html is in the correct state
     Started: 15:48:28.159757
    Duration: 26.06 ms
     Changes:
----------
          ID: test_systemd_unit
    Function: file.managed
        Name: /etc/systemd/system/test.service
      Result: True
     Comment: File /etc/systemd/system/test.service updated
     Started: 15:48:28.186076
    Duration: 24.858 ms
     Changes:
              ----------
              diff:
                  ---
                  +++
                  @@ -9,7 +9,7 @@
                   Type=simple
                   Restart=on-failure
                   WorkingDirectory=/tmp/test
                  -ExecStart=/usr/bin/python3 -m http.server 8080
                  +ExecStart=/usr/bin/python3 -m http.server 8081

                   [Install]
                   WantedBy=multi-user.target
----------
          ID: test_systemd_unit
    Function: module.run
        Name: service.systemctl_reload
      Result: True
     Comment: Module function service.systemctl_reload executed
     Started: 15:48:28.212099
    Duration: 474.369 ms
     Changes:
              ----------
              ret:
                  True
----------
          ID: test_running
    Function: service.running
        Name: test
      Result: True
     Comment: State was not run because none of the onchanges reqs changed
     Started: 15:48:28.687950
    Duration: 0.015 ms
     Changes:

Summary for local
------------
Succeeded: 4 (changed=2)
Failed:    0
------------
Total states run:     4
Total run time: 525.302 ms

As you can see, the test_systemd_unit is triggered and produces changes which is picked-up by the onchanges on module.run that does reload the systemd unit file but then the test_running state is not triggered, pretending that none of the onchanges reqs changed. This is technically true as its onchanges points to the file.managed of /tmp/test/index.html which didn't change but it happily ignores the watch which does trigger it correctly if you remove the onchanges_in of the index.html.

Cheers,

Olivier

@cmclaughlin Okay, so I discovered that you are definitely finding some unexpected behavior! It took a while for me to nail down exactly what's happening - I'm not sure if it's just a bug in our documentation, or if it's an actual bug.

What's happening here is kind of crazy.

If you remove the enable: True, or split it into

enable-apache:
  service.enabled:
    - name: httpd24

Everything works 🎉

The question you might have then is, "Buy _why_??"

Which is a darn good question 🤣

Turns out that the answer is pretty simple though - the only thing that's going wrong here is that by starting (but not enabling) a service, when Salt looks at the service it says, "Ah, yeah, running is good. Oh, but let me enable this service. Cool, I've made some changes to it. What, you have a watch? No, I don't need to do anything, I've already changed this service!"

Oops :trollface:

I definitely agree that it seems strange, or perhaps even unintuitive, but it is at least consistent, and now I understand exactly why 🙂


Just a had a quick conversation, and this is definitely intended behavior - I'm going to see about adding some information to our docs that calls it out, and perhaps gives some better explanation.

@waynew can you give an overview why it's intended behavior here?

@cmclaughlin Sure!

Watch looks for changes in one state, and if the watching state hasn't changed, calls triggers it's watching behavior.

If A is the watched state, and B is the watcher:

Does A have changes? Does B need changes? Tell B to change.
Does A have changes? Has B already changed? Cool cool, nothing to do here.

That's it. That's what watching does.


In the confusing case we have the following:

Does the conf file have changes? Yeah, cool, does apache have changes? Oh, we enabled the service? Yeah, cool, nothing more to do here.

It's following the pattern, because _enabling a service is a change_. And this change happened after the config file was changed, so as far as watch is concerned, everything is A-OK and in the correct state.

It's confusing because we all expected service.running to be what was watched, but it's actually the state apache-running, and since the state had changes... It's a bit like a magician using misdirection, but instead of surprise and delight, we were all just annoyed 😝

The takeaway here is that the watcher is only required to change if it has not already.

apache-running had changes, and from the watch perspective, there's no way to distinguish if that change was service started, or service enabled. I don't even think there should be. I was thinking about the idea of creating an RFC to allow states to produce an "ignore this change for watches", but that would create terribly inconsistent behavior. Consider these examples:

apache enabled:
  service.enabled:
    - name: httpd24

apache running:
  service.running:
    - name: httpd24

And

apache running and enabled:
  service.running:
    - name: httpd24
    - enable: True

If we want this to match our misunderstanding then service.running should be triggered if we just enable the service, right? Okay, say we do that.

What if someone wants to add watch_in to their apache enabled state? Are we going to say that because it's standalone it gets executed, but when it's part of another state it doesn't? Or are we just blacklisting enable service from watches/requisites?

There's really no consistent way we could change the behavior.

I do think that it's worth adding this kind of example to our docs, to explicitly call out the fact that watching is only triggered when a state has no changes, and that those changes may not match expectations.

Does that all make sense, or is there still some confusion?

Thanks, but it still doesn't really make sense to me.

I can see how there's a difference in behavior between watches on service.running and service.enabled... but if enabled is going to be a supported option on service.running I would expect it to behave exactly like a separate call to service.enabled.

The part about "enabling a service is a change" so the restart doesn't happen is totally counterintuitive to me. I would expect when file.managed is run, that's when the service status would be checked... and Salt would see the service is running, therefore a restart is required. From my end-user perspective, I don't see any reason why Salt can't enable the service and restart it. Why can only one change happen?

Thanks, but it still doesn't really make sense to me.

I think you actually just explained where our disconnect is 🙂 Let's see if it helps:

The part about "enabling a service is a change" so the restart doesn't happen is totally counterintuitive to me. I would expect when file.managed is run, that's when the service status would be checked...

That's a totally natural assumption... but how would you architect that? Would you couple file.managed to service.running? Where would you stop that coupling?

and Salt would see the service is running, therefore a restart is required. From my end-user perspective, I don't see any reason why Salt can't enable the service and restart it. Why can only one change happen?

Sure! I'd love to make that possible... but how would I implement that? I started to think about how to make that work but then I ran into this problem. How would you ensure that:

apache enabled:
  service.enabled:
    - name: httpd24

apache running:
  service.running:
    - name: httpd24  

and

apache running and enabled:
  service.running:
    - name: httpd24
    - enable: True

Behave identically when they're watching another state?

I couldn't come up with any good plans - either I ended out with tight coupling, or adding a bunch of extra logic for each individual special case. Or they started making Salt behave inconsistently, which I think is probably the worst thing of all.

Do you have any ideas how we could make that happen?

Was this page helpful?
0 / 5 - 0 ratings