This is a simple Salt state to enable and start jenkins service:
# jenkins.sls
activate_jenkins_service:
service.running:
- name: jenkins
- enable: True
Official Jenkins installation on RedHat/CentOS/Fedora uses init.d/sysv scripts.
Manually enabling and starting through init.d/sysv scripts perfectly works even on systemd-based Fedora 20:
systemctl enable jenkins
jenkins.service is not a native service, redirecting to /sbin/chkconfig.
Executing /sbin/chkconfig jenkins on
systemctl start jenkins
On the other hand, Salt fails to execute the state:
salt-call -l all state.sls jenkins
...
ID: activate_jenkins_service
Function: service.running
Name: jenkins
Result: False
Comment: The named service jenkins is not available
Changes:
...
The problem stems from the fact that Salt executes systemctl list-unit-files command which only lists systemd unit files excluding init.d/sysv scripts:
...
[INFO ] Executing state service.running for jenkins
[INFO ] Executing command 'systemctl --full list-unit-files | col -b' in directory '/root'
...
Because Salt doesn't see required jenkins service in the list of unit files, it doesn't pass next execution to systemctl for enabling/starting/... the service and does't let systemctl to tell "authoritatively" about actual existence of the service.
This issue is very closely related to issue #8444 (as far as the proposed solution is concerned) and described in this comment.
Rather than executing any pre-validation logic (i.e. finding service name somewhere), Salt should rely on systemd (and its systemctl command) to determine whether states to enable/start/... the service failed or succeeded. In other words, Salt should execute systemctl with any arbitrary service name optimistically and report result of the execution instead of trying to predict its outcome.
Again, see it in issue #8444.
Master and minion is the same host with Fedora 20 x86_64:
salt --versions-report
Salt: 2014.1.1
Python: 2.7.5 (default, Feb 19 2014, 13:47:28)
Jinja2: 2.7.1
M2Crypto: 0.21.1
msgpack-python: 0.1.13
msgpack-pure: Not Installed
pycrypto: 2.6.1
PyYAML: 3.10
PyZMQ: 13.0.2
ZMQ: 3.2.4
I agree with what you're saying here. Let's try and get this in.
@mtorromeo says this should be fixed by #11921. @uvsmtid can you verify?
@cachedout and @mtorromeo Thanks for updates!
I cherry-picked both 90bece1 and 9617d33 on top of 2014.1 (latest develop had some unrelated issues) in my virtualenv.
I used similar state mentioned there in its example
activate_vpn_service:
service.running:
- name: [email protected]
- enable: True
Indeed, commit 9617d33 handles @ in systemd unit names to make it work.
And while it still uses systemctl --full list-units command (see problems for init.d/sysv service next), parameterized services were listed in all my tries.
See example of Jenkins service state in the beginning of this issue.
After variations with enable/disable start/stop I can conclude that it doesn't work in general case. And here is why...
The code after commit 90bece1 still uses command systemctl --full list-units which simply does not list init.d/sysv until they are started on the system (only when they are started: enable/disable won't affect anything).
For example, start jenkins service manually and try to list it:
sudo systemctl start jenkins
systemctl --full list-units | grep jenkins
jenkins.service
# OK
Then stop jenkins service manually and execute:
sudo systemctl stop jenkins
systemctl --full list-units | grep jenkins
# ERROR: no output captured by grep
Although it seems more like an issue with systemd (I have even updated it here), the fastest fix is still possible through salt only. The argument is that systemctl --full list-units is not required to manage service.
@uvsmtid This is great feedback, thank you! I'll go ahead and close #8444 then and we'll keep working on this one.
This is still broken in 2014.1.10 on Fedora-20. While there is a kludgy workaround, this really does need to be fixed. the workaround, for those in a CICD environment who need to clear out any blocks in their pipeline, is ugly but works (this example is for Centrify, which also uses sysv init-style files but is managable with systemd under FC20):
centrify-service:
service.running:
- name: centrifydc
- enable: True
- reload: True
- watch:
- file: /etc/centrifydc/centrifydc.conf
- require:
- pkg: centrify-packages
- file: centrify-config
- cmd: centrify-adjoin
{%- if salt['grains.get']('osfinger', 'undefined') == 'Fedora-20' %}
- provider: service
{%- endif %}
Hello,
A little update for a strange thing :
salt-call service.available registrator.service
[INFO ] Executing command 'systemctl --all --full --no-legend --no-pager list-units | col -b' in directory '/root'
[INFO ] Executing command 'systemctl --full --no-legend --no-pager list-unit-files | col -b' in directory '/root'
[INFO ] Legacy init script: "README".
[INFO ] Legacy init script: "functions".
[INFO ] Legacy init script: "netconsole".
[INFO ] Legacy init script: "network".
local:
False
But :
salt-call service.available registrator
[INFO ] Executing command 'systemctl --all --full --no-legend --no-pager list-units | col -b' in directory '/root'
[INFO ] Executing command 'systemctl --full --no-legend --no-pager list-unit-files | col -b' in directory '/root'
[INFO ] Legacy init script: "README".
[INFO ] Legacy init script: "functions".
[INFO ] Legacy init script: "netconsole".
[INFO ] Legacy init script: "network".
local:
True
Why don't support the ".service" ? On systemd both are working :/
(and it's make me a little headache to find this...)
This happens to me when using hadoop-formula's hadoop.hdfs state. It starts three different services. The first service started by the highstate during a fresh run is not found. The rest of the services are found and function as normal. A second highstate run proceeds normally. This possibly indicates that Salt is reloading systemd later in the process than needed.
State:
{% if hdfs.is_namenode or hdfs.is_datanode %}
hdfs-services:
service.running:
- enable: True
- names:
{% if hdfs.is_namenode %}
- hadoop-secondarynamenode
- hadoop-namenode
{% endif %}
{% if hdfs.is_datanode %}
- hadoop-datanode
{% endif %}
{% endif %}
I also extend hdfs-services with provider: debian_service. I've tried it with the default for Debian Jessie (provider: systemd) with same results.
/var/log/salt/minion:
[INFO ] Executing command 'service hadoop-namenode status' in directory '/root'
[ERROR ] Command 'service hadoop-namenode status' failed with return code: 3
[ERROR ] output: * hadoop-namenode.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
[INFO ] Executing command 'service hadoop-namenode start' in directory '/root'
[ERROR ] Command 'service hadoop-namenode start' failed with return code: 6
[ERROR ] output: Failed to start hadoop-namenode.service: Unit hadoop-namenode.service failed to load: No such file or directory.
Versions report:
Salt: 2015.5.0
Python: 2.7.9 (default, Mar 1 2015, 12:57:24)
Jinja2: 2.7.3
M2Crypto: 0.21.1
msgpack-python: 0.4.2
msgpack-pure: Not Installed
pycrypto: 2.6.1
libnacl: Not Installed
PyYAML: 3.11
ioflo: Not Installed
PyZMQ: 14.4.0
RAET: Not Installed
ZMQ: 4.0.5
Mako: 1.0.0
Debian source package: 2015.5.0+ds-1~bpo8+1
I would like to debug this further but haven't debugged Salt much since I switched from Salt SSH to Master/Minion setup. Suggestions?
Running, CentOS 7, Salt version 2015.8.8.2. Cassandra is affected by this as well. As a work around, running this kludge works:
cassandra_kludge:
cmd.run:
- name: systemctl enable cassandra
- unless: systemctl -a | grep cassandra
cassandra_service:
service.running:
- name: cassandra
- init_delay: 10
- require:
- cmd: cassandra_kludge
This even made me update the bug in systemd again.
My test still confirm that there is no known way by systemd to list disabled services based on init.d/sysv scripts. The best current solution would be enabling/starting/stopping/disabling service and checking error code returned by systemctl - it will fail if there is no such service, but it will succeed if there is one without need to know upfront about it.
I have discovered somewhat similar problem on Debian Jessie, when I deploy new sysv script and try to use service.running state. I get:
2016-11-28 13:59:10,206 [salt.state ][INFO ][1092] Running state [pgbouncer-web-login] at time 13:59:10.205771
2016-11-28 13:59:10,207 [salt.state ][INFO ][1092] Executing state service.running for pgbouncer-web-login
2016-11-28 13:59:10,209 [salt.loaded.int.module.cmdmod][INFO ][1092] Executing command ['systemctl', 'status', 'pgbouncer-web-login.service', '-n', '0'] in directory '/root'
2016-11-28 13:59:10,229 [salt.loaded.int.module.cmdmod][DEBUG ][1092] output: * pgbouncer-web-login.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
2016-11-28 13:59:10,230 [salt.state ][ERROR ][1092] The named service pgbouncer-web-login is not available
Whole idea is to create /etc/init.d/pgbouncer-web-login daemon which is modification (copy) of /etc/init.d/pgbouncer (pgbouncer does not yet support systemd), but with different ports, configs, etc. because of need to have multiple pgbouncer pools, but that's details.
I had no problem on Wheesy, but on Jessie with systemd it seems that I have to execute systemctl daemon-reload (using module.wait -> cmd.run) to make new init.d script "visible" and service.running to work.
But does that mean that service.running should always reload systemd configuration? Would it be.. "bad" in any case?
Still see this same issue:
Salt Version:
Salt: 2016.3.5
Dependency Versions:
cffi: 0.8.6
cherrypy: Not Installed
dateutil: 1.5
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.7.2
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: 0.21.1
Mako: 0.8.1
msgpack-pure: Not Installed
msgpack-python: 0.4.8
mysql-python: 1.2.5
pycparser: 2.14
pycrypto: 2.6.1
pygit2: Not Installed
Python: 2.7.5 (default, Sep 15 2016, 22:37:39)
python-gnupg: Not Installed
PyYAML: 3.11
PyZMQ: 15.3.0
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.2.1
ZMQ: 4.1.4
System Versions:
dist: centos 7.2.1511 Core
machine: x86_64
release: 4.4.52-2.el7.centos.x86_64
system: Linux
version: CentOS Linux 7.2.1511 Core
Using a very simple file.managed + service.running/enable
/etc/init.d/vxlan:
file.managed:
- source: salt://services/vxlan/vxlan
- user: root
- group: root
- mode: 755
- require_in:
- service: vxlan
vxlan:
service.running:
- enable: True
If I chkconfig --add vxlan and then re-run these states, no problem. BTW, this appears to be a regression as I don't recall having this issue in 2016.3.4. I haven't tested 2016.11.3, which came out today, as we're not quite ready to move to that yet. Although, I'm inclined to just change this to a systemd service given I have full control over this one regardless of the bug in salt.
I ran into the same issue with cassandra init service on centos7. @gtmanfred suggested using the provider option for service.running which fixed the issue for me.
https://docs.saltstack.com/en/latest/ref/states/providers.html
```
start cassandra:
service.running:
- name: cassandra
- provider: rh_service
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.
Most helpful comment
Running, CentOS 7, Salt version 2015.8.8.2. Cassandra is affected by this as well. As a work around, running this kludge works: