Example:
On a single machine, I have 10 instances of some server software.
There's a stop task and start task which either stops or starts all of the instances.
It takes 5-10 seconds for a single instance to cleanly shutdown.
The result is that the script will take ~ 60 seconds to execute while the parallel bash version takes just 5 seconds...
(of course you can write such a task as a pure-python module and don't clutter the SLS files...)
I came up with a cool parallel execution module[that allows to keep the usual SLS syntax] that I used ever since so I wanted to hear some opinions upstream about it.
It appends output from all tasks to main tasks's output.
Unfortunately it doesn't track execution time per task.
I tried to use state.single which delivers a nice output and tracks execution time, but it made the module 10x time slower for some reason
# processes[name] = pool.apply_async(salt['state.single'], ("cmd.run","echo hello!",), {"concurrent":True})
(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)
_states/multiexec.py
----
from multiprocessing.pool import ThreadPool
def multiexec_state_sls(name, tasks):
ret = {"name": name, "changes": {}, "result":True}
processes = {}
pool = ThreadPool(processes=20) # TODO: allow to specify this
ok = True
tasks = tasks or {}
for name, cmd in tasks.items():
_name = name # A little bit of magic to handle int keys...
name = str(name)
log.info("Executing command @ " + name + " = " + str(cmd))
# commands:
# task_1:
# cmd.run:
# - name: echo Testing!
task = list(cmd.keys())[0] # One task per ID only
args = {}
for x in cmd[task]:
args.update(dict(x))
if not "name" in args:
args["name"] = name
log.error("cmdtask = " + str(task))
log.error("cmd = " + str(cmd[task]))
log.error("args = " + str(args))
processes[_name] = pool.apply_async(__states__[task], (), args ) # (cmd, [args], {kwargs})
for key, p in processes.items():
out = p.get()
if not out["result"]:
ok = False
ret["changes"][key] = { "out": out, "cmd": tasks[key] }
if not ok:
ret["comment"] = "Some tasks failed"
ret["result"] = False
return ret
ret["comment"] = "Tasks successfully executed!"
return ret
----
Example task:
Stop servers:
multiexec.multiexec_state_sls:
- tasks:
{% for server in target_servers %}
{{ sector_name}}:
cmd.run:
- sector: cd /home/servers/{{ server }} && bash stop.sh
{% endfor %}
(Include debug logs if possible and relevant.)
Salt 2016.11.1,
@ninja- would like to get others on the core team's input here on this issue with more knowledge around this. ping @cachedout @terminalmage @DmitryKuzmenko @rallytime any opinions/ideas on this module?
We'd lose the ability to enforce requisites on individual tasks. As long as this caveat is stated plainly in the documentation, I'd have no problem with this. Given that we're using the __states__ dunder dict to reference the function, including requisite arguments such as require, onchanges etc. would nearly always result in tracebacks, so we'd probably want to filter them out. salt/state.py contains an attribute called STATE_INTERNAL_KEYWORDS, any keyword in this list which does not begin with an underscore would need to be filtered out before we execute the function. However, we would not want to filter them out if the argument is supported by the function, so we'd want to use something like argspec to look for positional parameters in state funcs which are also in the STATE_INTERNAL_KEYWORDS.
One thing I'm worried about is that there are (or at least were) known issues with multiprocessing on Windows. I'm not sure of the specifics on this though, maybe @cachedout or @thatch45 would have more insight.
ok, here is how to do this and make it work really nice.....
In the state system when we execute a state we wait for the return from that state and then add those returns to a dict called running. When we evaluate a state to see if its requisites are met we check this running dict.
So what we could do is allow a global flag in states so a user could say parallel: True in the sls, and then we would run that state in another process. The way to make requisites work would be to add the parallel running state data in another dict so that if the requisite is found to be running then we just sleep for say 0.1 seconds and check again before checking if it is done or not yet.
Then you could just mark parallel: True for the states you want to run in parallel and requisites would be honored. Also this would potentially destroy the value of the "order" option, but that would logically be implied with the "parallel" option.
What do you think?
Oooh. I like that idea.
Closing as it's now implemented. Yay 馃槂
Backflip completed, I am really excited about this, we have been trying to figure out an elegant way to make parallel states for a LONG time! This is going to be fantastic!
Most helpful comment
ok, here is how to do this and make it work really nice.....
In the state system when we execute a state we wait for the return from that state and then add those returns to a dict called
running. When we evaluate a state to see if its requisites are met we check thisrunningdict.So what we could do is allow a global flag in states so a user could say
parallel: Truein the sls, and then we would run that state in another process. The way to make requisites work would be to add the parallel running state data in another dict so that if the requisite is found to be running then we just sleep for say 0.1 seconds and check again before checking if it is done or not yet.Then you could just mark
parallel: Truefor the states you want to run in parallel and requisites would be honored. Also this would potentially destroy the value of the "order" option, but that would logically be implied with the "parallel" option.What do you think?