salt 🚀 - master/minion should accept a SIGHUP and reload config

this might be trickier than just that, at the least salt should reload the modules on the minion

thatch45 on 24 Jan 2012

+1 on this issue. I have plans on modifying the salt master's configuration very, very often, and would prefer to reload the config rather than needing to restart the daemon (which would cause salt outages).

ryan-lane on 20 Jun 2012

I will slate this for the next release then and investigate how viable it will be

thatch45 on 20 Jun 2012

I have added some code to help mitigate some issues when restarting the salt master. I am still looking into how to manage this one

thatch45 on 9 Jul 2012

I am running into this issue with upstart. When upstart signals a reload it sends SIGHUP to the parent salt-master. It then promptly dies without error or debug, orphaning all of the children salt-masters. Upstart then thinks the process is dead, leaving lots of orphaned children.

subsequent "start saltmaster" immediately dies off as the port is still used by the orphaned children

At the very least, the salt-master should catch the signal and kill off its children such as

salt/master.py:
  Master.start() (line 343)
        signal.signal(signal.SIGHUP, sigterm_clean)

steamraven on 29 Jun 2013

Ooh, thanks for the report -- even if we don't properly reload, we need to handle SIGHUP more gracefully than that.

basepi on 1 Jul 2013

I don't want to be just another +1, but I too am planning to modify the configuration files. Mainly I already have an /etc/salt/master.d/nodegroups.conf file where I define my nodegroups (I am still not convinced by the available external node classifiers available). As such, any modification of said file must trigger a salt-master restart which, if I'm in the middle of a highstate call, triggers not-nice things to happen.

I tried to read salt's code to see if it was relatively easy to add this, but alas I did not understand much :)

godlike64 on 3 Jan 2015

Actually I've been digging around the daemons quite a bit recently and although I don't think this will be _easy_, it should be possible. I'll add this to my list of things to hack on ;)

jacksontj on 5 Jan 2015

Thr reason this does not work is that catching signals in python inturupts
the zmq threads and we get crashes of lost packets. We hope to get this
working with salt raet once we have implimented more of the major
components

On Sun, Jan 4, 2015, 22:12 Thomas Jackson [email protected] wrote:

Actually I've been digging around the daemons quite a bit recently and
although I don't think this will be _easy_, it should be possible. I'll
add this to my list of things to hack on ;)

Reply to this email directly or view it on GitHub
https://github.com/saltstack/salt/issues/570#issuecomment-68669161.

thatch45 on 5 Jan 2015

We already catch quite a few signals in the daemons and IIRC we don't have stuff exploding all over... I think the larger problem is going to be the varying classes that keep copies of the opts dict all over creation ;)

jacksontj on 5 Jan 2015

Good point. But just keep that in mind. And generally we are only catching
sigint and sigterm which do not care if there are zmq issues.
Just making sure you know the issues we have already seen here. But this is
one reason why we daemonize child procs instead of using sigchld. Which
will also cause isses with propogating the sighup

On Sun, Jan 4, 2015, 22:28 Thomas Jackson [email protected] wrote:

We already catch quite a few signals in the daemons and IIRC we don't have
stuff exploding all over... I think the larger problem is the varying
classes that keep copies of the opts dict all over creation ;)

Reply to this email directly or view it on GitHub
https://github.com/saltstack/salt/issues/570#issuecomment-68669808.

thatch45 on 5 Jan 2015

Yea, I'm thinking we'll want the parent to catch the sighup and coordinate the reload amongst the children. I have some ideas, but I'll need to make some time to mess with it :)

jacksontj on 5 Jan 2015

+1: I'm looking for a way to reload the nodegroups configuration only, without restarting the whole salt-master

bbinet on 2 Jul 2015

@bbinet Pretty sure that currently works. At least, last time I tried. Nodegroups don't require master restart.

basepi on 2 Jul 2015

It does not work for me:

$ salt -c /config -N a test.ping
No minions matched the target. No command was sent, no jid was assigned.

The strange thing is that in the master logs, I can see that the a nodegroup had actually matched the correct list of minions:

2015-07-03 12:16:48,382 [salt.master      ][INFO    ][178] Clear payload received with command publish
2015-07-03 12:16:48,384 [salt.utils.minions][DEBUG   ][178] Evaluating final compound matching expr: ( ( set(['hl-lxc-1-dev', 'hl-mc-9999-dev', 'hl-mc-3-dev', 'hl-mc-8888-dev', 'hl-mc-4-dev', 'hl-mc-5-dev']) ) & ( set(['cm-mc-1-dev']) ) )

But I don't know why we then get the No minions matched the target message.

Note that it works correctly when I target a nodegroup that was already existing when the salt-master was started.
Should I create a new issue specifically for that?

bbinet on 3 Jul 2015

If you look at that log line, that expression will evaluate to an empty set. It's &ing two sets, which have no strings in common. This is why it says no minions were matched. Are you certain your new nodegroup actually does match minions?

basepi on 8 Jul 2015

You're right, so this is working as expected.
Thanks and sorry for the noise.

bbinet on 8 Jul 2015

No problem, just wanted to make sure we fixed a bug if there was one! ;)

basepi on 8 Jul 2015

Is it possible today to refresh just the gitfs_remotes or the file roots without restarting the salt-master?

sametimesolutions on 15 Oct 2015

@sametimesolutions No, not yet.

eliasp on 15 Oct 2015

Is there any update for this, to reload salt-minion without restarting?

marek-obuchowicz on 26 Jan 2016

Unfortunately no, this is a very hard problem to solve that would require intensive re-architecting for the minion/master. It's hard to propagate reloaded configs across the various processes that are a part of the master, for one.

When it comes to the minion, a restart should not be so onerous. Minion restarts are fairly lightweight and minimize downtime. The master is where the problems are, since all the minions need to re-auth once the master comes back up, which can cause some load issues.

basepi on 26 Jan 2016

For us restart is very problematic. We need to reload minion after setting up new VM to enable reporting data with mine mechanism. This has to be executed as one of the first steps, so we can't use the "atd" scenario with delayed restart of salt-minion as a last step. When we execute highstate, it first fails (because minion was restarted) - so we have to run it twice on fresh machine.

marek-obuchowicz on 26 Jan 2016

👍3

Have you considered configuring the mine in pillar rather than in the minion config? It solves this very problem.

basepi on 26 Jan 2016

Sounds like a great suggestion, i did not know this feature, but looks like very promising. Is it also possible to configure mine_interval with pillar, or are we limited to mine_functions? Even if so, it will be possible to workaround longer mine interval with some sleep command - your help is greatly appreciated!

marek-obuchowicz on 26 Jan 2016

I cannot remember off the top of my head whether mine_interval works configured through pillar. It might, I just can't remember.

basepi on 26 Jan 2016

@marek-obuchowicz We work around this problem by using a set of startup_states. Within those is a simple service watch which restarts the minion when something important changes, works a treat here.

Also, I recently made #6691 happen, so starting in Boron, cmd.run_bg can quite likely also be used in some capacity.

The-Loeki on 9 Mar 2016

Subscribing because this also affects #13558. Although the issue is now >4 years old so I'm guessing not much is happening.

rothgar on 22 Jul 2016

If catching signals is problematic, how about a work around where we use something else (maybe a salt event) to get the config reloaded?

moloney on 6 Oct 2016

To give some clarity-- the reason this is "difficult" is not so much catching the signal (which has some gotchas to deal with), but primarily that all the config values (and various classes) are initialized with certain config values at start-- and we'd need to make it so that these subsystems (the minion, reactor, etc.) would know how to "reload" their config. So, its not impossible-- but will require a decent amount of work to do correctly.

jacksontj on 7 Oct 2016

👍1

+1

omolto on 16 Apr 2017

This is a long thread. Is the feature documented somewhere? What should I do when I add/update new configurations /modules? This is very basic problem for daily tasks. Thanks.

anhtiki on 17 Apr 2017

👍2

I solved my minion-restart problems with a little salt script that copies (using file.managed) a small Python program to the minion to do the restart, then runs it using "cmd.run" with "bg: true".
See the gist.

vernondcole on 27 Apr 2017

ZD-1632

rickh563 on 25 Jul 2017

Is there any work going into this or is this postponed to the future?

tankri01 on 22 Nov 2017

👍1

+1

stdhi on 30 Nov 2017

We @ juniper also got many schedulers configured and using influxdb returner to post data to DB, When we change DB config in /srv/salt/proxy, we have to restart proxy to take the config. With this, all the scheduler are gone. Any way to avoid proxy restart?
@cachedout @cr0hn

vnitinv on 13 Dec 2017

+1

amuhametov on 30 Mar 2018

+1

mustafaocak on 10 Dec 2018

+1

sevencastles on 8 Oct 2019

+1

tjyang on 16 Dec 2019

I know this quote is nearly 5 years old, but this is not correct:

@bbinet Pretty sure that currently works. At least, last time I tried. Nodegroups don't require master restart.

Docs:

When adding or modifying nodegroups to a master configuration file, the master must be restarted for those changes to be fully recognized.
A limited amount of functionality, such as targeting with -N from the command-line may be available without a restart.

https://docs.saltstack.com/en/latest/topics/targeting/nodegroups.html

If you are testing on the CLI: salt -N group1 test.ping, then you are good, but according to the docs not all functionality is available without a restart.

I only bring this up because for a long time now we have abandoned nodegroups due to the fact we needed a restart and when I read this comment and tested it I realized maybe we don't need a restart. I started reading the nodegroup docs again before realizing that my initial thought was correct and that a restart _is_ in fact required.

The docs don't mention the functionality _not_ available, but I assume top file targeting would not be available as that is where I need this.

This is all a moot point anyway because currently there is a bug in 2019.2 that prevents compound matching of NodeGroups.
https://github.com/saltstack/salt/issues/52678

So +1 for a way to reload master/minion configs. I want a way to programatically get a list of minions and assign them roles. This would mean nodegroups would be constantly changing and needing to be restarted multiple times per day.

Paulo-Nunes on 31 Dec 2019

+1

anthosz on 8 Apr 2020

Personally, I am working on my custom reactors. The master have about 200 minions, so it is quite a pain int the @ss to restart the whole master just to apply those changes.
What if we create a patch as a workaround until this will be implemented?

ZsBT on 9 Apr 2020

+1

krusolu on 12 May 2020

+1

xpicio on 5 Nov 2020

Salt: master/minion should accept a SIGHUP and reload config

Most helpful comment

All 46 comments

Related issues