Salt: Expose Prometheus metrics

Created on 8 Oct 2018 · 12Comments · Source: saltstack/salt

Description of Issue/Question

Prometheus is quickly becoming de-facto for monitoring and alerting. The model of pull instead of push would fit well in the salt-master/salt-minion topology. Exposing metrics in salt-minion and salt-master would allow more flexibility in monitoring large salt environments.

Here is a list of other software that have metrics exposed:
https://prometheus.io/docs/instrumenting/exporters/

As an example of metric data that could be exposed by the minion:

salt_minion_last_run_state_completion_time
salt_minion_last_run_state_executed
salt_minion_last_run_state_error
salt_minion_last_run_state_success

Feature

Source

gnalsa

👍16

Most helpful comment

I would like to suggest some possible salt-master side metrics to expose.

salt_master_keys{key_state="accepted"}
salt_master_gitfs_lock
salt_master_number_of_scheduled_jobs
salt_master_number_of_threads
salt_master_number_of_jobs_active
salt_master_number_of_minions_return
salt_master_running_process
salt_syndic_running_process
salt_syndic_master_sync

I think these are some of the things I would like to be able to track and possibly alert on. For the masters, part of it is knowing the master is healthy, but then also being able to track load over time as more minions are added to it. These types of metrics can help figure out that right balance of resources and scale.

justindesilets on 20 Aug 2020

👍6

All 12 comments

There is already an exporter out there https://github.com/BonnierNews/saltstack_exporter

But i don't see a reason to not add an engine that can do this inside of salt.

Marked as a feature request.

gtmanfred on 9 Oct 2018

:+1: I would really appreciate internal metrics.

@gtmanfred Yes there is an exporter, but it is necessary to run regular dry Highstates to get the data you want. The default is every 5 minutes, which also makes sense if you want to see the changes which would happen on your next Highstate before you run it yourself.
But in large environments running dry Highstates every 5 minutes will cause many side effects (blocked minions, high load on the Master,...) which will affect you system in a negative way.

Real internal metrics which are being collected continuously are a whole different and much more reliable story.
Additionally I would like to have the internal metrics not only for the minion but also for the master service to be able to get an idea how many minions are connected, how many states are being run over time and what the success ratio is.

Does it make sense to start a list which metrics we want to have from such an internal metrics endpoint before somebody starts implementing?

Armadill0 on 19 Feb 2019

This sounds like the job for a custom engine, that would be super awesome if it were contributed back to salt, but is probably something that the community will need to do.

gtmanfred on 19 Feb 2019

@gtmanfred Does this exist in Saltstack Enterprise? It sounds like it must...

anitakrueger on 23 Sep 2019

👍1

I don't really know anything about enterprise. I also no longer work for salt.

gtmanfred on 23 Sep 2019

Oh sorry @gtmanfred didn't realize. Hope you are well though!

anitakrueger on 24 Sep 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

stale[bot] on 7 Jan 2020

not stale

Armadill0 on 8 Jan 2020

Thank you for updating this issue. It is no longer marked as stale.

stale[bot] on 8 Jan 2020

Any progress?

b-a-t on 26 May 2020

This is very much needed for our use case. We have a few multi-master saltstack deployments each with few hundred minions connected with automation. Monitoring based on prometheus, exporter, grafana would help a lot.

kphatak on 28 May 2020

👍3

I would like to suggest some possible salt-master side metrics to expose.

salt_master_keys{key_state="accepted"}
salt_master_gitfs_lock
salt_master_number_of_scheduled_jobs
salt_master_number_of_threads
salt_master_number_of_jobs_active
salt_master_number_of_minions_return
salt_master_running_process
salt_syndic_running_process
salt_syndic_master_sync

justindesilets on 20 Aug 2020

👍6

Was this page helpful?

0 / 5 - 0 ratings