Same metrics as:
# cat /proc/loadavg
0.74 0.50 0.51 2/1215 17143
# HELP node_load1 1m load average.
# TYPE node_load1 gauge
node_load1 6.86
# HELP node_load15 15m load average.
# TYPE node_load15 gauge
node_load15 6.53
# HELP node_load5 5m load average.
# TYPE node_load5 gauge
node_load5 7.04
# HELP node_procs_running Number of processes in runnable state.
# TYPE node_procs_running gauge
node_procs_running 14
The total number of running processes.
It would be nice to see the total number of processes so that it would be easier to detect a sudden jump in running processes.
The counter "node_forks" isn't giving a clear indicator when this happens when a system has applications which spawn/kill a lot of child-processes.
We currently expose procs_running and procs_blocked from the contents of /proc/stat. But that file doesn't contain the metric for all process.
It should be simple enough to extract the process count number from /proc/loadavg.
Any update on this?
So, when thinking about implementing this, the following questions come to mind:
procs_total would sound appropriate. However, I think this is confusing: In reality, The value after the slash is the number of kernel scheduling entities that currently exist on the system. (cf. man 5 proc). In other words, this contains processes and threads. It will not match ps aux | wc -l, but rather ps -eLF | wc -l.```$ curl -s localhost:9100/metrics | grep '^node_processes_threads ' && cat /proc/loadavg && ps -eFL | wc -l
node_processes_threads 459
0.21 0.42 0.42 1/458 6143
460
$ curl -s localhost:9100/metrics | grep '^node_processes_pids ' && ps aux | wc -l
node_processes_pids 159
160
```
(Yes, in both examples the numbers are off by one -- probably due to curl and/or wc running in parallel)
@Lemmy: So, maybe #950 already solves all your needs (once merged)?
Ah, I didn't see #950. That does indeed look like what I'm looking for, and I think this could be closed in favour of #950. Thanks!
Yes I think this is all included in #950. If we miss something, we can re-open it!
Most helpful comment
Yes I think this is all included in #950. If we miss something, we can re-open it!