Describe the enhancement:
It is not uncommon to run into applications that expose unfiltered Go runtime metrics at some endpoint. While not all applications runtime metrics can be curated the same way, we can come up with a nice default as a starting point for most applications.
As a proposal:
As of now some of these metrics are being exposed and consumed by metricbeat at etcd and coredns modules at least.
Thank you for opening @odacremolbap, I guess you are referring to modules based on our Prometheus helper?
Could you please link to examples on how etcd and coredns are exposing these?
Yes, but not only prometheus helper.
metricbeats go module uses expvar, gathering JSON formatted metrics.
Here are the links to the modules:
expvar:
https://github.com/elastic/beats/blob/master/metricbeat/module/golang/heap/data.go#L33-L110
etcd:
https://github.com/elastic/beats/blob/master/metricbeat/module/etcd/metrics/metrics.go#L45
coredns:
well, I was inaccurate, we are not using these metrics right now, but this issue was opened because we need to add memory and process related metrics for coredns, and this is
To have an idea of what applications might be exposing, check the links at the issue above.
This is my take on this
Insights
NumGoroutine
should help to measure internal load of applications that spins new routines when processing data
HeapAlloc
All heap objects, even those unreferenced that haven't been GCed yet.
HeapObjects
Not sure on this one, I think HeapAlloc give us the info we need.
NextGC
Give us an idea of how much memory is the runtime counting on before launching the next GC cycle
NumCgoCalls
This is probably very specific.
GC debug metrics (https://golang.org/pkg/runtime/debug/#ReadGCStats)
These are interesting. Probably HeapAlloc + NextGC covers most of the information that most applications will need by default.
Proposal
I wouldn't go _very minimal_ adding metrics that are not specific to the application by default, let's say
Edit:
I think using Alloc is closer to what users might expect. Choosing it in spite of HeapAlloc
Sounds reasonable to me :+1:
It would be interesting to standardize the way we get these. Would all of them go into the same field names?
+1 on standardising these. I'm wondering on the naming scheme we should standardise these into. Perhaps under the go prefix? If yes, makes me wonder if we should do something simliar for other languages like Java with JVM metrics. Or should HeapAlloc for Java and Go be the in the same field?
there are nuances on how each virtual machine (Go, JVM) measure each of these components. I read that Go will be adding new functions under the runtime package because of some _inaccuracies_
this means that we can start abstracting as much as possible, but there is a decent chance that at some point users will request more specific metrics that might be a lot more tied to the runtime used.
To categorize/standardize, we need to check at least JMX and WMI.
Then come up with the abstract categories (memory, threading, GC, ...(?) ) and probably make room for specific ones. It could be something like
memory.heap_alloc
memory.heap.objects_count
memory.gc.syncs_total
threading.thread_count
threading.lock_count
...
threading.go.routines
(I'm not proposing that structure above, that's a light headed draft)
wondering ... is APM by any chance capturing some of that data?
sounds like we need to scrape what they excel at ... @roncohen
To kick this off and keep it moving, we could first prefix all the metrics with jmx.*, wmi.* and in this case go.*. As soon as we have these we can standardise and potentially make it even part of ECS (@webmat ).
The alternative is that we already abstract out the most obvious one now and but keep the more specific ones under a prefix like go.*. This would be similar to what we do in log modules today with the fields that don't fit into ECS.
If we generlise, I think we should put all of them under a common prefix like runtime.* or virtualmachine.* (other suggestions?).
I like a lot more your second suggestion, if both are weighted the same.
Your second suggestion would group all these runtime metrics under a common prefix, which IMO would make it easier for users to notice. I like a lot more using runtime than virtualmachine, since VM would be inaccurate for non GC/frameworked languages.
this (which if I understood correctly you are proposing) looks gently grouped:
runtime.memory.*
runtime.process.*
runtime.jmx.*
runtime.wmi.*
runtime.go.*
I'll change the go _namespacing_ here, I like that runtime.go proposal better than the nesting under runtime.process I was doing
Yeah I like the idea of nesting under runtime.*
closing this one (very specific to go) in favor of #11836
Most helpful comment
I like a lot more your second suggestion, if both are weighted the same.
Your second suggestion would group all these runtime metrics under a common prefix, which IMO would make it easier for users to notice. I like a lot more using
runtimethanvirtualmachine, since VM would be inaccurate for non GC/frameworked languages.this (which if I understood correctly you are proposing) looks gently grouped:
I'll change the
go_namespacing_ here, I like thatruntime.goproposal better than the nesting underruntime.processI was doinghttps://github.com/elastic/beats/blob/36fd113203691a2ac2640e87cf2ecacd24308bc9/metricbeat/module/coredns/stats/stats.go#L45-L47