Beats: Common Go metrics to consider

Created on 8 Apr 2019 · 10Comments · Source: elastic/beats

Describe the enhancement:

It is not uncommon to run into applications that expose unfiltered Go runtime metrics at some endpoint. While not all applications runtime metrics can be curated the same way, we can come up with a nice default as a starting point for most applications.

As a proposal:

we will consider current Go 1.12 here
threading stats will be considered from
- https://golang.org/pkg/runtime/#ThreadCreateProfile
- https://golang.org/pkg/runtime/#NumGoroutine
memory stats will be considered from
- https://golang.org/pkg/runtime/#ReadMemStats
there might be some other internals like
- https://golang.org/pkg/runtime/#NumCPU
- https://golang.org/pkg/runtime/#NumCgoCall

As of now some of these metrics are being exposed and consumed by metricbeat at etcd and coredns modules at least.

Metricbeat Integrations discuss

Source

odacremolbap

Most helpful comment

I like a lot more your second suggestion, if both are weighted the same.

Your second suggestion would group all these runtime metrics under a common prefix, which IMO would make it easier for users to notice. I like a lot more using runtime than virtualmachine, since VM would be inaccurate for non GC/frameworked languages.

this (which if I understood correctly you are proposing) looks gently grouped:

runtime.memory.*
runtime.process.*
runtime.jmx.*
runtime.wmi.*
runtime.go.*

I'll change the go _namespacing_ here, I like that runtime.go proposal better than the nesting under runtime.process I was doing

https://github.com/elastic/beats/blob/36fd113203691a2ac2640e87cf2ecacd24308bc9/metricbeat/module/coredns/stats/stats.go#L45-L47

odacremolbap on 12 Apr 2019

👍2

All 10 comments

Thank you for opening @odacremolbap, I guess you are referring to modules based on our Prometheus helper?

Could you please link to examples on how etcd and coredns are exposing these?

exekias on 8 Apr 2019

Yes, but not only prometheus helper.

metricbeats go module uses expvar, gathering JSON formatted metrics.

Here are the links to the modules:

expvar:
https://github.com/elastic/beats/blob/master/metricbeat/module/golang/heap/data.go#L33-L110
etcd:
https://github.com/elastic/beats/blob/master/metricbeat/module/etcd/metrics/metrics.go#L45
coredns:
well, I was inaccurate, we are not using these metrics right now, but this issue was opened because we need to add memory and process related metrics for coredns, and this is

To have an idea of what applications might be exposing, check the links at the issue above.

odacremolbap on 8 Apr 2019

This is my take on this

Insights

process metrics:

NumGoroutine
should help to measure internal load of applications that spins new routines when processing data

memory related metrics:

HeapAlloc
All heap objects, even those unreferenced that haven't been GCed yet.

HeapObjects
Not sure on this one, I think HeapAlloc give us the info we need.

NextGC
Give us an idea of how much memory is the runtime counting on before launching the next GC cycle

other objects

NumCgoCalls
This is probably very specific.

GC debug metrics (https://golang.org/pkg/runtime/debug/#ReadGCStats)
These are interesting. Probably HeapAlloc + NextGC covers most of the information that most applications will need by default.

Proposal

I wouldn't go _very minimal_ adding metrics that are not specific to the application by default, let's say

NumGoroutine
~~HeapAlloc~~ Alloc
NextGC

Edit:
I think using Alloc is closer to what users might expect. Choosing it in spite of HeapAlloc

odacremolbap on 8 Apr 2019

Sounds reasonable to me :+1:

It would be interesting to standardize the way we get these. Would all of them go into the same field names?

exekias on 9 Apr 2019

+1 on standardising these. I'm wondering on the naming scheme we should standardise these into. Perhaps under the go prefix? If yes, makes me wonder if we should do something simliar for other languages like Java with JVM metrics. Or should HeapAlloc for Java and Go be the in the same field?

ruflin on 9 Apr 2019

there are nuances on how each virtual machine (Go, JVM) measure each of these components. I read that Go will be adding new functions under the runtime package because of some _inaccuracies_

this means that we can start abstracting as much as possible, but there is a decent chance that at some point users will request more specific metrics that might be a lot more tied to the runtime used.

To categorize/standardize, we need to check at least JMX and WMI.
Then come up with the abstract categories (memory, threading, GC, ...(?) ) and probably make room for specific ones. It could be something like

memory.heap_alloc
memory.heap.objects_count
memory.gc.syncs_total
threading.thread_count
threading.lock_count
...

threading.go.routines

(I'm not proposing that structure above, that's a light headed draft)

wondering ... is APM by any chance capturing some of that data?
sounds like we need to scrape what they excel at ... @roncohen

odacremolbap on 9 Apr 2019

To kick this off and keep it moving, we could first prefix all the metrics with jmx.*, wmi.* and in this case go.*. As soon as we have these we can standardise and potentially make it even part of ECS (@webmat ).

The alternative is that we already abstract out the most obvious one now and but keep the more specific ones under a prefix like go.*. This would be similar to what we do in log modules today with the fields that don't fit into ECS.

If we generlise, I think we should put all of them under a common prefix like runtime.* or virtualmachine.* (other suggestions?).

ruflin on 10 Apr 2019

👍1

I like a lot more your second suggestion, if both are weighted the same.

this (which if I understood correctly you are proposing) looks gently grouped:

runtime.memory.*
runtime.process.*
runtime.jmx.*
runtime.wmi.*
runtime.go.*

I'll change the go _namespacing_ here, I like that runtime.go proposal better than the nesting under runtime.process I was doing

https://github.com/elastic/beats/blob/36fd113203691a2ac2640e87cf2ecacd24308bc9/metricbeat/module/coredns/stats/stats.go#L45-L47

odacremolbap on 12 Apr 2019

👍2

Yeah I like the idea of nesting under runtime.*

webmat on 12 Apr 2019

closing this one (very specific to go) in favor of #11836

odacremolbap on 16 Apr 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Improvements To Beats/Logstash "ACK" Protocol Including Covering Load Balancing and Log Messaging

MorrieAtElastic · 3Comments

Functionbeat failing to deploy due to Alphanumeric Error

musayev-io · 3Comments

Metricbeat kafka output not working.

marian-craciunescu · 3Comments

[Filebeat] Kubernetes metadata fields not expanded in rollover_alias and policy_name

TomaszKlosinski · 3Comments

Allow to add condition that matches all events in autodiscover templates

jsoriano · 3Comments