Telegraf: inputs.jolokia2_agent.metric not sending metrics of paths containing spaces

Created on 14 Jun 2018  ·  10Comments  ·  Source: influxdata/telegraf

Hello,

Telegraf v1.7.0 (git: release-1.7 f4d22dd4)
CentOS release 6.6
Kernel 2.6.32-431.el6.x86_64

I've got the following input setup:

[[inputs.jolokia2_agent.metric]]
name = "hbasemaster_Hadoop_JvmMetrics"
mbean = "Hadoop:name=JvmMetrics,service=HBase"
paths = ["GcCountG1 Old Generation", "GcCountG1 Young Generation", "GcTimeMillisG1 Old Generation", "GcTimeMillisG1 Young Generation", "GcCount", "GcTimeMillis", "LogError", "LogFatal", "LogInfo", "LogWarn", "MemHeapCommittedM", "MemHeapMaxM", "MemHeapUsedM", "MemMaxM", "MemNonHeapCommittedM", "MemNonHeapMaxM", "MemNonHeapUsedM"]

Since I've got these four paths containing spaces, I see no metrics being sent from telegraf. I'm using the outputs.influxdb with udp URL, for example:

[[outputs.influxdb]]
urls = ["udp://myserver:9096"]
timeout = "5s"

I'm using tcpdump to capture the 9096 traffic and I do see the metrics go and arrive at the destincation, only if I remove these four (4) paths with spaces.

Now, there is more to this issue that I can't exactly understand, but for some reason, on another server with the same exact setup, I was able to see this metric using these paths with spaces. So I can't really understand why it would work on one server but not another. They have the exact same telegraf binary.

I tried to run telegraf in debug, but debug is really not informative. It simply adds output of what batch it wrote and the buffer fullness. Nothing about why it isn't sending the metrics I requested. Nothing about syntax or anything else that might be causing it not to do what I wanted.

jolokia_list.txt

I've attached jolokia_list.txt so you can see these paths in the mbeans for yourselfes.

arejolokia

Most helpful comment

One of the easiest way to get every mbeans exposed by JMX and so by jolokia is to make a curl call to the jolokia list endpoint

Here an example if jolokia is exposed on localhost and on port 8086:

curl http://localhost:8086/jolokia/list

You can also use for example the command jq . to have a better output:

curl http://localhost:8086/jolokia/list | jq .

This will produce every metrics in json format with indentation.

I can write short examples and improve documentation.

All 10 comments

Ok, so after a day of investigating and just minutes after finally deciding to post this issue, I finally figured out why these paths worked on some servers and not others. It seem that one of our clusters has been setup to use the G1 GC using the -XX:+UseG1GC flag. That's why this cluster exposed these metric paths and why telegraf was able to send thier metrics.

As for the last paragraph of my OP, I still don't understand why running telegraf in debug mode didn't expose that I was requesting paths that simply don't exists. This would have given me ample reasons to re-avaluate me configuration.

@dylanmei Do you think we should log any specified paths that are not found when in debug mode, or do you think this is a common issue that should only be implied by the lack of a metric being produced?

This is very common. In many cases, not just here where the OP has servers with drifting configurations, metrics don't exist until a certain condition is met. For example, there are no Kafka topic metrics until a topic is created. Jolokia won't tell you this, but it can be inferred. It is a nice idea to give more feedback in debug mode.

Although, if it is very common then maybe we don't want to spend the resources to log it?

If the debug logger is using up significant cycles even at less verbose log levels, then I wouldn't suggest doing this.

The main cost would be any string formatting, though this can be reduced by checking the log level first. Totally arbitrary but I wouldn't think twice logging several times per gather but more than that should be guarded with if wlog.LogLevel() == wlog.DEBUG

I do appreciate you taking the time to discuss this issue.

I would also like articulate, that not all users are with such vast experience that while this being a "common" issue for some, might be time consuming learning curve for others. Should it fail to be decided to be added to the debug, at least, this common behavior should be added as a note or comment to the documentation, IMHO.

The best way to troubleshoot / answer the question, “why don’t I see my JMX metrics?” is by factoring Telegraf out and invoking the Jolokia HTTP endpoint directly.

I believe covering this as an exercise in the documention could help demystify the plugin and prove useful in situations like these.

I like the idea of adding a troubleshooting section to the documentation.

One of the easiest way to get every mbeans exposed by JMX and so by jolokia is to make a curl call to the jolokia list endpoint

Here an example if jolokia is exposed on localhost and on port 8086:

curl http://localhost:8086/jolokia/list

You can also use for example the command jq . to have a better output:

curl http://localhost:8086/jolokia/list | jq .

This will produce every metrics in json format with indentation.

I can write short examples and improve documentation.

Was this page helpful?
0 / 5 - 0 ratings