Describe the bug
Pod keep restarting itself over and over again due to unhealthy liveness/readiness probes
Version of Helm and Kubernetes:
Helm version : v3.0.2
Kubernetes: 1.16.0
FYI, Kubernetes is deployed in bare metal environment
Which chart:
stable/telegraf
What happened:
Pod restarted when "Readiness probe failed: HTTP probe failed with statuscode: 503" and "Liveness probe failed: HTTP probe failed with statuscode: 503" are detected
What you expected to happen:
Pod to keep running.
How to reproduce it (as minimally and precisely as possible):
Basically deployed the chart via standard command with modified value.yaml and wait for the probe to fail within minutes
Nevermind, turns out that the outputs.health "buffer_size" field doesn't get recognized for some reasons. I did verified that the field exists in InfluxDB though. Anyways, switched the health to monitor my own custom metrics and health check no longer failing /shrugs
This issue is coming from the fact that the buffer_size metric is coming from the internal input module, which isn't enabled.
Perhaps it should be good to amend #19607 to either enable internal or to change the monitored metric with one that is available without.
What do you think @WyriHaximus?
@fmauNeko in retrospect I should have added the internal input with that PR. I'll create a PR later this week amending #19607 with that. Another option would be to comment out the internal input and health output with a remark that they can be used for pod health status probes.
Most helpful comment
@fmauNeko in retrospect I should have added the
internalinput with that PR. I'll create a PR later this week amending #19607 with that. Another option would be to comment out the internal input and health output with a remark that they can be used for pod health status probes.