Charts: [stable/telegraf] Pod Liveness/Readiness probe failed

Created on 6 Jan 2020 · 3Comments · Source: helm/charts

Describe the bug
Pod keep restarting itself over and over again due to unhealthy liveness/readiness probes

Version of Helm and Kubernetes:
Helm version : v3.0.2
Kubernetes: 1.16.0

FYI, Kubernetes is deployed in bare metal environment

Which chart:
stable/telegraf

What happened:
Pod restarted when "Readiness probe failed: HTTP probe failed with statuscode: 503" and "Liveness probe failed: HTTP probe failed with statuscode: 503" are detected

What you expected to happen:
Pod to keep running.

How to reproduce it (as minimally and precisely as possible):
Basically deployed the chart via standard command with modified value.yaml and wait for the probe to fail within minutes

Source

potatochun

Most helpful comment

@fmauNeko in retrospect I should have added the internal input with that PR. I'll create a PR later this week amending #19607 with that. Another option would be to comment out the internal input and health output with a remark that they can be used for pod health status probes.

WyriHaximus on 10 Feb 2020

👍2

All 3 comments

Nevermind, turns out that the outputs.health "buffer_size" field doesn't get recognized for some reasons. I did verified that the field exists in InfluxDB though. Anyways, switched the health to monitor my own custom metrics and health check no longer failing /shrugs

potatochun on 14 Jan 2020

This issue is coming from the fact that the buffer_size metric is coming from the internal input module, which isn't enabled.
Perhaps it should be good to amend #19607 to either enable internal or to change the monitored metric with one that is available without.
What do you think @WyriHaximus?