The number of restarts is attached as a label to all metrics for a container. This causes all time series for a container to break as it restarts as identity is defined by the full set of labels.
The restart count is a time series itself and should only be exported as such.
+1
On Thu, May 26, 2016 at 9:05 AM, Fabian Reinartz [email protected]
wrote:
The number of restarts is attached as a label to all metrics for a
container. This causes all time series for a container to break as it
restarts as identity is defined by the full set of labels.The restart count is a time series itself and should only be exported as
such.—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
https://github.com/google/cadvisor/issues/1312
These are set by kubernetes: https://github.com/kubernetes/kubernetes/blob/b0ea89c2f6b954a27ba0f545da955db4f25009d0/pkg/kubelet/dockertools/labels.go#L37-L50
Any suggestions how to solve this? None of these labels make sense to export as metric labels. Filtering them in cAdvisor might be a bit awkward as it adds references back to Kubernetes. Changing this in Kubernetes might require quite some changes though.
I'm starting to understand what's going on here. Kubernetes attaches all these docker labels as a fallback to retrieve them later again. I guess the easiest would be to filter all labels which start with io.kubernetes.. What do you think @timstclair @vishh?
By the way, I believe cAdvisor shouldn't try to parse these labels and create a restart counter from it. This would be the job of Kubernetes, also as Docker containers don't have the concept of restarts.
Yes, the labels should all be treated as opaque by cAdvisor.
Actually, this isn't a Kubernetes Label, but a Docker Label also (when using --restart=always directive):
docker inspect nginx1 |grep -i restartcount
"RestartCount": 2,
This is a pure Docker 17.03 running, no Kubernetes. I'm going to check what's the impact to change cAdvisor code to also set a metric containing the RestartCount of the containers.
kubectl describe po nginx1 | grep "Restart Count"
Restart Count: 0
This is still not fixed for users who are using cadvisor directly. RestartCount is a label, which breaks everything and makes it pretty impossible to write alerts based on it (e.g. alert if restart happened more than X times in the last Y min)
@fabxc @dashpole @vishh @timstclair
Any news here? It would be useful to have container_restart_count metric to setup alerting when using bare docker engine without kubernetes.
Any news here?
Any news?
cAdvisor is stateless, and does not track restarts of containers itself. Some container runtimes attach a restart count label to metrics, but labels are treated as opaque by cAdvisor as pointed out above. If the RestartCount label is causing problems, set --store_container_labels=false and use --whitelisted-container-labels to keep the ones you want.
@dashpole I think the issue here is that folks want to use container_label_restartcount as a vector for alerting -- not that it is causing problems. It's just that as a label and not a vector, it is impossible to do so.
I understand Docker is what is exposing this, so maybe this is an issue to file on Docker itself? Or is there some future where cAdvisor could support remapping label values into vectors themselves?
I have a very nasty hack for anyone wanting a container_restart_count metric that works today: https://gist.github.com/slimsag/85e06781eb0d4d35beee12916aefac5f
Most helpful comment
Any news here? It would be useful to have container_restart_count metric to setup alerting when using bare docker engine without kubernetes.