I noticed that the CPU usage of cadvisor is the highest of all containers I'm running, which I feel is unexpected. it's still not crazy (an average of 7%), but I feel it should be less.
I'm using the following query to calculate CPU usage (in grafana):
sum(rate(container_cpu_usage_seconds_total{image!=""}[1m])) by (id,name)
Some info about my instance:
-docker_only --disable_metrics disk,tcp,udpIs this the expected CPU usage? Is there something I can improve? Or am I monitoring CPU usage wrong?
Edit: I managed to bring it down to ~3% by setting the --housekeeping_interval to 10s. But it would be nice if it was even lower.
you should add a number of things to the disable_metrics list. See the help text for the flag: https://github.com/google/cadvisor/blob/master/cmd/cadvisor.go#L137
I am also experiencing this issue.
On a small server (with CPU: Quad Core Intel Xeon E3-1265L V2) even i have disabled a large number of metrics, cAdvisor is using an average of 8% cpu. Compared to the other containers, it is at least 8 times higher than the most consuming container.
That's annoying to have the monitoring tool to eat far more resources than the monitored containers.
Best regards
@dashpole I have a docker-compose set up for caadvisor. where to add --housekeeping_interval to 10s ?
@gowrisankar22 to the container args. I'm not sure where those are specified...
@gowrisankar22 This issue is about high cpu usage, not about docker-compose content.
anyway
# cadvisor:
# image: gcr.io/google-containers/cadvisor:v0.36.0
# container_name: cadvisor
# command:
# - '--housekeeping_interval=55s'
# - '--docker_only'
I've disabled all statistics listed, specifically:
disk,diskIO,network,tcp,udp,percpu,sched,process
Note that the statistics cpu_topology, hugetlb and referenced_memory were mentioned, but i couldn't actually disable them: it gave me an error about an invalid argument.
Now the CPU usage is down to ~1.5%. This is still the highest average CPU usage of all 31 containers running on my machine (followed by promotheus itself at ~1.3%).
But I wouldn't say the issue is quite resolved: I expect cadvisor to be able to report statistics with intervals of ~15-30 seconds, without taking more than 1% of the CPU load/time, and I didn't expect to have to disable all statistics I can to get near that number.
1% of how many cores? Also, the query interval isn't what really matters; it is the housekeeping interval. cAdvisor collects metrics in the background, and serves them from its cache.
The machine has two cores, so I believe it would be ~1.5% of a single core. It is calculated as:
sum(rate(container_cpu_usage_seconds_total{image!=""}[1m])) by (id,name)
But note that this usage is when disabling all metrics I could disable, and setting the housekeeping interval to 15s.
With default settings, cadvisor would take up ~15% CPU usage, which in my opinion is too much for the default settings.
I'm having this same problem and followed the suggestions in this issue and it corrected it for a while, but the high CPU use came back after restarting the container.

This is my docker-compose.
cadvisor:
image: google/cadvisor:latest
container_name: cadvisor
restart: always
command:
- '--docker=tcp://socket-proxy:2375'
- '--housekeeping_interval=15s'
- '--docker_only=true'
- '--disable_metrics=disk,network,tcp,udp,percpu,sched,process'
networks:
- socket_proxy
- database
depends_on:
- socket-proxy
- prometheus
security_opt:
- no-new-privileges:true
ports:
- '$CADVISOR_PORT:8080'
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /dev/disk/:/dev/disk:ro
Most helpful comment
I'm having this same problem and followed the suggestions in this issue and it corrected it for a while, but the high CPU use came back after restarting the container.

This is my docker-compose.