Cadvisor: exclude system.slice metrics from being exposed as prometheus metrics

Created on 17 Dec 2018 · 17Comments · Source: google/cadvisor

we are using cAdvisor to collect docker container metrics with Prometheus. There are tons of cgroup related metrics being exposed that are not relevant and we want cAdvisor to filter them out
--docker_only flag does not change anything for some reason, we continue to see all /system.slice/* stuff being exposed. After some googling, I came across rawPrefixWhiteList = "/system.slice/kubelet.service, /system.slice/docker.service.." setting #2048 #1926
But I can not figure out how to use that setting and how it relates to docker_only flag? Could anyone explain it in more details how to use rawPrefixWhiteList? Seems it's not documented anywhere.

Source

viberan

Most helpful comment

so, is there a way not to collect system.slice/* metrics?

viberan on 17 Dec 2018

👍6

All 17 comments

docker_only makes cAdvisor only collect docker container metrics, and machine level metrics. We don't actually expose rawPrefisWhiteList as a flag. We may want to consider consuming this in the kubelet to get that behavior.

dashpole on 17 Dec 2018

thanks for your quick response.
does it mean we should NOT see /system.slice/* metrics when using docker_only flag? What do you refer to as machine level metrics?

I'm just looking for a way to tell cAdvisor not to collect/expose /system.slice/* metrics as there is noticeable performance impact. We don't use kubernetes

viberan on 17 Dec 2018

👍2

with docker_only, you only get cgroup metrics for cgroups that belong to docker containers, e.g. /kubepods/pod<pod_uid>/<container_id> will be collected, but /kubepods/pod<pod_uid> will not.

Machine metrics are metrics from the / cgroup, as well as, for example, disk metrics from filesystems on the node. We collect these metrics when --docker_only is specified even though it isn't a docker container cgroup.

dashpole on 17 Dec 2018

so, is there a way not to collect system.slice/* metrics?

viberan on 17 Dec 2018

👍6

We don't use kubernetes too, but want to use cadvisor to collect docker container metrics. It collect all cgroup metrics in / , prometheus curl these metrics take a long time.

lixianyang on 28 Dec 2018

👍1

I wanted to elaborate more on our use case for cAdvisor and Prometheus.
We're running fleets of spot instances that come and go many times a day with peak number of 1500-2000 instances at a moment. Each spot instance hosts a dozen of docker containers which are not static either. All in all it is pretty dynamic environment which in turn produces millions of different time series and leads to extremely high cardinality metrics.
Basically we need just a handful of metrics to monitor CPU and memory usage of every docker container at resolution of every few seconds.
cAdvisor exposes over 2700 time series on every spot instance at any given time. While trying to ingest all of those from every spot into Prometheus, I observed over 400 millions(!) of head time series before it crashed the biggest Prometheus server I could try.
Yes, I can drop the unnecessary metrics at Prometheus side and it does help to get cardinality under control. Still the 2700 time series is over one megabyte of data which Prometheus has difficult time to scrape at high pace. Given there are some 2000 endpoints to scrape, it's a lot of traffic too. It seems there is cpu overhead for collecting un-needed metrics as well.

For the above reasons, it would be helpful to have the option to collect and expose docker related metrics only.

viberan on 10 Jan 2019

👍1

@dashpole thanks for taking care of this. Is the pull request #72787 above going to solve the issue when cAdvisor is used outside of kubernetes?

viberan on 20 Jan 2019

no, but you should be able to use the --docker_only flag if you just want metrics for docker containers.

dashpole on 22 Jan 2019

@dashpole at the moment cadvisor with the --docker_only flag collects metrics from the /system.slice cgroup which is exactly what we are trying to avoid. Is it going to change that behavior?

viberan on 24 Jan 2019

@viberan that sounds like a bug. The --docker_only flag should mean you only get docker containers + the / (root) container.

dashpole on 24 Jan 2019

can you paste your configuration (flags + cadvisor version) and the prometheus metrics you are getting here?

dashpole on 24 Jan 2019

@dashpole sorry for delay in response. Below is excerpt from docker compose file:

  cadvisor:
    image: google/cadvisor:v0.32.0
    network_mode: bridge
    container_name: cadvisor
    ports:
      - 9600:8080
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    restart: unless-stopped
    command: ["--docker_only=true"]

the output of docker top command:

root@xxxx# docker top cadvisor
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                1495                1477                8                   15:45               ?                   00:01:59            /usr/bin/cadvisor -logtostderr --docker_only=true

getting ~ 3000 timeseries for prometheus:

curl -s localhost:9600/metrics | wc -l
3010

which is about 1Mb of data

curl -s localhost:9600/metrics | wc -c
1182807

and there are about 2000 system.slice related timeseries out of the above:

curl -s localhost:9600/metrics | grep 'system.slice' | wc -l
2014

metric example:

container_tasks_state{container_label_com_docker_compose_config_hash="",container_label_com_docker_compose_container_number="",container_label_com_docker_compose_oneoff="",container_label_com_docker_compose_project="",container_label_com_docker_compose_service="",container_label_com_docker_compose_version="",container_label_name="",id="**/system.slice/unattended-upgrades.service**",image="",name="",state="uninterruptible"} 0
container_tasks_state{container_label_com_docker_compose_config_hash="",container_label_com_docker_compose_container_number="",container_label_com_docker_compose_oneoff="",container_label_com_docker_compose_project="",container_label_com_docker_compose_service="",container_label_com_docker_compose_version="",container_label_name="",id="**/system.slice/watchdog.service**",image="",name="",state="iowaiting"} 0

There are tons of system.slice stuff:

curl -s localhost:9600/metrics | grep system.slice | cut -d'{' -f1 | sort | uniq -c
     64 container_cpu_load_average_10s
     64 container_cpu_system_seconds_total
    136 container_cpu_usage_seconds_total
     64 container_cpu_user_seconds_total
     38 container_fs_reads_bytes_total
     38 container_fs_reads_total
     38 container_fs_writes_bytes_total
     38 container_fs_writes_total
     64 container_last_seen
     64 container_memory_cache
     64 container_memory_failcnt
    256 container_memory_failures_total
     64 container_memory_mapped_file
     64 container_memory_max_usage_bytes
     64 container_memory_rss
     64 container_memory_swap
     64 container_memory_usage_bytes
     64 container_memory_working_set_bytes
     64 container_spec_cpu_period
     64 container_spec_cpu_shares
     64 container_spec_memory_limit_bytes
     64 container_spec_memory_reservation_limit_bytes
     64 container_spec_memory_swap_limit_bytes
     64 container_start_time_seconds
    320 container_tasks_state

curl -s localhost:9600/metrics | grep -Eo 'id="/system.slice[^"]+' | sort  | uniq -c
     35 id="/system.slice/accounts-daemon.service
     33 id="/system.slice/acpid.service
     27 id="/system.slice/apparmor.service
     27 id="/system.slice/apport.service
     35 id="/system.slice/apt-daily-upgrade.service
     34 id="/system.slice/atd.service
     27 id="/system.slice/cgroupfs-mount.service
     35 id="/system.slice/cloud-config.service
     35 id="/system.slice/cloud-final.service
     27 id="/system.slice/cloud-init-local.service
     35 id="/system.slice/cloud-init.service
     27 id="/system.slice/console-setup.service
     35 id="/system.slice/cron.service
     35 id="/system.slice/dbus.service
     35 id="/system.slice/docker.service
     35 id="/system.slice/exim4.service
     35 id="/system.slice/filebeat.service
     27 id="/system.slice/grub-common.service
     35 id="/system.slice/[email protected]
     35 id="/system.slice/irqbalance.service
     35 id="/system.slice/iscsid.service
     27 id="/system.slice/keyboard-setup.service
     27 id="/system.slice/kmod-static-nodes.service
     35 id="/system.slice/lvm2-lvmetad.service
     27 id="/system.slice/lvm2-monitor.service
     35 id="/system.slice/lxcfs.service
     27 id="/system.slice/lxd-containers.service
     35 id="/system.slice/mdadm.service
     27 id="/system.slice/networking.service
     35 id="/system.slice/node_exporter.service
     35 id="/system.slice/nscd.service
     35 id="/system.slice/nslcd.service
     35 id="/system.slice/ntp.service
     27 id="/system.slice/ondemand.service
     27 id="/system.slice/open-iscsi.service
     35 id="/system.slice/polkitd.service
     35 id="/system.slice/protologbeat.service
     27 id="/system.slice/rc-local.service
     27 id="/system.slice/resolvconf.service
     35 id="/system.slice/rsyslog.service
     27 id="/system.slice/setvtrgb.service
     27 id="/system.slice/snapd.seeded.service
     39 id="/system.slice/snapd.service
     35 id="/system.slice/ssh.service
     35 id="/system.slice/systemd-journald.service
     27 id="/system.slice/systemd-journal-flush.service
     35 id="/system.slice/systemd-logind.service
     27 id="/system.slice/systemd-modules-load.service
     27 id="/system.slice/systemd-random-seed.service
     27 id="/system.slice/systemd-remount-fs.service
     27 id="/system.slice/systemd-sysctl.service
     27 id="/system.slice/systemd-tmpfiles-setup-dev.service
     27 id="/system.slice/systemd-tmpfiles-setup.service
     39 id="/system.slice/systemd-udevd.service
     27 id="/system.slice/systemd-udev-trigger.service
     27 id="/system.slice/systemd-update-utmp.service
     27 id="/system.slice/systemd-user-sessions.service
     34 id="/system.slice/system-getty.slice
     35 id="/system.slice/system-serial\\x2dgetty.slice
     27 id="/system.slice/ufw.service
     27 id="/system.slice/unattended-upgrades.service
     35 id="/system.slice/watchdog.service

Do I miss anything in the configuration? Thanks.

viberan on 28 Jan 2019

@dashpole not sure whether it's a bug or a feature, what I could understand from the source below, accept will always be true for default []string{"/"} value of rawPrefixWhiteList

https://github.com/google/cadvisor/blob/150629c099b66e13223ec0601fdf9d49a3282c68/container/raw/factory.go#L70-L75

This would allow all raw cgroups to be always collected, hence there is all that /system.slice/* stuff reported by cAdvisor regardless of -docker_only flag being set.

viberan on 31 Jan 2019

Oh yeah, you are correct. That is a bug. Ill open something to fix it...

dashpole on 1 Feb 2019

@dashpole thank you for the #2161

Would you mind to add the ability to white list certain cgroup prefixes via a command line flag - #2164

viberan on 3 Feb 2019

Anyone reading this in current year, the --docker-only flag has been removed. Instead use the raw_cgroup_prefix_whitelist.

--raw_cgroup_prefix_whitelist=/docker/ should filter out everything correctly.