A fix was made in Elasticsearch and elasticsearch-docker to workaround a mounting issue within cgroups. Logstash should account for this as well.
cc: @bohyun-e
@suyograo @ph could you guys look into this? Thanks.
Thanks @bohyun-e
We have two solutions to implement the same workaround:
cgroups.hierarchy.override ?I am inclined to go to 1 route simply because we don't really use java properties.
cc: @jasontedor since he's worked on the ES side of things.
@ph, that's also the route were going to take in Kibana.
@tylersmalley @ph I recommend making two properties (instead of one as was done in Elasticsearch) and naming them something like cgroup.cpu.path.override and cgroup.cpuacct.path.override. I intend to deprecate cgroups.hierarchy.override in Elasticsearch and replace them with something like the above. Also, as we've done in Elasticsearch, I would make it clear that this a hack around an upstream problem and will eventually be removed.
I've sync with @jasontedor and will use the following scheme:
cpu.cgroup.path.override
cpuacct.cgroup.path.override
any workaround for this error?
were seeing this to, any update @ph
Was this work ever completed?
I'd like to run logstash in a container, but I don't want to mount /sys/fs/cgroup from the host. Is it possible to do so in recent version of logstash?
We're getting a lot of reports in elastic/logstash-docker#89 that I _think_ would be resolved by this feature.
...and on the forum. This feels like it's becoming more common.
So to sum up: this is a hack to provide substitute paths segments to build a successful full path to the needed paths but we don't know what the underlying problem is.
I am fixing this.
I am going to use the java system properties and NOT the logstash.yml for these reasons.
I plan to use the suggested properties ls.cgroup.cpu.path.override and ls.cgroup.cpuacct.path.override
So we understand the problem quite well. The problem is that inside a container, when you read /proc/self/cgroup we get paths to the cgroups that are not the actual paths:
23:38:21 1d [jason@totoro:~] $ sudo docker exec -it 0d6ee7a927d8 /bin/bash
[root@0d6ee7a927d8 elasticsearch]# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
elastic+ 1 93.7 2.0 6285520 1338696 ? Ssl 03:38 0:25 /opt/jdk-10.0.2/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djav
root 198 0.7 0.0 11832 2996 pts/0 Ss 03:38 0:00 /bin/bash
root 213 0.0 0.0 51720 3468 pts/0 R+ 03:38 0:00 ps aux
[root@0d6ee7a927d8 elasticsearch]# cat /proc/1/cgroup
11:cpuset:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
10:perf_event:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
9:hugetlb:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
8:devices:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
7:blkio:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
6:net_cls,net_prio:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
5:memory:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
4:cpu,cpuacct:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
3:pids:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
2:freezer:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
1:name=systemd:/docker/0d6ee7a927d8cbb7334971e1917a97d6e174702f6b16102e056397d53d4159ba
0::/system.slice/docker.service
However, these are not right, within the container, it's a lie, a terrible lie. Instead, we have to look at /sys/fs/cgroup/.
See elastic/elasticsearch-docker#25 for more background, especially the comment in the commit that explains the situation.
This is Docker problem, by the way (actually it's at an even lower level). There are upstream issues for it, linked previously, but it's been awhile so I am doubting they will ever do anything about it.
Most helpful comment
any workaround for this error?