0.12.0
RHEL6, kernel version 2.6.32-754.30.2.el6.x86_64
Appreciate that older kernel versions may not be supported, in which case please close this. However it may be useful for others.
I was able to run nomad Nomad v0.11.3 on RHEL6 if I ran it with a newer version of libc (which I did via patchelf --set-interpreter glibc-2.19/lib/ld-linux-x86-64.so.2 --set-rpath glibc-2.19/lib). It could successfully launch commands OK and generally worked well.
However with Nomad v 0.12.0, this no longer worked - the exec driver fails with:
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: panic: cannot statfs cgroup root: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: : alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: goroutine 8 [running]:: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/runc/libcontainer/cgroups.IsCgroup2UnifiedMode.func1(): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/[email protected]/libcontainer/cgroups/utils.go:45 +0xbe: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: sync.(*Once).doSlow(0x54d2310, 0x3313690): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: sync/once.go:66 +0xec: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: sync.(*Once).Do(...): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: sync/once.go:57: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/runc/libcontainer/cgroups.IsCgroup2UnifiedMode(0x5312ca0): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/[email protected]/libcontainer/cgroups/utils.go:42 +0x58: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/runc/libcontainer/cgroups.isSubsystemAvailable(0x321931a, 0x7, 0x24): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/[email protected]/libcontainer/cgroups/utils.go:106 +0x26: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/runc/libcontainer/cgroups.FindCgroupMountpointAndRoot(0x0, 0x0, 0x321931a, 0x7, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/opencontainers/[email protected]/libcontainer/cgroups/utils.go:65 +0x75: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor.getCgroupPathHelper(0x321931a, 0x7, 0xc0005304b0, 0x2b, 0x2b, 0xc000544000, 0x41, 0xc000515880): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor/executor_linux.go:716 +0x55: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor.configureBasicCgroups(0xc000515780, 0xc000544000, 0x0): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor/executor_linux.go:700 +0xb9: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor.(*UniversalExecutor).configureResourceContainer(0xc0002241a0, 0xbfd, 0x0, 0x0): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor/executor_universal_linux.go:80 +0xff: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor.(*UniversalExecutor).Launch(0xc0002241a0, 0xc0005080f0, 0x0, 0x0, 0x0): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor/executor.go:283 +0x258: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor.(*grpcExecutorServer).Launch(0xc000206750, 0x38e7100, 0xc000506000, 0xc000508000, 0xc000206750, 0xc000506000, 0xc000214b78): alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
2020-07-30T11:34:29.586+0100 [DEBUG] client.driver_mgr.raw_exec.executor.nomad: github.com/hashicorp/nomad/drivers/shared/executor/server.go:23 +0x371: alloc_id=d3cfc692-5264-7cc8-fd31-1e65072808b9 driver=raw_exec task_name=test_task
Without tracing on, this would just appear as a task 'Driver Failure' with error 'failed to launch command with executor: rpc error: code = Unavailable desc = transport is closing'
The problem seems to be line 45 of /vendor/github.com/opencontainers/runc/libcontainer/cgroups/utils.go in IsCgroup2UnifiedMode
if err := syscall.Statfs(unifiedMountpoint, &st); err != nil {
panic("cannot statfs cgroup root")
}
where unifiedMountpoint is "/sys/fs/cgroup". Seems the code now panics if this doesn't exist.
If this were just to return false instead (or the panic were avoided some other way), I think everything would work (seems the code otherwise tolerates cgroups initialization returning an error).
If it's not intended to support certain kernel versions, it might be good to have an error on startup.
Hi @dposton80! The error you're seeing is bubbling up from the third-party libcontainer, which we've updated for a security issue in 0.12.0 (see https://github.com/hashicorp/nomad/pull/8246). It might be worth reporting this to that project to see whether they can recommend a workaround for older kernels.
If it's not intended to support certain kernel versions, it might be good to have an error on startup.
That support is going to be dependent on which task drivers you have enabled, so that makes it a little tricky to state a specific version. But unfortunately it looks like we don't even document that (or at least anywhere I would expect to see it), so I'm going to mark this as a documentation bug at least.
As a potential workaround, you can disable cgroups usage in raw_exec with the no_cgroups flag. Can you try adding the following snippet to your client config:
plugin "raw_exec" {
config {
no_cgroups = true
}
}
raw_exec driver uses cgroup to improve process tracking for metric collection and shutdown purposes, so you may notice some odd behavior with child processes not tracked or killed if they don't clean up properly.
Thanks for the responses. I already tried the no_cgroups option, it doesn't seem to make any difference I'm afraid. Seems the call to configureResourceContainer in drivers/shared/executor/executor.go line 283 should be skipped if command.BasicProcessCgroup is false?
Well, that's unfortunate. This is a bug that we should fix - no cgroup operation should occur when no_cgroup is set!
Have any progress ? I have the same issue in latest nomad 0.12.7
@qianglchina Thanks for your patience. I have just merged a fix to be included in the next Nomad release.
Most helpful comment
Well, that's unfortunate. This is a bug that we should fix - no cgroup operation should occur when
no_cgroupis set!