CoreCLR uses mlock
during startup and fails if mlock
fails with EPERM
. Generally, that's not a problem.
However, many Linux distributions are starting to use systemd-nspawn
for building code. This creates a chroot where programs have restricted capabilities. Specifically they do not have CAP_IPC_LOCK
, which means they can't use mlock
.
Wwhen mlock
doesn't work, coreclr fails to start. This shows up in an strace
as something like:
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbd542bb000
mlock(0x7fbd542bb000, 4096) = -1 EPERM (Operation not permitted)
write(2, "Failed to initialize CoreCLR, HR"..., 49) = 49
As a result, this makes it basically impossible to build coreclr in some Linux distribution build systems.
cc @tmds @alucryd
See https://github.com/dotnet/source-build/issues/285#issuecomment-399949984 and https://github.com/rpm-software-management/mock/issues/186 for some examples where this is hitting some builds
The mlock
is necessary for proper behavior of the FlushProcessWriteBuffers PAL function that is crucial for ensuring reliable runtime suspension for GC. See https://github.com/dotnet/coreclr/blob/e6ebea25bea93eb4ec07cbd5003545c4805886a8/src/pal/src/thread/process.cpp#L3095-L3098 for description of the reason.
On Linux 4.3 and higher, there is a sys_membarrier
syscall that we could use as an alternate mechanism to implement FlushProcessWriteBuffers. Issue dotnet/runtime#4501 is tracking that. @sdmaclea tried to implement it and tested it on ARM64 . He has found that the performance was really bad and that running time of our ~11000 coreclr tests was about 50% longer. However, no testing was done on other hardware, so it was not clear if the performance issue is ARM64 specific or an overall problem.
Interestingly enough, I've just discovered the following article describing performance issues with the sys_membarrier: https://lttng.org/blog/2018/01/15/membarrier-system-call-performance-and-userspace-rcu/. The reason is that the syscall internally waits until all running threads on the system have gone through a context switch, which could take tens of milliseconds. But the good news mentioned in this article is that starting with Linux 4.14, there is a new flag that can be passed to the sys_membarrier syscall and that makes it to use IPI to implement the memory barrier semantics. And that is much faster. So we should give it a try.
Most helpful comment
The
mlock
is necessary for proper behavior of the FlushProcessWriteBuffers PAL function that is crucial for ensuring reliable runtime suspension for GC. See https://github.com/dotnet/coreclr/blob/e6ebea25bea93eb4ec07cbd5003545c4805886a8/src/pal/src/thread/process.cpp#L3095-L3098 for description of the reason.On Linux 4.3 and higher, there is a
sys_membarrier
syscall that we could use as an alternate mechanism to implement FlushProcessWriteBuffers. Issue dotnet/runtime#4501 is tracking that. @sdmaclea tried to implement it and tested it on ARM64 . He has found that the performance was really bad and that running time of our ~11000 coreclr tests was about 50% longer. However, no testing was done on other hardware, so it was not clear if the performance issue is ARM64 specific or an overall problem.Interestingly enough, I've just discovered the following article describing performance issues with the sys_membarrier: https://lttng.org/blog/2018/01/15/membarrier-system-call-performance-and-userspace-rcu/. The reason is that the syscall internally waits until all running threads on the system have gone through a context switch, which could take tens of milliseconds. But the good news mentioned in this article is that starting with Linux 4.14, there is a new flag that can be passed to the sys_membarrier syscall and that makes it to use IPI to implement the memory barrier semantics. And that is much faster. So we should give it a try.