I'm trying to create dotnet core console application for linux and distribute it via snapcraft.io platform.
I've created simple console app via dotnet new console
command and build it using dotnet build -c release --self-contained -r linux-x64
. After that made snap package using following snapcraft.yaml
config:
name: snap-test-console
version: 1.0.0
summary: Dotnet core snap test
description: |
Testing dotnet core in strict confinment
base: core18
architectures:
- amd64
confinement: strict
grade: stable
parts:
snap-test-console:
plugin: dump
source: bin/release/netcoreapp3.1/linux-x64/publish/
stage-packages:
- liblttng-ust0
- libcurl4
- libssl1.0.0
- libkrb5-3
- zlib1g
- libicu60
apps:
snap-test-console:
command: snap-test-console
plugs:
- network
- network-bind
- removable-media
- home
To build snap package i used snapcraft
command.
After installing produced package with snap install snap-test-console_*.snap --dangerous
command I run console app. But it failed with Failed to create CoreCLR, HRESULT: 0x80070008
message. I've done some digging and find out from syslog following SECCOMP issue:
] audit: type=1326 audit(1578771141.872:317): auid=1000 uid=1000 gid=1000 ses=2 pid=11025 comm="snap-test-conso" exe="/snap/snap-test-console/x1/snap-test-console" sig=0 arch=c000003e syscall=203 compat=0 ip=0x7f09cd37def7 code=0x50000
I analyzed syscall=203
:
$ scmp_sys_resolver 203
sched_setaffinity
So adding process-control
interface to snap plugs fixes the issue. But this required manually connecting process-control interface after snap installation, which I want to avoid.
So my question is following:
Is there any way to workaround or change this behavior, so dotnet core runtime did not call sched_setaffinity
kernel function and didn't required process-control interface?
Here is strace output in case you need it.
I'm using Ubuntu 18.04.2 LTS.
FYI, there is an analysis of the issue for another dotnot cli application in the snapcraft forum: https://forum.snapcraft.io/t/requesting-autoconnect-for-interfaces-in-pigmeat-process-control-home/17987/20
Heyo! Developer of the program @jdstrand mentioned. I'm facing the same issue with the same symptoms, and I, as you, do not have a workaround. This needs to be looked at further so the folks over at Snapcraft won't be forced to look at snaps on a individual basis over such a minor complication.
Hey :)
I'll share my thread as well.
https://forum.snapcraft.io/t/process-control-auto-connection-request-for-chameleon-snap/14993/10?u=soarc
Actually I'm not sure, if this is dotnet issue.
What I understand from my logs, dotnet core uses sched_setaffinity
to manipulate threads created by itself. So probably it is OK to allow call sched_setaffinity
for that particular threads.
@jdstrand what do you think?
UPD: Some details. Here 10152
process/thread is created and on next line sched_setaffinity(10152, 128, [0, 1, 2, 3]) = 0
is used to change affinity
The offending code might be the if
following https://github.com/dotnet/runtime/blob/master/src/coreclr/src/pal/src/thread/thread.cpp#L760
It assumes that any error code that is not ENOMEM
is a failure, and so an EPERM
will be treated as fatal. This is incorrect. The process should continue when a sched_setaffinity
syscall returns EPERM
.
@GRustamyan-ST - IME, @diddledan's observation is sound for dotnet to also ignore (or log but continue) on the EPERM unless the application is expected to not work correctly if it runs on a different CPU. Note that the code could also perhaps be adjusted to use sched_setaffinity(0, <whatever>, <you want>)
(which is compliant with the default snap sandbox) instead of using pthread_attr_setaffinity_np()
which calls sched_setaffinity()
in a way that requires process-control.
To snap publishers encountering this issue, it should be possible to use an LD_PRELOAD for sched_setaffinity to return 0 on EPERM until this issue is sorted out in dotnet proper.
@jdstrand Perhaps this sounds stupid, and if it does I'm sorry, but how would we go about using LD_PRELOAD
in a .NET language, like C#, which is a managed language running on the .NET Core runtime? Isn't this a runtime-layer change, which would be below what app developers could actually affect?
Well, maybe I am the one misunderstanding (I only postulated that it 'should be possible to use LD_PRELOAD' with dotnet), but the file @diddledan pointed out is a C++ file, it includes pthread_np.h for pthread_attr_setaffinity_np()
(pthread_attr_setaffinity_np()
is a glibc addon for pthread.h), which suggests dotnet is compiled with g++ for the snap, which should mean it is compiled to use dynamic libraries at the very least for glibc, which should therefore allow for LD_PRELOAD. Assuming that the dotnet runtime is not statically linked (ldd /path/to/thing would show that), it should work (someone would need to test this).
Ah, OK, so you mean we (those trying to publish .NET apps on Snapcraft) should compile a modified version of .NET Core and bundle that with our programs in some way? Apologies for the confusion on my part, I've never dealt with such low-level code before; I'm a teenage amateur.
It's OK @EmilSayahi LD_PRELOAD
is a rather esoteric concept that even a lot of very experienced developers don't really know about. As for the code in question, it's part of the native-code that .NET Core uses as part of its Platform Abstraction Layer for Linux systems. The idea of the PAL is to unify the various operating systems with a single API that .NET Core is targeted to on the premise that the small PAL is much simpler than embedding platform specifics all over the managed code.
What @jdstrand is thinking, and has worked in other (non-dotnet) Snaps is to write a small Shared Object library (DLL in Windows speak) that intercepts the function call to pthread_attr_setaffinity_np()
and ensures that it doesn't return an EPERM
error status. The LD_PRELOAD
name is the name of an environment variable that the linker ld
recognises at launch. The variable, if defined, should point to a Shared Object which is to be loaded first when loading an application's dependencies. The ordering of loading the library first means that it gets to implement the function before the library containing the original implementation is loaded. We can write our wrapper in a way that calls into the now-hidden version of the function and alter the return code.
Unfortunately, if an application is statically linked, then the functions you might want to replace are already in memory when the application starts so you can't load your preload library before that happens. This might be the case with the PAL. Certainly the test that I attempted to write builds to a statically linked executable so I couldn't replace the function using a different LD_PRELOAD
library to always return EPERM
to simulate the failure condition that Snaps are causing.
Ah, I get it now! Thank you so much; you're as brilliant as ever, @diddledan.
Before I resign myself to silently spectating, I feel I ought to point out how ridiculous & absurd this is? It feels like adding more layers of duct tape isn't the approach Snapcraft should be taking? Perhaps I'm just too used to the more consumer-friendly world of Windows, but forcing developers to jump through all sorts of hurdles just to publish their rather simple software is highly concerning, and presents a major issue in promoting FOSS development. For the few like me who're willing to deal with such things, there are many who will be turned off from the whole ecosystem entirely.
Anyway, it's really not my place, I suppose. Again, thank you for the continued work. You lot are far more talented than I.
cc @leecow since he's tracking Snap stuff.
Before I resign myself to silently spectating, I feel I ought to point out how ridiculous & absurd this is? It feels like adding more layers of duct tape isn't the approach Snapcraft should be taking? Perhaps I'm just too used to the more consumer-friendly world of Windows, but forcing developers to jump through all sorts of hurdles just to publish their rather simple software is highly concerning, and presents a major issue in promoting FOSS development. For the few like me who're willing to deal with such things, there are many who will be turned off from the whole ecosystem entirely.
Please note that LD_PRELOAD is a workaround that, perhaps, people can use until dotnet runtime is adjusted to work without process-control (snaps shouldn't be given the ability to influence other snaps by default (which process-control and sched_setaffinity in the way the dotnet runtime is using it allows)). Adjusting software to work within app store sandboxing or other constraints is not something specific to the snap store; we just need to find a solution for this particular case (and I offered a couple of suggestions to the developers in https://github.com/dotnet/runtime/issues/1634#issuecomment-654260483).
I'm considering changing the affinity on the thread after it starts, as suggested in https://github.com/dotnet/runtime/pull/38795#issuecomment-655195755.
Is anyone blocked on this issue for .NET Core 3.1 in a way where it can't be worked around by any of the suggestions above?
@kouvel For me it is not blocker right now, since i'm running my app with a classic confinment. But it is really overkill and being able to switch to strict confinment would be great! (it doesn't require --classic command line argument and not all users of my app are familiar with command line)
Unfortunatly I wasn't able to test jdstrand suggested workaround, since it is way out of my competense. :(
@GRustamyan-ST, ok will see if we can port to 3.1
It looks like sched_setaffinity
is currently not allowed even when the same pid as the calling thread is passed in:
4417 clone(child_stack=0x7f6b100e2fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f6b100e39d0, tls=0x7f6b100e3700, child_tidptr=0x7f6b100e39d0) = 4443
...
4443 sched_setaffinity(4443, 128, [ff, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7f6b100e2f30, d2a9be9ff9d2ee00, 0, 0, 7f6b100e39c0, 1745110, 0, 7f6b12d4b6ba, 0, 7f6b100e3700, 7f6b100e3700, 394afa08d0bc0018, 0, 7ffdde373baf, 7f6b100e39c0, 1745110, c79cda148efc0018, c79cdfa1bc460018, 0, 0, 0, 0, 0, 0, 0, 0, 7f6b100e3700, 7f6b11fdc41d, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) = -1 EPERM (Operation not permitted)
pthread_setaffinity_np(pthread_self(), ...)
appears to call sched_setaffinity(<nonzeroPid>, ...)
and is failing even though it's trying to set the affinity of the same calling thread. @jdstrand, would it be possible to allow pthread_setaffinity_np(pthread_self(), ...)
under Snap's strict confinement? I'm working around it by calling sched_setaffinity(0, ...)
directly for now.
I'm working around it by calling
sched_setaffinity(0, ...)
directly for now.
This is the only way to make it work in the default sandbox without connecting the process-control interface. The reason why is that this syscall is currently mediated via Linux's seccomp syscall filter and by the time it hits the filter, the first argument is filled in and all context for pid_t is gone (ie, we can only look at the value of pid_t and say 'yes' or 'no'-- there is no other context or conditional logic available like "I'm a child", "this is my pid", etc).
A possible kernel enhancement would be to adjust AppArmor to add a new rule type since it is in a much better position wrt context, but even if the kernel were fixed, it would be years before all distros had a new enough kernel that could leverage it.
@GRustamyan-ST, ok will see if we can port to 3.1
If this is possible, that would be great. @GRustamyan-ST mentioned classic confinement, but there are restrictions on classic confinement snap distribution in the public snap store (more than auto-connecting the process-control interface for example).
we can only look at the value of pid_t and say 'yes' or 'no'
I see, thanks for the context
there are restrictions on classic confinement snap distribution in the public snap store (more than auto-connecting the process-control interface for example).
Could you elaborate on that? Might be useful to include a bit more info about the issues involved with the workarounds. Feel free to add notes on issues with workarounds to https://github.com/dotnet/coreclr/pull/28071.
By default snaps are strictly confined and run under a default security policy template which is a collection of rules for file accesses, syscalls, IPC, etc. Additional accesses are provided to snaps through snap 'interfaces' which are collections of different rules for file accesses, syscalls, IPC, etc. Depending on the rules in the interface, it may either be manually connected by the user or automatically connected upon install. For example, the default template allows sched_setaffinity(0, ...)
while the process-control interface allows full access to sched_setaffinity
. Because full access to sched_setaffinity
allows the snap to impact other snaps and non-snap processes on the system, use of process-control requires manual connection by default.
In addition to the default behavior, we have store processes that allow publishers to "request auto-connection" for interfaces that are manually connected for their snaps in the public snap store. This process is flexible but requires human review and justification for the access and is a point of friction for dotnet runtime snap publishers since their applications don't have any intention to manipulate other snap's or non-snap's processes and otherwise don't need the accesses that process-control gives them.
As an alternative to strict confinement, publishers may choose 'classic confinement' which runs the snap outside of a security sandbox and therefore would also allow unrestricted use of sched_setaffinity
. Since classic confined snaps run without any sandboxing, we carefully regulate their distribution in the public snap store via store processes.
The easiest workaround that doesn't require engineering for snap publishers is plugging process-control (we would not allow a snap in the public store to use classic confinement to work around this bug). This is suboptimal of course, for the reasons stated above. I did outline ideas for an LD_PRELOAD approach, but others weren't successful in making that work. My hope is for dotnet runtime to incorporate the changes you are making now (thanks!) in any dotnet runtime versions that snap publishers might use so their applications just work without extra interfaces like process-control. In the meantime, we can temporarily grant auto-connection for process-control for snaps that come in before the dotnet runtime changes are in place.
The fix to 5.0 in https://github.com/dotnet/runtime/pull/40205 is now available in daily SDKs from the master branch here: https://github.com/dotnet/installer/blob/master/README.md#installers-and-binaries
Will the fix be backported to .NET Core 3.1? What's the present status of this (https://github.com/dotnet/coreclr/pull/28071) pull request?
I have the same question, but for .NET 5. The PR is supposed to have been merged in master, but does not seem to have landed in rc1 that got released today. Is this something that will be available for rc2, or later maybe ? Thanks!
@jeromelaban AFAICT, the fix is in RC1. If it's not working, something else is probably wrong.
Yes it looks like this fix is in .NET 5 RC1, some small apps appear to be working as expected. @jeromelaban if you are seeing other issues with RC1 under snap, can you share a repro or stack trace?
For 3.1 the fix is pending approval after getting some confidence in the change with RC1 and to collect any other related issues.
@kouvel @svick Thanks for the followup. The fix is indeed in RC1, I got mislead by snapcraft's caching which used previous Preview 8 binaries. All good and thanks for the hard work :)
Happy to be able to use .NET 5 in Linux strict snaps now :)
Most helpful comment
The fix to 5.0 in https://github.com/dotnet/runtime/pull/40205 is now available in daily SDKs from the master branch here: https://github.com/dotnet/installer/blob/master/README.md#installers-and-binaries