While Mono has <dllmap>
and there is a long-standing discussion going on in https://github.com/dotnet/coreclr/issues/930 and efforts like https://github.com/dotnet/corefx/issues/17135 to work around the limitations of P/Invoke and even NativeLibrary
was introduced, I feel that we could come up with a simple solution that leverages the existing P/Invoke capabilities, without the overhead and ugly machinery that comes from variations of GetDelegateForFunctionPointer
and similar hacks that have been attempted to work around the limitations of P/Invoke.
The proposal is to add an API to inform the runtime how we where we want a particular file referenced in the DllImport
attribute to be loaded from. Developers would then annotate their DllImport
attributes with a custom name, and at startup, their own logic would determine which library to load.
For example:
//
// We declare our DllImport, and by convention we use reverse domains, to avoid clashes:
//
using System.Runtime.InteropServices;
[DllImport ("github.com/migueldeicaza/gui.cs/curses")]
extern static void initscr ();
// At startup, we decide what we want to do.
static void Main ()
{
string library = "github.com/migueldeicaza/gui.cs/curses";
if (File.Exists ("/usr/lib/libncurses_6.so"))
PInvoke.RegisterLibrary (library, "/usr/lib/libncurses_6.so");
else if (File.Exists ("/usr/lib/libncurses_7.so"))
PInvoke.RegisterLibrary (library, "/usr/lib/libncurses_7.so");
else
throw ();
}
Working around this today requires ugly hacks, from Gui.cs having doubled definitions, to Grpc generating proxies and entirer class hierarchies and tons of delegates to achieve the desired effect. And the result produces more junk than the current P/Invoke does.
Bonus points:
we could make it so that the string passed to DllImport could have parameters, similar in spirit to say a ConnectionString in SQL, so we could provide defaults, or even simple inline switching that is evaluated at resolution time.
For example:
// Built-in switching capabilities, similar to Dllmap:
[DllImport ("(switch 'osMac:libSystem 'osLinux:libc 'osWindows:user32)")]
// Define a key that can be referenced by PInvoke.RegisterLibrary, but also provide a default if the API is not called
[DllImport ("(key 'github.com/migueldeicaza/gui.cs)(default 'ncurses)")]
While I can certainly add a bag of hacks to gui.cs (for forked, different naming versions) and TensorFlowSharp (for CPU vs GPU, vs various SIMD operation builds), none of those libraries are particularly affected by the transition speed. But it would be a shame if we did not implement something for every other user that needs to cope with different bits of native code, but does not want to pay the performance price of the GetDelegateFrom...
I agree that capability like this would be good to add. It is something we have been thinking about.
There are two possible designs:
My observation is that the resolve event seems to be more flexible: One can implement the registration APIs on top of the resolve event, but one cannot implement the resolve event using the registration APIs. Also, the resolve event may have better performance characteristics because of it is lazy.
@migueldeicaza Do you have opinion about registration API vs. resolve event?
cc @luqunl @AaronRobinsonMSFT @jeffschwMSFT
We are pulling through a design for this feature right now.
cc @annaaniol
I am more much inclined to have an event mechanism. I have always loathed DB connection strings and having a DSL for this kind of thing quickly gets out of control. An event API for loading avoids these issues and users can perform _any_ logic they desire without waiting for an update to the DSL.
If an event system is used similar to ResolveAssembly, possibly ResolveDllImport? what is it going to return to the caller? ResolveAssembly expects a loaded and resolved assembly but without some sort of wrapper around a library are we limited to returning the path? If so do we want to just hand off a path to the runtime or would some ability to track the lifetime of what we pass back be useful?
For example if you know that your library is going to be needed for a couple of calls at setup you might want to allow that library to be unloaded and I'm not aware that there's any way to do that currently and with a path it wouldn't be possible in future.
The event resolving system is nice, but also too cumbersome for most uses.
Perhaps we can provide a simple API on top of it?
+1 to @migueldeicaza approach. Our current design enables eventing in the runtime and we plan providing a higher level api for ease of use.
@migueldeicaza What part of an eventing mechanism is too cumbersome? Eventing is a well known mechanism in C# and as @Wraith2 pointed out the AssemblyResolve
event already exists and is really the native parallel of that mechanism.
@Wraith2 The event signature itself would need to be iterated on, but I can foresee an approach that has the handler set the path and/or the name of the function to use. Given these arguments the runtime would respect those and just go with it.
I can foresee an approach that has the handler set the path and/or the name of the function to use
I think the handler should return the native library handle (IntPtr
or something that wraps it). We should have a method that wraps the default platform LoadLibrary/dlopen that takes path and returns handle. The simple implementations of the handler can use this method. The more complex cases that need to use platform specific features required by some libraries can get the handle by PInvoking the platform specific API (e.g. LoadLibraryEx
with flags or SxS context on Windows; or dlopen flags like RTLD_GLOBAL
on Unix).
If this kind of API is going to be added, are there any thoughts on adding a slightly more flexible API that allows the resolver to return the actual address of a function instead? For example, on Windows this could be accomplished by LoadLibrary
+ GetProcAddress
. This can be useful in situations where functions that aren't exported or are returned by other functions need to be called. It is not always possible to use the marshaler for this (for example, if the function is variadic).
@jkotas The whole native library thing just doesn't seem to be the real problem - at least how it has been explained to me. The issue is library path discovery, which is really 'Find this library path using CLR look up logic'. That issue could be solved with a Path API that offers up a path computed by the system and is similar in spirit to PathCchCanonicalizeEx
. In this case the caller can determine what to do with that path either offer it back to the CLR to use or alternatively load it using a PInvoke to LoadLibraryEx()
or dlopen()
. This way the contract is clear to the user that they own the lifetime of this library and the CLR will _not_ to anything with it. If we provided hooks into the PInvoke logic and had the user create a NativeLibrary
then it is entirely possible for them to 'assume' the library lifetime is theirs and could be unloaded or some other operation that may break CLR assumptions.
Overall I see little value in providing a native library loading API when all user appear to be after is "find this like the CLR does". It is lower level and I appreciate that, but it does provide a solid v1 and IF we get a large contingency of users that want a NativeLibrary
so much that someone creates a NuGet package for it, well then lets have that conversation and see if adding it to the API surface is warranted.
@migueldeicaza as @jkotas said in his first reply there are two ways to do it and the declarative approach can be built on the eventing approach so I assumed it would be done that way providing both.
My only problem with the extended declarative syntax is that I don't like magic interpreted strings. I'd rather that logic be split out in some way. Perhaps something like [DllImportHint(Platform.MacOS,"libc")]
decorating the same method.
I thought that being unable to unload a native library one it was no longer needed would be a desirable possibility which isn't needed for this issue but could be developed in future. The NativeLibrary which wraps a library with a default implementation in the PAL for each supported platform and exposes the module handle as an Inptr or similar would be a step towards enabling this.
The whole native library thing just doesn't seem to be the real problem
There are the simple cases that are just about the path discovery, and then there are the more complex cases. I have seen the more complex cases number of times. Here are a few examples:
RTLD_GLOBAL
. https://github.com/dotnet/coreclr/pull/18628 has example.Overall I see little value in providing a native library loading API
We do have this API already as part of AssemblyLoadContext. We should consider how to reconcile what we have already with this design.
@jkotas I do remember commenting on dotnet/coreclr#18628 and that issue should definitely be addressed. The probing issue could be addressed by the path API without much issue. Fully agree with reconciling with AssemblyLoadContext
- which I didn't even know existed. Thanks.
Oh please, yes. Really the only problem that needs to be solved (IMO) is that DllImport path is constant, and I want to create that string at runtime. Sometimes I want to select between a 32-bit and 64-bit lib based on the current runtime. Other times, I want to use different library names on different OSes / distros. I don't care if you use callbacks or Miguel's suggestion, but just please let me generate that string at runtime.
While I used dllmap with Mono in the past, I must say I havn't looked back since netcoreapp/net471 sdk based projects added support for nuget packages with native depedencides/managed dependencies per RID (runtimes/), as those supercede the functionality dllmap had to ever offer...
What would be the motivation of using this over the current nuget/native package?
On Linux it is more appropriate and customary to use system-provided shared libraries. Resolving the correct one though often has to happen at runtime, and canβt be hard coded into a dllimport.
In Windows, I have a managed lib that selects either a 32-bit or 64-bit DLL at runtime. Currently Iβm setting the PATH variable and putting them in separate folders - a hack that only works on Windows and isnβt solved by nuget.
But is IS solved by (newer) nuget, all of it...
https://docs.microsoft.com/en-us/nuget/create-packages/supporting-multiple-target-frameworks#architecture-specific-folders
That is a compile time solution which doesn't enable single package multiple environment deployments. It's a solution to a different problem.
I am not sure I follow: you can use this with dotnet run
and it will magically load the right dependency. Also with dotnet publish
(fdd).
It even works with dotnet global tools, as they pack all rids and resolution is @ runtime (similar to fdd publish).
Are you referring to the ability to edit the xml by the end user to fiddle with dll mapping in case an app is used on a new not yet supported os with a need for custom mapping?
I'm having a little hard time seeing the use case for it under netcoreapp, (unlike net framework where it could help quite a lot), but to each their own I guess...
@damageboy Have you even read the first comment? If native library has different name in different Unix/Linux distributions and that native library is part of OS and can't be added to that NuGet how would your solution even work with that?
@damageboy consider packaging scenarios. I create an app and compile a self contained or native version for distribution. In this scenario the end user running the application isn't a developer and shouldn't be doing nuget package restores.
If we package all the various flavours of native dependency (native.dll, native.so, native.dylib) and then have the program use what we've discussed above to determine the correct one to load then there is a single binary distribution with no complex setup to go wrong.
@wanton7 yes I did read, and I'm glad you asked because the answer to that is a resounding yes: not only does it help, but also supersedes the <dllmap>
approach.
In case of architecture specific folders you can essentially employ a bait and switch tactic where your code is compiled against a certain assembly and you get a different managed assembly at dotnet run
or dotnet publish
(fdd/scd).
What <dllmap>
does solve can be equally solved by generating different managed versions of the code that can have different [DllImport()]
attributes to deal with the so/dll names variations per os / os-version.
But it doesn't just stop there...
The arch specific managed assembly approach is by far, in my view, superior to <dllmap>
since it also covers the next pain point which is often structs layouts / bitness.
What happens if you need different [StructLayout()]
/ [FieldOffset()]
for a structure shared between the native code and the managed code? With <dllmap>
, you go back to square one on that front.
With the arch specific approach you actually have a way (e.g. the bait and switch of managed assemblies per RID) where you can provide (for example):
@wraith2 unless I'm making a complete ass of myself, you are describing what dotnet publish
already does.
I've just packaged one my own tools as a dotnet global tool inside our company, which has a complex dependency like that one you've just described (on liblzma.{dll,so,dylib} and that single packaged nupkg can be installed and run on windows/ubuntu right now (haven't testes osx personally, so I won't comment on it).
@wanton7 also as a side note, I know you didn't mean it that way, but still:
This is NOT MY solution, this is a solution that Microsoft did a great job of implementing and shipping in production at least since .NET Core 2.0.
What they've done with less stellar success is documenting / advocating / talking about it as what both you and @Wraith2 think that cannot be done today without <dllmap>
is actually working right now on my machine.
@damageboy architecture is not OS. You don't understand this problem at all. Let's say you create a game to Steam for Linux using .NET Core. Tell me how would your approach support all those different Linux distros from one install? This can't be done properly at build time, just can't.
It's not me and @Wraith2 just having this but @migueldeicaza who published this issue, he works for Microsoft and is creator of Mono. But if you really think you know better then please educate us.
@damageboy let me still try to break this to you little bit. With example you are describing with liblzma.{dll,so,dylib} is situation where you as a developer are in control of names for those libraries. It's very different situation, actual problem is that you can't include these native libraries in nuget package because they are part of Operating System and you can't control their names. Their names could be different in different OS distro/version like liblzma_6.so liblzma_7.so and so on.
@wanton7 so I must be a magician since I managed to support the exact use case you described with the current tools
My internal compression nupkg (sorry, this is not public yet) achieves exactly this, here's the unzipped directory listing of its nupkg:
.
βββ daemaged.compression.0.0.4.nupkg
βββ daemaged.compression.0.0.4.nupkg.sha512
βββ daemaged.compression.nuspec
βββ lib
βΒ Β βββ netcoreapp2.0
βΒ Β βββ Daemaged.Compression.dll
βββ runtimes
βββ ubuntu.14.04-x64
βΒ Β βββ lib
βΒ Β βββ netcoreapp2.0
βΒ Β βββ Daemaged.Compression.dll
βββ ubuntu.16.04-x64
βΒ Β βββ lib
βΒ Β βββ netcoreapp2.0
βΒ Β βββ Daemaged.Compression.dll
βββ ubuntu.18.04-x64
βΒ Β βββ lib
βΒ Β βββ netcoreapp2.0
βΒ Β βββ Daemaged.Compression.dll
βββ win7-x64
βΒ Β βββ lib
βΒ Β βΒ Β βββ netcoreapp2.0
βΒ Β βΒ Β βββ Daemaged.Compression.dll
βΒ Β βββ native
βΒ Β βββ libbz2.dll
βΒ Β βββ liblzma.dll
βΒ Β βββ liblzo2.dll
βΒ Β βββ libz.dll
βββ win7-x86
βββ lib
βΒ Β βββ netcoreapp2.0
βΒ Β βββ Daemaged.Compression.dll
βββ native
βββ libbz2.dll
βββ liblzma.dll
βββ liblzo2.dll
βββ libz.dll
Note the different versions of Daemaged.Compression.dll
inside the ubuntu/win7 folders, each of those contains a different variation of the p/invoke signature into the OS provided shared objects (liblzma_6.so
/liblzma_7.so
in your example, though you must have meant: liblzma.so.6
/ liblzma.so.7
as they would actually appear in a linux distro) as well as (if/when needed) different [Structlayout()]
for the way the struct looks on that specific version of the OS. There are no native libraries provided by the nupkg for THOSE operating systems.
When applications consuming THIS nupkg are published with dotnet publish
it's the responsibility of the CLR to load the right managed assemblies and native libraries.
I personally use this with packaging dotnet global tools and doing SCD styled deployments (dotnet publish -r win7-x64
)
@damageboy yes you must be a magician because having to build own assembly for every supported OS sounds magical.
It's not me and @Wraith2 just having this but @migueldeicaza who published this issue, he works for Microsoft and is creator of Mono. But if you really think you know better then please educate us.
@wanton7 The reason I writing all of this is that if you scroll back to my first post, I actually ended it with:
What would be the motivation of using this over the current nuget/native package?
Note the use of a question mark which implies I understand the context of where I am and am actually trying to learn myself what this specific feature offers to netcore developers.
I don't feel I've heard a compelling answer to my, and I'm not trying to dissuade anyone, especially not the creator of mono out of anything, I am actually curious to find our the answer to my question
I think I see the disconnect. Youβre taking about making an OS-agnostic nuget package that can be consumed by any OS-specific app and itβll work. Weβre talking about making an OS-agnostic app.
@jherby2k maybe that is the disconnect, but I am using this technique to publish OS agnostic apps in the form of:
It's true that the arch specific nuget approach doesn't work for a single assembly that is both the app and the assembly doing the p/invoke... then again, if you separate the p/invokes into its own assembly, everything does fall into place.
Is this what <dllmap>
is all about for .NET core? The ability to achive os agnostic apps without dependencies?
Note that I do understand the magic that is <dllmap>
for the older framework applications, and have happily used it before under that setting, and it was definitely a nice tool / utility to have back then / there.
@damageboy I explained and showed the scenario in the original issue.
But some additional detail:
gui.cs library: need to consume a system library, which depending on the system can be "curses", "ncurses" or "ncursesVERSION" and further, the version can be 5, or 6. Bonus points: Linux distributions by default do not install the default symlink, unless you install the -dev package, so more often people need to install a development package to get this capability. In the case of the "ncursesVERSION" it is desirable to load version 6 if available, or fallback to 5 otherwise.
TensorFlowSharp: need to dynamically choose which version of the native library to use (CPU, GPU, which CPU+optimizations and GPU+optimizations to use), right now, I just hardcode one and hope that users overwrite the native library manually.
Again, by no means extensive, just a harsh reminder that there are workarounds available which are painful to use (again, mentioned in the original issue, the GRPC library that achieves this).
This scenario was addressed by NativeLibrary APIs added in .NET Core 3.0.
Most helpful comment
I am more much inclined to have an event mechanism. I have always loathed DB connection strings and having a DSL for this kind of thing quickly gets out of control. An event API for loading avoids these issues and users can perform _any_ logic they desire without waiting for an update to the DSL.