Runtime: Custom internal calls in .NET Core hosting

Created on 31 Jan 2019  路  16Comments  路  Source: dotnet/runtime

P/Invoke is not so efficient as internal calls when working with frequent native code calls, for instance, game scripting runtime.

Internal calls in CoreCLR are hard coded in ecalllist.h and limited to mscorlib.dll scope. While Mono provides a mono_add_internal_call API which became the first choice of Unity and CRYENGINE.

Will you provide an API to register custom internal calls?

area-Interop-coreclr

Most helpful comment

I think the proper way to address this is to add attribute to annotate PInvoke methods that always take less than microsecond and that do not do other problematic actions like taking locks that can deadlock with GC. The runtime would recognize this attribute and skip the full PInvoke transition for these.

More details in https://github.com/dotnet/coreclr/pull/22383#issuecomment-461429171

All 16 comments

Would the calli proposal in the csharp "Compiler Intrinsics" https://github.com/dotnet/csharplang/blob/master/proposals/intrinsics.md#calli fit this need? (Though I don't know if that's the latest variant)

/cc @jaredpar

@benaadams Unfortunately, calli instruction may not be faster than P/Invoke.

I haven't tested it myself, but someone has:
https://ybeernet.blogspot.com/2011/03/techniques-of-calling-unmanaged-code.html

Behaviour should be improved in coreclr https://github.com/dotnet/coreclr/pull/13756

(Hopefully I'm not completely wrong with this reply, this is what I've figured our from looking at the CLR source code.)

One issue I can see is that all the internal calls inside the runtime have to do a few things to remain safe, i.e. look at DebugStackTrace::GetStackFramesInternal (wired up here). If you look at the first few lines https://github.com/dotnet/coreclr/blob/83fcf2e552d190892d9ed7264ebef94b379f1f11/src/vm/debugdebugger.cpp#L333-L347

It has to play nice with the GC (GCPROTECT_BEGIN), erect a frame to make stack walking work and a few other things. See this gist for the expanded versions of some of these marcos.

I assume that all of this is taken care of, in the generated stubs, when using P/Invoke. But I guess that it would need to happen in any native code that the runtime called into, otherwise 'bad things would happen!'

For a bit more info see Calling from managed to native code, Is your code GC-safe? and PInvokes

We currently use P/Invoke to call into things like MKL and the overhead does show up in backtraces. We only pass unsafe pointers to unmanaged code objects so if there was a faster way to invoke native code that:

1) Didn't need marshalling
2) Never called back into .NET
3) Didn't manipulate any .NET objects

then we'd be extremely happy!

I think the proper way to address this is to add attribute to annotate PInvoke methods that always take less than microsecond and that do not do other problematic actions like taking locks that can deadlock with GC. The runtime would recognize this attribute and skip the full PInvoke transition for these.

More details in https://github.com/dotnet/coreclr/pull/22383#issuecomment-461429171

Thanks for your information @benaadams. I did a simple test and calli instruction turned out to be the best solution for now.

Interestingly, Math.Sqrt(Double) runs as fast as Convert.ToDouble(Double).

Line Chart

Not that I recommend this for your code, if you're satisfied with the performance ... but you can use calli with the managed calling convention to get better performance at the expense of delaying garbage collection. I did not see that in your benchmark you posted above.

On x64 the managed calling convention is the native x64 calling convention, and on Windows x86 I believe it is __fastcall.

You can read more about this in 15.5.6.3 Fast calls to unmanaged code in Partition II: Metadata Definition and Semantics (With Added Microsoft Specific Implementation Notes)

I think the proper way to address this is to add attribute to annotate PInvoke methods ...

I honestly don't know why delegates retrieved by Marshal.GetDelegateForFunctionPointer will be a lot slower.

calli instruction turned out to be the best solution for now.

The difference between calli and PInvoke is noise. I would recommend choosing the one out of these two that gives you the most maintainable code. The performance gains you get by being able to maintain and refactor your code easily will be much higher than what you get from calli vs. PInvoke.

Marshal.GetDelegateForFunctionPointer will be a lot slower.

The delegate has an extra indirection in it. This indirection costs extra instructions.

15.5.6.3 Fast calls to unmanaged code

If you consider doing something like this, make sure that the unmanaged code that you are calling meets all constrains listed in the doc.

What I actually need is to call function pointers provided by unmanaged code at runtime, which can not be retrieved by NativeLibrary.Load and NativeLibrary.GetExport. So P/Invoke doesn't work in this scenario, Marshal.GetDelegateForFunctionPointer seems to be slow and cumbersome, and calli is unsafe... I wonder whether there would be a better workaround.

Why can鈥檛 you use calli with the unmanaged calling convention? That will solve your problem without having to worry about the constraints and GC delaying of using a managed calling convention.

Not that I recommend this for your code

You said you don't recommend this?

calli with the managed calling convention, described in 15.5.6.3 Fast calls to unmanaged code is the one that I do not recommend without understanding the constraints.

calli with an unmanaged calling convention, like cdecl, stdcall, etc still erect a PInvoke frame and are as safe as .... well calling into any unmanaged code is, so that would work for the scenario you describe.

As an experiment I tested calli with managed calling convention. It's super fast...

Results still in https://github.com/dotnet/coreclr/issues/22320#issuecomment-466736048

@NextTurn I don't see any current action here so am closing this. Please file a new issue if there is some action that is being suggested or asked about. Thank you.

Was this page helpful?
0 / 5 - 0 ratings