I don't think it's actually a _suggestion_, so I'm filing it as a compiler issue.
As @rspeele mentions in comment to TaskBuilder.fs repo, currently typedefof looks rather inefficient. Compare the following C# and the following F# snippets:
using System.Collections.Generic;
public class C {
public void M() {
var x = typeof(List<>);
}
}
open System.Collections.Generic
type C() =
member this.M() =
let x = typedefof<List<_>>
()
C# compiles to the following CIL:
ldtoken [mscorlib]System.Collections.Generic.List`1
call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle)
stloc.0
While F# compiles to the following:
ldtoken class [mscorlib]System.Collections.Generic.List`1<object>
call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle(valuetype [mscorlib]System.RuntimeTypeHandle)
stloc.1
ldloc.1
callvirt instance bool [mscorlib]System.Type::get_IsGenericType()
brfalse.s IL_001b
ldloc.1
callvirt instance class [mscorlib]System.Type [mscorlib]System.Type::GetGenericTypeDefinition()
br.s IL_001c
IL_001b: ldloc.1
IL_001c: stloc.0
which is an equivalent of
Type typeFromHandle = typeof(List<object>);
Type type = (!typeFromHandle.IsGenericType) ? typeFromHandle : typeFromHandle.GetGenericTypeDefinition()
Could we improve that? Are there any issues I miss with plain ldtoken that require the current approach?
If you use let x = typeof<List<_>> instead of typedefof the IL is the same as in C#. My original question in TaskBuilder issue was about which one to pick to make sure JIT-time constant magic doesn't break. Since it is JIT work, typeof should be fine. Much bigger issue would be if that didn't work from F#.
I created a benchmark for that. If writing simple F# code like below it is 100+ time slower than C#, because the method equality uses F# GenericEquality and when I just want to generate 10 millions longs profiler reminds me about my old SO question. Also the ULong64 case in F# generates many GC0, which indicates that casting is not optimized away by JIT.
type Worker() =
[<MethodImplAttribute(MethodImplOptions.AggressiveInlining)>]
member this.Work<'T>(x: 'T) : 'T =
if typeof<'T> = typeof<uint64> then
let d : uint64 = unbox(box(x))
unbox(box(d))
else
x
main
F#
Case | MOPS | Elapsed | GC0 | GC1 | GC2 | Memory
--------- |--------:|---------:|------:|------:|------:|--------:
F# Long | 45.46 | 22 ms | 0.0 | 0.0 | 0.0 | 0.008 MB
F# Ulong | 37.04 | 27 ms | 3.0 | 0.0 | 0.0 | 4.891 MB
Main
C#
Case | MOPS | Elapsed | GC0 | GC1 | GC2 | Memory
--------- |--------:|---------:|------:|------:|------:|--------:
C# Ulong | 5000.00 | 2 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
C# Long | 5000.00 | 2 ms | 0.0 | 0.0 | 0.0 | 0.000 MB

Then I changed the code to
type Worker() =
[<MethodImplAttribute(MethodImplOptions.AggressiveInlining)>]
member this.Work<'T>(x: 'T) : 'T =
if obj.ReferenceEquals(typeof<'T>, typeof<uint64>) then
let d : uint64 = unbox(box(x))
unbox(box(d))
else
x
and found that the method is not inlined for some reason even though it has the attribute (I believe the issue on this attributed was merged long time ago)

But adding inline to the F# method makes it exactly the same as C# without allocations etc.:
type Worker() =
[<MethodImplAttribute(MethodImplOptions.AggressiveInlining)>]
member inline this.Work<'T>(x: 'T) : 'T =
if obj.ReferenceEquals(typeof<'T>, typeof<uint64>) then
let d : uint64 = unbox(box(x))
unbox(box(d + 1UL))
else
x
main
F#
Case | MOPS | Elapsed | GC0 | GC1 | GC2 | Memory
--------- |--------:|---------:|------:|------:|------:|--------:
F# Long | 2500.00 | 4 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
F# Ulong | 2500.00 | 4 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
Main
C#
Case | MOPS | Elapsed | GC0 | GC1 | GC2 | Memory
--------- |--------:|---------:|------:|------:|------:|--------:
C# Ulong | 2500.00 | 4 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
C# Long | 2500.00 | 4 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
But increasing the number of operations for more precise measurement reveals that F# does less operations per second in both cases and the difference to C# is similar to the difference between the two cases (they differ by ULong addition) so likely the if branch is not eliminated in F#. I do not want to disassembly and dig deeper for now, probably @AdamSitnik could help to show how to use disassembly with BenchmarkDotNet.
main
F#
Case | MOPS | Elapsed | GC0 | GC1 | GC2 | Memory
--------- |--------:|---------:|------:|------:|------:|--------:
F# Long | 2173.91 | 46 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
F# Ulong | 1923.08 | 52 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
Main
C#
Case | MOPS | Elapsed | GC0 | GC1 | GC2 | Memory
--------- |--------:|---------:|------:|------:|------:|--------:
C# Ulong | 2325.58 | 43 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
C# Long | 2272.73 | 44 ms | 0.0 | 0.0 | 0.0 | 0.000 MB
The C# code is as simple/idiomatic as one could write, no special tricks are needed.
class Worker
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public T Work<T>(T x)
{
if (typeof(T) == typeof(ulong))
{
var u = (ulong)(object)x;
u = u + 1;
return (T)(object)u;
}
else
{
return x;
}
}
}
Update:
I have looked into the IL and found that C# generates equality using System.Type::op_Equality where in F# I used obj.ReferenceEquals. How to make F# generate the same? When I use simple .Equals it generates a method call and the benchmark becomes many times slower.
call bool [System.Runtime]System.Type::op_Equality(class [System.Runtime]System.Type, class [System.Runtime]System.Type)
Hi, perhaps I misunderstand your meaning with this statement:
If you use let x = typeof
> instead of typedefof the IL is the same as in C#
But F#'s typeof<List<_>> is equivalent to C#'s typeof(List<object>).
It's not a replacement for C#'s typeof(List<>), which is why in F# you must use either typedefof<List<_>> or typeof<List<_>>.GetGenericTypeDefinition(). In both of these scenarios, again, the <_> is effectively <obj>, it's just idiomatic to use _.
I should mention regardless of this being changed I still probably wouldn't want to do reflection-based checks to return cached bool tasks in TaskBuilder.fs, just because it adds complexity and there would always be some overhead to it in the majority of cases where _not_ returning bools. But _this_ issue is a separate thing and I must admit there have been times when doing heavy reflection code in F# that I have (probably as an unnecessary premature optimization) chosen to save the result of typedefof in a variable.
@rspeele in the original issue my focus was on JIT compile-time constant and if branch elimination. If it worked then...
there would always be some overhead to it in the majority of cases where not returning bools.
... there would be exactly zero overhead because JIT compiler just eliminates the wrong branches entirely (details here). I promised some benchmarks in the original issue and just posted it there.
@buybackoff I guess this issue https://github.com/Microsoft/visualfsharp/issues/526 still haunt us. I guess of the performance related issues comes from unwanted heap allocations (or even stack because of the excessive struct copy due the inability to pass struct's by reference,hopefully this will change with Span<'T> and inref<'T>).
F# should have some kind of infrastructure (even better: constraints or even attributes something like the roslyn checker provides!) to check the heap / stack allocations - perferrably by the compiler. It it possible to calculate the heap / stack allocation based on the TAST and the generated IL? While for functional languages It's an active research area [1], even a basic or a Roslyn HeapAllocationAnalyzer like tool will help a lot due the quick feedback loop to fix existing F# problems like this issue (it also helps that we know it's possible to solve it, because C# already have these tricks):
https://github.com/Microsoft/RoslynClrHeapAllocationAnalyzer
Also take a look at this "constraint" for C# as "The proposal is to declare a local function as non-capturing. The keyword to indicate that would be static":
https://github.com/dotnet/csharplang/issues/1565
[1] http://www.raml.co/
[2] https://arxiv.org/abs/1801.01896
I have looked into the IL and found that C# generates equality using System.Type::op_Equality where in F# I used obj.ReferenceEquals. How to make F# generate the same?
@buybackoff What happens when you use System.Type.op_Equality explicitly, e.g.
if System.Type.op_Equality(typeof<'T>, typeof<uint64>) then
...
Note that open FSharp.Core.Operators.NonStructuralComparison may be sensible for this kind of low-allocation code in any case. In that case, I believe op_Equality will always be called directly for all uses of a = b but of course you will lose structural quality calls and won't be able to use = on some types
@dsyme yep, this removes the difference (and actually F# becomes slightly faster). Haven't looked into IL.
But what is the deal with inlining here? Is F# compiler or JIT does the work when the inline is present on the method?
But what is the deal with inlining here? Is F# compiler or JIT does the work when the inline is present on the method?
The F# compiler does the work for inline. The JIT does the work for the AggressiveInlining attribute if there are any remaining calls to the method (which there won't be in F# code).
@dsyme I formulated the question incorrectly. When F# inlines the method (I realize that this is like literally pasting IL code to a call site) then boxing disappears. Does F# compiler eliminate the unbox(box(x)) part when types are the same or JIT? And why do you thing this method is not inlined with the AggressiveInlining attribute?
@dsyme I formulated the question incorrectly. When F# inlines the method (I realize that this is like literally pasting IL code to a call site) then boxing disappears. Does F# compiler eliminate the unbox(box(x)) part when types are the same or JIT?
I think the JIT will. F# doesn't. The inlining will also type-specialise the code and that might help the JIT.
And why do you thing this method is not inlined with the AggressiveInlining attribute?
No idea I'm afraid. I'd just be guessing.
FYI:
"Also, there are a number of reasons why methods with AggressiveInlining might not get inlined. If you can drop a checked clrjit.dll into your dotnet install, you can set COMPlus_JitPrintInlinedMethods to get a log of inlining decisions with failure reasons (this is less voluminous than a full jit dump)."
https://github.com/dotnet/coreclr/issues/5996#issuecomment-228467831
There is an optimization pass to detect box + isinst: (https://github.com/dotnet/coreclr/issues/12877 + https://github.com/dotnet/coreclr/issues/14472) https://github.com/dotnet/coreclr/blob/master/src/jit/compiler.h#L3153 however I did not yet found a box + unbox optimization pass in compiler.h / compiler.cpp.
@buybackoff last time I checked the if typeof<'T> = typeof
TypeofStructDU | 342.482 ns | 3.3741 ns | 3.1561 ns |
[<MethodImpl(MethodImplOptions.AggressiveInlining)>]
let inline matchTypeofStructDU(x: 'T) (v: int) =
if typeof<StructCase1> = typeof<'T> then
Baseline.Invoke(v,1)
else if typeof<StructCase2> = typeof<'T> then
Baseline.Invoke(v,2)
else if typeof<StructCase3> = typeof<'T> then
Baseline.Invoke(v,3)
else if typeof<StructCase4> = typeof<'T> then
Baseline.Invoke(v,4)
else if typeof<StructCase5> = typeof<'T> then
Baseline.Invoke(v,5)
else if typeof<StructCase6> = typeof<'T> then
Baseline.Invoke(v,6)
else if typeof<StructCase7> = typeof<'T> then
Baseline.Invoke(v,7)
else if typeof<StructCase8> = typeof<'T> then
Baseline.Invoke(v,8)
else if typeof<StructCase9> = typeof<'T> then
Baseline.Invoke(v,9)
else if typeof<StructCase10> = typeof<'T> then
Baseline.Invoke(v,10)
else 0
let typeofStructDUInstance = (Unchecked.defaultof<StructCase10>)
[<MethodImpl(MethodImplOptions.NoInlining)>]
let typeofStructDU() = matchTypeofStructDU typeofStructDUInstance 1
Generated IL code snippet (full namespace retracted) for typeof as match:
.method assembly static int32 matchTypeofStructDU$cont@396(class [FSharp.Core]Microsoft.FSharp.Core.Unit unitVar) cil managed
{
.custom instance void [System.Runtime]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 )
// Code size 284 (0x11c)
.maxstack 4
IL_0000: ldtoken StructCase3
IL_0005: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_000a: ldtoken StructCase10
IL_000f: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0014: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_0019: brfalse.s IL_0023
IL_001b: ldc.i4.1
IL_001c: ldc.i4.3
IL_001d: call int32 Baseline::Invoke(int32,
int32)
IL_0022: ret
IL_0023: ldtoken StructCase4
IL_0028: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_002d: ldtoken StructCase10
IL_0032: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0037: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_003c: brfalse.s IL_0046
IL_003e: ldc.i4.1
IL_003f: ldc.i4.4
IL_0040: call int32 Baseline::Invoke(int32,
int32)
IL_0045: ret
IL_0046: ldtoken StructCase5
IL_004b: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0050: ldtoken StructCase10
IL_0055: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_005a: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_005f: brfalse.s IL_0069
IL_0061: ldc.i4.1
IL_0062: ldc.i4.5
IL_0063: call int32 Baseline::Invoke(int32,
int32)
IL_0068: ret
IL_0069: ldtoken StructCase6
IL_006e: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0073: ldtoken StructCase10
IL_0078: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_007d: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_0082: brfalse.s IL_008c
IL_0084: ldc.i4.1
IL_0085: ldc.i4.6
IL_0086: call int32 Baseline::Invoke(int32,
int32)
IL_008b: ret
IL_008c: ldtoken StructCase7
IL_0091: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0096: ldtoken StructCase10
IL_009b: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_00a0: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_00a5: brfalse.s IL_00af
IL_00a7: ldc.i4.1
IL_00a8: ldc.i4.7
IL_00a9: call int32 Baseline::Invoke(int32,
int32)
IL_00ae: ret
IL_00af: ldtoken StructCase8
IL_00b4: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_00b9: ldtoken StructCase10
IL_00be: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_00c3: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_00c8: brfalse.s IL_00d2
IL_00ca: ldc.i4.1
IL_00cb: ldc.i4.8
IL_00cc: call int32 Baseline::Invoke(int32,
int32)
IL_00d1: ret
IL_00d2: ldtoken StructCase9
IL_00d7: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_00dc: ldtoken StructCase10
IL_00e1: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_00e6: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_00eb: brfalse.s IL_00f6
IL_00ed: ldc.i4.1
IL_00ee: ldc.i4.s 9
IL_00f0: call int32 Baseline::Invoke(int32,
int32)
IL_00f5: ret
IL_00f6: ldtoken StructCase10
IL_00fb: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0100: ldtoken StructCase10
IL_0105: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_010a: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_010f: brfalse.s IL_011a
IL_0111: ldc.i4.1
IL_0112: ldc.i4.s 10
IL_0114: call int32 Baseline::Invoke(int32,
int32)
IL_0119: ret
IL_011a: ldc.i4.0
IL_011b: ret
} // end of method MatchBenchmark::matchTypeofStructDU$cont@396
Generated IL code snippet (full namespace retracted) for benchmark:
.method public static int32 typeofStructDU() cil managed noinlining
{
// Code size 77 (0x4d)
.maxstack 4
IL_0000: ldtoken StructCase1
IL_0005: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_000a: ldtoken StructCase10
IL_000f: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0014: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_0019: brfalse.s IL_0023
IL_001b: ldc.i4.1
IL_001c: ldc.i4.1
IL_001d: call int32 Baseline::Invoke(int32,
int32)
IL_0022: ret
IL_0023: ldtoken StructCase2
IL_0028: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_002d: ldtoken StructCase10
IL_0032: call class [System.Runtime]System.Type [System.Runtime]System.Type::GetTypeFromHandle(valuetype [System.Runtime]System.RuntimeTypeHandle)
IL_0037: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>(!!0,
!!0)
IL_003c: brfalse.s IL_0046
IL_003e: ldc.i4.1
IL_003f: ldc.i4.2
IL_0040: call int32 Baseline::Invoke(int32,
int32)
IL_0045: ret
IL_0046: ldnull
IL_0047: call int32 MatchBenchmark::matchTypeofStructDU$cont@396(class [FSharp.Core]Microsoft.FSharp.Core.Unit)
IL_004c: ret
} // end of method MatchBenchmark::typeofStructDU
@zpodlovics the point is that the same C# method is inlined by JIT and they use the same JIT... which probably means IL is different. But in such cases like this I prefer F# order, not a hint, to inline.
There are not so many reasons when a simple method is not inlined - the most common is exception throwing, like in this case:

The method is F#-intrinsic, but not JIT-intrisic, so it is a normal one and any throw 100% makes a method not a candidate for inlining (which is easy to avoid with ThrowHelper pattern).
@zpodlovics
a somewhat horrible and the slowest code
"horrible and the slowest code" is this part: typeof<StructCase2> = typeof<'T> which translates to bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<class [System.Runtime]System.Type>.
Try using what @dsyme showed above: if System.Type.op_Equality(typeof<'T>, typeof<uint64>) and JIT will happily recognize that this condition is JIT-time constant and will remove false branches completely.
@buybackoff I did not noticed that System.Type.op_Equality comment earlier, I will try it out.
It may worth to reconsider the exception handling in F# code base expecially in standard library and move out the slow path to helpers (throwhelpers). This could shrink the fastpath number of ILs, that could also improve the JIT inlining decisions, but that's a different story / issue.
Adjacent box / unbox.any is handled via a peephole optimization in the jit importer (look at importer.cpp around line 14890).
I tried looking at inlining decisions in your linked repro case but it appears that inlining of Worker.Work was done by the frontend compiler (even without the inline keyword).
The jit in .Net Core 2.1 has a number of enhancements to type equality opts and boxing opts so you might also give that a try, if you're still using 2.0.
It may worth to reconsider the exception handling in F# code base expecially in standard library and move out the slow path to helpers (throwhelpers). This could shrink the fastpath number of ILs, that could also improve the JIT inlining decisions, but that's a different story / issue.
This is a low-hanging fruit. Corefx uses this everywhere with interesting comments on why two nested method. Since @AndyAyersMS is here he could give authoritative answer if this pattern is worth it to change many LOCs
Just looked into Array.fs and found that exceptions are inlined with F# inline keyword, which is quite opposite to ThrowHelper pattern.
@buybackoff Exactly, this why I mentioned. It's assumed to be a low-hanging fruit, but I am afraid it will be probably the most critical realization for F# performance related issues. It's not accidental that CoreCLR/CoreFX/Kestrel code refactored to that style.
a) The most common execution path shrinked significantly (fastpath). The less code means lot more opportunity to inline (some slow code path could also prevent inlining due the complexity, size, etc) and the JIT will generate lot less code to execute that path.
b) The non-common path (slowpath) moved out to own method/function to avoid code bloat when the fastpath is inlinied.
c) The other non-common path (exception) moved out to own method/function to avoid code bloat when the fastpath is inlinied.
Inlining/NoInlining flags should be used wisely: https://github.com/dotnet/coreclr/issues/1270#issuecomment-180862050
The original reason why raise marked inline is ... to avoid warnings is code coverage tests (instead of fixing the code coverage tool). That was reasonably reasonable at that time, but the CoreCLR changed/improved significantly since then. Please see my late concerns about here: https://github.com/Microsoft/visualfsharp/issues/3121#issuecomment-372032278
There are some legitimate case when inlining (at IL level by the compiler) exception throwing is required for example NoDynamicInvocation case here: https://github.com/dotnet/coreclr/issues/1270#issuecomment-181767435
@zpodlovics but that is tedious work without much fun. I try to use ThrowHelper by default after I created such helper for myself, but usually it makes sense when a profiler shows that e.g. method call takes more time than some numeric calculation. But at least in F# Core many throwing functions are reused from helper modules and it's easier to change code there than everywhere.
@buybackoff I did the same in my code. Well, we can always introduce a an attribute for forced inline exception throwing (on raise) eg.: ForceInlineThrow (or something else), and the compiler do a transformation and generate helper static methods for raise code. But I am afraid this automatic tranformation may result a non-equivalent program and/or error prone (but the same could be true for inlining exception throwing).
I would say in CoreCLR/CoreFX where're kind of in an ok but funny place. We have some number of methods that throw manually converted over to use throw helpers. But not all. We'd really prefer that this transformation be done automatically, behind the scenes (via say the ILLinker). But we don't have that yet.
The pattern is:
[MethodImpl(MethodImplOptions.NoInlining)].The jit recognizes this pattern and does not inline the throw helper, and moves its call site to the "cold" section of the calling method's code (either end of method when jitting, or in a split off cold part when prejitting).
Note that because of resource lookup, boxing, etc the formatting of the exception message can be a fairly large chunk of code (and the trend is to make these messages be more detailed). So we also get overall size wins when throw helpers can be reused across a number of different methods.
Whether this is worth doing is hard to say -- if you have a number of frequently called methods that have conditional throws then it probably is helpful. You can be selective and only convert things gradually.
Not re: the original topic, but where it meandered to... #5307 affects this (sprinkling breadcrumbs...)
(Oh, and BTW, not sure if it was ever changed, but I remember looked many moons ago at AggressiveInlining in f#... From memory the MethodImplAttributes were handled as special cases, and AggressiveInlining were not passed through to the IL (probably the enum value was created after the fsharp compiler did that bit...), which might explain why it didn't work. From memory again I think I had a crack at adding it, and it wasn't particularly hard - not sure why I never posted a PR though... Hmmm. Maybe there was some issue...)
@manofstick Last time I checked AggressiveInlining was fine in F#. The decompiled IL (with ildasm) have the aggressiveinlining marker. Congratulation for your amazing work on equality!
[<MethodImpl(MethodImplOptions.AggressiveInlining)>]
static member op_Equality(this : SOption<'a>, other : SOption<'a>) =
this.IsSome = other.IsSome && this.Value = other.Value
.method public specialname static bool
op_Equality(valuetype Program/VOption`1<!a> this,
valuetype Program/VOption`1<!a> other) cil managed aggressiveinlining
{
// Code size 44 (0x2c)
.maxstack 4
.locals init (!a V_0,
!a V_1)
IL_0000: ldarga.s this
IL_0002: call instance bool valuetype Program/VOption`1<!a>::get_IsSome()
IL_0007: ldarga.s other
IL_0009: call instance bool valuetype Program/VOption`1<!a>::get_IsSome()
IL_000e: bne.un.s IL_002a
IL_0010: ldarga.s this
IL_0012: call instance !0 valuetype Program/VOption`1<!a>::get_Value()
IL_0017: stloc.0
IL_0018: ldarga.s other
IL_001a: call instance !0 valuetype Program/VOption`1<!a>::get_Value()
IL_001f: stloc.1
IL_0020: ldloc.0
IL_0021: ldloc.1
IL_0022: tail.
IL_0024: call bool [FSharp.Core]Microsoft.FSharp.Core.LanguagePrimitives/HashCompare::GenericEqualityIntrinsic<!a>(!!0,
!!0)
IL_0029: ret
IL_002a: ldc.i4.0
IL_002b: ret
} // end of method VOption`1::op_Equality
Any way to move this forward?
@dsyme what are your thoughts on loading an open generic token like C# does by repurposing _ within a typedefof call by adding an optimization in the compiler?
@dsyme what are your thoughts on loading an open generic token like C# does by repurposing _ within a typedefof call by adding an optimization in the compiler?
Couldn't we just add a specific optimization to generate the open ldtoken?
That was the suggestion indeed!
Without adding discards support it is a bit confusing you must fully specify the type yet you don't care about the arguments.
Related to this is something I ran into a while back with a complex type; as typedefof is also just a normal generic function it must also respect the constraints on the type parameters involved. This can lead to awkward signatures where you must put a lot of emphasis on the arguments (think nonsense like Foo<'T, 'U when 'T :> seq<'U>, 'U :> IEquatable<'U>>). I'm assuming it would have to become an intrinsic to loosen type checks for typedef, if it's even something you want to entertain.
Most helpful comment
Couldn't we just add a specific optimization to generate the open
ldtoken?