Version Used:
2.7.0.62715 (db02128e)
Steps to Reproduce:
Expected Behavior:
All cases resulting in the same optimal IL.
Actual Behavior:
Many cases resulting in total crap IL.
Use the following to play with
c#
static class C {
// Compiling with optimization turned ON, the following lines should
// be able to generate the same code.
static bool fn(this int val) {
// bool retval = (val % 2 == 0) ? true : false; return retval;
bool retval = (val % 2 != 0) ? false : true; return retval;
// bool retval; retval = (val % 2 == 0) ? true : false; return retval;
// bool retval; if (val % 2 == 0) retval = true; else retval = false; return retval;
// if (val % 2 == 0) { return true; } else return false;
// the following three are thankfully generating the same code with /o
// bool retval; retval = (val % 2 == 0); return retval;
// bool retval = (val % 2 == 0); return retval;
// return val % 2 == 0;
}
}
The flag /o causes the compiler to not generate debug information, but the compiler (by design) performs only the simplest of local "optimizations". Generally, we put optimizations in our runtime compilers.
@tamlin-mike, it may also be worth noting that, while the IL could "be improved", the only thing that actually matters is the output of the JIT.
In some cases, the "more complex" IL has special recognition by the JIT or is done to meet language specification requirements.
I didn't realize I'd have to back this up with profiling data.
@gafter I thought it was an actual optimizer and therefore expected a certain amount of quality and attention to details. When you now explained it's a non-optimizer _by design_ I see my expectations are diametrically opposite to reality.
Perhaps this fact (_only the simplest of local "optimizations"_) should be added to the help output from csc, to prevent future confusion from people expecting an actual optimizer to kick in?
@tannergooding I find that argument to be like claiming "what machinecode c2.dll emits is irellevant, what matters is only the CPU generated microcode output". Had such a claim been made by someone on the c2 compiler optimization team, I'd expect a vacancy in a very short amount of time.
To me, in general, larger = slower (unless for cache-line alignment); More complex IL would mean more work later (= slower to parse and optimize) for the JIT compiler to try to figure stuff out, and since the JIT by neccessity has got way less time to optimize than the off-line compiler, it follows logic it simply can't do as good of a job (= slower generated CPU machinecode).
In some cases, the "more complex" IL has special recognition by the JIT or is done to meet language specification requirements.
I fail to see how this have any relevance to the provided example. Could you elaborate?
You may want to read this: https://blogs.msdn.microsoft.com/ericlippert/2009/06/11/what-does-the-optimize-switch-do/
Perhaps this fact (only the simplest of local "optimizations") should be added to the help output from csc, to prevent future confusion from people expecting an actual optimizer to kick in?
What tangible problem would that solve (beyond the case of people reading a check box label and jumping to conclusions)?
Had such a claim been made by someone on the c2 compiler optimization team, I'd expect a vacancy in a very short amount of time.
Perhaps. But we'll never know because you created the issue in the wrong repository. This is basically the "c1/c1xx" repository, not the c2 "repository". If you have an actual problem with the performance of the code generated by the JIT for such patterns then you should post in the coreclr repository.
To me, in general, larger = slower (unless for cache-line alignment); More complex IL would mean more work later (= slower to parse and optimize) for the JIT compiler to try to figure stuff out, and since the JIT by neccessity has got way less time to optimize than the off-line compiler, it follows logic it simply can't do as good of a job (= slower generated CPU machinecode).
Do you have any evidence for all this? It sounds like you are making all kinds of assumptions.
I fail to see how this have any relevance to the provided example. Could you elaborate?
It really shouldn't be relevant but the JIT has its own issues and it's can be quite sensitive to the IL shape.
Example 1, Example 2, Example 3, and Example 5 generate equivalent assembly on the Desktop JIT for x86
Example 4 generates slightly different assembly but variable assignment can have side effects so I question whether 4 is actually semantically equivalent to the others.
C#:
```C#
bool retval = (val % 2 == 0) ? true : false; return retval;
**IL:**
```C#
IL_0000: ldarg.0
IL_0001: ldc.i4.2
IL_0002: rem
IL_0003: brfalse.s IL_0007
IL_0005: ldc.i4.0
IL_0006: ret
IL_0007: ldc.i4.1
IL_0008: ret
x86 Assembly:
L0000: and ecx, 0x80000001
L0006: jns L000d
L0008: dec ecx
L0009: or ecx, 0xfffffffe
L000c: inc ecx
L000d: test ecx, ecx
L000f: jz L0014
L0011: xor eax, eax
L0013: ret
L0014: mov eax, 0x1
L0019: ret
C#:
```C#
bool retval = (val % 2 != 0) ? false : true; return retval;
**IL:**
```C#
IL_0000: ldarg.0
IL_0001: ldc.i4.2
IL_0002: rem
IL_0003: brtrue.s IL_0007
IL_0005: ldc.i4.1
IL_0006: ret
IL_0007: ldc.i4.0
IL_0008: ret
x86 Assembly:
L0000: and ecx, 0x80000001
L0006: jns L000d
L0008: dec ecx
L0009: or ecx, 0xfffffffe
L000c: inc ecx
L000d: test ecx, ecx
L000f: jnz L0017
L0011: mov eax, 0x1
L0016: ret
L0017: xor eax, eax
L0019: ret
C#:
```C#
bool retval; retval = (val % 2 == 0) ? true : false; return retval;
**IL:**
```C#
IL_0000: ldarg.0
IL_0001: ldc.i4.2
IL_0002: rem
IL_0003: brfalse.s IL_0007
IL_0005: ldc.i4.0
IL_0006: ret
IL_0007: ldc.i4.1
IL_0008: ret
x86 Assembly:
L0000: and ecx, 0x80000001
L0006: jns L000d
L0008: dec ecx
L0009: or ecx, 0xfffffffe
L000c: inc ecx
L000d: test ecx, ecx
L000f: jz L0014
L0011: xor eax, eax
L0013: ret
L0014: mov eax, 0x1
L0019: ret
C#:
```C#
bool retval; if (val % 2 == 0) retval = true; else retval = false; return retval;
**IL:**
```C#
IL_0000: ldarg.0
IL_0001: ldc.i4.2
IL_0002: rem
IL_0003: brtrue.s IL_0009
IL_0005: ldc.i4.1
IL_0006: stloc.0
IL_0007: br.s IL_000b
IL_0009: ldc.i4.0
IL_000a: stloc.0
IL_000b: ldloc.0
IL_000c: ret
x86 Assembly:
L0000: push ebp
L0001: mov ebp, esp
L0003: and ecx, 0x80000001
L0009: jns L0010
L000b: dec ecx
L000c: or ecx, 0xfffffffe
L000f: inc ecx
L0010: test ecx, ecx
L0012: jnz L001b
L0014: mov eax, 0x1
L0019: jmp L001d
L001b: xor eax, eax
L001d: pop ebp
L001e: ret
C#:
```C#
if (val % 2 == 0) { return true; } else return false;
**IL:**
```C#
IL_0000: ldarg.0
IL_0001: ldc.i4.2
IL_0002: rem
IL_0003: brtrue.s IL_0007
IL_0005: ldc.i4.1
IL_0006: ret
IL_0007: ldc.i4.0
IL_0008: ret
x86 Assembly:
L0000: and ecx, 0x80000001
L0006: jns L000d
L0008: dec ecx
L0009: or ecx, 0xfffffffe
L000c: inc ecx
L000d: test ecx, ecx
L000f: jnz L0017
L0011: mov eax, 0x1
L0016: ret
L0017: xor eax, eax
L0019: ret