Benchmarkdotnet: Strange disassembly ordering/truncation

Created on 30 Aug 2017  路  8Comments  路  Source: dotnet/BenchmarkDotNet

I'm looking at the disasm reported by the DisassemblyDiagnoser, and seeing some oddities.

Here's my benchmark class:

```c#
public class Program
{
private readonly int[] array = new int[] { 3, 5, 6, 7, 9, 8, 2 };

    [Benchmark]
    public int DoFoo() => Foo(7, array);

    static int Foo(int n, int[] a)
    {
        for (int i = 0; i < a.Length; ++i)
        {
            if (a[i] == n)
            {
                return i;
            }
        }
        return -1;
    }

    public static void Main(string[] args)
    {
        BenchmarkRunner.Run<Program>();
    }
}

and I'm running against [Microsoft.NETCore.App 2.1.0-preview2-25628-01](https://dotnet.myget.org/feed/dotnet-core/package/nuget/Microsoft.NETCore.App/2.1.0-preview2-25628-01)

If I locally build a corresponding checked CoreCLR/RyuJIT, here's the disassembly for method `Foo`:

```asm
; Assembly listing for method Program:Foo(int,ref):int
; Emitting BLENDED_CODE for X64 CPU with AVX
; optimized code
; rsp based frame
; fully interruptible
; Final local variable assignments
;
;  V00 arg0         [V00,T03] (  3,  6   )     int  ->  rcx
;  V01 arg1         [V01,T02] (  6,  9   )     ref  ->  rdx         class-hnd
;  V02 loc0         [V02,T00] (  8, 22.50)     int  ->  rax
;# V03 OutArgs      [V03    ] (  1,  1   )  lclBlk ( 0) [rsp+0x00]
;  V04 cse0         [V04,T01] (  8, 14   )     int  ->   r8
;
; Lcl frame size = 0

G_M571_IG01:

G_M571_IG02:
       33C0                 xor      eax, eax
       448B4208             mov      r8d, dword ptr [rdx+8]
       4585C0               test     r8d, r8d
       7E12                 jle      SHORT G_M571_IG05

G_M571_IG03:
       4C63C8               movsxd   r9, eax
       42394C8A10           cmp      dword ptr [rdx+4*r9+16], ecx
       740E                 je       SHORT G_M571_IG07

G_M571_IG04:
       FFC0                 inc      eax
       443BC0               cmp      r8d, eax
       7FEF                 jg       SHORT G_M571_IG03

G_M571_IG05:
       B8FFFFFFFF           mov      eax, -1

G_M571_IG06:
       C3                   ret

G_M571_IG07:
       C3                   ret

; Total bytes of code 35, prolog size 0 for method Program:Foo(int,ref):int

But what the DisassemblyDiagnoser reports is this:

00007ffc`a8a0e460 Program.Foo(Int32, Int32[])
00007ffc`a8a0e460 33c0            xor     eax,eax
00007ffc`a8a0e46b 4c63c8          movsxd  r9,eax
00007ffc`a8a0e46e 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffc`a8a0e473 740d            je      00007ffc`a8a0e482
00007ffc`a8a0e482 c3              ret
00007ffc`a8a0e475 ffc0            inc     eax
00007ffc`a8a0e462 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffc`a8a0e466 4585c0          test    r8d,r8d
00007ffc`a8a0e469 7e11            jle     00007ffc`a8a0e47c
00007ffc`a8a0e477 443bc0          cmp     r8d,eax
00007ffc`a8a0e47a 7fef            jg      00007ffc`a8a0e46b
00007ffc`a8a0e47c b8ffffffff      mov     eax,0FFFFFFFFh

Notice that:

  • The code from 00007ffc`a8a0e462 through 00007ffc`a8a0e46a has been displayed out-of-order, just after 00007ffc`a8a0e475
  • The ret at 00007ffc`a8a0e482 (which was G_M571_IG07 above) has been displayed out-of-order, immediately after 00007ffc`a8a0e473 which is the jmp that targets it.
  • The ret that was at G_M571_IG06 is just completely absent from the report.

Am I missing a setting to display all the machine instructions, in machine code order? This seems to maybe be in IL order, which will make looking at code built release difficult.

Trying again with printPrologAndEpilog set to true doesn't help, it still has the same three issues, and appends the disassembly of the next several bytes (which likely weren't code):

00007ffc`a89fe460 Program.Foo(Int32, Int32[])
00007ffc`a89fe460 33c0            xor     eax,eax
00007ffc`a89fe46b 4c63c8          movsxd  r9,eax
00007ffc`a89fe46e 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffc`a89fe473 740d            je      00007ffc`a89fe482
00007ffc`a89fe482 c3              ret
00007ffc`a89fe475 ffc0            inc     eax
00007ffc`a89fe462 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffc`a89fe466 4585c0          test    r8d,r8d
00007ffc`a89fe469 7e11            jle     00007ffc`a89fe47c
00007ffc`a89fe477 443bc0          cmp     r8d,eax
00007ffc`a89fe47a 7fef            jg      00007ffc`a89fe46b
00007ffc`a89fe47c b8ffffffff      mov     eax,0FFFFFFFFh
00007ffc`a89fe483 cc              int     3
00007ffc`a89fe484 1900            sbb     dword ptr [rax],eax
00007ffc`a89fe486 0000            add     byte ptr [rax],al
00007ffc`a89fe488 3e0000          add     byte ptr ds:[rax],al
00007ffc`a89fe48b 0000            add     byte ptr [rax],al
00007ffc`a89fe48d 0000            add     byte ptr [rax],al
00007ffc`a89fe48f 0000            add     byte ptr [rax],al
00007ffc`a89fe491 0000            add     byte ptr [rax],al
00007ffc`a89fe493 0000            add     byte ptr [rax],al
00007ffc`a89fe495 0000            add     byte ptr [rax],al
00007ffc`a89fe497 00d8            add     al,bl
00007ffc`a89fe499 2aaea8fc7f00    sub     ch,byte ptr [rsi+7FFCA8h]
00007ffc`a89fe49f 008b4148ffc0    add     byte ptr [rbx-3F00B7BFh],cl

This was with BenchmarkDotNet version 0.10.9.287

Diagnosers bug

Most helpful comment

@adamsitnik I see what you're seeing with .NET Framework 4.7, but I'm still seeing missing return with 0.10.9.290, when I use .NET Core version 2.1.0-preview2-25628-01. I've copied the repro config/program up to a gist, and here's a cmd script that will repro the issue (pulling down from the gist):

@rem Make a directory called '.dotnet' to insert a local installation of the runtime(s):
md .dotnet
powershell Invoke-WebRequest -Uri "https://raw.githubusercontent.com/dotnet/cli/master/scripts/obtain/dotnet-install.ps1" -OutFile .dotnet\dotnet-install.ps1
@rem  - Install the specific version of the runtime exhibiting the issue
powershell -ExecutionPolicy unrestricted -NoProfile .dotnet\dotnet-install.ps1 -SharedRuntime -InstallDir "$PWD\.dotnet" -Channel master -Version 2.1.0-preview2-25628-01 -Architecture x64 -NoPath
@rem  - Install the latest 2.1 SDK (and whatever version of the runtime exactly matches it will come along for the ride)
powershell -ExecutionPolicy unrestricted -NoProfile .dotnet\dotnet-install.ps1 -InstallDir "$PWD\.dotnet" -Channel master -Architecture x64 -NoPath
@rem  - Install the 2.0 SDK and runtime
powershell -ExecutionPolicy unrestricted -NoProfile .dotnet\dotnet-install.ps1 -InstallDir "$PWD\.dotnet" -Channel 2.0 -Architecture x64 -NoPath

@rem Set up NuGet.config to get preview drops of dotnet and BenchmarkDotNet
powershell Invoke-WebRequest -Uri "https://gist.githubusercontent.com/JosephTremoulet/0c23ee869a3dcfcce63fc5d151dcd001/raw/5f67356dfb52c5cc69aed39b8282d46c5e5297cd/NuGet.config" -OutFile NuGet.config

@rem Create repro
.dotnet\dotnet new console
.dotnet\dotnet add package BenchmarkDotNet
powershell Invoke-WebRequest -Uri "https://gist.githubusercontent.com/JosephTremoulet/0c23ee869a3dcfcce63fc5d151dcd001/raw/5f67356dfb52c5cc69aed39b8282d46c5e5297cd/Program.cs" -OutFile Program.cs

@rem build repro
.dotnet\dotnet restore
.dotnet\dotnet build -c release

@rem Put local dotnet.exe on path so that it's what BenchmarkDotNet toolchain invocations will find when we run
set PATH=%CD%\.dotnet;%PATH%

@rem This should show that we're running the local dotnet
dotnet --info

@rem run the repro
dotnet run -c release

The report I get looks like this for method Foo:

00007ffc`9ceae650 dropped.Program.Foo(Int32, Int32[])
00007ffc`9ceae650 33c0            xor     eax,eax
00007ffc`9ceae652 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffc`9ceae656 4585c0          test    r8d,r8d
00007ffc`9ceae659 7e11            jle     00007ffc`9ceae66c
00007ffc`9ceae65b 4c63c8          movsxd  r9,eax
00007ffc`9ceae65e 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffc`9ceae663 740d            je      00007ffc`9ceae672
00007ffc`9ceae665 ffc0            inc     eax
00007ffc`9ceae667 443bc0          cmp     r8d,eax
00007ffc`9ceae66a 7fef            jg      00007ffc`9ceae65b
00007ffc`9ceae66c b8ffffffff      mov     eax,0FFFFFFFFh
00007ffc`9ceae672 c3              ret
00007ffc`9ceae673 cc              int     3
00007ffc`9ceae674 1900            sbb     dword ptr [rax],eax
00007ffc`9ceae676 0000            add     byte ptr [rax],al
00007ffc`9ceae678 3e0000          add     byte ptr ds:[rax],al
00007ffc`9ceae67b 0000            add     byte ptr [rax],al
00007ffc`9ceae67d 0000            add     byte ptr [rax],al
00007ffc`9ceae67f 0000            add     byte ptr [rax],al
00007ffc`9ceae681 0000            add     byte ptr [rax],al
00007ffc`9ceae683 0000            add     byte ptr [rax],al
00007ffc`9ceae685 0000            add     byte ptr [rax],al
00007ffc`9ceae687 00502b          add     byte ptr [rax+2Bh],dl
00007ffc`9ceae68a f9              stc
00007ffc`9ceae68b 9c              pushfq
00007ffc`9ceae68c fc              cld
00007ffc`9ceae68d 7f00            jg      00007ffc`9ceae68f
00007ffc`9ceae68f 008b4148ffc0    add     byte ptr [rbx-3F00B7BFh],cl

There's another ret, at 00007ffc`9ceae671, which isn't being displayed.

All 8 comments

@JosephTremoulet thank you very much for the report. The DisassemblyDiagnoser is a recent feature and you are one of our first users. It's great that you are trying to use it!
@adamsitnik, could you take a look?

@JosephTremoulet big thanks for giving it a try and reporting this bug!

I have fixed the ordering, but I will need to spend some more time for the missing return

This build contains the fix

Before (.NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2101.1):

00007ffe`16fd8df0 BenchmarkDotNet.Samples.Bug.Foo(Int32, Int32[])   
00007ffe`16fd8df0 33c0            xor     eax,eax
00007ffe`16fd8dfb 4c63c8          movsxd  r9,eax
00007ffe`16fd8dfe 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffe`16fd8e03 7501            jne     00007ffe`16fd8e06
00007ffe`16fd8e05 c3              ret
00007ffe`16fd8e06 ffc0            inc     eax
00007ffe`16fd8df2 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffe`16fd8df6 4585c0          test    r8d,r8d
00007ffe`16fd8df9 7e12            jle     00007ffe`16fd8e0d
00007ffe`16fd8e08 443bc0          cmp     r8d,eax
00007ffe`16fd8e0b 7fee            jg      00007ffe`16fd8dfb
00007ffe`16fd8e0d b8ffffffff      mov     eax,0FFFFFFFFh

After:

00007ffe`16ff8df0 BenchmarkDotNet.Samples.Bug.Foo(Int32, Int32[])   
00007ffe`16ff8df0 33c0            xor     eax,eax
00007ffe`16ff8df2 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffe`16ff8df6 4585c0          test    r8d,r8d
00007ffe`16ff8df9 7e12            jle     00007ffe`16ff8e0d
00007ffe`16ff8dfb 4c63c8          movsxd  r9,eax
00007ffe`16ff8dfe 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffe`16ff8e03 7501            jne     00007ffe`16ff8e06
00007ffe`16ff8e05 c3              ret
00007ffe`16ff8e06 ffc0            inc     eax
00007ffe`16ff8e08 443bc0          cmp     r8d,eax
00007ffe`16ff8e0b 7fee            jg      00007ffe`16ff8dfb
00007ffe`16ff8e0d b8ffffffff      mov     eax,0FFFFFFFFh

@adamsitnik sure, thanks for making it easy to get disassembly w/o custom builds!

@JosephTremoulet could you try it with our 0.10.9.289 package?

This is what I am getting now for .NET Framework 4.7 (CLR 4.0.30319.42000), 64bit RyuJIT-v4.7.2106.0:
printPrologAndEpilog = false

00007ffb`563d41a0 BenchmarkDotNet.Samples.Bug.Foo(Int32, Int32[])   
00007ffb`563d41a0 33c0            xor     eax,eax
00007ffb`563d41a2 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffb`563d41a6 4585c0          test    r8d,r8d
00007ffb`563d41a9 7e12            jle     00007ffb`563d41bd
00007ffb`563d41ab 4c63c8          movsxd  r9,eax
00007ffb`563d41ae 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffb`563d41b3 7501            jne     00007ffb`563d41b6
00007ffb`563d41b5 c3              ret
00007ffb`563d41b6 ffc0            inc     eax
00007ffb`563d41b8 443bc0          cmp     r8d,eax
00007ffb`563d41bb 7fee            jg      00007ffb`563d41ab
00007ffb`563d41bd b8ffffffff      mov     eax,0FFFFFFFFh

printPrologAndEpilog = true

00007ffb`563d41a0 BenchmarkDotNet.Samples.Bug.Foo(Int32, Int32[])   
00007ffb`563d41a0 33c0            xor     eax,eax
00007ffb`563d41a2 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffb`563d41a6 4585c0          test    r8d,r8d
00007ffb`563d41a9 7e12            jle     00007ffb`563d41bd
00007ffb`563d41ab 4c63c8          movsxd  r9,eax
00007ffb`563d41ae 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffb`563d41b3 7501            jne     00007ffb`563d41b6
00007ffb`563d41b5 c3              ret
00007ffb`563d41b6 ffc0            inc     eax
00007ffb`563d41b8 443bc0          cmp     r8d,eax
00007ffb`563d41bb 7fee            jg      00007ffb`563d41ab
00007ffb`563d41bd b8ffffffff      mov     eax,0FFFFFFFFh
00007ffb`563d41c2 c3              ret
00007ffb`563d41c3 cc              int     3
00007ffb`563d41c4 1900            sbb     dword ptr [rax],eax
00007ffb`563d41c6 0000            add     byte ptr [rax],al
00007ffb`563d41c8 3e0000          add     byte ptr ds:[rax],al
00007ffb`563d41cb 0000            add     byte ptr [rax],al
00007ffb`563d41cd 0000            add     byte ptr [rax],al
00007ffb`563d41cf 0000            add     byte ptr [rax],al
00007ffb`563d41d1 0000            add     byte ptr [rax],al
00007ffb`563d41d3 0000            add     byte ptr [rax],al
00007ffb`563d41d5 0000            add     byte ptr [rax],al
00007ffb`563d41d7 007888          add     byte ptr [rax-78h],bh
00007ffb`563d41da 4556            push    r14
00007ffb`563d41dc fb              sti
00007ffb`563d41dd 7f00            jg      00007ffb`563d41df

@adamsitnik I see what you're seeing with .NET Framework 4.7, but I'm still seeing missing return with 0.10.9.290, when I use .NET Core version 2.1.0-preview2-25628-01. I've copied the repro config/program up to a gist, and here's a cmd script that will repro the issue (pulling down from the gist):

@rem Make a directory called '.dotnet' to insert a local installation of the runtime(s):
md .dotnet
powershell Invoke-WebRequest -Uri "https://raw.githubusercontent.com/dotnet/cli/master/scripts/obtain/dotnet-install.ps1" -OutFile .dotnet\dotnet-install.ps1
@rem  - Install the specific version of the runtime exhibiting the issue
powershell -ExecutionPolicy unrestricted -NoProfile .dotnet\dotnet-install.ps1 -SharedRuntime -InstallDir "$PWD\.dotnet" -Channel master -Version 2.1.0-preview2-25628-01 -Architecture x64 -NoPath
@rem  - Install the latest 2.1 SDK (and whatever version of the runtime exactly matches it will come along for the ride)
powershell -ExecutionPolicy unrestricted -NoProfile .dotnet\dotnet-install.ps1 -InstallDir "$PWD\.dotnet" -Channel master -Architecture x64 -NoPath
@rem  - Install the 2.0 SDK and runtime
powershell -ExecutionPolicy unrestricted -NoProfile .dotnet\dotnet-install.ps1 -InstallDir "$PWD\.dotnet" -Channel 2.0 -Architecture x64 -NoPath

@rem Set up NuGet.config to get preview drops of dotnet and BenchmarkDotNet
powershell Invoke-WebRequest -Uri "https://gist.githubusercontent.com/JosephTremoulet/0c23ee869a3dcfcce63fc5d151dcd001/raw/5f67356dfb52c5cc69aed39b8282d46c5e5297cd/NuGet.config" -OutFile NuGet.config

@rem Create repro
.dotnet\dotnet new console
.dotnet\dotnet add package BenchmarkDotNet
powershell Invoke-WebRequest -Uri "https://gist.githubusercontent.com/JosephTremoulet/0c23ee869a3dcfcce63fc5d151dcd001/raw/5f67356dfb52c5cc69aed39b8282d46c5e5297cd/Program.cs" -OutFile Program.cs

@rem build repro
.dotnet\dotnet restore
.dotnet\dotnet build -c release

@rem Put local dotnet.exe on path so that it's what BenchmarkDotNet toolchain invocations will find when we run
set PATH=%CD%\.dotnet;%PATH%

@rem This should show that we're running the local dotnet
dotnet --info

@rem run the repro
dotnet run -c release

The report I get looks like this for method Foo:

00007ffc`9ceae650 dropped.Program.Foo(Int32, Int32[])
00007ffc`9ceae650 33c0            xor     eax,eax
00007ffc`9ceae652 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffc`9ceae656 4585c0          test    r8d,r8d
00007ffc`9ceae659 7e11            jle     00007ffc`9ceae66c
00007ffc`9ceae65b 4c63c8          movsxd  r9,eax
00007ffc`9ceae65e 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffc`9ceae663 740d            je      00007ffc`9ceae672
00007ffc`9ceae665 ffc0            inc     eax
00007ffc`9ceae667 443bc0          cmp     r8d,eax
00007ffc`9ceae66a 7fef            jg      00007ffc`9ceae65b
00007ffc`9ceae66c b8ffffffff      mov     eax,0FFFFFFFFh
00007ffc`9ceae672 c3              ret
00007ffc`9ceae673 cc              int     3
00007ffc`9ceae674 1900            sbb     dword ptr [rax],eax
00007ffc`9ceae676 0000            add     byte ptr [rax],al
00007ffc`9ceae678 3e0000          add     byte ptr ds:[rax],al
00007ffc`9ceae67b 0000            add     byte ptr [rax],al
00007ffc`9ceae67d 0000            add     byte ptr [rax],al
00007ffc`9ceae67f 0000            add     byte ptr [rax],al
00007ffc`9ceae681 0000            add     byte ptr [rax],al
00007ffc`9ceae683 0000            add     byte ptr [rax],al
00007ffc`9ceae685 0000            add     byte ptr [rax],al
00007ffc`9ceae687 00502b          add     byte ptr [rax+2Bh],dl
00007ffc`9ceae68a f9              stc
00007ffc`9ceae68b 9c              pushfq
00007ffc`9ceae68c fc              cld
00007ffc`9ceae68d 7f00            jg      00007ffc`9ceae68f
00007ffc`9ceae68f 008b4148ffc0    add     byte ptr [rbx-3F00B7BFh],cl

There's another ret, at 00007ffc`9ceae671, which isn't being displayed.

@JosephTremoulet this is the best repro case I have ever seen! And the script itself is very useful.

I have tracked down and fixed this bug.

When using ClrMD I have assumed that every map without ILOffset was part of the prolog or epilog.

I was wrong, it turns out that there can be sth in the middle of the code (like the missing ret) without corresponding IL.

image

0.10.9.291 has the fix.

The output is now:

00007ffd`e57711a0 33c0            xor     eax,eax
00007ffd`e57711a2 448b4208        mov     r8d,dword ptr [rdx+8]
00007ffd`e57711a6 4585c0          test    r8d,r8d
00007ffd`e57711a9 7e11            jle     00007ffd`e57711bc
00007ffd`e57711ab 4c63c8          movsxd  r9,eax
00007ffd`e57711ae 42394c8a10      cmp     dword ptr [rdx+r9*4+10h],ecx
00007ffd`e57711b3 740d            je      00007ffd`e57711c2
00007ffd`e57711b5 ffc0            inc     eax
00007ffd`e57711b7 443bc0          cmp     r8d,eax
00007ffd`e57711ba 7fef            jg      00007ffd`e57711ab
00007ffd`e57711bc b8ffffffff      mov     eax,0FFFFFFFFh
00007ffd`e57711c1 c3              ret
00007ffd`e57711c2 c3              ret

@JosephTremoulet I am sorry, previous build has failed. Package 0.10.9.292 contains the fix and is available at our CI feed.

@adamsitnik yep, looks perfect with 0.10.9.292, thanks!

Was this page helpful?
0 / 5 - 0 ratings