Runtime: JIT reads elements from the wrong ImmutableArray when enumerating ImmutableArray multiple times

Created on 20 Aug 2020  路  23Comments  路  Source: dotnet/runtime

Tagging @DavidWrighton @jcouv

Roslyn has encountered a crash dotnet/roslyn#46575 which we can only reproduce when building the runtime in fairly specific circumstances. You can see stack trace info, etc. in the linked bug in Roslyn. It requires building the runtime using a release mode compiler from later than Aug 4th or so, and building using msbuild, not calling the compiler directly.

Analysis

Looking at the crash dump, @davidwrighton identified that the JIT produced incorrect assembly. In the body of ExplicitInterfaceImplementation (source), we have two locals of the same type (ImmutableArray<MethodSymbol>) and we loop through one of them twice. In the second loop, the assembly uses the wrong local to index into the array. As a result, we get a crash/NRE with method being null on this line.

Repro

To reproduce this crash in dotnet/runtime:

  1. Checkout the triage-jit-crash branch and clean the artifacts out of your local repo.

    • There are currently several compiler breaking changes in flight at the same time as this crash. This repro branch is based on release/5.0, but hacking around some unrelated compiler issues in order to get us to the point where we can repro the bug.

  2. Run .\Build.cmd -restore, then .\Build.cmd libs -bl (having a binlog with -bl will probably be handy).

    • You should see crashes in the projects for ref\System.Diagnostics.Process.csproj and ref\System.Data.Common.csproj.

Details

We have found differences in the native code produced for the PEMethodSymbol.get_ExplicitInterfaceImplementations method when running from msbuild and when running csc using the args produced by msbuild. I have attached the output from the Disassembly window for these methods. The interesting bit seems to be around line 210 where in the broken version, we read the length from the explicitlyOverriddenMethods array, but seem to read elements from the explicitInterfaceImplementations array. (I'm not skilled in reading disassembly, so I could have mangled the description of this, sorry.)

It could be illustrative to pop these two disassembled methods into a diff tool and compare what they are doing.

PEMethodSymbol.ExplicitInterfaceImplementations-broken.txt
PEMethodSymbol.ExplicitInterfaceImplementations-working.txt

area-CodeGen-coreclr

Most helpful comment

Have verified the prototype fix leads to successful libraries build.

C:\repos\runtime1\.dotnet\shared\Microsoft.NETCore.App\5.0.0-preview.8.20361.2 >
copy \repos\runtime4\artifacts\bin\coreclr\Windows_NT.x64.Checked\clrjit.dll .

C:\repos\runtime1 (triage-jit-crash) > .\Build.cmd libs -bl
... 
Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:15:58.89

All 23 comments

Tagging subscribers to this area: @eiriktsarpalis
See info in area-owners.md if you want to be subscribed.

@dotnet/jit-contrib

Some notes on making this repro reliably.

  1. You must not have the 5.0.0-preview.8.20361.2 version of .NET installed on the machine globally.
  2. You must not have a 3.1 version of dotnet in the .dotnet subdirectory of your runtime repo.

With those I found it possible to run csc.dll directly to look at the problem more easily instead of having to run through msbuild.

Psychic debugging would say this is an issue with CSE, but that's just a guess.

I can repro the failure via msbuild, but can't yet get the extracted command line to repro the issue. @davidwrighton any tips on how you did the latter?

Can repro bad codegen via PMI of .nuget\packages\microsoft.net.compilers.toolset\3.8.0-3.20416.3\tasks\netcoreapp3.1\bincore\Microsoft.CodeAnalysis.CSharp.dll with recent master, so will start looking there. Method of interest is Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEMethodSymbol:get_ExplicitInterfaceImplementations.

And yes, there is a CSE into R14... so psychic debugging is still a contender.

;  V59 cse1         [V59,T03] (  7, 18.50)     int  ->  r14         "CSE - aggressive"

Concern is this array access

G_M34632_IG20:
       cmp      r12d, r14d
       jae      G_M34632_IG28
       movsxd   rcx, r12d
       mov      r15, gword ptr [r13+8*rcx+16]

here r12d is the IV, r14d is array length from much earlier

IG05:
       ...
       mov      r14d, dword ptr [rdi+8]
       mov      r15, rdi

and r13 from a call after that point:

G_M34632_IG19:
       ...
       call     Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[__Canon][System.__Canon]:ToImmutableAndFree():System.Collections.Immutable.ImmutableArray`1[__Canon]:this
       mov      r13, rax

other "valid" array accesses earlier in the loop

G_M34632_IG06:
       cmp      r12d, r14d
       jae      G_M34632_IG28
       movsxd   rcx, r12d
       mov      r13, gword ptr [r15+8*rcx+16]
...

G_M34632_IG12:
       cmp      r12d, r14d
       jae      G_M34632_IG28
       movsxd   rcx, r12d
       mov      rdi, gword ptr [r15+8*rcx+16]

So seems like (as noted above) we're not using the right array length in a bounds check.

Here's a simpler repro. Things go bad in F and we end up trying to access elements in aa and get an exception. Still chasing down where things go wrong.
```c#
using System;
using System.Collections.Immutable;
using System.Runtime.CompilerServices;

class X
{
[MethodImpl(MethodImplOptions.NoInlining)]
public static void E(ImmutableArray a) {}

[MethodImpl(MethodImplOptions.NoInlining)]
public static ImmutableArray<string> G() => ImmutableArray<string>.Empty;

[MethodImpl(MethodImplOptions.NoInlining)]
public static ImmutableArray<string> H()
{
    string[] a = new string[100];

    for (int i = 0; i < a.Length; i++)
    {
        a[i] = "hello";
    }

    return ImmutableArray.Create<string>(a);
}

[MethodImpl(MethodImplOptions.NoInlining)]    
public static int F()
{
    var a = H();
    int r = 0;

    foreach (var s in a)
    {
        if (s.Equals("hello")) r++;
    }

    var aa = a;

    if (r > 0)

    {
        foreach (var s in a)
        {
            if (s.Equals("hello")) r--;
        }

        aa = G();

        foreach (var s in a)
        {
            if (s.Equals("hello")) r++;
        }
    }

    E(aa);

    return r;
}

public static int Main() => F();

}
```

I note that there is the following assignment in the source code

explicitInterfaceImplementations = explicitlyOverriddenMethods;

I think assertion prop might be getting confused here (or is missing a kill).

The key is the reassignment aa = G(). Here aa is a struct with one field; that field gets promoted.

aa = G() is modelled as

N003 ( 18,  8) [000086] -ACXG---R---              *  ASG       struct (copy)
N002 (  3,  2) [000084] D------N----              +--*  LCL_VAR   struct<System.Collections.Immutable.ImmutableArray`1, 8>(P) V02 loc2         d:4
                                                  +--*    ref    V02.array (offs=0x00) -> V18 tmp14        
N001 ( 14,  5) [000083] --CXG-------              \--*  CALL      struct X.G

and this redef of V18 does not seem to invalidate assertions involving V18, leading the compiler to believe incorrectly that V19 and V18 have the same value in the final loop.

Still not sure if the issue is in SSA, Value Numbering, assertion prop, or copy prop.

cc @sandreenko -- I suspect this is related to your recent work on "no retyping" of this kind of struct.

cc @sandreenko -- I suspect this is related to your recent work on "no retyping" of this kind of struct.

It could be, I have added special logic for these assignments in order to allow their VN-s, will take a look.

Suspect there are issues in value numbering. The CanBeReplacedWithItsField updates need to happen in fgValueNumberBlockAssignment and not in fgValueNumberTree.

Still not sure if -- or how -- this leads to the bug.

Believe the actual bug is in optBlockCopyProp where it does not properly handle cases like the [000086] assignment above. The redefinition of aa is ignored and copy prop subsequently replaces uses of a with uses of aa in the bottom loop.

@sandreenko I linked to my prospective fix, but I thought you were going to own this issue (or we both can)?

@sandreenko I linked to my prospective fix, but I thought you were going to own this issue (or we both can)?

Sure, I am checking your fix now (your repro + frameworks + spmi collections) and the other suspicious place in fgValueNumberBlockAssignment, I am planning to push a PR today.

The prospective fix works on the small repro above; verifying it fixes the actual bug now.

Here are the diffs on the version of Microsoft.CodeAnalysis.CSharp used in this bug, 6 methods in all impacted. Change looks fairly surgical so far, and the diff in get_ExplicitInterfaceImplementations looks to be correct.

Total bytes of diff: 56 (0.00% of base)
    diff is a regression.

Top file regressions (bytes):
          56 : Microsoft.CodeAnalysis.CSharp.dasm (0.00% of base)

1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.

Top method regressions (bytes):
          44 ( 5.56% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEMethodSymbol:get_ExplicitInterfaceImplementations():System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
          24 ( 1.53% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Binder:ProcessParameterlessCrefMemberLookupResults(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],int,Microsoft.CodeAnalysis.CSharp.Syntax.MemberCrefSyntax,Microsoft.CodeAnalysis.CSharp.Syntax.TypeArgumentListSyntax,byref,Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this

Top method improvements (bytes):
          -3 (-0.06% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.MethodBodySynthesizer:MakeSubmissionInitialization(Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[Microsoft.CodeAnalysis.CSharp.BoundStatement, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.SyntaxNode,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.SynthesizedSubmissionFields,Microsoft.CodeAnalysis.CSharp.CSharpCompilation)
          -3 (-0.91% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.LocalRewriter:GetEffectiveArgumentRefKinds(System.Collections.Immutable.ImmutableArray`1[RefKind],System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):System.Collections.Immutable.ImmutableArray`1[RefKind]
          -3 (-0.27% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MergeConstraintsForPartialDeclarations(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=5.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]],System.Collections.Generic.IReadOnlyDictionary`2[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[System.Boolean, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]],Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
          -3 (-0.37% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SynthesizedParameterSymbol:DeriveParameters(Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]

Top method regressions (percentages):
          44 ( 5.56% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEMethodSymbol:get_ExplicitInterfaceImplementations():System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
          24 ( 1.53% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Binder:ProcessParameterlessCrefMemberLookupResults(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],int,Microsoft.CodeAnalysis.CSharp.Syntax.MemberCrefSyntax,Microsoft.CodeAnalysis.CSharp.Syntax.TypeArgumentListSyntax,byref,Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this

Top method improvements (percentages):
          -3 (-0.91% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.LocalRewriter:GetEffectiveArgumentRefKinds(System.Collections.Immutable.ImmutableArray`1[RefKind],System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):System.Collections.Immutable.ImmutableArray`1[RefKind]
          -3 (-0.37% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SynthesizedParameterSymbol:DeriveParameters(Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]
          -3 (-0.27% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MergeConstraintsForPartialDeclarations(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=5.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]],System.Collections.Generic.IReadOnlyDictionary`2[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[System.Boolean, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]],Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
          -3 (-0.06% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.MethodBodySynthesizer:MakeSubmissionInitialization(Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[Microsoft.CodeAnalysis.CSharp.BoundStatement, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.SyntaxNode,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.SynthesizedSubmissionFields,Microsoft.CodeAnalysis.CSharp.CSharpCompilation)

6 total methods with Code Size differences (4 improved, 2 regressed), 42826 unchanged.

Hard to be 100% sure, but the other 5 methods don't look like they had bad codegen in the base version. So perhaps just this one buggy method in this assembly.

Preview8 has a different JIT GUID than master, so will take me a bit longer to build a p8 compatible jit with above fix and drop it into the toolset to verify the fix "in situ."

FWIW psychic debugging failed here. I should keep a scorecard.

@RikkiGibson looks like you all have a workaround for P8, is that right?

We have a workaround but it's in our 16.8-preview3 release (corresponding to .NET 5 RC1). Our window to put things into preview8 has pretty much closed since 16.8-preview2 is not taking any more changes on the VS side.

Basically, we haven't worked around it in preview8, but we have worked around it in RC1, and we're OK with that. (tagging @jcouv in case he wishes to correct/clarify things)

Have verified the prototype fix leads to successful libraries build.

C:\repos\runtime1\.dotnet\shared\Microsoft.NETCore.App\5.0.0-preview.8.20361.2 >
copy \repos\runtime4\artifacts\bin\coreclr\Windows_NT.x64.Checked\clrjit.dll .

C:\repos\runtime1 (triage-jit-crash) > .\Build.cmd libs -bl
... 
Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:15:58.89

Keep open for RC1 port.

Fixed in 5.0 by #41243.

Thanks for the prompt investigation and fix! :-)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

btecu picture btecu  路  3Comments

noahfalk picture noahfalk  路  3Comments

jchannon picture jchannon  路  3Comments

Timovzl picture Timovzl  路  3Comments

GitAntoinee picture GitAntoinee  路  3Comments