Tagging @DavidWrighton @jcouv
Roslyn has encountered a crash dotnet/roslyn#46575 which we can only reproduce when building the runtime in fairly specific circumstances. You can see stack trace info, etc. in the linked bug in Roslyn. It requires building the runtime using a release mode compiler from later than Aug 4th or so, and building using msbuild, not calling the compiler directly.
Looking at the crash dump, @davidwrighton identified that the JIT produced incorrect assembly. In the body of ExplicitInterfaceImplementation (source), we have two locals of the same type (ImmutableArray<MethodSymbol>) and we loop through one of them twice. In the second loop, the assembly uses the wrong local to index into the array. As a result, we get a crash/NRE with method being null on this line.
To reproduce this crash in dotnet/runtime:
.\Build.cmd -restore, then .\Build.cmd libs -bl (having a binlog with -bl will probably be handy).ref\System.Diagnostics.Process.csproj and ref\System.Data.Common.csproj.We have found differences in the native code produced for the PEMethodSymbol.get_ExplicitInterfaceImplementations method when running from msbuild and when running csc using the args produced by msbuild. I have attached the output from the Disassembly window for these methods. The interesting bit seems to be around line 210 where in the broken version, we read the length from the explicitlyOverriddenMethods array, but seem to read elements from the explicitInterfaceImplementations array. (I'm not skilled in reading disassembly, so I could have mangled the description of this, sorry.)
It could be illustrative to pop these two disassembled methods into a diff tool and compare what they are doing.
PEMethodSymbol.ExplicitInterfaceImplementations-broken.txt
PEMethodSymbol.ExplicitInterfaceImplementations-working.txt
Tagging subscribers to this area: @eiriktsarpalis
See info in area-owners.md if you want to be subscribed.
@dotnet/jit-contrib
Some notes on making this repro reliably.
With those I found it possible to run csc.dll directly to look at the problem more easily instead of having to run through msbuild.
Psychic debugging would say this is an issue with CSE, but that's just a guess.
I can repro the failure via msbuild, but can't yet get the extracted command line to repro the issue. @davidwrighton any tips on how you did the latter?
Can repro bad codegen via PMI of .nuget\packages\microsoft.net.compilers.toolset\3.8.0-3.20416.3\tasks\netcoreapp3.1\bincore\Microsoft.CodeAnalysis.CSharp.dll with recent master, so will start looking there. Method of interest is Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEMethodSymbol:get_ExplicitInterfaceImplementations.
And yes, there is a CSE into R14... so psychic debugging is still a contender.
; V59 cse1 [V59,T03] ( 7, 18.50) int -> r14 "CSE - aggressive"
Concern is this array access
G_M34632_IG20:
cmp r12d, r14d
jae G_M34632_IG28
movsxd rcx, r12d
mov r15, gword ptr [r13+8*rcx+16]
here r12d is the IV, r14d is array length from much earlier
IG05:
...
mov r14d, dword ptr [rdi+8]
mov r15, rdi
and r13 from a call after that point:
G_M34632_IG19:
...
call Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[__Canon][System.__Canon]:ToImmutableAndFree():System.Collections.Immutable.ImmutableArray`1[__Canon]:this
mov r13, rax
other "valid" array accesses earlier in the loop
G_M34632_IG06:
cmp r12d, r14d
jae G_M34632_IG28
movsxd rcx, r12d
mov r13, gword ptr [r15+8*rcx+16]
...
G_M34632_IG12:
cmp r12d, r14d
jae G_M34632_IG28
movsxd rcx, r12d
mov rdi, gword ptr [r15+8*rcx+16]
So seems like (as noted above) we're not using the right array length in a bounds check.
Here's a simpler repro. Things go bad in F and we end up trying to access elements in aa and get an exception. Still chasing down where things go wrong.
```c#
using System;
using System.Collections.Immutable;
using System.Runtime.CompilerServices;
class X
{
[MethodImpl(MethodImplOptions.NoInlining)]
public static void E(ImmutableArray
[MethodImpl(MethodImplOptions.NoInlining)]
public static ImmutableArray<string> G() => ImmutableArray<string>.Empty;
[MethodImpl(MethodImplOptions.NoInlining)]
public static ImmutableArray<string> H()
{
string[] a = new string[100];
for (int i = 0; i < a.Length; i++)
{
a[i] = "hello";
}
return ImmutableArray.Create<string>(a);
}
[MethodImpl(MethodImplOptions.NoInlining)]
public static int F()
{
var a = H();
int r = 0;
foreach (var s in a)
{
if (s.Equals("hello")) r++;
}
var aa = a;
if (r > 0)
{
foreach (var s in a)
{
if (s.Equals("hello")) r--;
}
aa = G();
foreach (var s in a)
{
if (s.Equals("hello")) r++;
}
}
E(aa);
return r;
}
public static int Main() => F();
}
```
I note that there is the following assignment in the source code
explicitInterfaceImplementations = explicitlyOverriddenMethods;
I think assertion prop might be getting confused here (or is missing a kill).
The key is the reassignment aa = G(). Here aa is a struct with one field; that field gets promoted.
aa = G() is modelled as
N003 ( 18, 8) [000086] -ACXG---R--- * ASG struct (copy)
N002 ( 3, 2) [000084] D------N---- +--* LCL_VAR struct<System.Collections.Immutable.ImmutableArray`1, 8>(P) V02 loc2 d:4
+--* ref V02.array (offs=0x00) -> V18 tmp14
N001 ( 14, 5) [000083] --CXG------- \--* CALL struct X.G
and this redef of V18 does not seem to invalidate assertions involving V18, leading the compiler to believe incorrectly that V19 and V18 have the same value in the final loop.
Still not sure if the issue is in SSA, Value Numbering, assertion prop, or copy prop.
cc @sandreenko -- I suspect this is related to your recent work on "no retyping" of this kind of struct.
cc @sandreenko -- I suspect this is related to your recent work on "no retyping" of this kind of struct.
It could be, I have added special logic for these assignments in order to allow their VN-s, will take a look.
Suspect there are issues in value numbering. The CanBeReplacedWithItsField updates need to happen in fgValueNumberBlockAssignment and not in fgValueNumberTree.
Still not sure if -- or how -- this leads to the bug.
Believe the actual bug is in optBlockCopyProp where it does not properly handle cases like the [000086] assignment above. The redefinition of aa is ignored and copy prop subsequently replaces uses of a with uses of aa in the bottom loop.
@sandreenko I linked to my prospective fix, but I thought you were going to own this issue (or we both can)?
@sandreenko I linked to my prospective fix, but I thought you were going to own this issue (or we both can)?
Sure, I am checking your fix now (your repro + frameworks + spmi collections) and the other suspicious place in fgValueNumberBlockAssignment, I am planning to push a PR today.
The prospective fix works on the small repro above; verifying it fixes the actual bug now.
Here are the diffs on the version of Microsoft.CodeAnalysis.CSharp used in this bug, 6 methods in all impacted. Change looks fairly surgical so far, and the diff in get_ExplicitInterfaceImplementations looks to be correct.
Total bytes of diff: 56 (0.00% of base)
diff is a regression.
Top file regressions (bytes):
56 : Microsoft.CodeAnalysis.CSharp.dasm (0.00% of base)
1 total files with Code Size differences (0 improved, 1 regressed), 0 unchanged.
Top method regressions (bytes):
44 ( 5.56% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEMethodSymbol:get_ExplicitInterfaceImplementations():System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
24 ( 1.53% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Binder:ProcessParameterlessCrefMemberLookupResults(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],int,Microsoft.CodeAnalysis.CSharp.Syntax.MemberCrefSyntax,Microsoft.CodeAnalysis.CSharp.Syntax.TypeArgumentListSyntax,byref,Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
Top method improvements (bytes):
-3 (-0.06% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.MethodBodySynthesizer:MakeSubmissionInitialization(Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[Microsoft.CodeAnalysis.CSharp.BoundStatement, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.SyntaxNode,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.SynthesizedSubmissionFields,Microsoft.CodeAnalysis.CSharp.CSharpCompilation)
-3 (-0.91% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.LocalRewriter:GetEffectiveArgumentRefKinds(System.Collections.Immutable.ImmutableArray`1[RefKind],System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):System.Collections.Immutable.ImmutableArray`1[RefKind]
-3 (-0.27% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MergeConstraintsForPartialDeclarations(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=5.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]],System.Collections.Generic.IReadOnlyDictionary`2[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[System.Boolean, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]],Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
-3 (-0.37% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SynthesizedParameterSymbol:DeriveParameters(Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]
Top method regressions (percentages):
44 ( 5.56% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.Metadata.PE.PEMethodSymbol:get_ExplicitInterfaceImplementations():System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
24 ( 1.53% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Binder:ProcessParameterlessCrefMemberLookupResults(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],int,Microsoft.CodeAnalysis.CSharp.Syntax.MemberCrefSyntax,Microsoft.CodeAnalysis.CSharp.Syntax.TypeArgumentListSyntax,byref,Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
Top method improvements (percentages):
-3 (-0.91% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.LocalRewriter:GetEffectiveArgumentRefKinds(System.Collections.Immutable.ImmutableArray`1[RefKind],System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]):System.Collections.Immutable.ImmutableArray`1[RefKind]
-3 (-0.37% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SynthesizedParameterSymbol:DeriveParameters(Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.ParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]
-3 (-0.27% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.Symbols.SourceNamedTypeSymbol:MergeConstraintsForPartialDeclarations(System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]], System.Collections.Immutable, Version=5.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a]],System.Collections.Generic.IReadOnlyDictionary`2[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterSymbol, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35],[System.Boolean, System.Private.CoreLib, Version=5.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]],Microsoft.CodeAnalysis.DiagnosticBag):System.Collections.Immutable.ImmutableArray`1[[Microsoft.CodeAnalysis.CSharp.Symbols.TypeParameterConstraintClause, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]]:this
-3 (-0.06% of base) : Microsoft.CodeAnalysis.CSharp.dasm - Microsoft.CodeAnalysis.CSharp.MethodBodySynthesizer:MakeSubmissionInitialization(Microsoft.CodeAnalysis.PooledObjects.ArrayBuilder`1[[Microsoft.CodeAnalysis.CSharp.BoundStatement, Microsoft.CodeAnalysis.CSharp, Version=3.8.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]],Microsoft.CodeAnalysis.SyntaxNode,Microsoft.CodeAnalysis.CSharp.Symbols.MethodSymbol,Microsoft.CodeAnalysis.CSharp.SynthesizedSubmissionFields,Microsoft.CodeAnalysis.CSharp.CSharpCompilation)
6 total methods with Code Size differences (4 improved, 2 regressed), 42826 unchanged.
Hard to be 100% sure, but the other 5 methods don't look like they had bad codegen in the base version. So perhaps just this one buggy method in this assembly.
Preview8 has a different JIT GUID than master, so will take me a bit longer to build a p8 compatible jit with above fix and drop it into the toolset to verify the fix "in situ."
FWIW psychic debugging failed here. I should keep a scorecard.
@RikkiGibson looks like you all have a workaround for P8, is that right?
We have a workaround but it's in our 16.8-preview3 release (corresponding to .NET 5 RC1). Our window to put things into preview8 has pretty much closed since 16.8-preview2 is not taking any more changes on the VS side.
Basically, we haven't worked around it in preview8, but we have worked around it in RC1, and we're OK with that. (tagging @jcouv in case he wishes to correct/clarify things)
Have verified the prototype fix leads to successful libraries build.
C:\repos\runtime1\.dotnet\shared\Microsoft.NETCore.App\5.0.0-preview.8.20361.2 >
copy \repos\runtime4\artifacts\bin\coreclr\Windows_NT.x64.Checked\clrjit.dll .
C:\repos\runtime1 (triage-jit-crash) > .\Build.cmd libs -bl
...
Build succeeded.
0 Warning(s)
0 Error(s)
Time Elapsed 00:15:58.89
Keep open for RC1 port.
Fixed in 5.0 by #41243.
Thanks for the prompt investigation and fix! :-)
Most helpful comment
Have verified the prototype fix leads to successful libraries build.