For .NET core 2.1, the following program outputs 1023 in debug, but 255 in release.
using System;
class C0
{
public sbyte F;
}
public class Program
{
public static void Main()
{
C0 var0 = new C0 { F = -1 };
ulong var1 = (ulong)(1000 | (byte)var0.F);
Console.WriteLine(var1);
}
}
This issue repros on .NET framework 4.6.1 as well with 64-bit JIT (it does not repro with 32-bit JIT). The compiler used is csc.exe 2.8.3.62923 (7aafab56).
This only happens if C0 is a class. Other expressions than (byte)var0.F
will not trigger the bug either, for example
using System;
public class Program
{
public static void Main()
{
sbyte f = -1;
ulong var1 = (ulong)(1000 | unchecked((byte)f));
Console.WriteLine(var1);
}
}
does not trigger the bug, or making C0
a struct does not trigger it as well.
This seems to be caused by incorrect value numbering of the (byte)var0.F
indir:
***** BB01, stmt 2
( 7, 7) [000014] ------------ * STMT void (IL 0x005... ???)
N005 ( 1, 1) [000011] ------------ | /--* CNS_INT int -1 $80
N006 ( 7, 7) [000013] -A-XG------- \--* ASG byte $VN.Void
N004 ( 5, 5) [000012] ---XG--N---- \--* IND byte $80
N002 ( 1, 1) [000035] ------------ | /--* CNS_INT long 8 field offset Fseq[F] $240
N003 ( 2, 2) [000036] -------N---- \--* ADD byref $280
N001 ( 1, 1) [000010] ------------ \--* LCL_VAR ref V02 tmp1 u:3 $200
***** BB01, stmt 3
( 22, 18) [000026] ------------ * STMT void (IL 0x00D...0x020)
N011 ( 22, 18) [000024] --CXG------- \--* CALL void System.Console.WriteLine $VN.Void
N009 ( 8, 12) [000023] ---XG------- arg0 in rcx \--* CAST long <- int <l:$241, c:$3c0>
N007 ( 1, 4) [000018] ------------ | /--* CNS_INT int 0x3E8 $82
N008 ( 7, 10) [000022] ---XG------- \--* OR int <l:$80, c:$380>
N006 ( 5, 5) [000020] ---XG------- \--* IND ubyte <l:$80, c:$340>
N004 ( 1, 1) [000037] ------------ | /--* CNS_INT long 8 field offset Fseq[F] $240
N005 ( 2, 2) [000038] -------N---- \--* ADD byref $280
N003 ( 1, 1) [000019] ------------ \--* LCL_VAR ref V02 tmp1 u:3 (last use) $200
-1 has VN $80 and the indir also gets VN $80 even though it produces 255, not -1. This then results in the OR being dropped since x | -1 = -1
.
@mikedn The same bug is also present with F = -2 (254 in release, 1022 in debug), though I assume it is still kind of the same optimization?
Yes, it's the same problem because -2 | 1000 = -2
. It's worth explaining that the reason why the OR is dropped is not constant folding, then the result would likely be -2, not 254. What happens is that because OR and IND get the same VN, CSE has this funny idea of CSEing the indir tree:
N013 ( 21, 14) [000024] -ACXG------- * CALL void System.Console.WriteLine $VN.Void
N011 ( 7, 8) [000023] -A-XG------- arg0 in rcx \--* CAST long <- int <l:$241, c:$3c0>
N009 ( 1, 1) [000048] ------------ | /--* LCL_VAR int V03 cse0 <l:$81, c:$340>
N010 ( 6, 6) [000049] -A-XG------- \--* COMMA int <l:$81, c:$340>
N006 ( 5, 5) [000020] ---XG------- | /--* IND ubyte <l:$81, c:$340>
N004 ( 1, 1) [000037] ------------ | | | /--* CNS_INT long 8 field offset Fseq[F] $240
N005 ( 2, 2) [000038] -------N---- | | \--* ADD byref $280
N003 ( 1, 1) [000019] ------------ | | \--* LCL_VAR ref V02 tmp1 u:3 (last use) $200
N008 ( 5, 5) [000045] -A-XG---R--- \--* ASG int $VN.Void
N007 ( 1, 1) [000044] D------N---- \--* LCL_VAR int V03 cse0 <l:$81, c:$340>
So what you get is really the value produced by (byte)var0.F
.
I see. Here is a similar case with a similar problem, which I assume is the same then. It is a bit simpler:
public class Program
{
static short s_2 = -1000;
public static void Main()
{
ulong var1 = (ushort)(1U ^ s_2);
Console.WriteLine(var1);
}
}
Turns out that the new case is a completely different issue - fgMorphCast
and/or optNarrowTree
manage to screw up and dropped the cast to ushort
:
[000015] --CXG------- * CALL void System.Console.WriteLine
[000014] --CXG----U-- arg0 \--* CAST long <- ulong <- uint
[000013] --CXG------- \--* CAST int <- ushort <- long
[000011] --CXG------- | /--* CAST long <- int
[000003] ----G------- | | | /--* FIELD short s_2
[000010] --CXG------- | | \--* COMMA short
[000009] H-CXG------- | | \--* CALL help long HELPER.CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
[000005] ------------ arg0 | | +--* CNS_INT long 0x7ffe4b784598
[000006] ------------ arg1 | | \--* CNS_INT int 1
[000012] --CXG------- \--* XOR long
[000002] ------------ \--* CAST long <- int
[000001] ------------ \--* CNS_INT int 1
and after morph we have
[000015] --CXG+------ * CALL void System.Console.WriteLine
[000014] --CXG+---U-- arg0 in rcx \--* CAST long <- ulong <- uint
[000002] -----+------ | /--* CNS_INT int 1
[000012] --CXG+------ \--* XOR int
[000011] --CXG+------ \--* NOP ushort
[000003] ----G+------ | /--* CLS_VAR short Hnd=0x4b784770 Fseq[s_2]
[000010] --CXG+------ \--* COMMA short
[000009] H-CXG+------ \--* CALL help long HELPER.CORINFO_HELP_GETSHARED_NONGCSTATIC_BASE
[000005] -----+------ arg0 in rcx +--* CNS_INT long 0x7ffe4b784598
[000006] -----+------ arg1 in rdx \--* CNS_INT int 1
The XOR got narrowed down from long
to int
(correct) and at the same time the cast to ushort
has disappeared.
I suggest opening a separate issue for this. And out of curiosity - how did you come up with this example?
I opened dotnet/coreclr#18238.
And out of curiosity - how did you come up with this example?
As a project for a university course we wrote a fuzzer that generates C# programs, https://github.com/jakobbotsch/Fuzzlyn. It generates a program, then compares the results of all static and local variables when running in debug and release mode. I have many more examples, but we have no automatic reducer yet, so I have been reducing them to minimal repros by hand. I am not too familiar with the JIT internals, so it is a little hard for me to know which problems are new and which aren't.
EDIT: It is of course heavily inspired by Csmith, which I'm sure you're familiar with.
I am not too familiar with the JIT internals, so it is a little hard for me to know which problems are new and which aren't.
Well, one thing is for sure - unfortunately it has a problem with casts, small int indirs & co. I lost count of the issues (both bugs and code quality issues) I've seen in this area.
Well, one thing is for sure - unfortunately it has a problem with casts, small int indirs & co. I lost count of the issues (both bugs and code quality issues) I've seen in this area.
What is the state of RyuJit being shared with .NET Framework? I thought it was shared, but strangely none of the 4 issues I have found today (which includes one of yours, dotnet/runtime#7141) have reproed on the full .NET framework.
A 5th 'bug' I have found is the following, but I don't know if this is even a bug, rather than an oddity of the idiv
instruction:
int value = int.MinValue % -1; // constant-folded to 0 by Roslyn
value = int.MinValue;
value %= -1; // OverflowException thrown at runtime
At least the / -1
case is probably implementation-defined behavior, but it feels like % -1
should have a well-defined result.
cc @dotnet/jit-contrib
As a project for a university course we wrote a fuzzer that generates C# programs, https://github.com/jakobbotsch/Fuzzlyn.
@jakobbotsch Very good work - does it test all possible syntax variations or just a subset of C#/IL?
@jakobbotsch thanks for the reports. Sounds like an interesting tool.
If you have other examples where you suspect bugs, feel free to share them earlier rather than later. We can help with the work of reducing things to simpler cases, or sorting out duplicates.
As for reproing on desktop: it would help if you described exactly what steps you are taking. Desktop has a number of different versions and jit implementations, and input IL can also vary depending on which CSC is being used.
@jakobbotsch Very good work - does it test all possible syntax variations or just a subset of C#/IL?
It is a very limited subset for now. We were working on more features, but since we already have results we switched to posting some of them.
If you have other examples where you suspect bugs, feel free to share them earlier rather than later. We can help with the work of reducing things to simpler cases, or sorting out duplicates.
Will do, thanks.
As for reproing on desktop: it would help if you described exactly what steps you are taking. Desktop has a number of different versions and jit implementations, and input IL can also vary depending on which CSC is being used.
Thanks, a little rubber-duck debugging helped. 馃槃
My problem was that I had 'Prefer 32-bit' checked. The issues indeed repro with standard .NET framework for 64-bit apps.
int value = int.MinValue % -1; // constant-folded to 0 by Roslyn
Hmm, this appears to be a bug in the C# compiler. The C# spec says:
If the left operand is the smallest int or long value and the right operand is -1, a System.OverflowException is thrown. In no case does x % y throw an exception where x / y would not throw an exception.
The ECMA spec says something similar (though it mentions a different exception!?):
Integral operations can throw System.ArithmeticException if value1 is the smallest representable integer value and value2 is -1.
Hmm, this appears to be a bug in the C# compiler. The C# spec says:
I see. I'll ask in the Roslyn repo.
@mikedn
though it mentions a different exception!?
OverflowException
derives from ArithmeticException
, so if OverflowException
is thrown, both specs should be satisfied.
Here is a case which looks different, although from some experiments I am guessing it is also related to CSE:
using System;
public class Program
{
public static void Main()
{
ushort[] var2 = new ushort[]{ushort.MaxValue};
var2[0] = var2[0];
int[] var1 = new int[]{-1};
int val = -1 / var1[0];
Console.WriteLine(val); // 1 in debug, 0 in release
}
}
Here is a case which looks different, although from some experiments I am guessing it is also related to CSE:
Yep, it's the same incorrect value numbering of indirs. var2[0]
and var1[0]
have the same value number (even if one produces 0xffff and the other 0xffffffff) and CSE replaces var1[0]
with the value produced earlier by var2[0]
.
And incorrect value numbering aside, CSE is dubious anyway. It doesn't make sense to introduce a new temporary variable to hold the result of var2[0]
when this is actually a constant. And even if it's not a constant it would make more sense to re-use whatever value was used to initialize var1[0]
so that the temporary has a shorter life.
I spent the weekend writing an automatic reducer and running it on all our examples. The programs can be seen here. I have looked through most of them and I believe they are all instances of the bugs I have already reported. though it's hard for me to tell for sure. Once these issues are fixed I will rerun and filter away the non-interesting programs.
I took a closer look at this and from what I can see the problem occurs because VNApplySelectorsTypeCheck
does not take casts into account when values are constants:
https://github.com/dotnet/coreclr/blob/eb31c358b27182dae487406a3247dfe995beb40f/src/jit/valuenum.cpp#L2574
Here is a simpler repro:
struct S0
{
public uint F0;
public byte F2;
public int F3;
public sbyte F5;
public long F8;
}
public class Program
{
static uint s_0;
public static void Main()
{
S0 vr3 = new S0();
vr3.F0 = 0x10001;
uint vr4 = (ushort)vr3.F0;
s_0 = vr4;
System.Console.WriteLine(vr4); // 1 in debug, 0x10001 in release
}
}
After value-numbering we get:
***** BB01, stmt 1
( 5, 4) [000005] ------------ * STMT void (IL 0x000...0x003)
N001 ( 1, 1) [000003] ------------ | /--* CNS_INT int 0 $80
N003 ( 5, 4) [000004] IA------R--- \--* ASG struct (init) $VN.Void
N002 ( 3, 2) [000001] D------N---- \--* LCL_VAR struct V00 loc0 d:3
***** BB01, stmt 2
( 5, 9) [000011] ------------ * STMT void (IL 0x008...0x00F)
N001 ( 1, 4) [000008] ------------ | /--* CNS_INT int 0x10001 $82
N003 ( 5, 9) [000010] -A------R--- \--* ASG int $82
N002 ( 3, 4) [000006] U------N---- \--* LCL_FLD int V00 loc0 ud:3->4[+0] Fseq[F0] $1c0
***** BB01, stmt 3
( 4, 5) [000018] ------------ * STMT void (IL 0x014...0x01C)
N001 ( 4, 5) [000012] C----------- | /--* LCL_FLD ushort V00 loc0 u:4[+0] Fseq[F0] (last use) $82
N003 ( 4, 5) [000017] -A------R--- \--* ASG int $82
N002 ( 1, 1) [000016] D------N---- \--* LCL_VAR int V02 tmp1 d:3 $82
***** BB01, stmt 4
( 5, 6) [000023] ------------ * STMT void (IL ???... ???)
N002 ( 1, 1) [000020] ------------ | /--* LCL_VAR int V02 tmp1 u:3 $82
N003 ( 5, 6) [000022] -A--G------- \--* ASG int $82
N001 ( 3, 4) [000021] ----G--N---- \--* CLS_VAR int Hnd=0xffa45288 Fseq[s_0] $1c1
***** BB01, stmt 5
( 15, 7) [000026] ------------ * STMT void (IL ???...0x026)
N005 ( 15, 7) [000024] --CXG------- \--* CALL void System.Console.WriteLine $VN.Void
N003 ( 1, 1) [000019] ------------ arg0 in rcx \--* LCL_VAR int V02 tmp1 u:3 (last use) $82
***** BB01, stmt 6
( 0, 0) [000028] ------------ * STMT void (IL 0x026... ???)
N001 ( 0, 0) [000027] ------------ \--* RETURN void $2c0
Node 12 gets the same VN as 0x10001, which does not seem correct (since it is a smaller type). This is propagated to node 19 which causes the issue.
s_0 = vr4;
is needed for repro because otherwise the call just uses a LCL_FLD
temp directly. This actually leads to worse (but correct) code-gen since CSE does not optimize this:
C744242801000100 mov dword ptr [rsp+28H], 0x10001
0FB74C2428 movzx rcx, word ptr [rsp+28H]
E813FCFFFF call System.Console:WriteLine(int)
vs
B901000100 mov ecx, 0x10001
E811FCFFFF call System.Console:WriteLine(int)
It also does not repro if S0
is made smaller, because then the struct is promoted and morph correctly handles the constant propagation in this case.
Does my analysis seem correct? I would like to try to submit a fix for this.
Most helpful comment
I opened dotnet/coreclr#18238.
As a project for a university course we wrote a fuzzer that generates C# programs, https://github.com/jakobbotsch/Fuzzlyn. It generates a program, then compares the results of all static and local variables when running in debug and release mode. I have many more examples, but we have no automatic reducer yet, so I have been reducing them to minimal repros by hand. I am not too familiar with the JIT internals, so it is a little hard for me to know which problems are new and which aren't.
EDIT: It is of course heavily inspired by Csmith, which I'm sure you're familiar with.