Similar to dotnet/runtime#10873 but about GT_ADDR nodes. Like GT_ASG nodes, GT_ADDR nodes have one fundamental issue: they change the semantics of their operand nodes.
A GT_LCL_VAR that is used by GT_ADDR is no longer a node that produces a value, it's a node that represents a location. It also affects GT_FIELD so you can have a chain like GT_FIELD(GT_ADDR(GT_FIELD(GT_ADDR(GT_FIELD(…))))) where the outer GT_FIELD is an indirection that produces a value while the inner ones just represent locations.
This approach results in various code complications, inefficiencies and bugs.
category:implementation
theme:ir
skill-level:expert
cost:extra-large
The JIT already uses GT_LCL_VAR_ADDR and GT_LCL_FLD_ADDR nodes after rationalization, the importer should produce these directly from IL ldloca & co. Hopefully that's not too complicated.
The question is what to do with GT_FIELD. Add GT_FIELD_ADDR? Make GT_FIELD always take and return addresses and have the importer add an extra GT_IND instead of waiting until morph?
@RussKeldorph I'm not quite sure what up-for-grabs really means but if it implies that the issue is perhaps more approachable then it might be misleading in this case.
Removing GT_ADDR completely tends to run into the same issues as GT_ASG removal. In particular, we need to be able to have struct typed GT_LCL_FLD nodes in the IR. And then, getting that strays into 1st class structs territory so it's double the fun and not exactly trivial.
BTW, you can also assign it to me since I have every intention of getting rid of GT_ASG/GT_ADDR/GT_LIST and whatever other legacy IR crap I run into. Though at the speed reviews progress that will probably take a year, if not more :grin:
@mikedn How about now?
Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.
Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.
I don't know, as I already said, I'm not sure what the meaning of the label truly is. I guess that some kind of "hard" is perfectly fine for up-for-grabs. But when the work spans more than half of the JIT, interferes with other issues, requires multiple PRs to get it done then it's probably a bit too much.
Sometimes I wonder how I ended up doing this. I guess that after a couple of years of looking at the JIT codebase, pieces finally fell into place and this looks feasible. Otherwise I probably wouldn't have opened all these issues to begin with.
@mikedn I wish you the best of luck in cleaning up these disgusting design mistakes! Really need to get rid of stuff like this to move the JIT forward.
@GSPP Thanks!
In particular, we need to be able to have struct typed GT_LCL_FLD nodes in the IR.
I should mention that preliminary work to support this is on going in dotnet/coreclr#21705. Just so it doesn't look as if this issue (or the GT_ASG one) is stagnating :)
A (hopefully complete) list of ADDR uses and some thoughts about how they can be eliminated:
ADDR(LCL_VAR) (or ADDR(LCL_FLD) for fields of local variables). The obvious thing to do is to replace these with LCL_VAR_ADDR and LCL_FLD_ADDR. Potential problems in doing this:IsLocalAddrExpr & co. should be converted to LCL_FLDs.LCL_VAR nodes that reference address exposed variables have GTF_GLOB_REF set on them, even if they're used by an ADDR node.IND(ADDR(LCL_VAR)) tree, normal side effect propagation pushes GTF_GLOB_REF up the tree to the indir, which is what we want. But if there's no indir then the same side effect propagation poisons the tree with GTF_GLOB_REF even if the address of a variable can't really be aliased. This does result in CQ issues - one case I ran into is that call args that are address of locals can't be late args due the bogus side effect they carry.GTF_GLOB_REF but then we need to ensure that indirections of such address do somehow get GTF_GLOB_REF if the variable is address exposed. Or as proposed above - such indirect local access trees should always be converted to LCL_FLDs.ldlflda & co. are imported as ADDR(FIELD). The only thing that can be done is to introduce FIELD_ADDR. Given the short lifespan of FIELD nodes this shouldn't be a big problem, though some care may be needed around fgUnwrapProxy (is this still needed in .NET Core?) and R2R affected fields.ldelema is imported as ADDR(INDEX). And of course, the solution is similar - introduce INDEX_ADDR. Except that one already exists (which should not be a big problem) and that array access is actually a bit more complicated, as we'll see below.ADDR. INDEX gets expanded to IND(element address tree) which may seem rather innocuous. It's not because:IND node has an ArrayInfo out-of-band annotation attached to it. This makes changing/eliminating this node problematic as the annotation has to be preserved somehow.IND, not an OBJ. Likely because of difficulties caused by the above mentioned annotation, JIT code that needs an OBJ does not change the IND but instead piles an OBJ on top of it by means of ADDR - OBJ(ADDR(IND(element address)). To get rid of ADDR there's no choice but to change the IND but then we need to ensure that the annotation is preserved properly. Or just get rid of it, best I can tell this annotation can be replaced by a special field sequence.IND node is actually wrapped in a COMMA. That may be less of an issue as far as ADDR is concerned but it is a problem for ASG elimination. So we need to keep that in mind when making changes around array element access. It's probably best to sink the COMMA below the IND so we get IND(COMMA(ARR_BOUNDS_CHK, element address)).ADDR use is to attach struct type information to nodes that lack it, by wrapping them in OBJ(ADDR(node)), even if the node doesn't even represent an addressable location. This happens with SIMD typed nodes that are used as call arguments - call morphing needs to know the exact struct type for ABI decisions and SIMD node do not carry that information. To get rid of ADDR we have the following options:OBJ(ADDR(x)) - to carry struct type info. Might work but I have reservations about extra nodes that act like wrappers for annotation purposes, they require special care to avoid them becoming detached from the child node they're supposed to wrap (e.g. GTF_NO_CSE).GenTreeCall::Use so it does not need to be recovered from the argument node. This makes the most sense to me, its information that's provided by the call signature and as such it should be stored with the call node and not its arguments. The only drawback would be that it would make GenTreeCall::Use bigger by adding a pointer to it.:mips-interest
Hi @mikedn
Testcase: JIT/HardwareIntrinsics/X86/Sse41/ConvertToVector128_r/ConvertToVector128_r.exe
Assertion failed to work for MIPS:
noway_assert(((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) ||
((tree->gtFlags & GTF_GLOB_REF) != 0));
Workaround:
diff --git a/src/jit/morph.cpp b/src/jit/morph.cpp
index 1fc966a..dddaa6a 100644
--- a/src/jit/morph.cpp
+++ b/src/jit/morph.cpp
@@ -6244,7 +6244,7 @@ GenTree* Compiler::fgMorphField(GenTree* tree, MorphAddrContext* mac)
}
noway_assert(((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) ||
- ((tree->gtFlags & GTF_GLOB_REF) != 0));
+ ((tree->gtFlags & GTF_GLOB_REF) != 0) MIPS64_ONLY(|| objRef->gtOper == GT_LCL_VAR));
if (tree->gtField.gtFldMayOverlap)
{
Request for comments.
Thanks,
Leslie Zhai
@xiangzhai If adding the LCL_VAR makes the assert pass it means that you have something like FIELD(LCL_VAR) and then FIELD should really have GTF_GLOB_REF set because there's generally no way to tell where the pointer stored in the local variable points to. So the real problem here is likely the missing GTF_GLOB_REF and changing the assert won't fix that.
You should check Compiler::gtNewFieldRef, that one sets GTF_GLOB_REF in most cases. The only exception involves trees like FIELD(ADDR(LCL_VAR)) so it's not clear why GTF_GLOB_REF would be missing. Can you provide a JIT dump?
Since it's MIPS specific I would guess that it's caused by other MIPS change you have done, that's possibly related to struct call parameters handling (does MIPS pass structs by reference or by value?)
Or, since this is a x86 intrinsics test, maybe something is not handled correctly in the intrinsic import code.
And this isn't probably directly related to this issue but yes, the removal of ADDR should make things simpler and safer. But it will take a while until this is done.
Hi @mikedn
Thanks for your teaching!
So the real problem here is likely the missing GTF_GLOB_REF
Thanks for pointing out the root cause!
Can you provide a JIT dump?
Sorry! There are lots of debugging output messages in JitFunctionTrace.log.
struct call parameters handling (does MIPS pass structs by reference or by value?)
By value:

Or, since this is a x86 intrinsics test ...
Yes, Sse2.IsSupported, for example, should directly return false for MIPS without trigging assertion failure.
\cc @QiaoVanke
Thanks,
Leslie Zhai
Sorry! There are lots of debugging output messages in JitDisasm.log.
I was referring to a JIT dump obtained by setting the environment variable COMPlus_JitDump to the method name that asserts. That would show the IR that is generating the assert. But the log also helps a bit - we can see that the method that is being compiled is <>c:<JoinInternal>b__37_0(struct,struct):this. This is a lambda expression used in one of the Path.JoinInternal methods - https://github.com/dotnet/coreclr/blob/a9f3fc16483eecfc47fb79c362811d870be02249/src/System.Private.CoreLib/shared/System/IO/Path.cs#L659
This means that:
Without the dump, I can only guess that the local variable is perhaps one of the struct parameters and that FIELD(LCL_VAR) should have been FIELD(ADDR(LCL_VAR)). But I don't know how ADDR could have disappeared along the way (or not being generated in the first place).
It might be useful to highlight a discrepancy between IL instructions and JIT's IR: in IL you have
IL_0000: ldarg.2
IL_0001: ldfld !0 valuetype System.ValueTuple`5<native int, int32, native int, int32, bool>::Item1
But in the JIT IR the FIELD node expects an address so the above IL is imported as FIELD(ADDR(LCL_VAR)).
I was referring to a JIT dump obtained by setting the environment variable COMPlus_JitDump to the method name that asserts.
Sorry! I uploaded wrong log ...
Here JitDump.log is.
Thanks,
Leslie Zhai
Thanks, the dump says:
Morphing args for 7.CALL:
Replacing address of implicit by ref struct parameter with byref:
[000004] ------------ * LCL_VAR byref V02 arg2
Assert failure(PID 13339 [0x0000341b], Thread: 13339 [0x341b]): Assertion failed '((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) || ((tree->gtFlags & GTF_GLOB_REF) != 0)' in '<>c:<JoinInternal>b__37_0(struct,struct):this' (IL size 120)
It seems that you have "implicit by ref" parameter morphing enabled, even though you say that on MIPS struct args are passed by value. Perhaps you need to adjust the ifdef in fgMorphImplicitByRefArgs?
bool Compiler::fgMorphImplicitByRefArgs(GenTree* tree)
{
#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_)
return false;
#else // (_TARGET_AMD64_ && !UNIX_AMD64_ABI) || _TARGET_ARM64_
bool changed = false;
Though as it is it will already reject a non x64/arm64 target so presumably you already changed this in some way.
Though as it is it will already reject a non x64/arm64 target so presumably you already changed this in some way.
Thanks for pointing out my fault:
diff --git a/src/jit/morph.cpp b/src/jit/morph.cpp
index 1fc966a..2b8a8ee 100644
--- a/src/jit/morph.cpp
+++ b/src/jit/morph.cpp
@@ -17911,7 +17911,7 @@ void Compiler::fgMarkDemotedImplicitByRefArgs()
*/
bool Compiler::fgMorphImplicitByRefArgs(GenTree* tree)
{
-#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_) && !defined(_TARGET_MIPS64_)
+#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_)
return false;
Thanks,
Leslie Zhai
struct call parameters handling (does MIPS pass structs by reference or by value?)
By value:
Sorry! @QiaoVanke pointed out my fault: It is a special case for O32, but not mips64r2 N64. Short word: By Reference!
Thanks,
Leslie Zhai
You should check Compiler::gtNewFieldRef, that one sets GTF_GLOB_REF in most cases. The only exception involves trees like FIELD(ADDR(LCL_VAR)) so it's not clear why GTF_GLOB_REF would be missing.
@QiaoVanke fixed it:
diff --git a/src/jit/compiler.hpp b/src/jit/compiler.hpp
index 7c8660f..99176db 100644
--- a/src/jit/compiler.hpp
+++ b/src/jit/compiler.hpp
@@ -1192,7 +1192,7 @@ inline GenTree* Compiler::gtNewFieldRef(var_types typ, CORINFO_FIELD_HANDLE fldH
{
unsigned lclNum = obj->gtOp.gtOp1->gtLclVarCommon.gtLclNum;
lvaTable[lclNum].lvFieldAccessed = 1;
-#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
// These structs are passed by reference; we should probably be able to treat these
// as non-global refs, but downstream logic expects these to be marked this way.
if (lvaTable[lclNum].lvIsParam)
@@ -1780,10 +1780,10 @@ inline void LclVarDsc::incRefCnts(BasicBlock::weight_t weight, Compiler* comp, R
bool doubleWeight = lvIsTemp;
-#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
// and, for the time being, implict byref params
doubleWeight |= lvIsImplicitByRef;
-#endif // defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#endif // defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
if (doubleWeight && (weight * 2 > weight))
{
Thanks,
Leslie Zhai
Most helpful comment
I don't know, as I already said, I'm not sure what the meaning of the label truly is. I guess that some kind of "hard" is perfectly fine for up-for-grabs. But when the work spans more than half of the JIT, interferes with other issues, requires multiple PRs to get it done then it's probably a bit too much.
Sometimes I wonder how I ended up doing this. I guess that after a couple of years of looking at the JIT codebase, pieces finally fell into place and this looks feasible. Otherwise I probably wouldn't have opened all these issues to begin with.