Runtime: Remove GT_ADDR nodes

Created on 8 Sep 2018  Â·  18Comments  Â·  Source: dotnet/runtime

Similar to dotnet/runtime#10873 but about GT_ADDR nodes. Like GT_ASG nodes, GT_ADDR nodes have one fundamental issue: they change the semantics of their operand nodes.

A GT_LCL_VAR that is used by GT_ADDR is no longer a node that produces a value, it's a node that represents a location. It also affects GT_FIELD so you can have a chain like GT_FIELD(GT_ADDR(GT_FIELD(GT_ADDR(GT_FIELD(…))))) where the outer GT_FIELD is an indirection that produces a value while the inner ones just represent locations.

This approach results in various code complications, inefficiencies and bugs.

category:implementation
theme:ir
skill-level:expert
cost:extra-large

JitUntriaged area-CodeGen-coreclr enhancement hard problem

Most helpful comment

Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.

I don't know, as I already said, I'm not sure what the meaning of the label truly is. I guess that some kind of "hard" is perfectly fine for up-for-grabs. But when the work spans more than half of the JIT, interferes with other issues, requires multiple PRs to get it done then it's probably a bit too much.

Sometimes I wonder how I ended up doing this. I guess that after a couple of years of looking at the JIT codebase, pieces finally fell into place and this looks feasible. Otherwise I probably wouldn't have opened all these issues to begin with.

All 18 comments

The JIT already uses GT_LCL_VAR_ADDR and GT_LCL_FLD_ADDR nodes after rationalization, the importer should produce these directly from IL ldloca & co. Hopefully that's not too complicated.

The question is what to do with GT_FIELD. Add GT_FIELD_ADDR? Make GT_FIELD always take and return addresses and have the importer add an extra GT_IND instead of waiting until morph?

@RussKeldorph I'm not quite sure what up-for-grabs really means but if it implies that the issue is perhaps more approachable then it might be misleading in this case.

Removing GT_ADDR completely tends to run into the same issues as GT_ASG removal. In particular, we need to be able to have struct typed GT_LCL_FLD nodes in the IR. And then, getting that strays into 1st class structs territory so it's double the fun and not exactly trivial.

BTW, you can also assign it to me since I have every intention of getting rid of GT_ASG/GT_ADDR/GT_LIST and whatever other legacy IR crap I run into. Though at the speed reviews progress that will probably take a year, if not more :grin:

@mikedn How about now?

Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.

Maybe I'm being idealistic but I'd like to think we can offer problems of all difficulty levels under up-for-grabs, but if that isn't consistent with the accepted meaning of that label, I can stop.

I don't know, as I already said, I'm not sure what the meaning of the label truly is. I guess that some kind of "hard" is perfectly fine for up-for-grabs. But when the work spans more than half of the JIT, interferes with other issues, requires multiple PRs to get it done then it's probably a bit too much.

Sometimes I wonder how I ended up doing this. I guess that after a couple of years of looking at the JIT codebase, pieces finally fell into place and this looks feasible. Otherwise I probably wouldn't have opened all these issues to begin with.

@mikedn I wish you the best of luck in cleaning up these disgusting design mistakes! Really need to get rid of stuff like this to move the JIT forward.

@GSPP Thanks!

In particular, we need to be able to have struct typed GT_LCL_FLD nodes in the IR.

I should mention that preliminary work to support this is on going in dotnet/coreclr#21705. Just so it doesn't look as if this issue (or the GT_ASG one) is stagnating :)

A (hopefully complete) list of ADDR uses and some thoughts about how they can be eliminated:

  • Taking the address of a local variable - ADDR(LCL_VAR) (or ADDR(LCL_FLD) for fields of local variables). The obvious thing to do is to replace these with LCL_VAR_ADDR and LCL_FLD_ADDR. Potential problems in doing this:

    • Do we still want to have address taken variables that are not address exposed and thus tracked variables? That would require setting SSA numbers on these local address nodes but that's rather strange since the value they produce has nothing to do with SSA. IMO all address taken variables should be address exposed and all indirect local accesses recognized by IsLocalAddrExpr & co. should be converted to LCL_FLDs.

    • LCL_VAR nodes that reference address exposed variables have GTF_GLOB_REF set on them, even if they're used by an ADDR node.

      For a IND(ADDR(LCL_VAR)) tree, normal side effect propagation pushes GTF_GLOB_REF up the tree to the indir, which is what we want. But if there's no indir then the same side effect propagation poisons the tree with GTF_GLOB_REF even if the address of a variable can't really be aliased. This does result in CQ issues - one case I ran into is that call args that are address of locals can't be late args due the bogus side effect they carry.

      So local address nodes should never have GTF_GLOB_REF but then we need to ensure that indirections of such address do somehow get GTF_GLOB_REF if the variable is address exposed. Or as proposed above - such indirect local access trees should always be converted to LCL_FLDs.

  • Field address access - unfortunately ldlflda & co. are imported as ADDR(FIELD). The only thing that can be done is to introduce FIELD_ADDR. Given the short lifespan of FIELD nodes this shouldn't be a big problem, though some care may be needed around fgUnwrapProxy (is this still needed in .NET Core?) and R2R affected fields.
  • Array element address access - similar to fields, ldelema is imported as ADDR(INDEX). And of course, the solution is similar - introduce INDEX_ADDR. Except that one already exists (which should not be a big problem) and that array access is actually a bit more complicated, as we'll see below.
  • Array value access, especially when combined with struct types, is quite a problem, perhaps the worst problem associated with ADDR. INDEX gets expanded to IND(element address tree) which may seem rather innocuous. It's not because:

    • The IND node has an ArrayInfo out-of-band annotation attached to it. This makes changing/eliminating this node problematic as the annotation has to be preserved somehow.

    • Even if the element type is struct you still get an IND, not an OBJ. Likely because of difficulties caused by the above mentioned annotation, JIT code that needs an OBJ does not change the IND but instead piles an OBJ on top of it by means of ADDR - OBJ(ADDR(IND(element address)). To get rid of ADDR there's no choice but to change the IND but then we need to ensure that the annotation is preserved properly. Or just get rid of it, best I can tell this annotation can be replaced by a special field sequence.

    • And just to complicate things even more, the IND node is actually wrapped in a COMMA. That may be less of an issue as far as ADDR is concerned but it is a problem for ASG elimination. So we need to keep that in mind when making changes around array element access. It's probably best to sink the COMMA below the IND so we get IND(COMMA(ARR_BOUNDS_CHK, element address)).

  • (Hopefully) the last and most exotic ADDR use is to attach struct type information to nodes that lack it, by wrapping them in OBJ(ADDR(node)), even if the node doesn't even represent an addressable location. This happens with SIMD typed nodes that are used as call arguments - call morphing needs to know the exact struct type for ABI decisions and SIMD node do not carry that information. To get rid of ADDR we have the following options:

    • Add struct type information to SIMD nodes - probably overkill since the this information is only required when these nodes are used as call args

    • Add a new node that serves the same purpose as OBJ(ADDR(x)) - to carry struct type info. Might work but I have reservations about extra nodes that act like wrappers for annotation purposes, they require special care to avoid them becoming detached from the child node they're supposed to wrap (e.g. GTF_NO_CSE).

    • Include the necessary struct type info in GenTreeCall::Use so it does not need to be recovered from the argument node. This makes the most sense to me, its information that's provided by the call signature and as such it should be stored with the call node and not its arguments. The only drawback would be that it would make GenTreeCall::Use bigger by adding a pointer to it.

:mips-interest

Hi @mikedn

Testcase: JIT/HardwareIntrinsics/X86/Sse41/ConvertToVector128_r/ConvertToVector128_r.exe

Assertion failed to work for MIPS:

noway_assert(((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) ||
                 ((tree->gtFlags & GTF_GLOB_REF) != 0));

Workaround:

diff --git a/src/jit/morph.cpp b/src/jit/morph.cpp
index 1fc966a..dddaa6a 100644
--- a/src/jit/morph.cpp
+++ b/src/jit/morph.cpp
@@ -6244,7 +6244,7 @@ GenTree* Compiler::fgMorphField(GenTree* tree, MorphAddrContext* mac)
     }

     noway_assert(((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) ||
-                 ((tree->gtFlags & GTF_GLOB_REF) != 0));
+                 ((tree->gtFlags & GTF_GLOB_REF) != 0) MIPS64_ONLY(|| objRef->gtOper == GT_LCL_VAR));

     if (tree->gtField.gtFldMayOverlap)
     {

Request for comments.

Thanks,
Leslie Zhai

@xiangzhai If adding the LCL_VAR makes the assert pass it means that you have something like FIELD(LCL_VAR) and then FIELD should really have GTF_GLOB_REF set because there's generally no way to tell where the pointer stored in the local variable points to. So the real problem here is likely the missing GTF_GLOB_REF and changing the assert won't fix that.

You should check Compiler::gtNewFieldRef, that one sets GTF_GLOB_REF in most cases. The only exception involves trees like FIELD(ADDR(LCL_VAR)) so it's not clear why GTF_GLOB_REF would be missing. Can you provide a JIT dump?

Since it's MIPS specific I would guess that it's caused by other MIPS change you have done, that's possibly related to struct call parameters handling (does MIPS pass structs by reference or by value?)
Or, since this is a x86 intrinsics test, maybe something is not handled correctly in the intrinsic import code.

And this isn't probably directly related to this issue but yes, the removal of ADDR should make things simpler and safer. But it will take a while until this is done.

Hi @mikedn

Thanks for your teaching!

So the real problem here is likely the missing GTF_GLOB_REF

Thanks for pointing out the root cause!

Can you provide a JIT dump?

Sorry! There are lots of debugging output messages in JitFunctionTrace.log.

struct call parameters handling (does MIPS pass structs by reference or by value?)

By value:
pass-structs

Or, since this is a x86 intrinsics test ...

Yes, Sse2.IsSupported, for example, should directly return false for MIPS without trigging assertion failure.

\cc @QiaoVanke

Thanks,
Leslie Zhai

Sorry! There are lots of debugging output messages in JitDisasm.log.

I was referring to a JIT dump obtained by setting the environment variable COMPlus_JitDump to the method name that asserts. That would show the IR that is generating the assert. But the log also helps a bit - we can see that the method that is being compiled is <>c:<JoinInternal>b__37_0(struct,struct):this. This is a lambda expression used in one of the Path.JoinInternal methods - https://github.com/dotnet/coreclr/blob/a9f3fc16483eecfc47fb79c362811d870be02249/src/System.Private.CoreLib/shared/System/IO/Path.cs#L659

This means that:

  • the problem is unlike to be related to intrinsics as this method does not use intrinsics. At least not directly.
  • the method has struct parameters so it is very likely that the problem has to do with the handling of method parameters

Without the dump, I can only guess that the local variable is perhaps one of the struct parameters and that FIELD(LCL_VAR) should have been FIELD(ADDR(LCL_VAR)). But I don't know how ADDR could have disappeared along the way (or not being generated in the first place).

It might be useful to highlight a discrepancy between IL instructions and JIT's IR: in IL you have

IL_0000: ldarg.2
IL_0001: ldfld !0 valuetype System.ValueTuple`5<native int, int32, native int, int32, bool>::Item1

But in the JIT IR the FIELD node expects an address so the above IL is imported as FIELD(ADDR(LCL_VAR)).

I was referring to a JIT dump obtained by setting the environment variable COMPlus_JitDump to the method name that asserts.

Sorry! I uploaded wrong log ...

Here JitDump.log is.

Thanks,
Leslie Zhai

Thanks, the dump says:

Morphing args for 7.CALL:
Replacing address of implicit by ref struct parameter with byref:
               [000004] ------------              *  LCL_VAR   byref  V02 arg2         

Assert failure(PID 13339 [0x0000341b], Thread: 13339 [0x341b]): Assertion failed '((objRef != nullptr) && (objRef->IsLocalAddrExpr() != nullptr)) || ((tree->gtFlags & GTF_GLOB_REF) != 0)' in '<>c:<JoinInternal>b__37_0(struct,struct):this' (IL size 120)

It seems that you have "implicit by ref" parameter morphing enabled, even though you say that on MIPS struct args are passed by value. Perhaps you need to adjust the ifdef in fgMorphImplicitByRefArgs?

bool Compiler::fgMorphImplicitByRefArgs(GenTree* tree)
{
#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_)

    return false;

#else  // (_TARGET_AMD64_ && !UNIX_AMD64_ABI) || _TARGET_ARM64_

    bool changed = false;

Though as it is it will already reject a non x64/arm64 target so presumably you already changed this in some way.

Though as it is it will already reject a non x64/arm64 target so presumably you already changed this in some way.

Thanks for pointing out my fault:

diff --git a/src/jit/morph.cpp b/src/jit/morph.cpp
index 1fc966a..2b8a8ee 100644
--- a/src/jit/morph.cpp
+++ b/src/jit/morph.cpp
@@ -17911,7 +17911,7 @@ void Compiler::fgMarkDemotedImplicitByRefArgs()
  */
 bool Compiler::fgMorphImplicitByRefArgs(GenTree* tree)
 {
-#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_) && !defined(_TARGET_MIPS64_)
+#if (!defined(_TARGET_AMD64_) || defined(UNIX_AMD64_ABI)) && !defined(_TARGET_ARM64_)

     return false;

Thanks,
Leslie Zhai

struct call parameters handling (does MIPS pass structs by reference or by value?)

By value:
pass-structs

Sorry! @QiaoVanke pointed out my fault: It is a special case for O32, but not mips64r2 N64. Short word: By Reference!

Thanks,
Leslie Zhai

You should check Compiler::gtNewFieldRef, that one sets GTF_GLOB_REF in most cases. The only exception involves trees like FIELD(ADDR(LCL_VAR)) so it's not clear why GTF_GLOB_REF would be missing.

@QiaoVanke fixed it:

diff --git a/src/jit/compiler.hpp b/src/jit/compiler.hpp
index 7c8660f..99176db 100644
--- a/src/jit/compiler.hpp
+++ b/src/jit/compiler.hpp
@@ -1192,7 +1192,7 @@ inline GenTree* Compiler::gtNewFieldRef(var_types typ, CORINFO_FIELD_HANDLE fldH
     {
         unsigned lclNum                  = obj->gtOp.gtOp1->gtLclVarCommon.gtLclNum;
         lvaTable[lclNum].lvFieldAccessed = 1;
-#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
         // These structs are passed by reference; we should probably be able to treat these
         // as non-global refs, but downstream logic expects these to be marked this way.
         if (lvaTable[lclNum].lvIsParam)
@@ -1780,10 +1780,10 @@ inline void LclVarDsc::incRefCnts(BasicBlock::weight_t weight, Compiler* comp, R

             bool doubleWeight = lvIsTemp;

-#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#if defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)
             // and, for the time being, implict byref params
             doubleWeight |= lvIsImplicitByRef;
-#endif // defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_)
+#endif // defined(_TARGET_AMD64_) || defined(_TARGET_ARM64_) || defined(_TARGET_MIPS64_)

             if (doubleWeight && (weight * 2 > weight))
             {

Thanks,
Leslie Zhai

Was this page helpful?
0 / 5 - 0 ratings