One: [luci] Problem with postorder_traversal() with dangling virtual nodes

Created on 30 Dec 2020  路  16Comments  路  Source: Samsung/ONE

From PR #5507

              +-- CircleWhileOut(ofm1) -- CircleOutput
CircleWhile --+
              +-- CircleWhileOut(ofm2)

may have same problem with current type inference logic which uses below code;

bool TypeInferenceSession::to(Graph *g) const
{
  bool changed = false;

  for (auto node : postorder_traversal(output_nodes(g)))
  {
    if (_rule->recognize(node->dialect()))
    {
      DataType dtype = DataType::Unknown;

      if (!dtype_known(node) && inputs_dtype_ready(node))
      {
        if (_rule->infer(node, dtype))
        {
          node->annot(stdex::make_unique<DataTypeAnnotation>(dtype));
          changed = true;
        }
      }
    }
  }

  return changed;
}

We may need to check this too.

Most helpful comment

For dtype, https://github.com/Samsung/ONE/issues/5510#issuecomment-752319413 explains why.
But for shape, little difference is found.
shape works well for now because of following steps.

  1. When CircleWhileOut is created, shape is copied, but ShapeStatus is still UNDEFINED.
    (circle node shape : known / shape status : UNDEFINED / loco shape : nullptr)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/import/src/Nodes/CircleWhile.cpp#L110-L117

  2. During ShapeInferencePass, the CircleWhileOut is not checked.
    (circle node shape : known / shape status : UNDEFINED / loco shape : nullptr)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/loco/src/Service/ShapeInference.cpp#L75

  3. During MigrateLegacyShapeDtypePass, there is nothing to copy.
    (circle node shape : known / shape status : UNDEFINED / loco shape : nullptr)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/pass/src/MigrateLegacyShapeDtypePass.cpp#L66

  4. When CircleWhileOut is exported, shape description is not created because shape status is UNDEFINED.
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/export/src/CircleTensorExporter.cpp#L116-L117

  5. Even shape description is not created, default offset will be returned because of following codes.
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/export/src/CircleTensorExporter.cpp#L440-L442

In conclusion, exported dtype is correct even thought there may be some problems, but exported shape is really invalid.
(But errors are not occurred)

All 16 comments

@llFreetimell , can you check this too?

In the above point of view, that code also cannot check While_002 bug.
However, it works well for now because of following steps.

  1. When CircleWhileOut is created, shape and dtype are copied. (circle node dtype : known / loco dtype : unknown)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/import/src/Nodes/CircleWhile.cpp#L110-L115

  2. During TypeInferencePass, the CircleWhileOut is not checked. (circle node dtype : known / loco dtype : unknown)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/loco/src/Service/TypeInference.cpp#L66

  3. During MigrateLegacyShapeDtypePass, there is nothing to copy. (circle node dtype : known / loco dtype : unknown)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/pass/src/MigrateLegacyShapeDtypePass.cpp#L99-L105

  4. When CircleWhileOut is exported, dtype of circle node is copied.
    (FYI, before last change, node_dtype() in TypeBridge.h was used but it also used dtype of circle node)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/export/src/CircleTensorExporter.cpp#L114

@llFreetimell , thanks for the investigation!

  • circle node dtype : known --> node dtype of circle node itself
  • loco dtype : unknown --> annotation dtype from type inference
  • I see that problem itself is "type is unknown" in new CircleTypeInferencePass for While Op
  • Issue exist when we use postorder_traversal(output_nodes(g)) for virtual outputs that are not used as graph outputs

I'll rename this issue to know that we may have some problem in similar cases.

When I checked again, shape and dtype behave differently... I will investigate about it!

For dtype, https://github.com/Samsung/ONE/issues/5510#issuecomment-752319413 explains why.
But for shape, little difference is found.
shape works well for now because of following steps.

  1. When CircleWhileOut is created, shape is copied, but ShapeStatus is still UNDEFINED.
    (circle node shape : known / shape status : UNDEFINED / loco shape : nullptr)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/import/src/Nodes/CircleWhile.cpp#L110-L117

  2. During ShapeInferencePass, the CircleWhileOut is not checked.
    (circle node shape : known / shape status : UNDEFINED / loco shape : nullptr)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/loco/src/Service/ShapeInference.cpp#L75

  3. During MigrateLegacyShapeDtypePass, there is nothing to copy.
    (circle node shape : known / shape status : UNDEFINED / loco shape : nullptr)
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/pass/src/MigrateLegacyShapeDtypePass.cpp#L66

  4. When CircleWhileOut is exported, shape description is not created because shape status is UNDEFINED.
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/export/src/CircleTensorExporter.cpp#L116-L117

  5. Even shape description is not created, default offset will be returned because of following codes.
    https://github.com/Samsung/ONE/blob/2c71236a2982874fff2b501f4c8aebbb3d08a45a/compiler/luci/export/src/CircleTensorExporter.cpp#L440-L442

In conclusion, exported dtype is correct even thought there may be some problems, but exported shape is really invalid.
(But errors are not occurred)

I think step 5 if (info.shape_status() == ShapeStatus::VALID) is a hack (I think I added this) to avoid not visited virtual output nodes with postorder_traversal() :)

@llFreetimell As @seanshpark said,

From my past experience all_nodes may make visit to unwanted nodes.

And, it can't distinguish dead virtual node from alive one. How about adding some logic to loco::active_nodes finding nodes that should be added like virtual nodes. You can use VirtualOutputDetector.

How about adding some logic to loco::active_nodes finding nodes that should be added like virtual nodes

I agree to use functions like loco::active_nodes.
But we can add some logic to luci, not to loco. Virtual nodes are only in luci :(

it can't distinguish dead virtual node from alive one.
You can use VirtualOutputDetector

AFAIK, current VirtualOutputDetector only checks the type of nodes. It means that we cannot determine whether a virtual node is dead or alive with using VirtualOutputDetector
We may be able to use bool DeadNodeQueryServiceImpl::isDeadNode(loco::Node *node) by moving DeadNodeQueryService.h to luci/lang/include..!

Inspired by @mhs4670go's comment, how about following codes?

// For below codes,
// We should move DeadNodeQueryService.h to luci/lang/include/

for (auto node : loco::all_nodes(g))
{
  if (!isDeadNode(node))
  {
    loco::DataType dtype;
    auto circle_node = loco::must_cast<luci::CircleNode *>(node);

    if (type_infer_rule.infer(circle_node, dtype) && circle_node->dtype() != dtype)
    {
      circle_node->dtype(dtype);
      changed = true;
    }
  }
}

@llFreetimell Well, the thing is all_nodes and isDeadNode can't cover below network and neither can RemoveDeadNodeWithQueryPass.

              +-- CircleWhileOut(ofm1) -- CircleOutput
CircleWhile --+
              +-- CircleWhileOut(ofm2)

                  CircleWhileOut(dead node) -> this node won't be removed

So, here is one of those solutions.

  1. Get active_nodes
  2. for loop. If there is virtual node's parent, add its kids who's not inserted yet.

Or, there could be better solutions.

@llFreetimell , if you have much work to do depending on PR 5507,

  • we could go with simple changes first, as this code block is not used yet.
  • later when you activate this Pass and find some problems, that it would be more specific to the problem and easy to fix.

@mhs4670go
If i understand correctly, all_nodes + isDeadNode can cover following case.

              +-- CircleWhileOut(ofm1) -- CircleOutput
CircleWhile --+
              +-- CircleWhileOut(ofm2)

                  CircleWhileOut(dead node) -> this node won't be removed
  1. CircleWhile, CircleWhileOut(ofm1), CircleOutput are alive by following codes.
    https://github.com/Samsung/ONE/blob/6d34185c0c4bf43d3cde46de36d7ad985e1e5a96/compiler/luci/lang/src/DeadNodeQueryService.cpp#L50-L54

  2. CircleWhileOut(ofm1) is alive by following codes, because input of CircleWhileOut(ofm1) is in active nodes.
    https://github.com/Samsung/ONE/blob/6d34185c0c4bf43d3cde46de36d7ad985e1e5a96/compiler/luci/lang/src/DeadNodeQueryService.cpp#L57-L68

As a result, only CircleWhileOut(dead node) is determined as dead node.

@seanshpark
If loco::all_nodes + isDeadNode approach is okay, there is only one thing for #5507. That is just moving DeadNodeQueryService.h to luci/lang/include/ and use it! :)

That is just moving DeadNodeQueryService.h to luci/lang/include/ and use it! :)

OK, Let's go with your suggestion :)

@llFreetimell Oh, I thought virtual nodes always return false in isDeadNode but it checks if it is active nodes as well. That approach do cover the case:)

This issue will be resolved in new CircleTypeInferencePass and CircleShapeInferencePass.

However, it will not be resolved in original TypeInferencePass and ShapeInferencePass.
Of course, we can apply this solution to original TypeInferencePass and ShapeInferencePass in technically, but I want to leave it as it is because some users are still using the original passes and thus I am not sure about the side effect :)

Maybe we can close it :)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

binarman picture binarman  路  3Comments

seanshpark picture seanshpark  路  3Comments

YongseopKim picture YongseopKim  路  3Comments

wateret picture wateret  路  4Comments

hasw7569 picture hasw7569  路  4Comments