Glow: How to check if a op is supported by a Backend before lowering this op

Created on 14 May 2019  路  1Comment  路  Source: pytorch/glow

For Heterogeneous partitioning, we need to assign a op with a BackendKind before the op is lowered. Current "isOpSupported" function checks the op after lowering. Now the way is we first check if a op can be lowered and then check "isOpSuppored" . However, according to @jfix71, it is not correct: ```
"Seeing this in action, I don't think it makes sense actually. Backends will return true for Nodes they do not know about. E.g. if a backend does not support SparseLengthsWeightedSum, it will return false for isOpSupported(), but true for shouldLower() even though SparseLengthsWeightedSum isn't lowered to anything.

One alternative here could be that when we encounter an unsupported node, clone it, and then try to lower it. If it's not lowered then it's unsupported. Else if it is lowered then recursively do the same check on all of the nodes added during lowering. This is going to require a non-trivial refactoring though, in lowering logic/figuring out how best to clone the Node temporarily and tracking the nodes created from lowering."
```
We need to improve this check.

2687

Most helpful comment

IMO, the code of isOpSupported is already too complicated (when I look at the CPU backend for instance). I'm concerned about making this even more complicated, because the answer to the question 'is this node supported ?' is not necessarily a clear 'yes' or 'no'. It can be at the middle depending on the context of the node.

A backend may indeed not know the answer before it actually tries to compile the sub-graph around the node. For instance:

  • a node may be supported only when preceded by another one of a certain type so the isOpSupported function asking for a single node is already too limited.
  • even when it doesn't support a node with a certain a type natively, the backend may be able to requantize the node to fit its native constraints and do it efficiently if nodes around can also be requantized. For instance, even if the s8 is not supported by the backend, it may be in certain conditions able to requantize the node for using s16 for instance.

Last but not least, answering yes or no doesn't give any performance information. The backend can support all nodes but with a non efficient 'fallback' implementations when it doesn't supports nativelly.

Globally, I think that this graph partitioning subject is tricky, and I would prefer to make sure that what we try to achieve is clear before to complicate the API and the framework. There are IMO 2 orthogonal problems to solve:

  • prevent Glow compilation failures: Glow should not give a sub-graph to a backend if this one will fail compiling, or glow should handle backend failures by doing a retry (in this case, the backend can be optimistic about what it supports)
  • optimize performance: Glow wants to partition a graph across various devices to optimize performance. For this, I guess we need to know if the backend supports a graph, but also at which perf. We need also to know the cost of moving data from one device to another one (it's not a local optimization problem).

>All comments

IMO, the code of isOpSupported is already too complicated (when I look at the CPU backend for instance). I'm concerned about making this even more complicated, because the answer to the question 'is this node supported ?' is not necessarily a clear 'yes' or 'no'. It can be at the middle depending on the context of the node.

A backend may indeed not know the answer before it actually tries to compile the sub-graph around the node. For instance:

  • a node may be supported only when preceded by another one of a certain type so the isOpSupported function asking for a single node is already too limited.
  • even when it doesn't support a node with a certain a type natively, the backend may be able to requantize the node to fit its native constraints and do it efficiently if nodes around can also be requantized. For instance, even if the s8 is not supported by the backend, it may be in certain conditions able to requantize the node for using s16 for instance.

Last but not least, answering yes or no doesn't give any performance information. The backend can support all nodes but with a non efficient 'fallback' implementations when it doesn't supports nativelly.

Globally, I think that this graph partitioning subject is tricky, and I would prefer to make sure that what we try to achieve is clear before to complicate the API and the framework. There are IMO 2 orthogonal problems to solve:

  • prevent Glow compilation failures: Glow should not give a sub-graph to a backend if this one will fail compiling, or glow should handle backend failures by doing a retry (in this case, the backend can be optimistic about what it supports)
  • optimize performance: Glow wants to partition a graph across various devices to optimize performance. For this, I guess we need to know if the backend supports a graph, but also at which perf. We need also to know the cost of moving data from one device to another one (it's not a local optimization problem).
Was this page helpful?
0 / 5 - 0 ratings

Related issues

mciprian13 picture mciprian13  路  3Comments

georgeokelly picture georgeokelly  路  4Comments

mciprian13 picture mciprian13  路  4Comments

speryt picture speryt  路  3Comments

qcolombet picture qcolombet  路  5Comments