For Heterogeneous partitioning, we need to assign a op with a BackendKind before the op is lowered. Current "isOpSupported" function checks the op after lowering. Now the way is we first check if a op can be lowered and then check "isOpSuppored" . However, according to @jfix71, it is not correct: ```
"Seeing this in action, I don't think it makes sense actually. Backends will return true for Nodes they do not know about. E.g. if a backend does not support SparseLengthsWeightedSum, it will return false for isOpSupported(), but true for shouldLower() even though SparseLengthsWeightedSum isn't lowered to anything.
One alternative here could be that when we encounter an unsupported node, clone it, and then try to lower it. If it's not lowered then it's unsupported. Else if it is lowered then recursively do the same check on all of the nodes added during lowering. This is going to require a non-trivial refactoring though, in lowering logic/figuring out how best to clone the Node temporarily and tracking the nodes created from lowering."
```
We need to improve this check.
IMO, the code of isOpSupported is already too complicated (when I look at the CPU backend for instance). I'm concerned about making this even more complicated, because the answer to the question 'is this node supported ?' is not necessarily a clear 'yes' or 'no'. It can be at the middle depending on the context of the node.
A backend may indeed not know the answer before it actually tries to compile the sub-graph around the node. For instance:
isOpSupported function asking for a single node is already too limited.Last but not least, answering yes or no doesn't give any performance information. The backend can support all nodes but with a non efficient 'fallback' implementations when it doesn't supports nativelly.
Globally, I think that this graph partitioning subject is tricky, and I would prefer to make sure that what we try to achieve is clear before to complicate the API and the framework. There are IMO 2 orthogonal problems to solve:
Most helpful comment
IMO, the code of
isOpSupportedis already too complicated (when I look at the CPU backend for instance). I'm concerned about making this even more complicated, because the answer to the question 'is this node supported ?' is not necessarily a clear 'yes' or 'no'. It can be at the middle depending on the context of the node.A backend may indeed not know the answer before it actually tries to compile the sub-graph around the node. For instance:
isOpSupportedfunction asking for a single node is already too limited.Last but not least, answering yes or no doesn't give any performance information. The backend can support all nodes but with a non efficient 'fallback' implementations when it doesn't supports nativelly.
Globally, I think that this graph partitioning subject is tricky, and I would prefer to make sure that what we try to achieve is clear before to complicate the API and the framework. There are IMO 2 orthogonal problems to solve: