Currently backend implementation has to provide one-to-one target-specific Node and Inst.
And the conversion from Node to Inst is fully under glow's control. Is it possible to design & create an interface for backend to convert specific Node to Inst?
It will be helpful for DLA which will be better or need to get some Inst sequences for some Node of operations.
Hi @champyen, yes it is definitely possible. We have discussed it before but just haven't gotten around to designing/implementing it. Would you be interested in working on this?
Yes, I'm interested on this feature.
Recently, I'm thinking about another issue of Node/Graph manipulation. (Currently backend can only replace glow node with specific node by calling replaceAllUsesOfWith).
For most NPUs/DLAs hardware engines have restrictions for some operators, it would be better to be able to split one node to multiple nodes. (eg: Hardware has sram size issue for convolution operator, it would required to partition one convolution to several convolutions with offset) Of course, it could be resolved by "Customized Node to Instruction Convertion". But it may be more intuitive and simpler to do this in Graph level. (Because for Low-level IR and target binary generation, the action is one-to-one translation)
I will try to write another issue when the idea is clear.
Ah ok, yeah I think that would make more sense at the Graph level. It seems that what you need to do is already supported via replaceAllUsesOfWith() -- you can use it to replace one Node with multiple Nodes.
For example, backends can decide whether to lower the FullyConnectedNode into a MatMulNode followed by a BatchedAddNode. When we do this with replaceAllUsesOfWith(), we replace the FullyConnectedNode by the BatchedAddNode (see here).
We also lower group Convolutions here, where we replace a ConvolutionNode with group > 1, with multiple ConvolutionNodes with group = 1, which are all concatenated together with a ConcatNode. The original ConvolutionNode with group > 1 is replaced by this ConcatNode using replaceAllUsesOfWith().
We also merge multiple MatMulNodes with the same RHS into a single MatMulNode that is then sliced into each output (see here). We also do Constant Subexpression Elimination (CSE), where we remove duplicate subgraphs (see here). All of these examples use replaceAllUsesOfWith().
Thanks for your fast response!
I underestimate the power of replaceAllUsesOfWith(). (I just think that it replace one node with another node) And the example of group Convolution is all what I mean.
The information you provide is very helpful for me.
I will study glow more before proposing any other issues.
Since I'm a newbie of NN compiler and just study glow for ~ 2 weeks.
I will try to implement NVDLA backend with glow later.
I'm going to close this issue for now, since it seems your problem was resolved. If you find you need backend-specific IRGen in the future feel free to reopen or open a new issue.
@jfix71 I was going to find issue, but then found this.
I have potential implementation: https://github.com/Cadence-TIP-AI/glow/commit/f200b104b4b296d19533b9f758eaa7dbef491226
Should I open a separate issue?
@ayermolo Looks great! I'll reopen, feel free to open a PR with that and we can re-close this issue once it lands 馃槃
Most helpful comment
Ah ok, yeah I think that would make more sense at the Graph level. It seems that what you need to do is already supported via
replaceAllUsesOfWith()-- you can use it to replace one Node with multiple Nodes.For example, backends can decide whether to lower the
FullyConnectedNodeinto aMatMulNodefollowed by aBatchedAddNode. When we do this withreplaceAllUsesOfWith(), we replace theFullyConnectedNodeby theBatchedAddNode(see here).We also lower group Convolutions here, where we replace a
ConvolutionNodewithgroup > 1, with multipleConvolutionNodeswithgroup = 1, which are all concatenated together with aConcatNode. The originalConvolutionNodewithgroup > 1is replaced by thisConcatNodeusingreplaceAllUsesOfWith().We also merge multiple
MatMulNodeswith the same RHS into a singleMatMulNodethat is then sliced into each output (see here). We also do Constant Subexpression Elimination (CSE), where we remove duplicate subgraphs (see here). All of these examples usereplaceAllUsesOfWith().