The current nnvm deployment module guide uses dynamic library mode, as @qingyuanxingsi suggested, we might want to add a system module deployment guide.
// old
// tvm::runtime::Module mod_syslib = tvm::runtime::Module::LoadFromFile("deploy.so");
// system module mode
tvm::runtime::Module mod_syslib = (*tvm::runtime::Registry::Get("module._GetSystemLib"))();
Contributions are welcomed
@dmlc/tvm-team I will take up this.
@srkreddy1238 great, note that however unless it is a design discussion, we don't have to loop in all the current tvm team members :)
OK, maybe we need to add a bit more additional details of device-related build into the doc. See the code here https://github.com/dmlc/tvm/blob/master/python/tvm/module.py#L93
What is happening for CUDA like modules is that we need to generate two files
What is happening in x.export_library("xx.so") is that we will call _PackImportsToC to create devc.cc, and then combine this together with the .o file.
See also the implementation of PackImportsToC https://github.com/dmlc/tvm/blob/master/src/codegen/codegen.cc#L33
I know it is a bit twisted, but this is a good hack to actually implement the device linking strategy and embed the binary blob into the final object file. It might make sense to provide an example for CUDA build and add a specific document to explain tvm runtime module export and the relation of the files.
Looking into these implementation, and document them will be very helpful for the community
@srkreddy1238 @@qingyuanxingsi @merrymercy
Most helpful comment
OK, maybe we need to add a bit more additional details of device-related build into the doc. See the code here https://github.com/dmlc/tvm/blob/master/python/tvm/module.py#L93
What is happening for CUDA like modules is that we need to generate two files
What is happening in x.export_library("xx.so") is that we will call
_PackImportsToCto create devc.cc, and then combine this together with the .o file.See also the implementation of PackImportsToC https://github.com/dmlc/tvm/blob/master/src/codegen/codegen.cc#L33
I know it is a bit twisted, but this is a good hack to actually implement the device linking strategy and embed the binary blob into the final object file. It might make sense to provide an example for CUDA build and add a specific document to explain tvm runtime module export and the relation of the files.
Looking into these implementation, and document them will be very helpful for the community
@srkreddy1238 @@qingyuanxingsi @merrymercy