Tvm: [DOCS] Neural network Deployment Guide with System Module Mode

Created on 1 Aug 2018  路  3Comments  路  Source: apache/tvm

The current nnvm deployment module guide uses dynamic library mode, as @qingyuanxingsi suggested, we might want to add a system module deployment guide.

Actionable Items

    // old
    // tvm::runtime::Module mod_syslib = tvm::runtime::Module::LoadFromFile("deploy.so");
    // system module mode
    tvm::runtime::Module mod_syslib = (*tvm::runtime::Registry::Get("module._GetSystemLib"))();

Contributions are welcomed

beginner-friendly help wanted

Most helpful comment

OK, maybe we need to add a bit more additional details of device-related build into the doc. See the code here https://github.com/dmlc/tvm/blob/master/python/tvm/module.py#L93

What is happening for CUDA like modules is that we need to generate two files

  • myfunc.o (the host module you can obtain by mod.save)
  • device_blob.cc (the device part binary file that contains the device side binary embedded in C++)

What is happening in x.export_library("xx.so") is that we will call _PackImportsToC to create devc.cc, and then combine this together with the .o file.

See also the implementation of PackImportsToC https://github.com/dmlc/tvm/blob/master/src/codegen/codegen.cc#L33

I know it is a bit twisted, but this is a good hack to actually implement the device linking strategy and embed the binary blob into the final object file. It might make sense to provide an example for CUDA build and add a specific document to explain tvm runtime module export and the relation of the files.

Looking into these implementation, and document them will be very helpful for the community

@srkreddy1238 @@qingyuanxingsi @merrymercy

All 3 comments

@dmlc/tvm-team I will take up this.

@srkreddy1238 great, note that however unless it is a design discussion, we don't have to loop in all the current tvm team members :)

OK, maybe we need to add a bit more additional details of device-related build into the doc. See the code here https://github.com/dmlc/tvm/blob/master/python/tvm/module.py#L93

What is happening for CUDA like modules is that we need to generate two files

  • myfunc.o (the host module you can obtain by mod.save)
  • device_blob.cc (the device part binary file that contains the device side binary embedded in C++)

What is happening in x.export_library("xx.so") is that we will call _PackImportsToC to create devc.cc, and then combine this together with the .o file.

See also the implementation of PackImportsToC https://github.com/dmlc/tvm/blob/master/src/codegen/codegen.cc#L33

I know it is a bit twisted, but this is a good hack to actually implement the device linking strategy and embed the binary blob into the final object file. It might make sense to provide an example for CUDA build and add a specific document to explain tvm runtime module export and the relation of the files.

Looking into these implementation, and document them will be very helpful for the community

@srkreddy1238 @@qingyuanxingsi @merrymercy

Was this page helpful?
0 / 5 - 0 ratings

Related issues

edmBernard picture edmBernard  路  5Comments

ysh329 picture ysh329  路  6Comments

tqchen picture tqchen  路  6Comments

kovasb picture kovasb  路  7Comments

zhiics picture zhiics  路  7Comments