Tvm: [DEV] TVM v0.7 Roadmap

Created on 8 Feb 2020  路  17Comments  路  Source: apache/tvm

This roadmap for TVM v0.7. TVM is a community-driven project and we love your feedback and proposals on where we should be heading. Please open up discussion in the discussion forum as well as bring RFCs.

  • Feel free to volunteer yourself if you are interested in trying out some items(they do not have to be on the list).
  • Please also check out the help wanted list in the github issues on things that need help

    In the 0.7 cycle, we are going to focus on the following four areas. Which are summarized summarized from the forum discussion.

We also welcome contributions along all the other areas, including more operator and model coverages, and they will be added to this list gradually. Please reply to this thread about what you would like to volunteer to work on.

We will aim to cut a release in about three months timeframe(around April, May)

Core Infra

  • [x] Unified IR refactor
  • [x] Unified runtime for heterogeneous execution
  • [ ] Enhanced support for high-level graph rewriting for accelerators
  • [x] Improving test and benchmark infrastructure.

    • [ ] Testing and benchmarking on remote targets.

  • [ ] Better packaging of the library besides installing from source

Usability

  • [x] Better documentations for developments
  • [ ] Command line utilties to use TVM as an ahead of time compiler
  • [ ] Visualization of Relay Graphs

Backend and runtime

  • [ ] End to end uTVM
  • [ ] More dynamic model support

    • [ ] Complete VM functionality

    • [ ] Improve VM performance

    • [ ] Add tutorial for VM

  • [x] External code generator
  • [ ] End to end inference with Chisel VTA
  • [ ] CUDA half2 data type support
  • [x] bfloat16 support
  • [ ] 4-bit model support

Automation

  • [ ] More auto scheduling
  • [ ] Better loop partitioning
  • [ ] Reduce AutoTVM tuning time
  • [ ] Auto tensorization
  • [ ] Auto quantization
roadmap

Most helpful comment

We will have a good support for pytorch including dynamic and quantized models. I have a series of PRs coming.

All 17 comments

We will have a good support for pytorch including dynamic and quantized models. I have a series of PRs coming.

Could you also add support for intel graphics as well. it seems it is not supported at the moment!

External codegen integration with uTVM is highly interesting for us at CEVA.
We are looking at compiling networks using TVM and generating code with our implementation of codegen and then executing on CEVA hardware via uTVM.
The goal (beside getting it done) is to share via TVM github repo.

@Coderx7 as far as i know, intel graphics is already supported for a limited setting by @Laurawly . @etom42 it would be great if you can start a RFC thread in the discus forum to keep the community aware of what you are working on -- it also brings chance for collaboration and give suggestions :)

Is there any plan to integrate TVM as a dialect into MLIR?
So other components based on MLIR can leverage the capability of TVM, such as high performance codegen, and fusion, etc.

@yangjunpro Perhaps it worth to start a new thread in the discuss forum to discuss MLIR related topics. We certainly would love some proposals about interpolation between MLIR and TVM.

I wonder is there any active effort working on the feature "Auto tensorization", and I'm interested in contributing.

@liangfu auto tensorization depends on on the new IR update, we will send out specific guidelines after we finish the IR refactoring

I am currently working on some end-to-end model stuff, and Relay's optimization pass are too slow. Any plans on improving the compilation speed?

I am currently working on some end-to-end model stuff, and Relay's optimization pass are too slow. Any plans on improving the compilation speed?

Can you identify why they are slow? For example switching some passes to the new iterative graph visitor might improve speed quite a bit. In previous profiling attempts I have found a significant amount of time is spent allocating, having a pooling allocator can often improve speeds quite a bit.

I'd like to propose creating Relay c_api.h for further extensions from frameworks currently using NNVM and/or bindings in the future.

re relay c_api, there is no need to do so, as all the relay functions are exposed through the PackedFunc FFI. So all the features of relay can already be accessed through the tvm's C API.

What is the expected time of release for this release? what are the chances of it happening in May?

Due to recent situation and the current progress, we might expect a bit of delay in the release to June/July -- we expect the unified IR refactor to land by then. We will do our best to keep the timeframe.

Due to recent situation and the current progress, we might expect a bit of delay in the release to June/July -- we expect the unified IR refactor to land by then. We will do our best to keep the timeframe.

Including more operator should be scheduled as least for one frontend? Onnx or tensorflow or mxnet etc, it will be great if 'ONE' frontend is widely supported!!

We will have general support for TensorFlow control flow and tensor array, which allows parsing for TensorFlow object detection models

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yurivict picture yurivict  路  4Comments

MarisaKirisame picture MarisaKirisame  路  5Comments

zhiqwang picture zhiqwang  路  4Comments

KayneWest picture KayneWest  路  6Comments

sunstarchan picture sunstarchan  路  7Comments