In #2401 , the community agrees that it is a good idea to move toward Apache. On behalf of the community, the TVM PMC drafted an incubation proposal: TVM Incubation Proposal
Please share your thoughts on the proposal. Everyone can comment on the google doc, the PMCs has edit permission and will try best to resolve the comment being raised. This RFC will last for one week.
Besides reviewing the proposal, this is also a good chance to think about how can we successfully graduate TVM as a top-level project, by reviewing the Apache Incubator Graduation Guide. Please share your thoughts with the community.
This RFC will last for one week. We will hold a formal community vote on the finalized version of the proposal before we ask our mentors to send the proposal to Apache.
cc @dmlc/tvm-comitter @markusweimer, @sscdotopen, @bgchun
Besides reviewing the proposal, this is also a good chance to think about how can we successfully graduate TVM as a top-level project, by reviewing the Apache Incubator Graduation Guide. Please share your thoughts with the community.
I think the key is ecosystem. For building more mature and larger ecosystem, the key is to attract more developers into our community. Without them, we can not be successful. No doubt is that TVM is become better and more powerful, however, to attract more developers, it is not enough. We should do more PRs to let more developers know us and be more friendly to new developers.
TVM is powerful, but many people complain the code is difficult to read and even don't know how to integrate TVM into their projects. I think both of them are pointing to the same reason: we lack of enough and good documentation. As one project is developed rapidly, I fully understand we have fast pace, but we also should care new users / developers. How to make them follow our way and thoughts and contribute their power? no doubt, it is documentation, step by step documentation. Compared with LLVM, we must admit that LLVM's documentation is so nice and abundant. LLVM is our good teacher, I think we could take it as our example.
LLVM's documentation is so nice and abundant
Is it really? I'd say _yes, superficially_ but if you dig into it, you end up in DOxygen which is mostly unhelpful. Here's a quick example: how long does it take you to figure out what alternatives there are to reloc::PIC_ and what reloc::ROPI_RWPI does? And maybe how about pinning down why exactly wasm libraries are packable using llvm-ar but _not_ regular ar :)
Jokes aside, I (quite uncontentiously) agree that helping people figure out how to integrate TVM is a very worthy goal. Part of the challenge is that there are many ways to do that: first pick a device, then a host+deployment target; probably AutoTVM the output--what's this about RPC now?
Perhaps a start would be to trawl the discussion board and issue tracker for the most common flows and focus on those. For instance, (nvidia gpu, deploy on host, cuda, no auto) or (aarch cpu, host != target, autotvm using device fleet) or (VTA on a supported FPGA, host != target, autotvm using own rpc). In all honesty, probably half the reason why people _keep making tensor compilers_ is because they don't immediately see that TVM can support their use case ;)
Is it really? I'd say _yes, superficially_ but if you dig into it, you end up in DOxygen which is mostly unhelpful. Here's a quick example: how long does it take you to figure out what alternatives there are to
reloc::PIC_and whatreloc::ROPI_RWPIdoes? And maybe how about pinning down why exactly wasm libraries are packable usingllvm-arbut _not_ regularar:)
However, without these unhelpful documentation seems superficially, developers would be even no place to dig into. :-( Like reloc::PIC_ issue, when you meet this issue, you have been in the place of apply LLVM / develop LLVM road. LLVM documentation shouldn't be responsible for this kind of issue, the distinguish of relocation mode should be explained by another places.
Jokes aside, I (quite uncontentiously) agree that helping people figure out how to integrate TVM is a very worthy goal. Part of the challenge is that there are many ways to do that: first pick a device, then a host+deployment target; probably AutoTVM the output--what's this about RPC now?
Yes. Another useful part is FAQ of common wrong error report message in my opinion. For example, 1. why Auto Tunning is always 0.00 GFLPS. / ValueError: Direct host side access to device memory is detected... / can not open shared object... and so on. This will be very useful for practice usage of TVM. Many users want to try TVM, but when they see this kind of error, they will probably give up because they don't know how to solve these strange error messages at the first glance.
IMHO, excellent documentation is essential for the knowledge transfer demanded for an Apache top level project. Documentation that is 'use case oriented' in my mind is the most productive. With general technology stacks like TVM, it is difficult to 'capture' the intent of the source code when you are coming to it from a very specific use case such as having to integrate inference in an Android app. Personally, I am most interested in custom hardware accelerators for AI workloads, and thus all the know-how in scheduling optimization for ARM and GPUs is totally uninteresting, but I am likely in a minority.
I am still missing a solid software architecture articulation in any of the TVM documentation, which makes it very difficult to marry any of TVM in other application software. If that software architecture discussion would be organized again in terms of the use cases, then the reader would be able to focus on their particular problem instead of having to try to filter out all the extraneous information that has no bearing on their particular task.
I totally agree that documentation is key to our future's success. Enabling others to use and develop TVM can be more important than adding code contribution itself.
I would propose docs as a focus area in our next release cycle. Please also comment in https://github.com/dmlc/tvm/issues/2469
+1 for document focus in the next release cycle. We will be more than happy to help on this end.
Love to help on the documentation side as well.
@tqchen The proposal looks great. My main question is how VTA fits in an Apache incubator project. To the best of my knowledge, I haven't seen any Apache project hosting open source hardware.
@bgchun I would think that if we managed it as a Hardware Abstraction Layer with a proper functional simulator behind it then it is as naturally supported as pure software. The run-time and the tvm integration would all be part of the flow, and then different hardware experiments can come in with open source, different simulators, different run-times, or fixed hardware and integrate seamlessly.
Otherwise stated, the VTA angle is the definition of a binary interface representing the ISA of the hardware accelerator and an associated software emulation of the interface. I do see some difficulties in unifying the scheduling information across vastly different hardware architectures (Nervana, Graphcore, PIM, DDF, etc.), but approaching this as an abstract hardware interface that supports all the operations of on the graph, would be a reasonable way forward IMHO.
@Ravenwater That makes sense to me. It'd be great to articulate the VTA angle in the proposal.
Once the proposal is submitted to Apache, Apache IPMC members will chime in and give feedback on the details of the proposal.
My post doesn't directly related to the Incubation Proposal. I agree with @FrozenGene and others that good documentation is indeed important for graduation. Still I think that attracting developers is a different goal because developers will effectively do in-depth architecture review which may be not the case for Apache graduation officials. Good documentation could impress newcomers, but they leave shortly after their few coding attempts if they don't understand or don't like the design. I agree with @Ravenwater that current documentation doesn't adequately describes architecture questions and I think that it is because some large-to-medium-scale question are still unanswered.
For example: it is still unclear to me what to think about tvm's C++ DSL. Should it be considered stable and integration-ready or not? API looks good, but C++ examples were recently removed from tutorials, but git log shows efforts towards rewriting Python in C++, but there are still lots of Python/C++ duplication in code..
Also I would name assertions in place of error messages, situation with simplifiers (this problem is not on a surface), situation with top-level reduce limitation which looks artificial, consistency of the politics regarding -1 in shapes and other corner-cases.
Those problems don't have signs of immediate failure, but may have negative effect on newcomers. I think that in order to increase popularity, we eventually need to improve our methods of detecting and solving such problems.
@grwlf These are great points. I think we can start a discussion on some of these in RFC threads. On C++ vs python, the current project's philosophy is to be agile when possible while being rigor. So have most of the test cases covered in python help toward that. To move toward a more stable C++ API, we can start from the runtime :)
The bottom line is that while we might make the project perfect right now(perhaps we never will be), we know the direction where we should be heading, and the community collectively makes things better.
C++ vs python
false dichotomy. the correct answer is to rewrite everything in rust ;)
Overall proposal looks good, I just added comments to the Google doc on wording, and content.
Thanks to @bgchun @Ravenwater 's comments I have added a section to include deeper discussion about open source hardware.
Thanks for everyone's participation to create this proposal, the formal voting thread is here https://github.com/dmlc/tvm/issues/2543
Most helpful comment
false dichotomy. the correct answer is to rewrite everything in rust ;)