Tensorize has been used in several places throughout the project but not very clearly documented, we will need one or a few tutorials in demonstrating what it is and how can be use it to run micro kernels etc.
cc @cowanmeg @merrymercy @tmoreau89
I think it'd be a good idea to incorporate an ARM example with a bit-serial tensor kernel.
We can also use a VTA tensorization example, but the tutorial will be less straightforward. @cowanmeg - do you think we can put one together with your ARM bit-serial kernel, including the definition of the tensor intrinsic?
The only issue I see at the moment is that this tutorial will have to execute on ARM, so it's not quite straightforward to just deploy on any test setup.
I feel tutorial on cpu is also good enough for demonstrating. I can help with that.
Did we agree on who would get started on the CPU tensorization tutorial @yzhliu @cowanmeg ?
Yes, sounds good!
@cowanmeg would you like to contribute the tutorial? or I can start to work on it from this weekend.
I have some time this week too. My example is for adding an ARM microkernel, but it would be great if you have an x86 example so that people can run it.
@cowanmeg @yzhliu any updates?
not yet from my side ...
I would like to revive this thread @cowanmeg @yzhliu ,please follow up on this
I'm writing and investigating the problem I mentioned in int8 PR #1680 - I feel it is a bug.
Most helpful comment
I feel tutorial on cpu is also good enough for demonstrating. I can help with that.