Tvm: [RFC] More PackedFunc metadata

Created on 8 Apr 2019 · 13Comments · Source: apache/tvm

(Moved from #2981)

Proposal: add type signature metadata to PackedFunc

Each PackedFunc would have an optional field describing what arguments it takes and what type it returns. This field would be automatically populated during conversion from a TypedPackedFunc.

Once we have these type signatures, we could use them to generate bindings in new languages (i.e., Rust) and get coverage of a good chunk of TVM's API for free.

This would result in much easier-to-maintain language bindings (not only for the runtime, for the compiler stack too) and might allow us to eventually rip out a lot of the manually-written stuff in e.g. the Python bindings.

The downside of this idea is that it would result in some fairly hairy codegen, which can be difficult to maintain. It would also add small amounts of overhead to the tvm runtime system; we could reduce that issue by adding a #define to disable the type metadata. Also, as @tqchen pointed out, the generated code might be lower-quality than handwritten bindings.

A few extensions of this idea:

Add more compile-time metadata to the Node heirarchy, allowing codegen to access their methods / attributes.
Add docstrings to PackedFuncs to allow auto-generation of documentation.
Allow std::variant or equivalent in TypedPackedFunc signatures. Lots of PackedFuncs have arguments that can be one of a few types (e.g. Int or Expr); a simple extension to the PackedFunc system + runtime type system would allow these to be described automatically.

RFC

Source

kazimuth

Most helpful comment

One thing I would like to mention is that manual wrapping has its merit :) Take the following wrapper code for scan as an example. By manually wrapping it in python, we can benefit from:

A clear signature that users can look up
Keyword argument and default value
Python-specific documents,
- python type signature
- useful code python examples

def scan(init, update, state_placeholder, inputs=None, name="scan", tag="", attrs=None):
    """Construct new tensors by scanning over axis.

    Parameters
    ----------
    init: Tensor or list of Tensor
        The initial condition of first init.shape[0] timestamps
    update: Tensor or list of Tensor
        The update rule of the scan given by symbolic tensor.
    state_placeholder: Tensor or list of Tensor
        The placeholder variables used by update.
    inputs: Tensor or list of Tensor, optional
        The list of inputs to the scan. This is not required, but can
        be useful for the compiler to detect scan body faster.
    name: str, optional
        The name hint of the tensor
    tag: str, optional
        Additonal tag information about the compute.
    attrs: dict, optional
        The additional auxiliary attributes about the compute.

    Returns
    -------
    tensor: Tensor or list of Tensors
        The created tensor or tuple of tensors it it contains multiple outputs.

    Example
    -------
    .. code-block:: python

      # The following code is equivalent to numpy.cumsum
      m = tvm.var("m")
      n = tvm.var("n")
      X = tvm.placeholder((m, n), name="X")
      s_state = tvm.placeholder((m, n))
      s_init = tvm.compute((1, n), lambda _, i: X[0, i])
      s_update = tvm.compute((m, n), lambda t, i: s_state[t-1, i] + X[t, i])
      res = tvm.scan(s_init, s_update, s_state, X)

While it is possible to have a system that codegen these components, that might mean we have to write the python example documents directly in somewhere else, which is less natural.

Of course, one drawback of doing the manual wrapping is that one wrapper has to be created for each language. This may not be a bad thing, especially we want to think clearly about the language specific features, and write good docs that are language specific

I do think there is some merit to do have good metadata for the node system and automatically generate some of the accessors.

tqchen on 8 Apr 2019

👍2

All 13 comments

One thing I would like to mention is that manual wrapping has its merit :) Take the following wrapper code for scan as an example. By manually wrapping it in python, we can benefit from:

A clear signature that users can look up
Keyword argument and default value
Python-specific documents,
- python type signature
- useful code python examples

def scan(init, update, state_placeholder, inputs=None, name="scan", tag="", attrs=None):
    """Construct new tensors by scanning over axis.

    Parameters
    ----------
    init: Tensor or list of Tensor
        The initial condition of first init.shape[0] timestamps
    update: Tensor or list of Tensor
        The update rule of the scan given by symbolic tensor.
    state_placeholder: Tensor or list of Tensor
        The placeholder variables used by update.
    inputs: Tensor or list of Tensor, optional
        The list of inputs to the scan. This is not required, but can
        be useful for the compiler to detect scan body faster.
    name: str, optional
        The name hint of the tensor
    tag: str, optional
        Additonal tag information about the compute.
    attrs: dict, optional
        The additional auxiliary attributes about the compute.

    Returns
    -------
    tensor: Tensor or list of Tensors
        The created tensor or tuple of tensors it it contains multiple outputs.

    Example
    -------
    .. code-block:: python

      # The following code is equivalent to numpy.cumsum
      m = tvm.var("m")
      n = tvm.var("n")
      X = tvm.placeholder((m, n), name="X")
      s_state = tvm.placeholder((m, n))
      s_init = tvm.compute((1, n), lambda _, i: X[0, i])
      s_update = tvm.compute((m, n), lambda t, i: s_state[t-1, i] + X[t, i])
      res = tvm.scan(s_init, s_update, s_state, X)

While it is possible to have a system that codegen these components, that might mean we have to write the python example documents directly in somewhere else, which is less natural.

I do think there is some merit to do have good metadata for the node system and automatically generate some of the accessors.

tqchen on 8 Apr 2019

👍2

From other thread:

@tqchen:

I will summarize some of my take here. I like the idea of Node hierarchy compile time generation. This is something I have thought about and discussed with @jroesch for a while and might help #2523 (comment)

It is always tempting to automate more parts of wrapper generation. However, our past experiences suggest that the automatic wrapper generation is never perfect. Think about how can we support keyword arguments, good pythonic style docstring and so on. It is also harder for developers to find the actual implementation of the "generated API" since some of that is generated at runtime. Eventually, we find that it is simpler to just do a manual wrapping, which gives us all the good native features, docs, and keep PackedFunc simple (by only support positional arguments without any meta-data).

@nhynes:

This idea actually comes a lot :P #2328 (comment)

I know, for sure, that we could get good docs with _really good_ codegen like that offered by Rust macros, but I also know for sure that we're not about to rewrite TVM in Rust :)

I think that the boilerplate really does bother new (advanced) users who want to use TVM as a tool. I wonder if there's a way forward here that satisfies all desiderata?

kazimuth on 8 Apr 2019

@tqchen, yeah those are good points. One possibility would be to use hand-written languages for first-tier-supported languages like python and offer the auto-generated wrappers for other languages.

If you did end up switching fully to auto-generated wrappers you could take steps like committing generated code to the git repo to make it easier to browse, with doc comments leading back to the original source code. I also think adding typed kwargs to PackedFunc would go a long way towards making the API usable. Ultimately, though, it'll always be less natural than hand-written wrappers, that's definitely true. I definitely understand if the project prefers to stick with polish over automation :)

RE: the Node heirarchy, it seems like doing more with that is an unambiguous win. I could imagine setting up a system like the TVM_REGISTER_GLOBAL macro for node subclasses + maybe macros for declaring attributes so that you don't have to hand-code visitAttrs. Did you have other thoughts on that design?

kazimuth on 8 Apr 2019

RE: Node hierarchy. Based on previous discussions with @jroesch , one possible direction is to design a DSL(for example we can just use python with typing info) as a way of declaring the node relations and fields and create generator util to generate the classes. The main challenge is to keep most of the data structures purely in C(so we can get ABI compatibility across languages). We do also want to keep some of the node base class and the possibility to declare node inside c++ for some private temp node. Which does not enjoy the cross-lang features but can make easy use of any language-specific data structure.

tqchen on 8 Apr 2019

Oh hm, that's an interesting idea. How would we handle inheritance / methods on nodes? Just copy-pasting stuff in the generator, or some sort of runtime lookup system implemented in C?

Another route to ABI compatibility is to just use opaque pointers + accessors everywhere, that might be a less invasive change. It does impose more overhead tho.

kazimuth on 8 Apr 2019

We will likely have to give up methods for most cases and use functions/vtable instead. This could be fine for common IR structures. This also seems to be the case for certain languages. methods can still be used if we manually declare the node in c++

tqchen on 8 Apr 2019

How does the refcounting for Nodes currently work across the ABI boundary? Is everything owned by the TVM runtime or something?

kazimuth on 9 Apr 2019

Re ref-counting: Things should work out of the box, as long as each side have the same way to access the ref counter. We increase/decrease ref counters on each language and that ref counter is global to all the languages. Because node has a deleter that is populated during creation, you can safely run the deletion depending on the language-specific behavior set by the creator of the node

tqchen on 9 Apr 2019

👍1

I could set it up so that most of the node generation logic + code is in HalideIR, and then there's a generator script each set of nodes needed (one here, one there.) I might open a PR over there and start work.

kazimuth on 11 Apr 2019

I feel like Node, NodeRef and NodePtr are pretty much general-purposed refcounted objects, rather than specific nodes in the IR. They could be shared across ABI boundary using PackedFunc out-of-box. It will be interesting to see if we could inherit every object from Node so that everything automatically gets the ability to be shared across different languages.

junrushao1994 on 11 Apr 2019

NodeRef is just a type-erased version of NodePtr<T> right?

I think the main question is if we want to allow nodes to be implemented in languages besides C++. If we require nodes to be implemented as Node subclasses, then they have to have C++ impls for inheritance to work. If we design a Node C API that doesn't have this requirement, then we can mix-and-match Node implementations between languages. That's really only useful for adding temporary node types for e.g. passes implemented in python though. (Of course, we can still implement all the C++ nodes as Node subclasses.)

kazimuth on 11 Apr 2019

👍1

I'm starting work in https://github.com/dmlc/HalideIR/pull/56

kazimuth on 12 Apr 2019

https://github.com/dmlc/tvm/issues/3501

tqchen on 6 Jul 2019

Was this page helpful?

0 / 5 - 0 ratings