Tvm: [RFC][Relay][HalideIR] Automatically generate the AST

Created on 6 Jul 2019 · 5Comments · Source: apache/tvm

I have begun to experiment with writing a new library called astgen to replace the large quantity of boilerplate required by the AST today, and enable us to more flexibly evolve the node system, and its APIs.

The first version of this tool will take a Python file like this:

import astgen
import tvm

class Expr:
    pass

@astgen.astgen
class Constant(Expr):
    """
    \\brief Constant tensor, backed by an NDArray on the cpu(0) device.
    \\note Scalar constants are represented by rank-0 constant tensors,
           enabling uniform constant folding over scalars and tensors.
    """

    """The data of the tensor."""
    data: tvm.ndarray.NDArray

astgen.generate_all("expr.h", "tvm::relay")

and produce this C++ file:

namespace tvm {
namespace relay {

/*!
 * \brief Constant tensor, backed by an NDArray on the cpu(0) device.
 * \note Scalar constants are represented by rank-0 constant tensors,
 * enabling uniform constant folding over scalars and tensors.
 *
 */
class Constant;

/*!
 * \brief Constant container.
 *
 */
class ConstantNode : public ExprNode {
 public:
  void VisitAttrs(tvm::AttrVisitor* v) final {
    v->Visit("data", &data);
  }
  TVM_DLL static Constant make(runtime::NDArray data);

  static constexpr const char* _type_key = "relay.Constant";
  TVM_DECLARE_NODE_TYPE_INFO(ConstantNode, ExprNode);
};
}
RELAY_DEFINE_NODE_REF(Constant, ConstantNode, Expr);

} // relay
} // tvm

This compliments Tianqi's recent proposal to evolve the low level IR see #3474.

Specifically by not hand writing all AST code, we should be able to flexibly change representation without requiring extensive refactors, and make unifying the IRs of TVM less effort as time goes on.

A secondary goal of mine is to allow any language with a C ABI compatible FFI to construct and manipulate TVM ASTs.

By supporting this we could allow users to build tools in languages of choice without having to change how we develop the core of TVM.

Furthermore this will improve Python interop. as we will no longer have to deal with hidden C++ fields as is the case today.

Unfortunately we have heavily relied on C++ objects, and C++ datatypes such as std::string and resolving these are essential to provide an FFI friendly AST.

I hope the community can help come up with a design for Relay's AST using a code generation based approach.

My goal is to first replace the AST today with little to no changes, and then incrementally evolve it over time.

I will follow up with more details on my proposed solutions over the next few days.

See this branch for more details: https://github.com/jroesch/tvm/tree/astgen.

RFC

Source

jroesch

👍6 🚀2

Most helpful comment

cc @jermainewang @kazimuth @junrushao1994 @icemelon9 @ajtulloch @yzhliu @merrymercy who might be interested in this. Some initial thoughts:

Convention of the name convention the class and file hierarchy schema
- e.g.tvm.schema.expr.py -> include/IR/expr.h
- Alternatively, allow declaration within each file.
Decouple schema reading(frontend) and emission, have an IR of class hierarchy schemas, so that we can have different emitters
- Think about python emitter, c++ emitter
We might want to use it to also deal with general objects, including runtime::Object in VM.
We still want to allow some user-written boilerplate, given that C++ datatypes can still be used in many of those and we would love to have them for certain internal data types.
How to handle docstrings
- ATM the python docstrings are separately written, by manual wrapping. The benefit of manual wrapping is the clear docstrings(as they might be different from those in c++).
- Should we do the same for now and only generate c++ code?

tqchen on 6 Jul 2019

👍2

All 5 comments

cc @jermainewang @kazimuth @junrushao1994 @icemelon9 @ajtulloch @yzhliu @merrymercy who might be interested in this. Some initial thoughts:

Convention of the name convention the class and file hierarchy schema
- e.g.tvm.schema.expr.py -> include/IR/expr.h
- Alternatively, allow declaration within each file.
Decouple schema reading(frontend) and emission, have an IR of class hierarchy schemas, so that we can have different emitters
- Think about python emitter, c++ emitter
We might want to use it to also deal with general objects, including runtime::Object in VM.
We still want to allow some user-written boilerplate, given that C++ datatypes can still be used in many of those and we would love to have them for certain internal data types.
How to handle docstrings
- ATM the python docstrings are separately written, by manual wrapping. The benefit of manual wrapping is the clear docstrings(as they might be different from those in c++).
- Should we do the same for now and only generate c++ code?

tqchen on 6 Jul 2019

👍2

Hey Jared,

Nice proposal!

I am mostly interested in using the node system across C ABI.

First, I would love to understand:
1) how member methods could be generated, and
2) their usability across C ABIs.

If we wrap up data fields of generated nodes in pure C, and if packed functions' global registry can be used across C ABI (not now), we could have a systematic way to wrap the methods up
1) For virtual methods, we may leave a field in the pure C struct, like what DLManagedTensor's virtual destructor did.
2) For non-virtual methods, we should somehow register them as packed functions. We can design our own name mangling mechanism.
3) For each instance with vtable, we probably need a type key to indicate its type. And then we could implement them in a thin C++ wrapper.

Second, basic data structures are still in C++, for example, tvm::Array, tvm::Map, and std::string. Maybe this would be an opportunity to rewrite them in C.

junrushao1994 on 6 Jul 2019

BTW, packed functions can be across-ABI if we have to, by simply adding several C APIs. We don't have to make std::function across-ABI, just register pointers to them and execution context to them, like what DLPack did using deleter and manager_ctx.

junrushao1994 on 13 Jul 2019

Yeah, I strongly agree with the point that we need to decouple schema reading and the generation.

This is somehow like LLVM's tablegen, which manages repeat and regular codes in a centralized description file to minimize the changes we need to add new IR nodes.