Taichi: [Refactor] IR standardization

Created on 31 Mar 2020  ·  9Comments  ·  Source: taichi-dev/taichi

At some point, we should make the Taichi IR more standard. The standardized IR will be codenamed CHI (气), recursive acronym for CHI Hierarchical Instructions. This will not only make the codebase more maintainable, but also allow others to reuse CHI to build new languages, for sparse/differentiable/megakernel computation, just like a higher-level LLVM IR for visual computing applications.

  • [ ] Make Stmt field naming consistent (e.g. AtomicOpStmt::dest/val are inconsistent with GlobalStoreStmt::ptr/data)
  • [ ] OffloadedStmt::loop_vars and OffloadedStmt::step are not used (thanks to @xumingkuan). Consider removal
  • [x] Add namespace analysis (thanks to @xumingkuan)
  • [ ] Split frontend/backend IR. Everything in the frontend is unique to taichi, and the backend statements (CHI) are reusable to build a different language.
  • [ ] Think about add_operand - is there a better way to implement it?
  • [ ] Slimmer taichi/ir/ir.h and taichi/ir/statements.h
  • [ ] IRBuilder for backend statements
  • [ ] Rename VecStatement to StatementPack or merge it with Block?
  • [ ] Function def/calls.
  • [ ] (De)Serialization (#690)
  • [ ] Source code line number tracking (thanks to @archibate)
  • [ ] Rename has_global_side_effect to has_side_effect
  • [ ] Remove ArgStoreStmt
  • [x] Decouple Stmt from LLVM (#686)
  • [x] Decouple SNode from LLVM (#817)
  • [ ] Document IR

Screen Shot 2020-03-31 at 4 54 51 PM

ir

Most helpful comment

Go CHI lang..!

All 9 comments

Go CHI lang..!

I wonder if we should separate OffloadedStmt out from the IR layer, and rename it to OffloadedTask or something. We will also have a special Root node representing the root of the AST, which contains a std::vector<OffloadedTask>.

By doing so, we can embed more structural knowledge of a Taichi kernel into the types. For example, a Taichi kernel is always composed of a series of OffloadedTasks (e.g. {range_for, serial, range_for}, {clear, listgen, clear, listgen, struct_for, gc}, ...). This is actually a characteristic that should be described by the types.

Right now the root is just a Block, and all its offloaded tasks are just inside Block::statements. This is no different from the blocks at the internal levels (e.g. IfStmt's two blocks). So in our codegens, we have to keep track of whether we are handling root or not, with flags like is_top_level_.

This also makes it hard to look up info other than then current OffloadedStmt. E.g. when processing struct_for, I may want to look at the immediate previous listgen and clear. While we can fit this logic inside visit(Block*), the method will contain two sets of logics, one for the root and the other for the internal IR blocks:

void Visitor::visit(Block* b) {
  if (is_top_level_) {
    // handling root, we know that each stmt in b->statements is an offloaded task
  } else {
    // internal IR nodes.
  }
}

Last but not least, this also enforces the invariants that offloaded tasks can only be at the top level -- Since it's not part of IR, it cannot be nested by other IR nodes accidentally.


Giving more thoughts on this, maybe Taichi/CHI is too incomplete without offloaded tasks being part of its IR. Then at a minimum, I think we can have the following:

  • Stop using Block to represent root and give it its own type.
  • The type for the offloaded tasks can only be included by the top-level root node.

WDYT?

WDYT?

Can't agree more, also note that there are not nesting OffloadStmt, so let's just say OffloadStmt is root stmt, and a kernel contains multiple roots. We can do this by modifying:
https://github.com/taichi-dev/taichi/blob/62cdc60f2e3d981b4d5c1c9f60b7590c9e8031a1/taichi/program/kernel.h#L14
to something like:

std::vector<OffloadStmt *> offloads;

I believe having a Root is still worth the efforts. It is a proper encapsulation of the AST. As one example, think about the AST visitor interface -- it could be just visit(Root&) instead of visit(std::vector<OffloadedTasks*>&). Having Root makes it a lot easier to extend Root without breaking the visitor's interface.

See also https://testing.googleblog.com/2017/11/obsessed-with-primitives.html. Builder's pattern is another common pattern to solve such problems.

visit(std::vector<OffloadedTasks*>&)

You mean, class Root : Stmt? Or class Root { vector<OffloadedTasks> os; }?

Also note that we want to implement real functions at some point #602, we want to leave a way for it when implementing Root or whatever.

The later one.

Also note that we want to implement real functions at some point #602, we want to leave a way for it when implementing Root or whatever.

Yeah, then it's even more reasonable that we have a Root to wrap up things like offloaded tasks, function defs, etc.

Shall we put all functions in analysis/ into namespace analysis?

Good idea. All passes that do not change IR should go to irpass::analysis.

Is there gonna be serialization/pretty printer/parser so ppl can build external tool without interfacing C++?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

xumingkuan picture xumingkuan  ·  3Comments

KLozes picture KLozes  ·  4Comments

jackalcooper picture jackalcooper  ·  4Comments

yuanming-hu picture yuanming-hu  ·  3Comments

yuanming-hu picture yuanming-hu  ·  4Comments