Hi, dear taichi community,
Sorry, this issure is not about any features or bugs, but I don't know where to ask this question now. If maintainer says I cannot open this issue, I will delete this issure later.
I listened the first conference online on the bilibili yesterday. I know Taichi will convert python ast to Taichi's IR. I am interested with it, also I am interested in the process about how to convert Taichi's IR to LLVM IR. But I don't know does taichi support to dump taichi's IR now? If yes, how to do it? and if not, what does taichi's IR look like?
From the code I read, I know taichi use python astor to parse python code, and modify the provided template to get the final executable code. But I don't find the transformation about python ast to Taichi IR.
Any reply is Ok, thanks.
Sorry, this issure is not about any features or bugs, but I don't know where to ask this question now. If maintainer says I cannot open this issue, I will delete this issure later.
Thank for asking this! All kinds of issues are welcomed :)
I listened the first conference online on the bilibili yesterday. I know Taichi will convert python ast to Taichi's IR. I am interested with it, also I am interested in the process about how to convert Taichi's IR to LLVM IR. But I don't know does taichi support to dump taichi's IR now? If yes, how to do it? and if not, what does taichi's IR look like?
From the code I read, I know taichi use python astor to parse python code, and modify the provided template to get the final executable code. But I don't find the transformation about python ast to Taichi IR.
ti.init(print_preprocessed=True).ti.init(print_ir=True).ti.cfg.print_kernel_llvm_ir = True.e.g.:
import taichi as ti
ti.init(print_ir=True)
@ti.kernel
def p():
print(42)
p()
From the code I read, I know taichi use python astor to parse python code, and modify the provided template to get the final executable code. But I don't find the transformation about python ast to Taichi IR.
Transformation from Python AST to Taichi IR needs 2 steps.
First, we transform Python AST:
import taichi as ti
ti.init(print_preprocessed=True)
@ti.kernel
def func() -> ti.f32:
x = 233
return x
print(func())
into Taichi API invocations:
def func():
import taichi as ti
ti.decl_scalar_ret(ti.f32)
x = ti.expr_init(233)
ti.core.create_kernel_return(ti.cast(ti.Expr(x), ti.f32).ptr)
Then, these C++ APIs in ti.core will create the corresponding statements in Taichi IR, e.g.:
https://github.com/taichi-dev/taichi/blob/2544d1f3a6740de6f9ff4e56be0f7d73142173eb/taichi/python/export_lang.cpp#L315-L325
import taichi as ti ti.init(print_ir=True) @ti.kernel def p(): print(42) p()
Thanks for your reply.
[Taichi] mode=development
[Taichi] preparing sandbox at /tmp/taichi-8dibpesi
[Taichi] <dev mode>, llvm 8.0.1, commit fe17ca1f, python 3.7.3
[Taichi] Starting on arch=x64
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Initial IR:
kernel {
$0 = eval 42
print %0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Lowered:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Typechecked:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Loop Vectorized:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Loop Split:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Simplified I:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Dense struct-for demoted:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Optimized by CFG I:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Access flagged I:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Simplified II:
kernel {
<i32 x1> $0 = const [42]
print $0, "\n"
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Offloaded:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Optimized by CFG II:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Access flagged II:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Make thread local:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Access lowered:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] DIE:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Access flagged III:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Atomics demoted:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Optimized by CFG III:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
[I 07/10/20 19:47:44.751] [compile_to_offloads.cpp:operator()@23] Simplified III:
kernel {
$0 = offloaded
body {
<i32 x1> $1 = const [42]
print $1, "\n"
}
}
so, the kernel wrapped code is Taichi IR?
Besides, I run examples/waterwave.py by setting print_ir=True in ti.init,
I found that the Initial IR occurs 8 times totally, while the @ti.kernel or @ti.func occurs 7 times in total in waterwave.py.
So, why 8 and 7? not equal?
And
what's Lowered/DIE/Simplified I(II/III)/Atomics demoted/Access flagged III/Access Lowered/Make thread local stands for?
Are they the Optimization pass in LLVM or Taichi?
If yes, what actions this optimization take?
And Why some @ti.kernel or @ti.func has just one of it, while another has serveral of them? What determine it?
so, the kernel wrapped code is Taichi IR?
what's Lowered/DIE/Simplified I(II/III)/Atomics demoted/Access flagged III/Access Lowered/Make thread local stands for?
Are they the Optimization pass in LLVM or Taichi?
Yes and yes :)
If yes, what actions this optimization take?
I'm not super clear about IR optimizations. Things that I can tell:
Atomics demoted
Sometimes variable operations are not necessary to be atomic, this pass detect that and demote x[i] += 1 into x[i] = x[i] + 1.
DIE
This pass remove unused variables, e.g.:
x = 233
x = 4
will be optimized into:
x = 4
Simplified III
This pass combine constant operations, e.g.:
x = 233
y = 4
z = x + y
will be optimized into:
z = 237`1
Let me show you an example:
import taichi as ti
ti.init(print_ir=True)
#ti.core.toggle_advanced_optimization(False)
@ti.kernel
def calc_pi() -> ti.f32:
term = 1.0
sum = 0.0
divisor = 1
for i in ti.static(range(10)):
sum += term / divisor
term *= -1 / 3
divisor += 2
return sum * ti.sqrt(12.0)
print(calc_pi())
In the last optimization pass Simplified III, it eliminated all statements, except for one const [3.1415], that is, this pass calculates all the works in compile-time!
Thanks for your clear explanation.
Besides, I want to say sorry, because I found what I asked, for certain, is in the document:-)
Most helpful comment
Yes and yes :)
I'm not super clear about IR optimizations. Things that I can tell:
Sometimes variable operations are not necessary to be atomic, this pass detect that and demote
x[i] += 1intox[i] = x[i] + 1.This pass remove unused variables, e.g.:
will be optimized into:
This pass combine constant operations, e.g.:
will be optimized into:
Let me show you an example:
In the last optimization pass
Simplified III, it eliminated all statements, except for oneconst [3.1415], that is, this pass calculates all the works in compile-time!