I'm not sure if I'm using taichi incorrectly, but the following program crashes:
import taichi as ti
ti.init(arch=ti.cuda) # Run on GPU by default
n = 100
pixels = ti.var(dt=ti.i32, shape=(n, n))
@ti.kernel
def paint(t: ti.i32):
if t < 100:
for i, j in pixels:
pixels[i, j] = 1
gui = ti.GUI("Automata", (n, n))
for t in range(n):
paint(t)
gui.set_image(pixels)
gui.show()
[Release mode]
[Taichi version 0.5.2, cuda 10.1, commit 4d56959a]
[E 02/23/20 18:21:46.543] [statements.inc.h:visit@19] Not supported.
[E 02/23/20 18:21:46.543] Received signal 6 (Aborted)
***********************************
* Taichi Compiler Stack Traceback *
***********************************
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::signal_handler(int)
/lib64/libc.so.6(+0x361e0) [0x7f468dc031e0]
/lib64/libpthread.so.0: raise
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::IRVisitor::visit(taichi::Tlang::StructForStmt*)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::CodeGenLLVM::visit(taichi::Tlang::Block*)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::CodeGenLLVM::visit(taichi::Tlang::IfStmt*)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::CodeGenLLVM::visit(taichi::Tlang::Block*)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::CodeGenLLVMGPU::visit(taichi::Tlang::OffloadedStmt*)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::CodeGenLLVM::visit(taichi::Tlang::Block*)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::GPUCodeGen::codegen_llvm()
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::KernelCodeGen::compile(taichi::Tlang::Program&, taichi::Tlang::Kernel&)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::Program::compile(taichi::Tlang::Kernel&)
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::Kernel::compile()
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so: taichi::Tlang::Kernel::operator()()
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so(+0xd78034) [0x7f466558c034]
/home/sci/karthik/.virtualenvs/deep/lib/python3.6/site-packages/taichi/core/../lib/taichi_core.so(+0xb25494) [0x7f4665339494]
/usr/lib64/libpython3.6m.so.1.0: _PyCFunction_FastCallDict
/usr/lib64/libpython3.6m.so.1.0: _PyObject_FastCallDict
/usr/lib64/libpython3.6m.so.1.0: _PyObject_Call_Prepend
/usr/lib64/libpython3.6m.so.1.0: PyObject_Call
/usr/lib64/libpython3.6m.so.1.0(+0x147c40) [0x7f468e2ecc40]
/usr/lib64/libpython3.6m.so.1.0: _PyObject_FastCallDict
/usr/lib64/libpython3.6m.so.1.0(+0x18884d) [0x7f468e32d84d]
/usr/lib64/libpython3.6m.so.1.0: _PyEval_EvalFrameDefault
/usr/lib64/libpython3.6m.so.1.0: PyEval_EvalCodeEx
/usr/lib64/libpython3.6m.so.1.0(+0x1103c3) [0x7f468e2b53c3]
/usr/lib64/libpython3.6m.so.1.0: PyObject_Call
/usr/lib64/libpython3.6m.so.1.0: _PyEval_EvalFrameDefault
/usr/lib64/libpython3.6m.so.1.0: _PyFunction_FastCallDict
/usr/lib64/libpython3.6m.so.1.0: _PyObject_FastCallDict
/usr/lib64/libpython3.6m.so.1.0: _PyObject_Call_Prepend
/usr/lib64/libpython3.6m.so.1.0: PyObject_Call
/usr/lib64/libpython3.6m.so.1.0(+0x147c40) [0x7f468e2ecc40]
/usr/lib64/libpython3.6m.so.1.0: _PyObject_FastCallDict
/usr/lib64/libpython3.6m.so.1.0(+0x18884d) [0x7f468e32d84d]
/usr/lib64/libpython3.6m.so.1.0: _PyEval_EvalFrameDefault
/usr/lib64/libpython3.6m.so.1.0: PyEval_EvalCodeEx
/usr/lib64/libpython3.6m.so.1.0: PyEval_EvalCode
/usr/lib64/libpython3.6m.so.1.0(+0x208d62) [0x7f468e3add62]
/usr/lib64/libpython3.6m.so.1.0: PyRun_FileExFlags
/usr/lib64/libpython3.6m.so.1.0: PyRun_SimpleFileExFlags
/usr/lib64/libpython3.6m.so.1.0: Py_Main
python(main+0x1e8) [0x564d106f5d18]
/lib64/libc.so.6: __libc_start_main
python(_start+0x2a) [0x564d106f5e8a]
Removing the condition from the kernel lets the program run:
@ti.kernel
def paint(t: ti.i32):
for i, j in pixels:
pixels[i, j] = 1
Additionally, the condition in this code never evaluates to false, even when the t passed in is less than 50:
@ti.kernel
def paint(t: ti.i32):
if t > 50:
return
for i, j in pixels:
pixels[i, j] = 1
Is this a bug, or am I misunderstanding how conditions work in taichi?
Hi,
For your first case,
if t < 100:
for i, j in pixels:
pixels[i, j] = 1
The for i, j in pixels: is what Taichi called a struct-for kernel (See https://taichi.readthedocs.io/en/latest/hello.html#parallel-for-loops). There are two features that Taichi couldn't support here:
if, so the entire kernel will map to a 1x1x1 CUDA kernel.struct-for loops at the out-most scope. Internally, Taichi does a series steps of transformations to figure out how to correctly parallelize such kind of for loops. That's why removing the if made Taichi work.TL;DR Use struct-for only at the out-most scope.
As for the third one
@ti.kernel
def paint(t: ti.i32):
if t > 50:
return
for i, j in pixels:
pixels[i, j] = 1
I printed out the IR and it looks like something below:
[T 02/24/20 11:19:00.434] [codegen_x86.cpp:lower_llvm@22] Initial IR:
==========
kernel {
if (arg[0] > 100) {
}
}
# That's it, statements following the return are gone...
I guess Taichi couldn't handle return properly at this point yet, maybe similar to how ti.func cannot handle multiple return values (https://taichi.readthedocs.io/en/latest/syntax.html#functions). Will need @yuanming-hu for confirmation..
FYI, you can print out Taichi IR by doing something like
ti.init(..., print_ir=True)
ti.set_logging_level(ti.TRACE) # finest grained logging level
For @yuanming-hu , do you think we should clarify the doc that it is the out-most scope loop that can be parallelized, not the out-most loop?
As @k-ye said return is not yet supported in taichi kernels. However, you can write the result to a global scalar and read from it in Python-scope.
Also,
@ti.kernel
def paint(t: ti.i32):
if t < 100:
for i, j in pixels:
pixels[i, j] = 1
paint(t)
could be transformed into a working version:
@ti.kernel
def paint(t: ti.i32):
for i, j in pixels:
pixels[i, j] = 1
if t < 100:
paint(t)
For @yuanming-hu , do you think we should clarify the doc that it is the out-most scope loop that can be parallelized, not the out-most loop?
Good idea! I'm rushing for removing unified memory dependency right now and will work on these later. If you have a chance please go ahead.
OK, SG :-)
Got it! Thank you for the detailed responses!
I'm trying to split some of my code into a ti.func, but the generated IR seems incorrect(look at the if statements in the original code and the IR):
@ti.func
def rule_30(a: ti.i32, b: ti.i32, c: ti.i32) -> ti.i32:
if a == 1:
if b == 0 and c == 0:
return 1
else:
return 0
else:
if b == 0 and c == 0:
return 0
else:
return 1
@ti.kernel
def paint(t: ti.i32):
for i in range(1, n-1):
a = cells[t-1, i-1]
b = cells[t-1, i]
c = cells[t-1, i+1]
cells[t, i] = rule_30(a, b, c)
IR: (there is no IR for the ti.func "rule_30". Presumably it was optimized out and inlined into the kernel)
[T 02/23/20 21:09:59.335] [/home/karthik/.virtualenvs/automata/lib/python3.6/site-packages/taichi/lang/kernel.py:__call__@347] Compiling kernel paint_c8_0_...
[T 02/23/20 21:09:59.343] [codegen_x86.cpp:lower_llvm@620] Initial IR:
==========
kernel {
for @tmp7 in range((cast<int32> 1), (cast<int32> 199)) {
$1 = alloca @tmp9
@tmp9 = gbl load #@tmp0[(arg[0] - 1), (@tmp7 - 1)]
$3 = alloca @tmp10
@tmp10 = gbl load #@tmp0[(arg[0] - 1), @tmp7]
$5 = alloca @tmp11
@tmp11 = gbl load #@tmp0[(arg[0] - 1), (@tmp7 + 1)]
$7 = alloca @tmp12
@tmp12 = @tmp9
$9 = alloca @tmp13
@tmp13 = @tmp10
$11 = alloca @tmp14
@tmp14 = @tmp11
if (@tmp12 == 1) {
if ((@tmp13 == 0) & (@tmp14 == 0)) {
#@tmp0[arg[0], @tmp7] = 1
}
}
}
}
If I copy paste the function's body into paint(), I get the correct IR:
@ti.kernel
def paint(t: ti.i32):
for i in range(1, n-1):
a = cells[t-1, i-1]
b = cells[t-1, i]
c = cells[t-1, i+1]
if a == 1:
if b == 0 and c == 0:
cells[t, i] = 1
else:
cells[t, i] = 0
else:
if b == 0 and c == 0:
cells[t, i] = 0
else:
cells[t, i] = 1
IR:
[T 02/23/20 21:14:34.735] [/home/karthik/.virtualenvs/automata/lib/python3.6/site-packages/taichi/lang/kernel.py:__call__@347] Compiling kernel paint_c8_0_...
[T 02/23/20 21:14:34.744] [codegen_x86.cpp:lower_llvm@620] Initial IR:
==========
kernel {
for @tmp7 in range((cast<int32> 1), (cast<int32> 199)) {
$1 = alloca @tmp9
@tmp9 = gbl load #@tmp0[(arg[0] - 1), (@tmp7 - 1)]
$3 = alloca @tmp10
@tmp10 = gbl load #@tmp0[(arg[0] - 1), @tmp7]
$5 = alloca @tmp11
@tmp11 = gbl load #@tmp0[(arg[0] - 1), (@tmp7 + 1)]
if (@tmp9 == 1) {
if ((@tmp10 == 0) & (@tmp11 == 0)) {
#@tmp0[arg[0], @tmp7] = 1
} else {
#@tmp0[arg[0], @tmp7] = 0
}
} else {
if ((@tmp10 == 0) & (@tmp11 == 0)) {
#@tmp0[arg[0], @tmp7] = 0
} else {
#@tmp0[arg[0], @tmp7] = 1
}
}
}
}
Here is the complete program:
import taichi as ti
ti.init(arch=ti.cuda, print_ir=True) # Run on GPU by default
ti.set_logging_level(ti.TRACE)
n = 200
cells = ti.var(dt=ti.i32, shape=(n, n))
img_n = 700
pixels = ti.var(dt=ti.f32, shape=(img_n, img_n))
@ti.kernel
def gen_image():
for i, j in pixels:
x = (n * i) // img_n
y = (n * j) // img_n
pixels[i, j] = cells[x, y]
@ti.kernel
def init_cells():
for i, j in cells:
cells[i, j] = 0
cells[0, n//2] = 1
@ti.func
def rule_30(a: ti.i32, b: ti.i32, c: ti.i32) -> ti.i32:
if a == 1:
if b == 0 and c == 0:
return 1
else:
return 0
else:
if b == 0 and c == 0:
return 0
else:
return 1
@ti.kernel
def paint(t: ti.i32):
for i in range(1, n-1):
a = cells[t-1, i-1]
b = cells[t-1, i]
c = cells[t-1, i+1]
cells[t, i] = rule_30(a, b, c)
gui = ti.GUI("Automata", (img_n, img_n))
init_cells()
for i in range(1, n):
paint(i)
gen_image()
gui.set_image(pixels)
gui.show()
gui.set_image(pixels)
# gui.wait_key()
Multi-return value in ti.func is not supported yet: https://taichi.readthedocs.io/en/latest/syntax.html#functions. Could you try use a local variable to hold the return values?
Oops, I misunderstood what was meant by "multi-return value" in functions. Thank you!
Warning: The issue has been out-of-update for 50 days, marking stale.
Most helpful comment
Oops, I misunderstood what was meant by "multi-return value" in functions. Thank you!