Describe the bug
I found that continue is not working well in struct-for on bitmasked structure.
To Reproduce
import taichi as ti
ti.init()
ti.core.toggle_advanced_optimization(False)
x = ti.var(ti.f32)
ti.root.bitmasked(ti.i, 32).place(x)
@ti.kernel
def error():
for i in x:
if i == 0:
continue
print(i)
@ti.kernel
def good():
for i in x:
if i == 0:
pass
else:
print(i)
x[0] = 3
x[1] = 3
x[2] = 3
x[5] = 3
x[10] = 3
print('error:')
error()
print('good:')
good()
Log/Screenshots
[Taichi] mode=development
[Taichi] preparing sandbox at /tmp/taichi-nd61fh7e
[Taichi] <dev mode>, llvm 10.0.0, commit 67c6ea62, python 3.8.3
[Taichi] Starting on arch=x64
error:
good:
1
2
5
10
If you have local commits (e.g. compile fixes before you reproduce the bug), please make sure you first make a PR to fix the build errors and then report the bug.
I found this is caused by ContinueStmt::as_return().
@ti.kernel
def error():
for i in x:
if i == 0:
continue
print(i)
is compiled as the same as
@ti.kernel
def error():
for i in x:
if i == 0:
return
print(i)
which should be OK if the for is parallel. But it seems that the loop is serial. Why?
The IR looks pretty good:
[I 07/02/20 22:58:34.340] [compile_to_offloads.cpp:taichi::lang::irpass::compile_to_offloads::<lambda_a4464fe7c75e1f42a3a490ee54c7ec3e>::operator ()@23] Simplified III:
kernel {
$0 = offloaded clear_list S1bitmasked
$1 = offloaded listgen S0root->S1bitmasked
$2 = offloaded struct_for(S1bitmasked) block_dim=adaptive
body {
<i32 x1> $3 = loop $2 index 0
<i32 x1> $4 = const [0]
<i32 x1> $5 = const [1]
<i32 x1> $6 = cmp_eq $3 $4
<i32 x1> $7 = bit_and $5 $6
$8 : if $7 {
$9 continue (scope=$2)
}
print $3, "
"
}
}
For a better feeling of return, try
@ti.kernel
def error():
for i in x:
if i == 2:
continue
print(i)
This is a problem that is subject to backend implementation.
We translated the top-level continue as return because in most cases, each kernel function handles one index group. However, the struct-for function on LLVM has a loop that can handle more than one indices, see https://github.com/taichi-dev/taichi/blob/ad3fe31c0d5e0762efe1da61859629420b61a884/taichi/codegen/codegen_llvm.cpp#L1314-L1319
So in this case, we have two options:
continue in LLVM struct-for as a special case.I prefer the second option: it makes sense to me if we treat continue in LLVM struct-for as a real continue.
I prefer the second option: it makes sense to me if we treat continue in LLVM struct-for as a real continue.
Talked more on this later yesterday. The problem was that range_for does follow this way to create a function wrapper: https://github.com/taichi-dev/taichi/blob/4e37b9a3d93ac85297b9db2a1b15e16d165d4fa3/taichi/backends/cpu/codegen_cpu.cpp#L41-L51
so we cannot use continue in that case.
I think @yuanming-hu will do some refactoring to the LLVM backend to make this more consistent. Maybe we should wait for that before this fix. Meanwhile, maybe I can rename as_return() to from_kernel() or something, so that each backend impl can handle the top-level continue in their most suited ways.
Thank you guys for all the discussions here! I'll take these into consideration when I refactor the LLVM backend/runtime...
Most helpful comment
Thank you guys for all the discussions here! I'll take these into consideration when I refactor the LLVM backend/runtime...