Taichi: [Bug] [IR] random() is not an idempotent stmt and cannot be assigned multiple times.

Created on 25 Jun 2020 · 13Comments · Source: taichi-dev/taichi

Describe the bug

I found this when debugging https://github.com/taichi-dev/taichi/pull/1188#issuecomment-649510574

It basically boils down to this:

import taichi as ti

ti.init(arch=ti.x64, print_ir=True)

@ti.kernel
def foo():
  a = ti.Vector([ti.random(), 2.3]).normalized()
  print(a)

foo()

Here's the generated IR:

    <f32 x1> $1 = const [1.0]
    <f32 x1> $2 = rand()  <-- !
    <f32 x1> $3 = mul $2 $2
    <f32 x1> $4 = const [2.3]
    <f32 x1> $5 = const [5.29]
    <f32 x1> $6 = add $3 $5
    <f32 x1> $7 = sqrt $6
    <f32 x1> $8 = div $1 $7
    <f32 x1> $9 = rand()  <-- !
    <f32 x1> $10 = mul $8 $9
    <f32 x1> $11 = rand()  <-- !
    <f32 x1> $12 = mul $11 $11
    <f32 x1> $13 = add $12 $5
    <f32 x1> $14 = sqrt $13
    <f32 x1> $15 = div $1 $14
    <f32 x1> $16 = mul $15 $4
    print "[", $10, ", ", $16, "]"

As we can see, rand() appeared three times in the IR. The correct way will be to assign $2 to $9 and $11.

bug discussion ir python

Source

k-ye

👍1

Most helpful comment

The initial IR looks off already:

[I 06/26/20 00:07:34.666] [compile_to_offloads.cpp:operator()@23] Initial IR:
kernel {
  $0 = alloca @tmp0
  @tmp0 = ((1  ((sqrt (((rand<float32>() pow 2) + (2.3 pow 2)) + 0)) + 0)) * rand<float32>())
  $2 = alloca @tmp1
  @tmp1 = ((1  ((sqrt (((rand<float32>() pow 2) + (2.3 pow 2)) + 0)) + 0)) * 2.3)
  $4 = eval @tmp0
  $5 = eval @tmp1
  print "[", %4, ", ", %5, "]"
}

k-ye on 25 Jun 2020

👍2

All 13 comments

The initial IR looks off already:

[I 06/26/20 00:07:34.666] [compile_to_offloads.cpp:operator()@23] Initial IR:
kernel {
  $0 = alloca @tmp0
  @tmp0 = ((1  ((sqrt (((rand<float32>() pow 2) + (2.3 pow 2)) + 0)) + 0)) * rand<float32>())
  $2 = alloca @tmp1
  @tmp1 = ((1  ((sqrt (((rand<float32>() pow 2) + (2.3 pow 2)) + 0)) + 0)) * 2.3)
  $4 = eval @tmp0
  $5 = eval @tmp1
  print "[", %4, ", ", %5, "]"
}

k-ye on 25 Jun 2020

👍2

Fun fact, simply use a temp var:

import taichi as ti

ti.init(print_ir=True)

@ti.kernel
def foo():
    a = ti.random()
    c = ti.Vector([a, 2.3]).normalized()
    print(c)


foo()

solve the problem:

kernel {
  $0 = offloaded  
  body {
    <f32 x1> $1 = rand()
    <f32 x1> $2 = const [1.0]
    <f32 x1> $3 = mul $1 $1
    <f32 x1> $4 = const [2.3]
    <f32 x1> $5 = const [5.29]
    <f32 x1> $6 = add $3 $5
    <f32 x1> $7 = sqrt $6
    <f32 x1> $8 = div $2 $7
    <f32 x1> $9 = mul $8 $1
    <f32 x1> $10 = mul $8 $4
    print "[", $9, ", ", $10, "]
"
  }
}

(Btw, ir_printer for PrintStmt seems broken after introducing end='\n'...)

archibate on 25 Jun 2020

Shall we just translate

a = ti.Vector([ti.random(), 2.3]).normalized()

into

tmp = ti.random()
a = ti.Vector([tmp, 2.3]).normalized()

at the frontend?

xumingkuan on 25 Jun 2020

BTW why https://github.com/taichi-dev/taichi/pull/1188#issuecomment-649510574 did work? This behavior should have existed in the codebase for months...

xumingkuan on 25 Jun 2020

Shall we just translate

a = ti.Vector([ti.random(), 2.3]).normalized()

into

tmp = ti.random()
a = ti.Vector([tmp, 2.3]).normalized()

at the frontend?

But this will change a.fill(ti.random())'s behavior. I think we should fill different values in this case. Maybe we should just tell the users to use a temp var when they want a deterministic random value?

xumingkuan on 25 Jun 2020

Do you know how to distinguish [ti.random(), 2.3] (which needs to be copied to a temp var) and a.fill(ti.random()) (which needs to be not copied) in python?

xumingkuan on 25 Jun 2020

Do you know how to distinguish [ti.random(), 2.3] (which needs to be copied to a temp var) and a.fill(ti.random()) (which needs to be not copied) in python?

I guess copying is needed in both cases?

yuanming-hu on 26 Jun 2020

So a.fill(ti.random()) should fill the same value to all elements of a?

xumingkuan on 26 Jun 2020

# Before preprocessing:
@ti.kernel
def foo():
    a = ti.Vector([ti.random(), 2.3]).normalized()
    print(a)

# After preprocessing:
def foo():
  a = ti.expr_init(ti.Vector([ti.random(), 2.3]).normalized())
  ti.ti_print(a)

The code after preprocessing looks good.

xumingkuan on 26 Jun 2020

1337 contains a temporary fix. I think we need a systematic solution here.

The code after preprocessing looks good.

Right, but normalized and __init__ of Vector leads to the duplicated ti.random(). For example, normalized returns an Expr instead of an evaluated value. This Expr is then duplicated when multiplying with the two vector components.

I think we should use @ti.func for Matrix.normalized instead of @taichi_scope. The later is great but we need to enforce preprocessing there. Currently, in many functions in Matrix, we are doing the preprocessing by hand, which is very error-prone. In the @ti.func preprocessor, we should enforce a ti.expr_init (which creates a local variable that holds the evaluated values of Expr) on return.

Also note that ti.expr_init may be creating too many allocas and local stores. IIRC FrontendEvalStmt may be a better choice.

yuanming-hu on 26 Jun 2020

👍1

So should I add ti.expr_eval now?

archibate on 26 Jun 2020

So should I add ti.expr_eval now?

Actually another issue is that FrontendEvalStmt cannot be referred to later... We may need something like ReferenceExpr(FromtendEvalStmt *) and FrontendRefStmt...

Maybe we should stick to expr_init for now.

yuanming-hu on 26 Jun 2020

I'm really confused now... It feels like I found two independent things in https://github.com/taichi-dev/taichi/pull/1188#issuecomment-649510574.

Before df62174854af8f5d964b39f0239e0f909810110a, which is 4e6ed8a7c4f5527c0c50625b5094ab6e43df9413, I don't see this problem.
At df62174854af8f5d964b39f0239e0f909810110a (#1188 ), if I comment out https://github.com/taichi-dev/taichi/blob/13462ff09560de75fc780de3ee6b0519086ee3b9/python/taichi/lang/ops.py#L213-L215, I don't see the problem.
At df62174854af8f5d964b39f0239e0f909810110a, even if I move ti.random() out and assign it to a tmp var a, then construct the vector with a. I still see that problem.
At master (i.e. with #1330 ), I still see the problem...

k-ye on 26 Jun 2020

😕1 👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

[CUDA] [Perf] The performance of cornell_box.py on CUDA is still poor than OpenGL

archibate · 3Comments

[Doc] Improve documentation

yuanming-hu · 3Comments

NAN check and throw exception on debug mode

g1n0st · 3Comments

[Bug] [Sparse] Particles disappear in `example/particle_renderer.py`

yuanming-hu · 3Comments

[lang] Add constants like `ti.pi`, `ti.ln2`.

archibate · 4Comments