Tvm: [Simplifier] [Regression] Bounds checks for variable-extent loops not eliminated by simplifier

Created on 14 Aug 2019 · 10Comments · Source: apache/tvm

Reproduction script:

```.py
import tvm

n = tvm.var('n')
X = tvm.placeholder(shape=(n,), name="x")
W = tvm.placeholder(shape=(n + 1,), dtype="int32", name="w")

def f(i):
start = W[i]
extent = W[i+1] - W[i]
rv = tvm.reduce_axis((0, extent))
return tvm.sum(X[rv + start], axis=rv)
Y = tvm.compute(X.shape, f, name="y")
s = tvm.create_schedule([Y.op])
with tvm.build_config(add_lower_pass=[(2, lambda stmt: tvm.ir_pass.LoopPartition(stmt, False))]):
print(tvm.lower(s, [X, W, Y], simple_mode=True))


At 60607eff494ab0b40ec21d6faf72c2f3dcef502f, the output is:

produce y {
for (i, 0, n) {
y[i] = 0f
for (rv, 0, (w[(i + 1)] - w[i])) {
if ((rv < (w[(i + 1)] - w[i]))) {
y[i] = (y[i] + x[(rv + w[i])])
}
}
}
}


Where we have an unnecessary bounds check on the inner loop.

at 52179338f3ed7cba17ebe2ac68b5eab6ee3f5179, the output is:

produce y {
for (i, 0, n) {
y[i] = 0.000000f
for (rv, 0, (w[(i + 1)] - w[i])) {
y[i] = (y[i] + x[(w[i] + rv)])
}
}
}
```

That is, we avoid any bounds checks in the inner loop. The loop partition pass makes no difference here.

This seems to be a bug in the new simplifier infrastructure? cc @tqchen.

https://github.com/ajtulloch/tvm/tree/fix-cond-in-variable-extent-loo is a fix, but it's quite ugly and there should be a better fix in the simplifier.

help wanted

Source

ajtulloch

All 10 comments

Git bisect blames to 75c29c6a1e16683c2660e362c769c311d54a22d6, https://github.com/dmlc/tvm/pull/3463.

I identified the issue in the PR, https://github.com/dmlc/tvm/commit/75c29c6a1e16683c2660e362c769c311d54a22d6#r34683949.

@tqchen do you have any ideas on how to cleanly get the simplifier to recognize this pattern? I would propose:

```.diff
diff --git a/src/arithmetic/int_set.cc b/src/arithmetic/int_set.cc
index 0dae8198..e1785125 100644
--- a/src/arithmetic/int_set.cc
+++ b/src/arithmetic/int_set.cc
@@ -375,7 +375,9 @@ class IntervalSetEvaluator :
IntervalSet min_set = this->Eval(val->min_value);
IntervalSet max_set = this->Eval(val->max_value);
--recur_depth_;
- return IntervalSet(min_set->min_value, max_set->max_value);
+ return IntervalSet(
+ min_set->IsEverything() && !val->IsEverything() ? val->min_value : min_set->min_value,
+ max_set->IsEverything() && !val->IsEverything() ? val->max_value : max_set->max_value);
}

IntervalSet VisitExpr_(const IntImm* op) final {
```

which avoids further pessimization of the bounds to IsEverything unnecessarily. But this doesn't solve more complex problems like bsr_mm, which has these problems due to the compute_at(..) declaration which seems to break the simplifier.

ajtulloch on 14 Aug 2019

👍1

Great observation! I think the main reason here is that when comparing the bounds, we can try different relaxations.

While in our normal cases it is fine to just relax i, in this particular case, the best way is to not do so and makes the bound of r dependent on i.

We might want to think about whether or not if we can try both, if we can list the other failure case you mentioned, we can also dig a bit deeper

tqchen on 14 Aug 2019

specifically to that PR, I think one possible change is not to eagerly resolve the bound that is context depend, but marks it to be so, and then try to recursively simplify min and max after we get a bound that we know is context dependent(so we will do simplification that involves i dependent first, then relax i if necessary)

tqchen on 14 Aug 2019

@tqchen I don't fully understand your idea. In general, how robust do you think that approach is? One thing I was thinking of adding would be a trivial step in Simplifier which, when visiting a Range(min, extent) node, adds the constraints loop_var >= min and loop_var < min + extent, and then detects expressions which exactly match one of these and replace them with const(1). This is a simple thing and should remove these bugs going forward.

ajtulloch on 15 Aug 2019

👍1

To elaborate, in the example below, we have the following range bound to to the loop vars

r: 0 to w[i+1] - w[i]
i: 0 to n

Our previous approach eagerly expand the bound of each variable, so we use bind to bind the bound of r, it get relaxed to Union_{i \in bound of i} {0 to w[i+1] - w[i]}, before it starts simplification. Sor - w[i+1] - w[i] can no longer get simplified.

The proposed idea is to make such expansion lazy. i.e. when we evaluate r, we would return the bound 0 to w[i+1] - w[i]. So if we call evalset for r+1, it would be 1 to w[i+1] - w[i]+1. However, we find that the new bound is still dependent on some of the relaxed variables, in this case i, and we will continue to relax the min and max value after we get the resulting set.

tqchen on 16 Aug 2019

@ajtulloch after thinking a bit more about the issue. I think your idea of adding bound constraints may be the best way to go, can you send in a PR?

tqchen on 21 Aug 2019

ping @ajtulloch can you follow up on this thread:)

tqchen on 16 Sep 2019

@tqchen sure - are you saying you want something like https://github.com/ajtulloch/tvm/commit/ca1088c63c51e14a9d3c02df7eaa44fd60c23ce4?

ajtulloch on 26 Sep 2019

looks great! Although this is a solution that only works for the exact condition, it fits our purpose and is a pragmatic solution

tqchen on 26 Sep 2019

@tqchen followed up in https://github.com/dmlc/tvm/pull/4076.

ajtulloch on 7 Oct 2019

Was this page helpful?

0 / 5 - 0 ratings