Tvm: [vulkan] Assertion in tir/transforms/lower_thread_allreduce.cc", line 157 TVMError: Check failed: v:

Created on 28 May 2020  路  13Comments  路  Source: apache/tvm

Thanks for participating in the TVM community! We use https://discuss.tvm.ai for any general usage questions and discussions. The issue tracker is used for actionable items such as feature proposals discussion, roadmaps, and bug tracking. You are always welcomed to post on the forum first :)

Issues that are inactive for a period of time may get closed. We adopt this policy so that we won't lose track of actionable issues that may fall at the bottom of the pile. Feel free to reopen a new one if you feel there is an additional problem that needs attention when an old one gets closed.

For bug reports, to help the developer act on the issues, please include a description of your environment, preferably a minimum script to reproduce the problem.

For feature proposals, list clear, small actionable items so we can track the progress of the change.

Hit assertion in vulkan target. Other targets might have the same issue as well.
File "/home/mye/main/incubator-tvm/src/tir/transforms/lower_thread_allreduce.cc", line 157
TVMError: Check failed: v:
repro.zip

Most helpful comment

I'm new to TVM, but I will have a try :)
Thanks! @tqchen

All 13 comments

I have encounted the same problem.
It's triggered by 7b74a8672e1e40e7541c0007d8628586c62277e8, as vulkan use some cuda codes in topi. Forcing the sched_warp_softmax (in topi/python/topi/cuda/softmax.py) to return false could bypass the problem.
The root cause of the problem seems that MakeAllreduce mistakenly assume the fourth arg should be a var node, as the var node might be replaced by a constant node in the simplify optimization.

Thanks @majiang31312 please let me know if you would like to attempt a fix. cc @kazum @wpan11nv @yongfeng-nv @roastduck

I'm new to TVM, but I will have a try :)
Thanks! @tqchen

This sounds similar to me to the symptoms discussed in the recent posts in #5600.
So quite likely, making the cuda softmax schedule specific to cuda would fix this (did this in #5726). Of course, implementing shuffle reductions where they exist is the alternative...
I'm not entirely sure how to improve the error message (because the root cause is that warp shuffle reductions are not supported on the target).

The fix seems quite simple, but I'm not sure whether it's complete.
Please take a look at the Discussion section. Thanks! @tqchen @wpan11nv

Problem:
when num_thread = 1 (that's the case for vulkan as CreateTarget in target.cc set thread_warp_size to 1),
'
ko, ki = s[B].split(B.op.reduce_axis[0], factor=num_thread)
s[B].bind(ki, te.thread_axis("threadIdx.x"))
'
will triger "TVMError: Check failed: v:" in MakeAllreduce.
when factor=1, simplify optimization replace the IterVar with a constant node, but MakeAllreduce want a var node.

Reproduce:

import tvm
from tvm import te

n, m = 32,32
num_thread = 1
A = te.placeholder((n, m), name='A' ,dtype = 'int8')
k = te.reduce_axis((0, m), "k")
B = te.compute((n, ), lambda i: te.sum(A[i, k], axis=[k]), name="B")

s = te.create_schedule(B.op)
ko, ki = s[B].split(B.op.reduce_axis[0], factor=num_thread)
s[B].bind(ki, te.thread_axis("threadIdx.z"))

#target = tvm.target.create("vulkan")
target = tvm.target.create("cuda")
s = tvm.lower(s, [A, B])
s = tvm.tir.transform.Apply(lambda f: f.with_attr("target", target))(s)
s = tvm.tir.transform.Simplify()(s)
print(s)
s = tvm.tir.transform.LowerThreadAllreduce()(s)

Fix:

--- a/src/tir/transforms/lower_thread_allreduce.cc
+++ b/src/tir/transforms/lower_thread_allreduce.cc
@@ -154,9 +154,17 @@ class ThreadAllreduceBuilder final : public StmtExprMutator {
     std::unordered_set<const VarNode*> reduce_set;
     for (size_t i = 2 + 2 * size; i < call->args.size(); ++i) {
       const VarNode* v = call->args[i].as<VarNode>();
-      CHECK(v);
-      reduce_set.insert(v);
+      // The simply optimization replace a iteration variable with a constant
+      // when extent of the iteration is 1. As threaded IterVar always started from 0, 
+      // we can just ignore this variable in this case.
+      if (v) {
+        reduce_set.insert(v);
+      } else {
+        CHECK(call->args[i].as<IntImmNode>() && call->args[i].as<IntImmNode>()->value == 0) 
+          << "arg" << i << "should be a VarNode or IntImmNode";
+      }
     }
+      
     size_t nmatch = 0;
     std::vector<ThreadEntry> vred, vpar;
     for (const AttrStmtNode* attr : thread_extents_) {
@@ -170,6 +178,11 @@ class ThreadAllreduceBuilder final : public StmtExprMutator {
         const auto* ptr = attr->value.as<IntImmNode>();
         CHECK(ptr) << "Need constant extent for reduce set " << iv;
         e.extent = static_cast<int>(ptr->value);
+        // ignore variables equal to 0
+        if (e.extent == 1) {
+          continue;
+        }
+
         if (reduce_set.count(iv->var.get())) {
           vred.push_back(e);
           ++nmatch;

Discussion:
At this moment threaded IterVar always started from 0, so we can safely ignore the const var node.
Maybe we could keep a record somewhere after we replace a VarNode with a IntImmNode? I thinks that would help to deal with such kind of cases more clearly.
By the way, the 'analyzer_.Simplify' in BufIndex can not work as expected. It looks like that the analyzer have not been initilized properly. I can provide test cases if someone want to take a look.

@t-vi Thanks for the advice.
In my opinion this is not a backend problem, we can triger it in the cuda backend (my test case above is using cuda).

@mei-ye Cool! Yes. I ran into trouble when the target info erroneously specified 1 thread per warp for ROCm, which would look similar but not for the same reason. I'm glad you found the complementing fix.

I do think there is a bug still to be fixed even if it is not currently triggered by the schedules in topi, i.e.
@majiang31312 's analysis looks very plausible and I can reproduce the assertion with the demo.

@majiang31312 can you please followup with another pr :)?

@majiang31312 can you please followup with another pr :)?

Hi @tqchen, do you mean #5726?
I'm strongly agree with @t-vi that this problem(5686) should be fixed separately. This problem is not related to any specific backend. In my opinion, it's just a misunderstanding in MakeAllreduce that impose a overstrict assert.

@majiang31312 that sounds good, can you send a PR with your proposed fix?

@majiang31312 that sounds good, can you send a PR with your proposed fix?

Thanks, I have create a PR for the fix.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

leandron picture leandron  路  61Comments

joshpoll picture joshpoll  路  28Comments

tqchen picture tqchen  路  27Comments

kevinthesun picture kevinthesun  路  46Comments

srkreddy1238 picture srkreddy1238  路  29Comments