Concisely describe the proposed feature
Currently, we operate tensor elements in taichi-scope, not the tensor itself.
But I think it would be convenient to operate tensor as a whole-thing in taichi-scope.
Consider the following code:
@ti.kernel
def blur(y: ti.ext_arr(), x: ti.ext_arr(), ker: ti.ext_arr()):
for i, j in x:
y[i, j] = sum(x[i-1:i+1, j-1:j+1] * ker) / sum(ker) # slicing sub-tensor
blur(img_out, img_in, [[0, 1, 0], [1, 2, 1], [0, 1, 0]])
With such syntax sugar, we made a very simple-to-write and easy-to-read gaussian blur kernel or whatever depends on ker. Also useful simplifying convolution kernels with such sugar.
Describe the solution you'd like (if any)
We may have Matrix for the type of temp variable holding x[i-1:i+1, j-1:j+1].
My concerns about using Matrix:
x is already a tensor of matrices? we obtain a matrix of matrices by x[1:3,2:4]? we may want another return type instead of Matrix, say: x[1:3, 2:4] is a type of ti.TempTensor?Additional Comments
may also want steping like y[i-3:i+3:2] where 2 is the step-length.
Ultimately I think this only require python ASTTransformer change.
Talk about ti.TempTensor...
I'm also thinking about ti.wince (寮犻噺缂╁苟):
C = ti.wince(A, ti.ij, B, ti.jk) # `j` follows Einstein's sum rule
represents: C_{ik} = \sum_j A_{ij} B_{jk}
Or simply sugar it into: C = A.ij * B.jk?
The ti.wince will finally translated into range-for's. So also consider ti.static_wince for static-optimized-small-range-for's.
First of all, your writing in this issue is really clear and concise. Great job.
My input: to implement sum(x[i-1:i+1, j-1:j+1] * ker) we might need to introduce a set of IR that allows allocation of local arrays, something like ArrayAllocStmt, which extends the current AllocStmt.
The this can be translated into something like
%1 = ArrayAllocStmt
for i, j in ...
... multiple ker[i, j] with x[...] and write the result into %1
...compute the sum of %1 using a for loop...
ArrayAllocStmt(size=4) can be translated to int a[4] in C-like backends.
Most helpful comment
First of all, your writing in this issue is really clear and concise. Great job.
My input: to implement
sum(x[i-1:i+1, j-1:j+1] * ker)we might need to introduce a set of IR that allows allocation of local arrays, something likeArrayAllocStmt, which extends the currentAllocStmt.The this can be translated into something like