One: Modifying IR for transpose, and its dependency on acl-cl, acl-neon

Created on 4 Jun 2020  路  8Comments  路  Source: Samsung/ONE

I have implemented shape inferencing rules for Transpose op under pull request #1461. One of the features under this pull request modifies the perm parameter of Transpose to be an input tensor, as it can have non-constant values. This feature calls for modifications to front end base_loader.h, as well as backend files such as KernelGenerator, and TransposeLayer.

Currently, perm is implemented as an input tensor, and has been successfully validated on cpu backend. However, the remaining backends, i.e., acl-cl and acl-neon do not support this change yet. In this context, I wanted to know which of the following approaches is feasible:

  1. Modify KernelGenerators for acl-cl and acl-neon to handle perm as input tensor, rather than Params.

  2. Revert to the original IR, where perm was a Param. However, we will not be covering non-constant perm values.

Please let me have your opinions on these alternatives. Thank you in advance.

help wanted

All 8 comments

The acl backends don't support dynamic tensor yet. To support on acl backends is more complex than cpu backend. And the acl backends probably are much slower than cpu backend in current version of acl, because acl backends must generate their kernels whenever changing shape of input/output of the kernels before executing the kernels.
So, I think it would be better to throw error when using acl backends with dynamic tensor now.

@ragmani , Thanks for your quick reply. I will keep in mind to throw error when handling dynamic tensors on acl backend. The issue I am facing at present is a build error such as the one shown below:

15:09:53  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-aarch64-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc: In member function 'virtual void onert::backend::acl_cl::KernelGenerator::visit(const onert::ir::operation::Transpose&)':
15:09:53  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-aarch64-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc:706:25: error: 'const class onert::ir::operation::Transpose' has no member named 'param'
15:09:53     const auto &perm{node.param().perm};
15:09:53                           ^~~~~
15:09:53  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-aarch64-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc:706:37: error: unable to deduce 'const auto&' from '<expression error>'
15:09:53     const auto &perm{node.param().perm};
15:09:53                                       ^
15:09:53  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-aarch64-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc:708:26: error: 'const class onert::ir::operation::Transpose' has no member named 'param'
15:09:53     const auto rank = node.param().rank;

Which makes sense, as I have modified the IR of Transpose. So, to fix this, I should handle perm as a tensor under void KernelGenerator::visit(const ir::operation::Transpose &node). If I understood the logic correctly in acl_cl/KernelGenerator.cc, this function reads the perm values and passes them on to ::onert::backend::acl_common::getARMComputePermutationVector function.

I am not sure whether I can implement the same when perm is a tensor, i.e pass its values at this stage when we are configuring the backend (as opposed to when we evaluate the op). We would still need a working implementation with input tensors without dynamic inferencing.

@dr-venkman
I understand your intention and the problem you have now.
You can change the param of Transpose to input in IR in order to make it as a tensor on backend if the param is not TransposeOptions in schema of circle and tflite.
Each backend can use input of IR as tensor or just value in their KernelGenerator.
Now acl backends don't support dynamic tensor. So you can make it check dynamic tensor and throw error in their KernelGenerator.

  • How to check dynamic and throw error
    If Input::PERM is non-constant, it makes output tensor dynamic tensor.
  const auto ofm_idx{node.getOutputs().at(0)};
  const auto perm_index{node.getInputs().at(ir::operation::Transpose::Input::PERM)};
  if (!_ctx.at(perm_index).isConstant())
  // if (_tensor_builder->at(ofm_idx)->is_dynamic()) // I'm not sure if it works well now.
  {
    throw std::runtime_error{"error message"}
  }
  • How to convert the constant input(PERM) to PermutationVector
  // After checking dynamic

  const auto perm = _ctx.at(perm_index);
  // Reversed
  auto backend_pv = ::onert::backend::acl_common::getARMComputePermutationVector(
      rank, perm.asVector<int32_t>(), frontend_layout, backend_layout);

@ragmani, thanks a lot for your inputs. I tried the modifications as above in pr #1461. However, I got build errors for both acl_cl and acl_neon. I am pasting an example error below:

20:18:13  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc:1140:20: error: '_tensor_builder' was not declared in this scope
20:18:13     auto ofm_alloc = _tensor_builder->at(ofm_idx).get();
20:18:13                      ^~~~~~~~~~~~~~~
20:18:13  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc:1142:21: error: '_ctx' was not declared in this scope
20:18:13     const auto rank = _ctx.at(ofm_idx).shape().rank();
20:18:13                       ^~~~
20:18:13  /opt/jenkins_slave/workspace/nnfw/master/pr-cross-runtime-test-pipeline-release@2/runtime/onert/backend/acl_cl/KernelGenerator.cc:1178:46: error: 'asAclClFunction' was not declared in this scope
20:18:13     auto acl_fn = asAclClFunction(std::move(fn));

A couple of things about these errors are not clear, such as:

  1. The line numbers do not relate to my modifications (the line number above relate to void KernelGenerator::visit(const ir::operation::Permute &node), which I had not modified.

  2. The variables _ctx and _tensor_builder are declared in the same namespace where they are accessed (onert::backend::acl_cl::KernelGenerator), so we shouldn't be seeing these errors in the first place.

Did I miss any details, or better still, is there a way I could build for acl-cl on my working copy? Please let me know.

@ragmani , I fixed the build errors based on your suggestions, thanks for spotting the mistake.

On running the testcases under acl-cl and acl-neon backends, I still see errors under Transpose ops. Further debugging of asVector() and tensor->buffer() return values (under cpu backend) show a value of 0, when the operation is being configured. This is essentially a compilation stage, which means I am not able to access the operand values. I am still trying to understand the internals of onert, and at present, it appears to me that any attempt to access either operand value (using _ctx.at(idx) and dereferencing _data, or using tensor->buffer() method) does not work during graph compilation stage.

This limitation, if valid, can be circumvented on a cpu backend, by deferring processing of data until the evaluation. However, on acl-cl and acl-neon backends, it is not immediately clear whether a similar solution would work. From what I observe, the permutation vector is created from parameter values at the compilation stage itself (please refer acl_cl/KernelGenerator.cc).

For the reasons mentioned above, I am inclined to retain perm as a parameter instead of a tensor. This does impose a limitation that we cannot validate perm with a non-constant value.

Please correct me wherever I am mistaken in my reasoning above. At present, I will revoke any changes to Transpose IR and restore pr #1461 to validate Transpose with only dynamic input tensor.

The errors are caused by modifying perm parameter of unittests from constant to non-constant. After reverting, it would work well.

@ragmani, thanks a lot for the discussion. I have implemented Transpose with perm as parameter, and validated the testcases. I will close this issue, unless there is something that needs further discussion.

@dr-venkman
I have no further discussion anymore.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lucenticus picture lucenticus  路  3Comments

mhs4670go picture mhs4670go  路  4Comments

YongseopKim picture YongseopKim  路  3Comments

underflow101 picture underflow101  路  4Comments

seanshpark picture seanshpark  路  3Comments