Incubator-mxnet: Implementing new operators in an external module

Created on 24 Jan 2018 · 5Comments · Source: apache/incubator-mxnet

Currently it is possible to implement new operators in C++/CUDA, but it requires you to have your own copy of the source tree and build it into the standard mxnet library.

The only way to share such operators with others is to either share your fork of the MXNet repo or to contribute the operator back to the mainline MXNet project.

This is not scalable. It makes it very difficult to use operators from more than one fork, and it requires developers to build the entire source rather than just the source for their operators. It also requires developers of custom operators to manually merge from the parent MXNet project everytime they want to take advantage of the latest features and bug fixes unrelated to their own operators.

Ideally, there should be a way to define new operators in a separate project that is built against header files and shared library from a released version of MXNet, and then be able to load them from various MXNet front-end languages without having to replace all of MXNet with a custom version.

This may already be possible if you are coding directly in C++, but if you are using Python the mxnet shared library is installed in the Python library directory and is loaded in local mode so that other libraries cannot link against it.

Instead the mxnet shared libraries could be installed in a global location, independent of the front end being used, and loaded in global mode so that external modules could link against it. Each front-end language should provide a way to load an external module and verify that it is compatible with the currently installed copy of mxnet.

Feature request Operator

Source

analog-cbarber

❤7

Most helpful comment

I'm writing a similar project named MobulaOP.
It's available to implement new operators in Python/C++/C/CUDA without rebuilding the source of deep learning framework.

It aims to write only one code to implement new operator on different deep learning framework and different devices.

For example, I write a ROIAlign code.
The project will generate the related CUDA code automatically.
And the ROIAlign operator will support CPU/GPU, MXNet, Numpy, PyTorch, etc.

wkcn on 14 May 2018

👍4

All 5 comments

Agree. A feature similar to PyTorch FFI is desired.

zhanghang1989 on 27 Feb 2018

TensorFlow supports this. See https://www.tensorflow.org/api_docs/python/tf/load_op_library

analog-cbarber on 27 Feb 2018

https://github.com/DuinoDu/load_op.mxnet

DuinoDu on 25 Apr 2018

I'm writing a similar project named MobulaOP.
It's available to implement new operators in Python/C++/C/CUDA without rebuilding the source of deep learning framework.

It aims to write only one code to implement new operator on different deep learning framework and different devices.

For example, I write a ROIAlign code.
The project will generate the related CUDA code automatically.
And the ROIAlign operator will support CPU/GPU, MXNet, Numpy, PyTorch, etc.

wkcn on 14 May 2018

👍4

Bump. Extensibility is very important