Addons: Implement GPU kernels

Created on 30 Mar 2019  路  13Comments  路  Source: tensorflow/addons

Currently we've omitted the gpu kernels from addons while we got everything setup. Now that CI testing is setup for GPU we should begin adding these to addons. There's a lot of things required so I wanted to start this issue so we can track / discuss.

Some of the TODO:

  • [x] Add gpu kernels and compile them as shown in the custom_op template which now includes a GPU op : PR #294
  • Configure BUILD scripts to use cuda tests

    • [x] image/image_projective_transform_op_gpu.cu.cc #294

    • [x] distance_transform_ops

    • [x] image/adjust_hsv_in_yiq_op_gpu.cu.cc

    • [x] seq2seq/beam_search_ops_gpu.cu.cc

  • [x] Update CONTRIBUTING documentation and add tensorflow/tensorflow:custom-op-gpu as the docker image
  • [x] Specify the TF_NEED_CUDA variable appropriately in our scripts
build

Most helpful comment

Closing this as all kernels are now implemented. I'll create a new issue for discussing how we will package GPU kernels as there is some discussion needed.

For record keeping... here are the notes from SIG Build meeitng regarding the 2 tf-core packages:

* ETA for tensorflow and tensorflow-gpu package to consolidate into a single pip package with dynamic kernels?
    * Ubuntu pip package available as tf-nightly-gpu, fixing windows gpu pip package now. If not 2.0, will be in 2.1 (will update this after confirming).
    * Dynamic kernels may take some more time to land
        * Packages are already available as tf-nightly-gpu but Windows is being fixed
        * Uncertain if this will happen in 2.0, because we鈥檙e not sure about breaking it
    * Jonathan: Will dynamic kernel support MKL/DNN?
        * Gunhan: Seems like no, haven鈥檛 figured this out yet

All 13 comments

@gunan @martinwicke Is there anyway we can see how https://github.com/tensorflow/community/pull/2 is progressing?

We've been discussing how we would like to package an Addons that includes GPU kernels and I believe we'll be depending on this in order to have a single pip package that loads either gpu or non gpu kernels. Is that correct? Publishing an addons-gpu does not seem like a good idea if this is on the horizon

@yifeif and I are working towards this
We are hoping to deliver this in Q2

Small update: dynamic kernels are now implemented on the tensorflow master branch so we can begin work on this.

@seanpmorgan Finally got myself an nvidia GPU. I can work on this.

@gunan Hey, do you have some sample code on how this is supposed to be used?

In the RFC, you mention a new API: tf.load_kernel_library, but I can't seem to find in the master branch of TensorFlow, I did find tf.load_library though.

During implementation, we decided to make it more general and name it "load_library" instead.
Please use that function instead.

@av8ramit Hi, Amit. Can you check the GPU test failures in #294? Which docker image do we use for gpu test? Is it tensorflow/tensorflow:custom-op-gpu image required by https://github.com/tensorflow/custom-op? Thank you.

Hi @facaiy we use an internal GCP image, but it's similar to the Docker image you listed.

@yifeif @gunan We're in the process of setting up our GPU env, but it looks as though tf-nightly-2.0-preview returns False when running tf.test.is_gpu_available(). If we run with tf-nightly-gpu-2.0-preview it correctly finds the device. As I understood it dynamic kernels have been merged into master so is this just a bug in device_lib or is the standard nightly package not able to run GPU kernels?

More information I'm running custom-op-gpu container and running:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

I can file an issue on tensorflow/tensorflow if it's just something that needs to be updated in that library.

@karmel @yifeif @gunan Friendly bump on the above question.

We're currently dealing with a failing test case that is only occuring on python2 tf-nightly-2.0-gpu. python3 gpu is passing because there hasn't been a released pip package for py34 gpu in 10 days. This makes is very difficult for us to work around these new errors when tf core nightly builds are failing frequently and we're not sure how to build our tests around them. See #373

Why is there still a tf-nightly-2.0-gpu package and is it going to stay around.

We have dropped support for Python3.4 packages as it is fairly old. @karmel could this issue be solved with upgrading the python3 version to Python3.5?

Good to know thanks. So we will upgrade our CI scripts to install py36, but for local testing the custom-op docker image we use only has py34 installed (still waiting on a new custom-op which looks like its soon)

Is there a reason for the 2.0-nightly-gpu pip package though? We were under the impression there would be a unified pip package now that dynamic kernels are in. We need to know this because we have a fail on GPU pip but not CPU pip package. It's difficult to troubleshoot without knowing the difference between these 2.

Closing this as all kernels are now implemented. I'll create a new issue for discussing how we will package GPU kernels as there is some discussion needed.

For record keeping... here are the notes from SIG Build meeitng regarding the 2 tf-core packages:

* ETA for tensorflow and tensorflow-gpu package to consolidate into a single pip package with dynamic kernels?
    * Ubuntu pip package available as tf-nightly-gpu, fixing windows gpu pip package now. If not 2.0, will be in 2.1 (will update this after confirming).
    * Dynamic kernels may take some more time to land
        * Packages are already available as tf-nightly-gpu but Windows is being fixed
        * Uncertain if this will happen in 2.0, because we鈥檙e not sure about breaking it
    * Jonathan: Will dynamic kernel support MKL/DNN?
        * Gunhan: Seems like no, haven鈥檛 figured this out yet

Was this page helpful?
0 / 5 - 0 ratings