Tfjs: Plans for N-D Convolution?

Created on 26 Jun 2018  Â·  13Comments  Â·  Source: tensorflow/tfjs

TFJS already has conv1d and conv2d, which are great, but I'm really missing Python's N-D convolution (tf.nn.convolution).

Are there plans for implementing this? And if not, would the team be open to PRs?

core feature

Most helpful comment

@dsmilkov Rad, thanks!

I believe the cpu kernel implementation is complete - adding some more unit tests this weekend and will open a PR soon.

I also implemented the webGL kernel and ran into a few bumps as the language is a bit foreign to me. I'm going to try a few more things, but might add some comments in the PR to seek guidance from someone that has experience with webGL.

Will follow up in a few days!

All 13 comments

We don't have plans for implementing it, but happy to take PRs!

@nsthorat
I've implemented convolution here. Essentially, my implementation determines which convolution to use (conv1d, conv2d) based on the rank of the filter.

At this point, I'd like your guidance on how to proceed with the tests.
To me, there are 2 viable approaches:

  • Testing the correct convolution was called with the correct arguments (I don't know how to do this), or
  • Reusing the tests from conv1d and conv2d. We can either copy and paste those tests, or we can factor those tests and call them from both conv1d/conv2d and convolution.

What do you think?

Hi @julianoks

Having an ND wrapper around conv1d and conv2d is not very useful at this point, but if conv3d support gets in, I can see the value. Are you interested in implementing conv3d?

@dsmilkov
Unfortunately, I don't think I'll have time to implement conv3d anytime soon.

No worries. In that case, we can postpone the convolution wrapper.

@julianoks @dsmilkov I'm taking a stab at conv3d , will update with progress within a few days.

Just a quick update - this is a little more complex than I initially thought, but giving it some free time over the next few weeks and thinking I can have it finished in July.

@dsmilkov For conv3d input formats, it looks like the conv2d implementation in tfjs-core strays from the TensorFlow Python documentation. Guessing this is from the deeplearn.js days, but I'm wondering if I should match the paradigm that's currently used in tfjs-core, or whether I should stick to the Python documentation exactly?

P.S. No rush in answering, changing this should be easy at any point.

Python Conv2d documentation: https://www.tensorflow.org/api_docs/python/tf/nn/conv2d
tfjs-core implementation: https://github.com/tensorflow/tfjs-core/blob/master/src/ops/conv.ts

Python Conv3d documentation: https://www.tensorflow.org/api_docs/python/tf/nn/conv3d
my stubbed implementation of conv3d for tfjs-core: https://github.com/zboldyga/tfjs-core/commit/9b98586bffe3ef0ae2d30e703938cdcf62c76169

Currently I'm following your existing paradigm. It seems like making a change to conv3d would warrant changing conv2d and possibly conv1d, which would be a breaking change.

Just curious -- what specifically diverges between tfjs-core and the python documentation, is it just the naming? The implementation should be identical since we test this against real TensorFlow.

If naming is the only difference, I would stick with what we did for conv2d in tfjs-core.

@nsthorat

The Python documentation for tf.nn.conv2d shows the following function description:
tf.nn.conv2d( input, filter, strides, padding, use_cudnn_on_gpu=True, data_format='NHWC', dilations=[1, 1, 1, 1], name=None )

Comparing this python documentation to the tfjs-core implementation of conv2d, I see the following differences:

  1. Input is strictly a 4D tensor in the python implementation [batch, in_height, in_width, in_channels], whereas the tfjs-core implementation allows a 4D tensor [batch, in_height, in_width, in_channels] OR a 3D tensor [in_height, in_width, in_channels] (in this case tfjs assumes batch size of 1).
  2. With the Python implementation, strides is a 1D tensor of length 4 [batch, height, width, channels], where it’s recommended that batch == channels == 1. For the tfjs-core implementation, strides is a 1D tensor of length 2 [height, width]. Batch and channels are automatically assumed to be 1 in the tfjs implementation.
  3. Python documentation says padding must be ‘SAME’ or ‘VALID’, and the tfjs-core implementation allows for ‘SAME’, ‘VALID’, or an integer input.
  4. Dilations faces the same issue as in #2. Python documentation calls for a 1D tensor of length 4, and tfjs-core implementation calls for a 1D tensor of length 2 (and automatically assumes first and last parameters are 1)... The Python documentation states that dilations in the batch and depth dimensions must be 1, but the input is still 4D in the Python implementation.
  5. dimRoundingMode is a parameter in the tfjs-core implementation, but this parameter does not appear in the Python documentation.
  6. name is not a parameter for the tfjs-core implementation (I'm assuming this has some sort of debugging or logging use, it's an optional parameter in the Python implementation).

I didn't check the Python documentation against the Python implementation, I'm assuming that the documentation is valid.

Hi @zboldyga ,

Nice comparison on the differences. When addding conv3d, I would follow what we did with conv2d in tfjs-core. In this case of conv3d, this means: take conv4d or conv5d tensor, strides is length 3, padding is same and valid (no need for supporting the integer input yet), dilations is length 3, skip dimRounding for now, and skip name.

@dsmilkov Rad, thanks!

I believe the cpu kernel implementation is complete - adding some more unit tests this weekend and will open a PR soon.

I also implemented the webGL kernel and ran into a few bumps as the language is a bit foreign to me. I'm going to try a few more things, but might add some comments in the PR to seek guidance from someone that has experience with webGL.

Will follow up in a few days!

building on the great work of @zboldyga, i submitted a PR for Conv3D in tfjs-layers: https://github.com/tensorflow/tfjs-layers/pull/495

Was this page helpful?
0 / 5 - 0 ratings