Tensorrt: Issue using slice layer to get the only the CLS embedding from BERT

Created on 10 Jan 2020 · 10Comments · Source: NVIDIA/TensorRT

Description

BERT output has the following dims: (batchsize, maxseqlen, hiddensize, 1, 1). We want to map it to (batchsize, 1, hiddensize, 1, 1), so that only the [0] index of maxseqlen dimension (i.e. the CLS embedding output) will be used in the softmax layer.

The problem is that the tensor shape of bert_out.shape[0] is -1, i.e. batchsize is not static. And ISliceLayer does not accept negative value in the shape parameter. It will complain as follows:

[TensorRT] INTERNAL ERROR: Assertion failed: dims.d[i] >= 1
../builder/cudnnBuilderGraph.cpp:605
Aborting...

[TensorRT] ERROR: ../builder/cudnnBuilderGraph.cpp (605) - Assertion Error in checkDimsSanity: 0 (dims.d[i] >= 1)

Here is the code:

# bert_out is the BERT embedding output, with shape (-1, maxseqlen, hiddensize, 1, 1)
bert_out = bert_model(config, init_dict, network, embeddings, mask_idx)
cls_embed = network.add_slice(bert_out, start=(0,0,0,0,0), shape=(-1,1,hiddensize,1,1), stride=(1,1,1,1,1))

So, is there a way to do the slicing gracefully by honoring the batch size?

Thanks!

Environment

TensorRT Version: 19.10
GPU Type: NVIDIA T4
Nvidia Driver Version: 418.116.00
CUDA Version: 10.1
CUDNN Version: 7.6.5.32-1
Operating System + Version: RHEL7.6
Python Version (if applicable): 3.6.8
TensorFlow Version (if applicable): 1.14.0
PyTorch Version (if applicable): N/A
Baremetal or Container (if container which image + tag): Container

Source

churinga

Most helpful comment

It works! thanks!

churinga on 21 Jan 2020

🎉2

All 10 comments

You can use the set_input API to provide a shape tensor to the slice layer:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Graph/Layers.html?highlight=islicelayer#tensorrt.ISliceLayer.set_input

pranavm-nvidia on 13 Jan 2020

@pranavm-nvidia I have been playing with set_input(), but couldn't make it work.

How can I make a dynamic tensor input for shape param, and set its value to (batch, 1, hiddensize, 1, 1)?

churinga on 13 Jan 2020

I think you'd want something like this:

shape = network.add_shape(bert_out).get_output(0)

mask = network.add_constant(shape=(5, ), weights=np.array([1, 0, 0, 1, 1], dtype=np.int32)).get_output(0)
hiddensize = network.add_constant(shape=(5, ), weights=np.array([0, 1, hiddensize, 0, 0], dtype=np.float32)).get_output(0)

slice_size = network.add_select(mask, shape, hiddensize).get_output(0)

cls_embed.set_input(2, slice_size)

Essentially, you'd get the shape at runtime using the IShapeLayer, and then perform operations on the shape to generate a new shape.

pranavm-nvidia on 13 Jan 2020

@pranavm-nvidia I see. Let me try. Thanks!

churinga on 13 Jan 2020

@pranavm-nvidia It seems Select layer is only added in TensorRT 7.0 release, while I'm still stuck with 6.0. Is there a similar workaround for 6.0?

churinga on 16 Jan 2020

I think you can implement similar behavior using elementwise layers.
You could set:

mask = [1, 0, 0, 1, 1]
inv_mask = [0, 1, 1, 0, 0]

and then compute:

slice_size = shape * mask + hiddensize * inv_mask

pranavm-nvidia on 16 Jan 2020

👍2

It works! thanks!

churinga on 21 Jan 2020

🎉2

@churinga, i have encountered the same problem in TensorRT 6.0. Can you share your solution? thanks a lot.

zaczou on 26 Mar 2020

@zaczou It's basically like what @pranavm-nvidia suggested. Create two masks, and do the maths, like slice_size = shape * mask + hiddensize * Inv_mask.

churinga on 26 Mar 2020

I am getting the same error. But the scenario is different.
I am getting this error at BatchedNMS_TRT plugin.
My model output shape is
Pred_box = [1, 10647, 1, 4]
Pred_score = [1, 10647, 1, 1]
Now, while I am reshaping this
Pred_box = tf.reshape(Pred_box, (-1, 10647, 1, 4))
Pred_score = tf.reshape(Pred_score, (-1, 10647, 1))

And replace with BatchedNMS_TRT plugin I am getting the exact same error.

[TensorRT] INTERNAL ERROR: Assertion failed: dims.d[i] >= 1
../builder/cudnnBuilderGraph.cpp:605
Aborting...

[TensorRT] ERROR: ../builder/cudnnBuilderGraph.cpp

Environment
TensorRT Version: 6.0.1.10
GPU Type: jetson nano Tegra x1
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 1.14.0
PyTorch Version (if applicable): N/A
Baremetal or Container (if container which image + tag): Container

And when on ubuntu
Environment
TensorRT Version: 7.0.0.11
GPU Type: GTX 1650Ti
NVIDIA Driver: 440.64.00
CUDA VERSION: 10.0
CUDNN VERSION: 7.6.5
Python Version (if applicable): 3.5.2
TensorFlow Version (if applicable): 1.14.0
PyTorch Version (if applicable): N/A
Baremetal or Container (if container which image + tag): Container
ERROR is as below:

Python3: batchedNMSPlugin.cpp:59: virtual nvinfer1::Dims nvinfer1::plugin::BatchedNMSPlugin::getOutputDimensions(int, const nvinfer1::Dims*, int): Assertion 'inputs[1].nudism == 2' failed.
Aborted (core dumped)

So, is there any solution for slicing or reshaping which is supported by tensorrt as well also as per BatchedNMS_TRT plugin shape?
Thanks.