Tensorrt: Can I set the input tensor of an INetworkDefinition with batch dim?

Created on 7 Aug 2019 · 2Comments · Source: NVIDIA/TensorRT

Can I add an Input with NCHW tensor format to an INetworkDefinition?
like

INetworkDefinition* net = builder->createNetwork();
auto tensor = net->addInput("input", DimsNCHW{32, 32, 32, 32});
...

When I build the ICudaEngine, I have already set the max batch size in the dims of the tensors. So what is the batch size I should set in the IBuilder?

builder->setMaxBatchSize(1);

builder->setMaxBatchSize(32);

Source

xiaocenxiaocen

Most helpful comment

In the current release, TensorRT treats batch size separately: it assumes you will provide the input tensor sizes without a batch dimension, and then configure the batch size separately.

At build time you do this using the maxBatchSize attribute (telling TensorRT the max batch size you will use, which is also the batch size it optimizes for.) In your case this would be 32.

And thus your call to addInput would be

  auto tensor = net->addInput("input", DimsCHW{32, 32, 32});

At runtime, you would then provide the batch size corresponding to your input tensor as a parameter to enqueue().

Future versions of TensorRT will provide additional flexibility in defining the network.

DilipSequeira on 5 Sep 2019

👍2

All 2 comments

In the current release, TensorRT treats batch size separately: it assumes you will provide the input tensor sizes without a batch dimension, and then configure the batch size separately.

At build time you do this using the maxBatchSize attribute (telling TensorRT the max batch size you will use, which is also the batch size it optimizes for.) In your case this would be 32.

And thus your call to addInput would be

  auto tensor = net->addInput("input", DimsCHW{32, 32, 32});

At runtime, you would then provide the batch size corresponding to your input tensor as a parameter to enqueue().

Future versions of TensorRT will provide additional flexibility in defining the network.

DilipSequeira on 5 Sep 2019

👍2

Hi @xiaocenxiaocen,

The above is correct for TensorRT 5.x, this refers to "implicit batch networks".

With TensorRT 6.x, "explicit batch networks" are now supported, so you can explicitly provide a batch dimension (-1 for dynamic batch size), and build your engine with the explicit batch flag. See this section from the docs for more details: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#work_dynamic_shapes