Can I add an Input with NCHW tensor format to an INetworkDefinition?
like
INetworkDefinition* net = builder->createNetwork();
auto tensor = net->addInput("input", DimsNCHW{32, 32, 32, 32});
...
When I build the ICudaEngine, I have already set the max batch size in the dims of the tensors. So what is the batch size I should set in the IBuilder?
builder->setMaxBatchSize(1);
or
builder->setMaxBatchSize(32);
In the current release, TensorRT treats batch size separately: it assumes you will provide the input tensor sizes without a batch dimension, and then configure the batch size separately.
At build time you do this using the maxBatchSize attribute (telling TensorRT the max batch size you will use, which is also the batch size it optimizes for.) In your case this would be 32.
And thus your call to addInput would be
auto tensor = net->addInput("input", DimsCHW{32, 32, 32});
At runtime, you would then provide the batch size corresponding to your input tensor as a parameter to enqueue().
Future versions of TensorRT will provide additional flexibility in defining the network.
Hi @xiaocenxiaocen,
The above is correct for TensorRT 5.x, this refers to "implicit batch networks".
With TensorRT 6.x, "explicit batch networks" are now supported, so you can explicitly provide a batch dimension (-1 for dynamic batch size), and build your engine with the explicit batch flag. See this section from the docs for more details: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#work_dynamic_shapes
Most helpful comment
In the current release, TensorRT treats batch size separately: it assumes you will provide the input tensor sizes without a batch dimension, and then configure the batch size separately.
At build time you do this using the maxBatchSize attribute (telling TensorRT the max batch size you will use, which is also the batch size it optimizes for.) In your case this would be 32.
And thus your call to addInput would be
At runtime, you would then provide the batch size corresponding to your input tensor as a parameter to enqueue().
Future versions of TensorRT will provide additional flexibility in defining the network.