Hello, I have a yolov3 trt model and a lot of test images, how to enter multiple images at the same time? not a single image. what should I do?
If you are using c++ API.
IBuilder::setMaxBatchSize(maxBatchSize), where you inference batch size is smaller than the maxBatchSize.IExecutionContext::enqueue(...) to push you input data to GPU. Please make sure the binding are allocated at right batch size, and the batch_size arg is correctly match with the binding you allocated. For detailed usage of TensorRT, please see
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html
You can start with studying the mnist sample:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-sample-support-guide/index.html#README-sampleMNIST
@litaotju
Hi,I'm using the Python API and test the sample onnx_resnet50.py which located in xx/samples/python/introductory_parser_samples. Everything works well but I want to enter multiple images too when testing the TensorRT model ,and I want to test different batch_size, what should I do? (maximum batch size、maximum workspace size、batch_size I almost confuse them ) Thank you so much!!!
Could pls read the document? All these concepts are either in the doc or in the code comments if the tensorrt include .h files.
Hi @sanmudaxia,
max_batch_size is the max batch size that your TensorRT engine will accept, you can execute a batch of sizes from 1,2,..., up to max_batch_size. The TensorRT engine will also be optimized for max_batch_size for an implicit batch network. For an explicit batch network, you can create serveral optimization profiles to optimize for various batch sizes.
Please see the docs for these terms: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Core/Builder.html
max_batch_size – int The maximum batch size which can be used at execution time,
and also the batch size for which the ICudaEngine will be optimized.
max_workspace_size – int The maximum GPU temporary memory which the
ICudaEngine can use at execution time.
And for batch_size that is specified when calling execute*() on the execution context: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Core/ExecutionContext.html#tensorrt.IExecutionContext.execute
batch_size – The batch size. This is at most the value supplied when the ICudaEngine was built.
@rmccorm4 Wanna ask, what if different images have different num output boxes? Multi batch how to make them know which part belongs to which image?
Hi @sanmudaxia,
max_batch_sizeis the max batch size that your TensorRT engine will accept, you can execute a batch of sizes from 1,2,..., up tomax_batch_size. The TensorRT engine will also be optimized formax_batch_sizefor an implicit batch network. For an explicit batch network, you can create serveral optimization profiles to optimize for various batch sizes.Please see the docs for these terms: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Core/Builder.html
max_batch_size – int The maximum batch size which can be used at execution time, and also the batch size for which the ICudaEngine will be optimized. max_workspace_size – int The maximum GPU temporary memory which the ICudaEngine can use at execution time.And for
batch_sizethat is specified when callingexecute*()on the execution context: https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/python_api/infer/Core/ExecutionContext.html#tensorrt.IExecutionContext.executebatch_size – The batch size. This is at most the value supplied when the ICudaEngine was built.
may i check for "max_workspace_size" what does (1<<20) meant? what is the "1" and "20" and "<<" meant? how to decide how much is required for my network?
Most helpful comment
If you are using c++ API.
IBuilder::setMaxBatchSize(maxBatchSize), where you inference batch size is smaller than the maxBatchSize.IExecutionContext::enqueue(...)to push you input data to GPU. Please make sure thebindingare allocated at right batch size, and thebatch_sizearg is correctly match with the binding you allocated.For detailed usage of TensorRT, please see
https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html
You can start with studying the mnist sample:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-sample-support-guide/index.html#README-sampleMNIST