The main changes why do they get high accuracy:
We use a calibration set of 1000 images, randomly sampled from our training set.
Are used only 6 classes from 80 (MS COCO), custom network size 608x352, and custom anchors
Using only layers supported by TenorRT: ReLU and custom-upsample layer, to avoid FP32->INT8->FP32 conversions
Or just use ReLU instead of Leaky-ReLU
Or use Leaky_ReLU = 2 scale-layers + ReLU + shortcut-layer:
if (x >= 0) out = x*a + x*(1-a) = xif (x < 0) out = x*a + x*0 = x*a

@laclouis5
You can change https://github.com/AlexeyAB/darknet/blob/63396082d7e77f4b460bdb2540469f5f1a3c7c48/cfg/yolov3-spp.cfg model
set width=608 height=352 https://github.com/AlexeyAB/darknet/blob/63396082d7e77f4b460bdb2540469f5f1a3c7c48/cfg/yolov3-spp.cfg#L8-L9
replace all activation=leaky to activation=relu in cfg-file
extract from MS COCO dataset only 6 classes: person, car, bicycle, motorbike, bus, truck
train this cfg-file by using this repository https://github.com/AlexeyAB/darknet
quantize and run this model on TenorRT: https://news.developer.nvidia.com/deepstream-sdk-4-now-available/
You should get approximately the same result.
@AlexeyAB Ok thanks, so changing to relu, training and then quantize to TensorRT should improve network latency for a small accuracy drop?
@laclouis5 Yes.
You will get the same relatrive improvement relative to default yolov3.cfg (Leaky & FP32).
To get absoule speed/accuracy as in paper:
width=608 height=352 for training - it will be 2x faster than 608x608.@AlexeyAB
What do you mean by quantize and run this model on TenorRT: https://news.developer.nvidia.com/deepstream-sdk-4-now-available/
@AlexeyAB
set width=608 height=352 for training - it will affect the accuracy?
Hi, I'm currently running yolo3-tiny on xavier. My inputsize is 576*352 I already converted yolo to tensorRT. As JetNet's parer mentioned we can achieve 60% speed up if we alter LeakeyRelu to Relu. However, I don't see any difference between them in my tests. In my case the speed for yolo-tiny at fp16 is about 13ms and int8 is about 10ms. As I know the yolo3-tiny is several times faster than yolo3 which means yolo-tiny should be at about 3-6ms in tensorRT at int8. Are there anyone seeing the same problem? Any help in speeding up yolo-tiny in tensorRT is welcomed. Thanks!
Hi, I'm currently running yolo3-tiny on xavier. My inputsize is 576*352 I already converted yolo to tensorRT. As JetNet's parer mentioned we can achieve 60% speed up if we alter LeakeyRelu to Relu. However, I don't see any difference between them in my tests. In my case the speed for yolo-tiny at fp16 is about 13ms and int8 is about 10ms. As I know the yolo3-tiny is several times faster than yolo3 which means yolo-tiny should be at about 3-6ms in tensorRT at int8. Are there anyone seeing the same problem? Any help in speeding up yolo-tiny in tensorRT is welcomed. Thanks!
Hi, how did you converted yolo to tensorRT please ?
@Kmarconi Hi, I refered to this repo https://github.com/lewes6369/TensorRT-Yolov3. Basically it converts the darknet model to caffe and use TesnsorRt to parse the caffe model.
Most helpful comment
@laclouis5 Yes.
You will get the same relatrive improvement relative to default yolov3.cfg (Leaky & FP32).
To get absoule speed/accuracy as in paper:
width=608 height=352for training - it will be 2x faster than 608x608.