I trained a tiny-yolo model for face detection using yolov3-tiny.cfg as provided. Input size of the net is 416x416. Everything is good enough except it's speed. I have read in the paper that tiny-yolov3 can reach >100fps speed but in fact, my model can reach only 20fpx, execute time is 0.04s per frame. What did I miss? And in what conditions tiny-yolov3 can reach that impressed speed?
Are you measured only inference time without data transfer and decoding? What hardware you are using? Do you using tensor cores? Try check gpu loading. Execution time depends at many factors.
Try to detect on videofile and use modern GPU.
@Aleksei91 I used GTX 1080 for testing my model, and eslaped time I refered is while the frame forward the net. GPU usage is 50% in common. To use GPU for detecting, the only thing that needed is change cv2.dnn.DNN_TARGET_CPU to cv2.dnn.DNN_TARGET_OPENCL?
Opencv dnn does not use cuda.
@Aleksei91 I used GTX 1080 for testing my model, and eslaped time I refered is while the frame forward the net. GPU usage is 50% in common. To use GPU for detecting, the only thing that needed is change
cv2.dnn.DNN_TARGET_CPUtocv2.dnn.DNN_TARGET_OPENCL?
@DoriHp How you use GPU for deteing objects in OpenCV? OpenCV doesn't support GPU's yet for detecting in opencv dnn module.
Most helpful comment
Are you measured only inference time without data transfer and decoding? What hardware you are using? Do you using tensor cores? Try check gpu loading. Execution time depends at many factors.