Darknet: Titan V vs. GTX 1080Ti for real-time inference on HD video at 50fps

Created on 7 Sep 2018 · 6Comments · Source: AlexeyAB/darknet

We intend to do some real-time inference on a single HD resolution video stream at around 50fps. I am trying to spec. out GPUs for the system deployed onsite that will be used to do the inference and re-rendering of video with detected objects.

Would GTX 1080Ti be able to keep up or would I need to pick Titan V? Degradation of the output frame rate to 30 fps or so might be acceptable as this is a prototype to be used for demonstrations.

On another thread, I noticed a comparison these 2 GPUs but for multiple live streams, so some parallelism was also coming into play in that case.

Any recommendations/pointers are appreciated.

Best,
Vineet

P.S: Copying @kmsravindra as he is collaborating w/ me on this project.

Source

endo123

All 6 comments

I think GTX 1080Ti should be enough to process FullHD (1920x1080) 50 FPS by using yolov3.cfg (416x416).

Because it can processes 2 x 1920x1080 25 FPS: https://github.com/AlexeyAB/darknet/issues/1232#issuecomment-405565193

On another thread, I noticed a comparison these 2 GPUs but for multiple live streams, so some parallelism was also coming into play in that case.

Single Yolo model can occupy ~95% of GPU - if you use this repository, OPENCV=1 CUDNN=1, and modern CPU.
There can be a bottleneck only on a CPU-side (video decompressing, resizing, saving) if you use other repo or slow CPU.

Titan V is required if you want to achive about ~90 FPS on 1920x1080 video and 416x416 network size.

AlexeyAB on 8 Sep 2018

Thanks, @AlexeyAB !

How do the performance requirements vary w/ increase in network size?
We may need a bigger network size to allow for better smaller object detection in our dataset.

P.S: @kmsravindra

endo123 on 9 Sep 2018

How do the performance requirements vary w/ increase in network size?

performance requirements linearly proportional to the product of numbers network_width x network_height

AlexeyAB on 9 Sep 2018

👍1

@Alexeyab, Just to confirm, we plan to use 832 x 480 network size whose product is 2.3 times bigger than 416x416. So can we approx assume 50/2.3 = 21.7Fps for this network size for 1080Ti?

Also, from your other thread I am assuming yolov2-light - yolov3 would be 1.3 times faster @ 1% mAP trade-off. So, hence using this lighter yolov3 should pump it up to 21.7 *1.3 = approx 28 FPS?

kmsravindra on 9 Sep 2018

👍1

@kmsravindra In general yes.

So can we approx assume 50/2.3 = 21.7Fps for this network size for 1080Ti?

Yes.
But this is only the assumption, that GTX 1080Ti will have about 50 FPS on yolov3 416x416, since I didn't test it on GTX 1080Ti.

From the other hand, I got only 32 FPS on Tesla V100 (~Titan V) without Tensor Cores, and 90 FPS on Tesla V100 (Titan V) with Tensor Cores, so may be there is somewhere a bottleneck on GPU, so GPU usage can be less than 90% without Tensor Cores: https://github.com/AlexeyAB/darknet/issues/407

Also, from your other thread I am assuming yolov2-light - yolov3 would be 1.3 times faster @ 1% mAP trade-off. So, hence using this lighter yolov3 should pump it up to 21.7 *1.3 = approx 28 FPS?

To do this, you should use -quantized flag at the end of command, and you should use this input_callibration= param in your cfg-file: https://github.com/AlexeyAB/yolo2_light/blob/29905072f194ee86fdeed6ff2d12fed818712411/bin/yolov3.cfg#L25

AlexeyAB on 9 Sep 2018

👍1

Thanks for the info @AlexeyAB

kmsravindra on 10 Sep 2018

Was this page helpful?

0 / 5 - 0 ratings