Yolov5: Do you speed up by TensorRT ?

Created on 12 Jun 2020  ·  16Comments  ·  Source: ultralytics/yolov5

v5 is so fast! I even dare to imagine how fast it is speeded up by TensorRT, Do you has any job about it ?

Stale enhancement

Most helpful comment

We have updated the yolov5 tensorrt according to the v2.0 release of this repo.

And made speed test on my machine.

| Models | Device | BatchSize | Mode | Input Shape(HxW) | FPS |
|-|-|:-:|:-:|:-:|:-:|
| YOLOv5-s | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 142 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 4 | FP16 | 608x608 | 173 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 8 | FP16 | 608x608 | 190 |
| YOLOv5-m | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 71 |
| YOLOv5-l | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 40 |
| YOLOv5-x | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 27 |

please find https://github.com/wang-xinyu/tensorrtx.

@glenn-jocher could you also add a link to https://github.com/wang-xinyu/tensorrtx in your Tutorials section?

All 16 comments

Hello @wu-ruijie, thank you for your interest in our work! Please visit our Custom Training Tutorial to get started, and see our Jupyter Notebook Open In Colab, Docker Image, and Google Cloud Quickstart Guide for example environments.

If this is a bug report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom model or data training question, please note that Ultralytics does not provide free personal support. As a leader in vision ML and AI, we do offer professional consulting, from simple expert advice up to delivery of fully customized, end-to-end production solutions for our clients, such as:

  • Cloud-based AI systems operating on hundreds of HD video streams in realtime.
  • Edge AI integrated into custom iOS and Android apps for realtime 30 FPS video inference.
  • Custom data training, hyperparameter evolution, and model exportation to any destination.

For more information please visit https://www.ultralytics.com.

@wang-xinyu did a great TensorRT implementation of our https://github.com/ultralytics/yolov3 repo here (which supports both YOLOv3 and YOLOv4), he might best answer this question.
https://github.com/wang-xinyu/tensorrtx/tree/master/yolov3-spp

Tensor core support would be amazing!

Hi! I tested yolo5-s on cpu by directly running detect.py and the inference speed is only 3 fps.Could you please give me some advice?I want to make it 30 fps at least.

@sljlp you might want to see 'Running yolov5 on CPU' #37

The default --img-size for detect.py is 640, which you can reduce significantly to get the FPS you are looking for.

@sljlp one caveat is --img-size must be a multiple of the largest stride, 32. So acceptable sizes are 320, 288, 256, etc.

Update: I've pushed more robust error-checking on --img-size now in 099e6f5ebd31416f33d047249382624ad5489550, so if a user accidentally requests an invalid size (which is not divisible by 32), the code will warn and automatically correct the value to the nearest valid --img-size.

@glenn-jocher Can you provide yolov5.weights file. I've found that to convert yolo to tensorrt, we need the weights file to use with (https://github.com/wang-xinyu/tensorrtx/)

@thancaocuong there is no such file.

I have a python implementation here, with NMS, https://github.com/TrojanXu/yolov5-tensorrt

Hi @glenn-jocher

I just implemented yolov5-s in my repo https://github.com/wang-xinyu/tensorrtx/tree/master/yolov5
, and test on my machine. yolov5-m, yolov5-l, etc, will come out soon.

| Models | Device | BatchSize | Mode | Input Shape(HxW) | FPS |
|-|-|:-:|:-:|:-:|:-:|
| YOLOv3-spp(darknet53) | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 38.5 |
| YOLOv4(CSPDarknet53) | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 35.7 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 167 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 4 | FP16 | 608x608 | 182 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 8 | FP16 | 608x608 | 186 |

Update! My tensorrt implementation already updated according to this commit https://github.com/ultralytics/yolov5/commit/364fcfd7dba53f46edd4f04c037a039c0a287972

The PANet updated.

Please find my repo https://github.com/wang-xinyu/tensorrtx

Update! My tensorrt implementation already updated according to this commit 364fcfd

The PANet updated.

Please find my repo https://github.com/wang-xinyu/tensorrtx

Thanks for sharing! Do you have plans to implement other yolov5 versions as well?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

We have updated the yolov5 tensorrt according to the v2.0 release of this repo.

And made speed test on my machine.

| Models | Device | BatchSize | Mode | Input Shape(HxW) | FPS |
|-|-|:-:|:-:|:-:|:-:|
| YOLOv5-s | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 142 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 4 | FP16 | 608x608 | 173 |
| YOLOv5-s | Xeon E5-2620/GTX1080 | 8 | FP16 | 608x608 | 190 |
| YOLOv5-m | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 71 |
| YOLOv5-l | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 40 |
| YOLOv5-x | Xeon E5-2620/GTX1080 | 1 | FP16 | 608x608 | 27 |

please find https://github.com/wang-xinyu/tensorrtx.

@glenn-jocher could you also add a link to https://github.com/wang-xinyu/tensorrtx in your Tutorials section?

@wang-xinyu thanks, yes this is a good idea. Can you submit a PR for the README please?

EDIT: I'll add a link to the export tutorial also.

Was this page helpful?
0 / 5 - 0 ratings