Detectron2: Any workaround on deploy these models using TensorRT?

Created on 29 Oct 2019 · 14Comments · Source: facebookresearch/detectron2

Since the model is really nice, it would be very good to add a feature deploy on TensorRT, does there anyone in community has some walk around on this?

Source

jinfagang

👍11 ❤1

Most helpful comment

Just share some test result. I converted backbone to TensorRT engine with torch2trt. If only measure backbone forward time, TensorRT engine is 30% faster, but the while GeneralizedRCNN forward only 15% faster.

config-file: faster_rcnn_R_101_FPN_3x.yaml
GPU: 1080
TensorRT: 6.0.1.5
pytorch: 1.3.0

Sanster on 23 Jan 2020

👍5 ❤1

All 14 comments

I think the best way to deploy a model at the moment is to take the backbone and heads out of it and make a separate nn.Module with them. The built-in backbones don't seem to convert to ONNX nicely out of the box though -- I'd recommend using torchvision backbones instead.

jinyeom on 4 Nov 2019

@jinyeom You mean using TensorRT on backbone only? That's actually doesn't accelerate much. and even wastes lots of time copying data.

jinfagang on 5 Nov 2019

Sorry, I meant you should get the backbone+heads out of the original model and build a separate module. All the other stuff like NMS and anchor generation don't convert.
Check this repo for reference: https://github.com/NVIDIA/retinanet-examples

jinyeom on 5 Nov 2019

@jinyeom Good advise. Have u tried this way?

jinfagang on 6 Nov 2019

@jinyeom I have digged a while, it hard to do since all layer input inside detectron2 using ImageList object not barely python list or tuple which is massive workload to change them.

The roi_heads using output of ProposalGenerator, while proposal using ImageList as input, so it's impossible trace only backbone and roi_heads since you gonna need ProposalGenerator output. But that part hugely relies on ImageList provides every image height and width information.

jinfagang on 6 Nov 2019

I'm still exploring this method at the moment. Yeah, you're probably going to have to reimplement that preprocessing part manually.

jinyeom on 6 Nov 2019

@jinyeom I have found it possible with ImageList when trace. previous maskrcnn-benchmark had people exported onnx by doing this.

However, I got some constant error for now.. needs fix.

jinfagang on 7 Nov 2019

👍2

We will not have GPU deployment pipeline since it has low value to facebook. Closing but welcome to comment about any working community efforts .

ppwwyyxx on 30 Nov 2019

config-file: faster_rcnn_R_101_FPN_3x.yaml
GPU: 1080
TensorRT: 6.0.1.5
pytorch: 1.3.0

Sanster on 23 Jan 2020

👍5 ❤1

@Sanster that's great. Are you willing to share the steps to convert backbone to TensorRT? Are you planning to convert whole Mask RCNN to TRT? Thanks!

automata on 23 Jan 2020

Has anyone tried to convert the Deformable Convoultion layer(s) to TenosrRT or ONNX?

adizhol on 1 Mar 2020

I have tried to use TensorRT to boost the inference performance of Cascade R-CNN with Caffe2/tensorrt. Caffe2/tensorrt is like TF-TRT, but Caffe2/tensorrt is not mature enough，after fixing some bugs, I got it run. https://zhuanlan.zhihu.com/p/122399743

mpjlu on 26 Apr 2020

@mpjlu Can u share some code to reproduce runing Cascaed RCNN on trt?

jinfagang on 27 Apr 2020

👍2

Just share some test result. I converted backbone to TensorRT engine with torch2trt. If only measure backbone forward time, TensorRT engine is 30% faster, but the while GeneralizedRCNN forward only 15% faster.

config-file: faster_rcnn_R_101_FPN_3x.yaml

GPU: 1080

TensorRT: 6.0.1.5

pytorch: 1.3.0

can you sure some code snippet? i can't make your example work with torch2trt