Detectron2: Any workaround on deploy these models using TensorRT?

Created on 29 Oct 2019  路  14Comments  路  Source: facebookresearch/detectron2

Since the model is really nice, it would be very good to add a feature deploy on TensorRT, does there anyone in community has some walk around on this?

Most helpful comment

Just share some test result. I converted backbone to TensorRT engine with torch2trt. If only measure backbone forward time, TensorRT engine is 30% faster, but the while GeneralizedRCNN forward only 15% faster.

  • config-file: faster_rcnn_R_101_FPN_3x.yaml
  • GPU: 1080
  • TensorRT: 6.0.1.5
  • pytorch: 1.3.0

All 14 comments

I think the best way to deploy a model at the moment is to take the backbone and heads out of it and make a separate nn.Module with them. The built-in backbones don't seem to convert to ONNX nicely out of the box though -- I'd recommend using torchvision backbones instead.

@jinyeom You mean using TensorRT on backbone only? That's actually doesn't accelerate much. and even wastes lots of time copying data.

Sorry, I meant you should get the backbone+heads out of the original model and build a separate module. All the other stuff like NMS and anchor generation don't convert.
Check this repo for reference: https://github.com/NVIDIA/retinanet-examples

@jinyeom Good advise. Have u tried this way?

@jinyeom I have digged a while, it hard to do since all layer input inside detectron2 using ImageList object not barely python list or tuple which is massive workload to change them.

The roi_heads using output of ProposalGenerator, while proposal using ImageList as input, so it's impossible trace only backbone and roi_heads since you gonna need ProposalGenerator output. But that part hugely relies on ImageList provides every image height and width information.

I'm still exploring this method at the moment. Yeah, you're probably going to have to reimplement that preprocessing part manually.

@jinyeom I have found it possible with ImageList when trace. previous maskrcnn-benchmark had people exported onnx by doing this.

However, I got some constant error for now.. needs fix.

We will not have GPU deployment pipeline since it has low value to facebook. Closing but welcome to comment about any working community efforts .

Just share some test result. I converted backbone to TensorRT engine with torch2trt. If only measure backbone forward time, TensorRT engine is 30% faster, but the while GeneralizedRCNN forward only 15% faster.

  • config-file: faster_rcnn_R_101_FPN_3x.yaml
  • GPU: 1080
  • TensorRT: 6.0.1.5
  • pytorch: 1.3.0

@Sanster that's great. Are you willing to share the steps to convert backbone to TensorRT? Are you planning to convert whole Mask RCNN to TRT? Thanks!

Has anyone tried to convert the Deformable Convoultion layer(s) to TenosrRT or ONNX?

I have tried to use TensorRT to boost the inference performance of Cascade R-CNN with Caffe2/tensorrt. Caffe2/tensorrt is like TF-TRT, but Caffe2/tensorrt is not mature enough锛宎fter fixing some bugs, I got it run. https://zhuanlan.zhihu.com/p/122399743

@mpjlu Can u share some code to reproduce runing Cascaed RCNN on trt?

Just share some test result. I converted backbone to TensorRT engine with torch2trt. If only measure backbone forward time, TensorRT engine is 30% faster, but the while GeneralizedRCNN forward only 15% faster.

  • config-file: faster_rcnn_R_101_FPN_3x.yaml
  • GPU: 1080
  • TensorRT: 6.0.1.5
  • pytorch: 1.3.0

can you sure some code snippet? i can't make your example work with torch2trt

Was this page helpful?
0 / 5 - 0 ratings