Py-faster-rcnn: batch size and speed while classifying

Created on 16 Feb 2016 · 12Comments · Source: rbgirshick/py-faster-rcnn

I trained a pyFaster-rCNN + vgg16 model sometime ago. I ran into 1 issue though and I'd like to know if someone can help here.

I got speeds of roughly 300-400ms / image on a AWS g2.2x gpu. Usually with the non faster-rCNN models, I have used the batch size feature to bunch upto 32 images and forward it through the network and I had achieved upto a 4x-5x improvement in timing. i.e it would take about 10 secs to do 100 images.

However when I tried the same here, I keep getting errors about the input size and blob etc.

I use Python. My aim is to achieve a 50-60ms / image speed on a AWS g2.2x, when running in batch mode.

Has anyone tried this before (batching on faster-rCNN) and are there any tips on how to get it done?

Source

kshalini

Most helpful comment

@JohnnyY8 Could you try to compile caffe with CUDA-7.5. By some undefined for me reason py-faster-rsnn works slower for me with CUDA-8.0.
Additionally you can try to use only C++ implementation of proposal_layers and your test tool. I mean remove python at all. You can do this based on pvanet sources. To do this it is necessary to add batch support to proposal_layer.cpp an proposal_layer.cu. You can use my version:
proposal_layer.cu.txt
proposal_layer.cpp.txt
Additionally it requires to add changes to model prototxt files, see patch
prototxt.patch.
As a basis for C++ version of test tool you can try cpp_classification. Hope it help you to improve image processing performance.

DmitryKhlus on 17 Dec 2016

👍4

All 12 comments

Hi, kshalini
I tried to implement batch image processing by changing following files:
1) lib/fast_rcnn/test.py test.py.txt
2) lib/rpn/proposal_layer.py proposal_layer.py.txt
3) tools/demo.py demo.py.txt
But it's looks like you have a better result. My testing result is around 140 ms / image.
May be you can share your changes, it's can help us to improve each others' system.

Thanks

DmitryKhlus on 21 Feb 2016

👍2

@DmitryKhlus , thanks for the query. but I only got about 300-400ms if you read my earlier post.

Only on the regular CNN (not faster-RCNN), I could get faster results through the batch size feature.

with regard to faster-rCNN, in fact, your results are better than mine -:) (140ms vs 400 ms)

btw, which network did you use - AlexNet or VGG or ZF ?

Anyway, i will also share my changes shortly here. regards

kshalini on 22 Feb 2016

@kshalini , I use ZF network and my test environment is also AWS

DmitryKhlus on 22 Feb 2016

Hi @DmitryKhlus @kshalini :
We also tried to use ZF network for processing images in batch. The results are following:
without batch:

with batch:

The speed without batch is faster than with batch, we do not understand why. Can you help us?
Thanks a lot!
BTW, we compiled caffe with CUDA-8.0 and cudnn-8.0-linux-x64-v5.1.

JohnnyY8 on 29 Oct 2016

I just used the code @DmitryKhlus provided for batching and experienced a 1.5-2% increase in speed (on the ZF-trained network).

mheyman on 16 Dec 2016

DmitryKhlus on 17 Dec 2016

👍4

@DmitryKhlus Thanks for your time. It is a little complicated for us, but we will try.

JohnnyY8 on 20 Dec 2016

@DmitryKhlus
Great Job!
BTW, I can't get "see 'Proposal' section here" file, would you please send it to me? Thanks a lot!
email: [email protected]

mydear33000 on 28 Apr 2017

@DmitryKhlus I just used the code that you provided for batching.
But when i modified the __C.TEST.BBOX_REG = False in config.py. I get the error :

lib/fast_rcnn/test.py", line 202, in im_detect_array. pred_boxes = np.tile(boxes, (1, scores.shape[1]))
AttributeError: 'list' object has no attribute 'shape'

Do you know what happened?

315386775 on 26 Jul 2017

@315386775
I didn't tested it with your changes. I think, the reason of this problem is that scores is not a numpy array. You can try to convert list to numpy array by using following: np.asarray(scores).shape[1]
Or may be you can initialize 'scores' as numpy array above line provided.

DmitryKhlus on 26 Jul 2017

@DmitryKhlus Thanks for your proposal_layer.cu.txt 、proposal_layer.cpp.txt 、prototxt.patch, but in proposal_layer.cu "Only single item batches are supported", could multi item batches be supported?