Py-faster-rcnn: Training py-faster r-cnn using alt-opt algorithm

Created on 6 Jan 2016 · 31Comments · Source: rbgirshick/py-faster-rcnn

I am trying to train VOC2007 dataset using the alternating optimization algorithm. I trained the dataset [21 classes] using the command given in this project page [. /experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF]. The training ran without any issue and I have a caffemodel after 10000 iterations. I tried verifying the trained model using demo.py (modified the caffemodel location to the newly trained model location). When I run the demo.py the model is unable to detect anything (draw boxes) in the images.

my threshold values in demo.py are:
CONF_THRESH=0.5
NMS_THRESH=0.4

I looked into the dets that are generated. Surprisingly, all the values in the dets that are passed on to vis_detections() function are of same value (0.0476 i,e approximately 1/21). I'm not sure where I have made the mistake. Also is there a minimum number of iterations that has to be run for alt-opt?

Source

hoticevijay

Most helpful comment

@hoticevijay : 10000 iterations are very less, you can try increasing the iterations. Also removing the dropout may help in this case as VOC 2007 has lesser amount of training data. I was facing the same issue with end2end training on a 10 class subset of VOC (only 1000 images), after removing the dropout I was getting better results.

SaiAdityaG on 17 May 2016

👍3

All 31 comments

I met a similar problem after compressing the net. contact me if you solve it.

JoeLoveFannie on 25 Feb 2016

@JoeLoveFannie : I couldn't figure out the reason. But I was able to train the VOC dataset using end2end algorithm.

hoticevijay on 25 Feb 2016

I am facing the same issue i.e. no boxes are being generated using the model that is generated after 10000 iterations. (The only difference in my case is that I have increased the classes in my data). Did you get any solution for this? (I am planning to try the end2end method and will update if that works)

Anubhav7 on 27 Feb 2016

@Anubhav7 : I am able to generate bounding boxes using end2end algorithm

hoticevijay on 21 Mar 2016

hi, I use the command ./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF but have this error:

+ set -e
+ export PYTHONUNBUFFERED=True
+ PYTHONUNBUFFERED=True
+ GPU_ID=0
+ NET=ZF
+ NET_lc=zf
+ DATASET=
+ array=($@)
+ len=2
+ EXTRA_ARGS=
+ EXTRA_ARGS_SLUG=
+ case $DATASET in
+ echo 'No dataset given'
No dataset given
+ exit

How to solve that?

yinggo on 24 Mar 2016

@yinggo Download the pascal VOC dataset as explained in https://github.com/andrewliao11/py-faster-rcnn/blob/master/original_README.md. Add pascal_voc to the end of the command: ./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF pascal_voc

DiegoPortoJaccottet on 13 Apr 2016

I have the same problem.
I downloaded the VOCdevkit2007. Trained to stage2, 80000. Can't use the demo, p is always 0.0476.

DiegoPortoJaccottet on 13 Apr 2016

I found out the problem, you need to train the network completely, to the end. If not, then it detection doesn't work. Takes 15 hours on my Titan X.

DiegoPortoJaccottet on 18 Apr 2016

SaiAdityaG on 17 May 2016

👍3

@DiegoPortoJaccottet : Hi, I am also facing this error. Dis you use the initial weight or not ?
@hoticevijay : could you find any solution for this issue?
@SaiAdityaG : I also train the end2end with 120,000 iterations but still face this issue.(same number class of VOC, just remove initial weight). Could you send me your modifications ?

tiepnh on 1 Jul 2016

@tiepnh
Initial weights are in VGG16.v2.caffemodel. There are several stages of training, some take 80 000 iterations, other stages take 40 000. You must complete all stages, training to the end, to have a working network. Faster-RCNN is not like pure Caffe where you can train a little and have a little result, you must go through all stages to have a working RCNN.

And if you want to detect a class that's not in VGG16.v2.caffemodel, then you must pre-train a network for classification using Caffe before you can train using Faster-RCNN for detection, otherwise it will not work. The number of images for pre-training might be much larger than the number of images used for training on Faster-RCNN. I hope this helps you in achieving good results.

DiegoPortoJaccottet on 1 Jul 2016

@DiegoPortoJaccottet : thank you very much, I read the paper again and found that the information about the pre-trained network. It seem that cannot train the faster rcnn with out pre-trained network
But, I still confuse that, how can I create a pre-trained network for a new network( totally new)

tiepnh on 1 Jul 2016

@tiepnh
It's been some time since I did this, but if I remember correctly I just deleted some lines from faster_rcnn_test.pt. Added the usual DATA layers found in Caffe Imagenet examples, and deleted all RPN and RoI Proposal layers. Also changed the last few layers and added accuracy and loss layers. After training, the weights in conv1_1, relu1_1, etc will be transferred, but the new RPN, RoI, etc layers will be re-trained for detection. I did not test this approach fully, so I cannot guarantee that it will work.
faster_rcnn_test.txt

DiegoPortoJaccottet on 1 Jul 2016

@DiegoPortoJaccottet : Thank you for your support.
I will try to train my network as your suggestion and will share the results later.

tiepnh on 4 Jul 2016

👍1

@SaiAdityaG : Could you share your prototxt where you had removed the dropout layers?

vj-1988 on 9 Jul 2016

@vj-1988 I don't have access to those prototxt files now. To remove dropout layers I just made dropout ratio to 0 in prototxt files

SaiAdityaG on 10 Jul 2016

👍2

@SaiAdityaG : Thank you. I was able to start the training after setting the dropout ratio to 0.

hoticevijay on 11 Jul 2016

@DiegoPortoJaccottet: So for custom datasets it is mandatory to pre-train the network for the custom dataset and generate a new caffemodel which is supposed to be used as initial weights for py-faster-rcnn training.

In case if I want to add few more classes (say 50 new classes) along with Imagenet classes for pre-taining, is it sufficient to finetune the VGG16.v2.caffemodel (after modifying the train.prototxt to 1050 classes as output) or should I train network from scratch? (since training Imagenet from scratch is a time consuming task)

vj-1988 on 12 Jul 2016

Hi @DiegoPortoJaccottet : Thank for your support. For now, I can train the new network without pre-trained network. The work need to do is put weight_filter to all _convolution_ and _inner_product_ layer. And, also, set the _lr_mult_ != 0

tiepnh on 21 Jul 2016

👍1

@tiepnh Yes my model was missing the lr_mult, sorry for that. I revisited the problem and successfully pre-trained a network. Good luck.
train_val.txt

DiegoPortoJaccottet on 24 Jul 2016

is there any solution for this error?
user01@digits-1:~py-faster-rcnn$ ./experiments/scripts/faster_rcnn_end2end.sh gpu 1 VGG16 pascal_voc

set -e
export PYTHONUNBUFFERED=True
PYTHONUNBUFFERED=True
GPU_ID=gpu
NET=1
NET_lc=1
DATASET=VGG16
array=($@)
len=4
EXTRA_ARGS=pascal_voc
EXTRA_ARGS_SLUG=pascal_voc
case $DATASET in
echo 'No dataset given'
No dataset given
exit

cervantes-loves-ai on 28 Jul 2016

@sarkeribrahim
There is no need for "gpu", only the ID (1 in your case).
Therefore the command should be: ./experiments/scripts/faster_rcnn_end2end.sh 1 VGG16 pascal_voc
Not: ./experiments/scripts/faster_rcnn_end2end.sh gpu 1 VGG16 pascal_voc

DiegoPortoJaccottet on 28 Jul 2016

when i try to run this
./experiments/scripts/faster_rcnn_alt_opt.sh 0 VGG16 pascal_voc

i got this error...i checked the path though.

'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SNAPSHOT_INFIX': 'stage1',
'SNAPSHOT_ITERS': 10000,
'USE_FLIPPED': True,
'USE_PREFETCH': False},
'USE_GPU_NMS': True}
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
_bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(_self._args, *_self._kwargs)
File "./tools/train_faster_rcnn_alt_opt.py", line 122, in train_rpn
roidb, imdb = get_roidb(imdb_name)
File "./tools/train_faster_rcnn_alt_opt.py", line 61, in get_roidb
imdb = get_imdb(imdb_name)
File
"/home/user01/Music/anotherfast/py-faster-rcnn/tools/../lib/datasets/factory.py",
line 38, in get_imdb
return __setsname
File
"/home/user01/Music/anotherfast/py-faster-rcnn/tools/../lib/datasets/factory.py",
line 20, in
__sets[name] = (lambda split=split, year=year: pascal_voc(split, year))
File
"/home/user01/Music/anotherfast/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py",
line 38, in __init__
self._image_index = self._load_image_set_index()
File
"/home/user01/Music/anotherfast/py-faster-rcnn/tools/../lib/datasets/pascal_voc.py",
line 82, in _load_image_set_index
'Path does not exist: {}'.format(image_set_file)
AssertionError: Path does not exist:
/home/user01/Music/anotherfast/py-faster-rcnn/data/VOCdevkit2007/VOC2007/ImageSets/Main/trainval.txt

On Thu, Jul 28, 2016 at 4:56 PM, Diego [email protected] wrote:

@sarkeribrahim https://github.com/sarkeribrahim
There is no need for "gpu", only the ID (1 in your case).
Therefore the command should be: ./experiments/scripts/faster_rcnn_end2end.sh
1 VGG16 pascal_voc
Not: ./experiments/scripts/faster_rcnn_end2end.sh _gpu_ 1 VGG16 pascal_voc

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rbgirshick/py-faster-rcnn/issues/56#issuecomment-235825801,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABNtfsLLba9q2zRUYGtlWT33YAGemr54ks5qaGCbgaJpZM4G_c17
.

With Regards,
싸커모하매드이브라힘
Mohammad Ibrahim Sarker

cervantes-loves-ai on 4 Aug 2016

Hi
When i give for fast rcnn training i get the following error

File "/home/alpha/fast-rcnn/py-faster-rcnn/tools/../lib/datasets/fishclassify.py", line 118, in rpn_roidb
if int(self._year) == 2007 or self._image_set != 'test':
AttributeError: 'fishclassify' object has no attribute '_year'

I checked the basketball.py in the basketball project. There also _year is not declared in def __init__.
Pls advice me how to overcome this error.

indsak on 14 Jun 2017

Anybody please help on the above problem

indsak on 15 Jun 2017

@psy770 look at it https://github.com/rbgirshick/py-faster-rcnn/issues/373

ujsyehao on 21 Nov 2017

I think it should be like this
./experiments/scripts/faster_rcnn_end2end.sh gpu 0 VGG16 pascal_voc

gengkai258 on 30 Mar 2018

Hi guys,
What is the solution for the below error. ??
Let me know if anyone knows about it.

'USE_GPU_NMS': True}
Loaded dataset train for training
Set proposal method: gt
Appending horizontally-flipped training examples...
Traceback (most recent call last):
File "./tools/train_net.py", line 108, in
imdb, roidb = combined_roidb(args.imdb_name)
File "./tools/train_net.py", line 73, in combined_roidb
roidbs = [get_roidb(s) for s in imdb_names.split('+')]
File "./tools/train_net.py", line 70, in get_roidb
roidb = get_training_roidb(imdb)
File "/home/ubuntu/FRCN_ROOT/tools/../lib/fast_rcnn/train.py", line 119, in get_training_roidb
imdb.append_flipped_images()
File "/home/ubuntu/FRCN_ROOT/tools/../lib/datasets/imdb.py", line 106, in append_flipped_images
boxes = self.roidb[i]['boxes'].copy()
File "/home/ubuntu/FRCN_ROOT/tools/../lib/datasets/imdb.py", line 67, in roidb
self._roidb = self.roidb_handler()
File "/home/ubuntu/FRCN_ROOT/tools/../lib/datasets/radar.py", line 83, in gt_roidb
for index in self.image_index]
File "/home/ubuntu/FRCN_ROOT/tools/../lib/datasets/radar.py", line 198, in _load_radar_annotation
overlaps[ix, cls] = 1.0
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

Ram-Godavarthi on 7 Jun 2018

@Ram124 HI
have you solve the problem?
In OHEM i meet the same error

foralliance on 25 Jun 2018

@foralliance
Check ur xml file once.

And edit some lines specified below.
It should work.

$Faster-RCNN-Root/lib/rpn/proposal_target_layer.py" line 124 should be "cls = int(clss[ind])" and line 166 should be "...size=int(fg_rois_per_this_image),..." and line 184 should be "labels[int(fg_rois_per_this_image):] = 0" and line 177 should be " ... size=int(bg_rois_per_this_image), ...".

Ram-Godavarthi on 25 Jun 2018

@yinggo @rbgirshick
Hello guys,
i have question about detection.
I have 2 classes (Including background it is 3)
If i feed the network with only 1 object per image as training data.
100 images contains only A class and another 100 images contains B class.
so total there would be 200 images with 2 classes.

If i do training on them.
After training is done.

During testing,
If i give images with both classes (A & B classes) in it.
Would the network detect both the objects simultaneously??

Or should i fee the network with images having 2 objects????

Please clarify this.
Because i am getting only 1 object detected per image when i do testing (after feeding images with 1 object dataset).

Any help is really appreciated.

Ram-Godavarthi on 25 Jun 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Makefile:607: recipe for target '.build_release/tools/compute_image_mean.bin' failed

svendon · 3Comments

libprotobuf error, when running demo

ghost · 4Comments

configuration error in train_net.py

limorbagizada · 5Comments

ImportError: libcudart.so.7.0: cannot open shared object file: No such file or directory

monajalal · 5Comments

Training problem : Value Error

n3011 · 5Comments