Py-faster-rcnn: Training FasterRCNN without pre-trained network?

Created on 1 Jul 2016  路  56Comments  路  Source: rbgirshick/py-faster-rcnn

Hi all,
I got the error ""BB = BB[sorted_ind, :]
IndexError: too many indices for array"
It seem that the trained network is nothing.

I follow the original steps in https://github.com/rbgirshick/py-faster-rcnn
And just modify scripts file ./experiments/scripts/faster_rcnn_end2end.sh to remove the line " --weights data/imagenet_models/${NET}.v2.caffemodel \"
I can finish the training, and also make the caffemodel file.

Anyones face this error? Could you please give me the solution?
Thank you,

Most helpful comment

Hi @manipopopo. Thank for your support.

You need to specify weight_filler for all Convolution and InnerProduct layers in the proto, otherwise all parameters will be initialized with zero.

So, it mean that, every convolution or inner_product layer in proto need to put some thing like "weight_filler { type: "gaussian" std: 0.01 }" ?
My probelem is the network don't leanr anything with out pre-trained, so my first target is make it learning something, even the performance worse than one with pre-trained weighs. It still big step with me.
I will try again as your advice. Thank again

All 56 comments

Maybe your model didn't learned anything (without initializing the model by the pre-trained model parameters).
The prediction of the model (on all images in the test set) doesn't contains any boxes labeled with some class C. So the BB becomes an empty array.

@manipopopo : Thank you for your answer. I understand that my trained model contain nothing. I just confuse that why the model cannot learn anything without pre-trained model).
If you have any knowledge about this, pls share with me.
Thankyou

Hi @tiepnh ,

Which ${NET} did you try? The lr_mult of the layers before conv2 in VGG_CNN_M_1024 and the layers before conv3_1 in VGG16 are set to 0, that is, the bottom layers of the networks won't learn anything during the training.

Even if all the lr_mult are non-zero (in ZF net), the hyperparameters (e.g. weight decay, training max_iteration and learning rates) are needed to be tuned. The provided hyperparameters are tuned for training models initialized with some pre-trained models.

Hi @manipopopo : Thank you for your answer. I will check those hyperparameters again.
Thanks,

Hi @manipopopo : I tried training ZF without pre-trained and use the solver as below:

base_lr: 0.001
lr_policy: "step"
gamma: 0.8
stepsize: 7000
display: 20
average_loss: 100
momentum: 0.9
weight_decay: 0.0005
max_iteration is 70000

But, it still cannot learn anything. Can you give me some advices?

  • You need to specify weight_filler for all Convolution and InnerProduct layers in the proto, otherwise all parameters will be initialized with zero.
  • You may want to check lr_policy exp, inv and poly (see caffe.proto for further information). Besides, hyperparameters include __C.TRAIN.{SOMETHING} in config.py.
  • Even if the model initialized randomly learns _something_, it may still perform worse than one initialized with pre-trained weights.

Hi @manipopopo. Thank for your support.

You need to specify weight_filler for all Convolution and InnerProduct layers in the proto, otherwise all parameters will be initialized with zero.

So, it mean that, every convolution or inner_product layer in proto need to put some thing like "weight_filler { type: "gaussian" std: 0.01 }" ?
My probelem is the network don't leanr anything with out pre-trained, so my first target is make it learning something, even the performance worse than one with pre-trained weighs. It still big step with me.
I will try again as your advice. Thank again

Before starting training a model on the whole data set by 70000 iterations, you may want to experiment with a tiny training data set (2-20 images), and make sure that the model can _overfit_ the training data set.

Hi @manipopopo: Follow your guide, I can train the network with out pre-trained network.
But, as you said, the mAP lower than one initialized with pre-trained network. It is still good for me.
Thank a lot

@tiepnh HI. I have same problem as you. I really appreciate you if you let me know how to modify the train.prototxt about faster-rcnn_end2end. I don't wanna use pre-trained model. So could you tell me more detail?

@tolry418 : You should follow comment of @manipopopo before.
Or you can try to use this prototxt(from #56) :https://github.com/rbgirshick/py-faster-rcnn/files/380443/train_val.txt
Base on that, you don't need use pre-trained network any more

@tiepnh Thanks.
I opened the file what you gave me. but it has no RPN and RCNN parts. It just VGG Model. isn't it?
In order to train faster-rcnn. What should i refer?
And i don't understand @manipopopo's comments.
Should i have to run with iteration 70000? and have to put weight_filler in all of convolution or innerproduct layer?
I'm wondering how can you modify your end2end train.prototxt.
In my case i used VGG16/faster_rcnn_end2end/train.prototxt.
Thanks.

@tolry418 : I'm sorry, I sent to you the wrong prototxt file. That file just for pre-training network.
So, to solve your issue now, you just need to add weight_filler for all _Convolution_ and _InnerProduct_ layers in the proto (put this line "_weight_filler { type: "gaussian" std: 0.01 }_" for all _convolution_param_ and _ inner_product_param_). You can check the prototxt in pre comment to check how to add weight_filter

These changes will make your network can learn something, not sure that the accuracy of final model is good or bad. so, you don't need change the other hyperparameter(such as iteration, learning rate,...) for now to train the new network. Just keep them are default and check that the network can leaning anything or not.
Hope that you can solve your issue

@tiepnh Thanks for your reply. As you mentioned, i put the line like that "weight_filler { type : "gaussian" std : 0.01}" for all Convolution and Innerproduct layers in train.prototxt which is in py-faster_rcnn/models/pascal_voc/VGG16/faster_rcnn_end2end folder.
And i do change nothing. except putting the weight_filler.

like below

layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }

}

}

BUT, i can't train it. Still have same error ...
What did i wrong?

Thanks.

@tolry418 : Can you upload your full proto file, and also faster_rcnn_end2end.sh file.
And, pls upload your using config. Base on that, maybe we can resolve the current issue.

OK. You might miss understand what i'm in now.
I can train it anyway. but i face the problem at the test moment when i use trained model without pretrained-model.
But same error occur at the end of test time.
""BB = BB[sorted_ind, :]
IndexError: too many indices for array"

This is the train.prototxt what i modified.


name: "VGG_ILSVRC_16_layers"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 21"
}
}

layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "pool1"
top: "conv2_1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu2_1"
type: "ReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu2_2"
type: "ReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2_2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "pool2"
top: "conv3_1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu3_1"
type: "ReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu3_2"
type: "ReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu3_3"
type: "ReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3_3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "pool3"
top: "conv4_1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu4_1"
type: "ReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu4_2"
type: "ReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu4_3"
type: "ReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4_3"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv5_1"
type: "Convolution"
bottom: "pool4"
top: "conv5_1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu5_1"
type: "ReLU"
bottom: "conv5_1"
top: "conv5_1"
}
layer {
name: "conv5_2"
type: "Convolution"
bottom: "conv5_1"
top: "conv5_2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0.1 }
}
}
layer {
name: "relu5_2"
type: "ReLU"
bottom: "conv5_2"
top: "conv5_2"
}
layer {
name: "conv5_3"
type: "Convolution"
bottom: "conv5_2"
top: "conv5_3"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "relu5_3"
type: "ReLU"
bottom: "conv5_3"
top: "conv5_3"
}

========= RPN ============

layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "conv5_3"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 512
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
layer {
name: "rpn_relu/3x3"
type: "ReLU"
bottom: "rpn/output"
top: "rpn/output"
}

layer {
name: "rpn_cls_score"
type: "Convolution"
bottom: "rpn/output"
top: "rpn_cls_score"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 18 # 2(bg/fg) * 9(anchors)
kernel_size: 1 pad: 0 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}

layer {
name: "rpn_bbox_pred"
type: "Convolution"
bottom: "rpn/output"
top: "rpn_bbox_pred"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 36 # 4 * 9(anchors)
kernel_size: 1 pad: 0 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}

layer {
bottom: "rpn_cls_score"
top: "rpn_cls_score_reshape"
name: "rpn_cls_score_reshape"
type: "Reshape"
reshape_param { shape { dim: 0 dim: 2 dim: -1 dim: 0 } }
}

layer {
name: 'rpn-data'
type: 'Python'
bottom: 'rpn_cls_score'
bottom: 'gt_boxes'
bottom: 'im_info'
bottom: 'data'
top: 'rpn_labels'
top: 'rpn_bbox_targets'
top: 'rpn_bbox_inside_weights'
top: 'rpn_bbox_outside_weights'
python_param {
module: 'rpn.anchor_target_layer'
layer: 'AnchorTargetLayer'
param_str: "'feat_stride': 16"
}
}

layer {
name: "rpn_loss_cls"
type: "SoftmaxWithLoss"
bottom: "rpn_cls_score_reshape"
bottom: "rpn_labels"
propagate_down: 1
propagate_down: 0
top: "rpn_cls_loss"
loss_weight: 1
loss_param {
ignore_label: -1
normalize: true
}
}

layer {
name: "rpn_loss_bbox"
type: "SmoothL1Loss"
bottom: "rpn_bbox_pred"
bottom: "rpn_bbox_targets"
bottom: 'rpn_bbox_inside_weights'
bottom: 'rpn_bbox_outside_weights'
top: "rpn_loss_bbox"
loss_weight: 1
smooth_l1_loss_param { sigma: 3.0 }
}

========= RoI Proposal ============

layer {
name: "rpn_cls_prob"
type: "Softmax"
bottom: "rpn_cls_score_reshape"
top: "rpn_cls_prob"
}

layer {
name: 'rpn_cls_prob_reshape'
type: 'Reshape'
bottom: 'rpn_cls_prob'
top: 'rpn_cls_prob_reshape'
reshape_param { shape { dim: 0 dim: 18 dim: -1 dim: 0 } }
}

layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rpn_rois'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 21"
}
}

========= RCNN ============

layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "conv5_3"
bottom: "rois"
top: "pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4096
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 4096
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 1 }
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "fc7"
top: "cls_score"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 21
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "fc7"
top: "bbox_pred"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 84
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels"
propagate_down: 1
propagate_down: 0
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "loss_bbox"
loss_weight: 1

}

This is faster_rcnn_end2end.sh

set -x
set -e

export PYTHONUNBUFFERED="True"

GPU_ID=$1
NET=$2
NET_lc=${NET,,}
DATASET=$3

array=( $@ )
len=${#array[@]}
EXTRA_ARGS=${array[@]:3:$len}
EXTRA_ARGS_SLUG=${EXTRA_ARGS// /_}

case $DATASET in
pascal_voc)
TRAIN_IMDB="voc_2007_trainval"
TEST_IMDB="voc_2007_test"
PT_DIR="pascal_voc"
ITERS=70000
;;
coco)
# This is a very long and slow training schedule
# You can probably use fewer iterations and reduce the
# time to the LR drop (set in the solver to 350,000 iterations).
TRAIN_IMDB="coco_2014_train"
TEST_IMDB="coco_2014_minival"
PT_DIR="coco"
ITERS=490000
;;
*)
echo "No dataset given"
exit
;;
esac

LOG="experiments/logs/faster_rcnn_end2end_${NET}_${EXTRA_ARGS_SLUG}.txt.date +'%Y-%m-%d_%H-%M-%S'"
exec &> >(tee -a "$LOG")
echo Logging output to "$LOG"

time ./tools/train_net.py --gpu ${GPU_ID} \
--solver models/${PT_DIR}/${NET}/faster_rcnn_end2end/solver.prototxt \
--weights data/imagenet_models/${NET}.v2.caffemodel \
--imdb ${TRAIN_IMDB} \
--iters ${ITERS} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}

set +x
NET_FINAL=grep -B 1 "done solving" ${LOG} | grep "Wrote snapshot" | awk '{print $4}'
set -x

time ./tools/test_net.py --gpu ${GPU_ID} \
--def models/${PT_DIR}/${NET}/faster_rcnn_end2end/test.prototxt \
--net ${NET_FINAL} \
--imdb ${TEST_IMDB} \
--cfg experiments/cfgs/faster_rcnn_end2end.yml \
${EXTRA_ARGS}


This is the extra config file what i add on original config file.

EXP_DIR: faster_rcnn_end2end
TRAIN:
HAS_RPN: True
IMS_PER_BATCH: 1
BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
RPN_POSITIVE_OVERLAP: 0.7
RPN_BATCHSIZE: 256
PROPOSAL_METHOD: gt
BG_THRESH_LO: 0.0
TEST:
HAS_RPN: True


Thanks your help.

@tolry418 : in faster_rcnn_end2end.sh, please remove line "--weights data/imagenet_models/${NET}.v2.caffemodel \" to avoid pre-trained network.
Not sure it can help or not. pls try it
Other tips is you should test with smaller iteration first (exp. change 70000 in faster_rcnn_end2end.sh to 10000)

@tiepnh
Above faster_rcnn_end2end. sh is original file, it is just given. I do not touch anything.
I already train it without pre-trained model.

I run test on command line directly like this
./tools/test_net.py --def models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt --net output/faster_rcnn_end2end/voc_2007_train/vgg16_faster_rcnn_iter_20000.caffemodel --cfg experiments/cfgs/faster_rcnn_end2end.yml

As i wrote above
--net output/faster_rcnn_end2end/voc_2007_train/vgg16_faster_rcnn_iter_20000.caffemodel
This is the model i trained without pretrained model.

But when i run the test.
I bump into that problem.
""BB = BB[sorted_ind, :]
IndexError: too many indices for array"

Thanks your help.

@tolry418

BB contains all boxes of some specific class. If the model doesn't find any boxes for some class from the whole test data set, BB will be an empty (1-d) array. So calling BB[sorted_ind, :] will lead to IndexError.

Maybe you should remove all lr_mult: 0 from conv1 and conv2 in the prototxt.

@tolry418, If you still not figured about your issue, the problem might be your four first convolution layers. In your provided prototxt, you specify the learning multiplier for these layer as 0. It would cause your network to not learn anything in these layers. Since these layers are important in filtering images, it means that after 100000 of iterations you still could not learn anything.

@tiepnh Hello, I have the problem of

BB = BB[sorted_ind, :]
IndexError: too many indices for array

And I followed above comments to modify train.prototxt by adding weight_filler and bias filler. But I still have the IndexError. Then I didn't use pretrained_model VGG16.v2.caffemodel to initialize faster r-cnn. However, I have a new error:

AssertionError: Selective search data not found at: 
/py-faster-rcnn/data/selective_search_data/voc_2007_trainval.mat

Can you help me to solve it? Thank you very much.

Hi @CassieMai:
Did you use config from file ./experiments/cfgs/faster_rcnn_end2end.yml ???
Make sure to set PROPOSAL_METHOD to "gt". (In default, the PROPOSAL_METHOD is set to selective_search and as I think you have no selective_search data).

@tiepnh Yes, I used faster_rcnn_end2end.yml, and I kept the default setting PROPOSAL_METHOD = gt. It seems config file didn't make a difference.

EXP_DIR: faster_rcnn_end2end
TRAIN:
  HAS_RPN: True
  IMS_PER_BATCH: 1
  BBOX_NORMALIZE_TARGETS_PRECOMPUTED: True
  RPN_POSITIVE_OVERLAP: 0.7
  RPN_BATCHSIZE: 256
  PROPOSAL_METHOD: gt
  BG_THRESH_LO: 0.0
TEST:
  HAS_RPN: True

@tolry418
hi, do you figure out the problem when you train VGG16/faster rcnn without pretraining model ?
I also met the same problem. Thank you!

@CassieMai I met the same error,AssertionError: Selective search data not found at:
/py-faster-rcnn/data/selective_search_data/voc_2007_trainval.mat
How did you handle it?
Thank you very much!!

@CassieMai @whmin I met the same error: AssertionError: Selective search data not found
di you solve it? Thanks for your answer.

@whmin @RichardMrLu Hello, it was long time ago. I didn't remember whether I solved it, because I used a tensorflow version code instead. You may follow the above posts or try other ways. Sorry for that.

I have a problem, why need use selective search data? We use RPN instead of selective search

@tiepnh @manipopopo Hi, I am not sure that since the layers before conv3_1 in VGG16 are set to 0, I need add the weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 1 } to all layers in VGG16??or just the layers before conv3_1锛焧hans for your help!

If you don't load pretrained models, you'll need to make sure that the weights of all convolution layers are initialized with some random tensors, and all convolution layers are learnable. That is, all convolution layers in prototxt should have something like
```
param {
lr_mult: {GREATER_THAN_ZERO}
decay_mult: ...
}

if bias_term is true

param {
lr_mult: {GREATER_THAN_ZERO}
decay_mult: ...
}
convolution_param {
...
weight_filler {
# initialize weights with random values
type: {gaussian, ...}
...
}
}

@manipopopo ok~thanks I get it! And I have another question now, If I modified the train.prototxt for training, how about the test.prototxt? Do I need modify it before testing? add the wights. Thanks for your help!

If you only change lr_mult and *_filler, you can use the corresponding deploy.prototxt directly.

that's to say whether set the method to 'gt' or not,the network will use RPN ,and then get proposal ,thx @CassieMai @ujsyehao

HI guys, I am getting this error when try to run the below script.
What would be the problem for this??

........~/FRCN_ROOT$ ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16

  • set -e
  • export PYTHONUNBUFFERED=True
  • PYTHONUNBUFFERED=True
  • GPU_ID=1
  • NET=VGG16
  • NET_lc=vgg16
  • DATASET=
  • array=($@)
  • len=2
  • EXTRA_ARGS=
  • EXTRA_ARGS_SLUG=
  • case $DATASET in
  • echo 'No dataset given'
    No dataset given
  • exit

Please let me know if someone know about it...
Thank You

@Ram124 Please try to run the following command
$ ./experiments/scripts/faster_rcnn_end2end.sh gpu 0 VGG16 pascal_voc

@CassieMai Thanks for your reply.
I have my own dataset..
I have it in
$ ./data/RAM_dataset/data
under this i have
/Annotation files,
/Images,
/ImageSets.

How should i specify the input file in the run command??

HI @CassieMai ,
I have solved the above mentioned problem..
But i ma getting new problem.
While training the network on my own data.

I got this error..

What is the solution for this??

I0607 11:45:46.728519 2386 net.cpp:283] Network initialization done.
I0607 11:45:46.728889 2386 solver.cpp:60] Solver scaffolding done.
Loading pretrained model weights from ./data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel
I0607 11:45:48.512673 2386 net.cpp:816] Ignoring source layer data
F0607 11:45:48.516479 2386 net.cpp:829] Cannot copy param 0 weights from layer 'conv3'; shape mismatch. Source param shape is 384 256 3 3 (884736); target param shape is 512 256 3 3 (1179648). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
* Check failure stack trace: *
Aborted (core dumped)

I have 2 classes.
I have images of size 512 * 512..

Please help if someone knows this

@Ram124 It looks that you did not use parameters of ZF net to initialize your network. Have you downloaded a pre-trained model of ZF net?

@CassieMai
Yes..
I had used VGG parametrs..
But now i solved it..

BUt i am getting different error now.

Cannot copy param 0 weights from layer 'fc6'; shape mismatch. Source param shape is 4096 25088 (102760448); target param shape is 4096 18432 (75497472). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
* Check failure stack trace: *
Aborted (core dumped)

What is this??

Which backbone network are you using? ZF or VGG? Did you run a right .sh?

@CassieMai
VGG

Maybe you can check your class num?

@CassieMai
I have 2 classes.
in cls_score : num_output : 3
in bbox_pred : num_output : 12

should i change the image size in dim : below??
My image size are 512 * 512 gray scale

name: "VGG_CNN_M_1024"
input: "data"
input_shape {
dim: 1
dim: 3
dim: 224
dim: 224
}
input: "im_info"
input_shape {
dim: 1
dim: 3
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}

@CassieMai
I am able to train the network..

It is running.

607 12:33:20.665174 2694 solver.cpp:229] Iteration 2900, loss = 0.411904
I0607 12:33:20.665235 2694 solver.cpp:245] Train net output #0: loss_bbox = 0.0577442 (* 1 = 0.0577442 loss)
I0607 12:33:20.665252 2694 solver.cpp:245] Train net output #1: loss_cls = 0.176116 (* 1 = 0.176116 loss)
I0607 12:33:20.665264 2694 solver.cpp:245] Train net output #2: rpn_cls_loss = 0.160761 (* 1 = 0.160761 loss)
I0607 12:33:20.665277 2694 solver.cpp:245] Train net output #3: rpn_loss_bbox = 0.0172822 (* 1 = 0.0172822 loss)
I0607 12:33:20.665290 2694 sgd_solver.cpp:106] Iteration 2900, lr = 0.001
I0607 12:33:23.360194 2694 solver.cpp:229] Iteration 2920, loss = 0.618303
I0607 12:33:23.360254 2694 solver.cpp:245] Train net output #0: loss_bbox = 0.188877 (* 1 = 0.188877 loss)
I0607 12:33:23.360271 2694 solver.cpp:245] Train net output #1: loss_cls = 0.336819 (* 1 = 0.336819 loss)
I0607 12:33:23.360285 2694 solver.cpp:245] Train net output #2: rpn_cls_loss = 0.0918402 (* 1 = 0.0918402 loss)
I0607 12:33:23.360297 2694 solver.cpp:245] Train net output #3: rpn_loss_bbox = 0.000766937 (* 1 = 0.000766937 loss)
I0607 12:33:23.360309 2694 sgd_solver.cpp:106] Iteration 2920, lr = 0.001

Where can i see the output??

How to test it on other dataset??

how long does this training runs??
This is the solver.prototxt

train_net: "models/VGG16/faster_rcnn_end2end/train.prototxt"
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
stepsize: 50000
display: 20
average_loss: 100
momentum: 0.9
weight_decay: 0.0005

We disable standard caffe solver snapshotting and implement our own snapshot

function

snapshot: 0

We still use the snapshot prefix, though

snapshot_prefix: "vgg_cnn_m_1024_faster_rcnn"

Please help..

@manipopopo @tiepnh should I use a pretrained model with classes which is different totally from classes of the model I will train?

Do Better ImageNet Models Transfer Better? carries out experiments on classification tasks. They show

ImageNet pretraining accelerates convergence and improves performance on
many datasets, but its value diminishes with greater training time, more training data, and greater divergence from ImageNet labels. For some fine-grained classification datasets, a few thousand
labeled examples, or a few dozen per class, are all that are needed to make training from scratch perform competitively with fine-tuning.

The effectness of transfer learning varies between datasets. Maybe you could try both of them (training from scratch and with pretrained models) and compare their performance on your validation dataset.

If you don't have enough resources, it seems that initializing weights from pretrained models is a good choice.

@manipopopo
I ahve done trainig on 2 classses.
I am getting some goo doutput aswell.

But i m not able to display 2 objects in single frame..
When i run demo.py. I am getting only 1 object per image even though there 2 objects located in the image,. What is the problem?

Any help is really appreciated..

Thank You

@tiepnh @manipopopo @CassieMai @karaspd
Hello guys,
i have question about detection.
I have 2 classes (Including background it is 3)
If i feed the network with only 1 object per image as training data.
100 images contains only A class and another 100 images contains B class.
so total there would be 200 images with 2 classes.

If i do training on them.
After training is done.

During testing,
If i give images with both classes (A & B classes) in it.
Would the network detect both the objects simultaneously??

Or should i fee the network with images having 2 objects????

Please clarify this.
Because i am getting only 1 object detected per image when i do testing (after feeding images with 1 object dataset).

Any help is really appreciated

When i run demo.py. I am getting only 1 object per image even though there 2 objects located in the image,. What is the problem?

If i give images with both classes (A & B classes) in it.

i am getting only 1 object detected per image

See https://github.com/rbgirshick/py-faster-rcnn/blob/master/tools/demo.py#L90-L98
The loop visualizes one class at a time.

@manipopopo But how to get 2 detections in the same images
I have some 100's of images with 2 objects in it..
i want to detect all of them and save them somewhere. What should i change to make it happen?
Do you have any idea?

You can save dets from all iterations.
It seems to me that the question is a little bit off-topic.

How to do it actually??
I tried all means of solutions.
But couldn't overcome it.
Can You share me the part of the code which is required for this..

@tiepnh @manipopopo @CassieMai @karaspd I have a question regarding batch size. Can we use batch size of more than 1 in mxnet-rcnn training??
Because i have a large dataset of 15000 images.
if i do training on them , the speed : 2.35 sample/sec.
it takes almost 4 hours per epoch.
is there anyother way i could increase the speed??

Any help is really appreciated.

Hi, @Ram124 I have the similar problem:

Cannot copy param 0 weights from layer 'cls_score'; shape mismatch. Source param shape is 2 4096 (8192); target param shape is 21 4096 (86016). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
*** Check failure stack trace: ***
Aborted (core dumped)

How did you solve it?

I have solved it.
Because I encountered it when running ./tools/demo.py, I change the num_output in py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt and it works.

@tolry418 : You should follow comment of @manipopopo before.
Or you can try to use this prototxt(from #56) :https://github.com/rbgirshick/py-faster-rcnn/files/380443/train_val.txt
Base on that, you don't need use pre-trained network any more

it is a good way to learning net .

Was this page helpful?
0 / 5 - 0 ratings