Py-faster-rcnn: Fine tuning issue

Created on 4 May 2016 · 12Comments · Source: rbgirshick/py-faster-rcnn

I'v fine-tuned imagenet pre-trained model (VGG16.v2.caffemodel) with my custom dataset that has 5 categories and it's O.K.
But When I use pascal_voc model or coco model (VGG16_faster_rcnn_final.caffemodel or coco_vgg16_faster_rcnn_final.caffemodel) instead of imagenet pre-trained model, it's not fine-tuned. even i changed num_output, num_classes, and some layer' name as well.
Is there anybody to tell me why? thanks.

Source

yeongrok

Most helpful comment

I got it! The problem comes from the snapshot wrapper in train.py. This wrapper only works if your bbox_pred layer is named 'bbox_pred'. You can easily correct this by changing the function snapshot of the class SolverWrapper. This is the way I did it :

def snapshot(self):
    """Take a snapshot of the network after unnormalizing the learned
    bounding-box regression weights. This enables easy use at test-time.
    """

    # **********************************************************************************
    # CORRECTION (Hugues THOMAS 17/06/2016)

    # Put here the new name of your bbox_pred layer (originally it was 'bbox_pred')
    my_bbox_pred = 'bbox_pred_KTP'

    # **********************************************************************************

    net = self.solver.net
    scale_bbox_params = (cfg.TRAIN.BBOX_REG and
                         cfg.TRAIN.BBOX_NORMALIZE_TARGETS and
                         net.params.has_key(my_bbox_pred))

    if scale_bbox_params:
        # save original values
        orig_0 = net.params[my_bbox_pred][0].data.copy()
        orig_1 = net.params[my_bbox_pred][1].data.copy()

        # scale and shift with bbox reg unnormalization; then save snapshot
        net.params[my_bbox_pred][0].data[...] = \
                (net.params[my_bbox_pred][0].data *
                 self.bbox_stds[:, np.newaxis])
        net.params[my_bbox_pred][1].data[...] = \
                (net.params[my_bbox_pred][1].data *
                 self.bbox_stds + self.bbox_means)

    infix = ('_' + cfg.TRAIN.SNAPSHOT_INFIX
             if cfg.TRAIN.SNAPSHOT_INFIX != '' else '')

    filename = (self.solver_param.snapshot_prefix + infix +
                '_iter_{:d}'.format(self.solver.iter) + '.caffemodel')

    filename = os.path.join(self.output_dir, filename)

    net.save(str(filename))
    print 'Wrote snapshot to: {:s}'.format(filename)

    if scale_bbox_params:
        # restore net to original state
        net.params[my_bbox_pred][0].data[...] = orig_0
        net.params[my_bbox_pred][1].data[...] = orig_1
    return filename

HuguesTHOMAS on 17 Jun 2016

😄11 🎉2

All 12 comments

I used coco model to finetune pascal voc dataset,however,the mAP for each class I got is very low,<=0.1.

wait1988 on 4 May 2016

I am actually trying to finetune imagenet model on a custom dataset (KTP for people detection) which has only 2 categories, background and people. If I succeed I will try to pascal or coco model so I am very interested to know if you found a solution to this problem.

HuguesTHOMAS on 9 Jun 2016

@wait1988 Same here, I have very poor results when finetuning from another model than imagenet pretrained. I investigated the reason and for me it comes from the fact that you have to change layer names.
In fact when training on original imagenet model, layers are different so you don't have to change any layer name in train and test prototxt, but I did it just to see what happens and I get very low AP too.

I did not find a solution to this problem yet but I continue to investigate. If anyone here has a clue, I would be glad to hear it.

HuguesTHOMAS on 17 Jun 2016

def snapshot(self):
    """Take a snapshot of the network after unnormalizing the learned
    bounding-box regression weights. This enables easy use at test-time.
    """

    # **********************************************************************************
    # CORRECTION (Hugues THOMAS 17/06/2016)

    # Put here the new name of your bbox_pred layer (originally it was 'bbox_pred')
    my_bbox_pred = 'bbox_pred_KTP'

    # **********************************************************************************

    net = self.solver.net
    scale_bbox_params = (cfg.TRAIN.BBOX_REG and
                         cfg.TRAIN.BBOX_NORMALIZE_TARGETS and
                         net.params.has_key(my_bbox_pred))

    if scale_bbox_params:
        # save original values
        orig_0 = net.params[my_bbox_pred][0].data.copy()
        orig_1 = net.params[my_bbox_pred][1].data.copy()

        # scale and shift with bbox reg unnormalization; then save snapshot
        net.params[my_bbox_pred][0].data[...] = \
                (net.params[my_bbox_pred][0].data *
                 self.bbox_stds[:, np.newaxis])
        net.params[my_bbox_pred][1].data[...] = \
                (net.params[my_bbox_pred][1].data *
                 self.bbox_stds + self.bbox_means)

    infix = ('_' + cfg.TRAIN.SNAPSHOT_INFIX
             if cfg.TRAIN.SNAPSHOT_INFIX != '' else '')

    filename = (self.solver_param.snapshot_prefix + infix +
                '_iter_{:d}'.format(self.solver.iter) + '.caffemodel')

    filename = os.path.join(self.output_dir, filename)

    net.save(str(filename))
    print 'Wrote snapshot to: {:s}'.format(filename)

    if scale_bbox_params:
        # restore net to original state
        net.params[my_bbox_pred][0].data[...] = orig_0
        net.params[my_bbox_pred][1].data[...] = orig_1
    return filename

HuguesTHOMAS on 17 Jun 2016

😄11 🎉2

I'm facing with the same issue where I try to fine tune on ZF faster rcnn final model on my own dataset, where I have only about 100 samples each class. I've changed the train snapshot accordingly, but it still gives me very low mAP. What's strange is that the training loss converged to 0 but when I test it on the training set itself, it gives me less than .1 mAP. I looked at the RPN generation, if looks like it's not getting a very good proposals. Any help is much appreciated!!

sssruhan1 on 30 Sep 2016

@HuguesTHOMAS hello, after you have modified the python script as you did, have you got a good result fineing tune on MS coco dataset ? thank you

yanxp on 22 Nov 2016

Yes I did

HuguesTHOMAS on 23 Nov 2016

@HuguesTHOMAS hello,can you tell me some details? I changed the python script as you said and changed the train.prototxt such as rpn_cls_score to rpn_cls_score_layer,but i still get low MAP fineing tune on the coco model.thank you very much.

yanxp on 27 Nov 2016

@yanxp It's kinda hard for me to answer you as I stopped working on my project for some months now. From what I remember, a low AP on your result can be caused by several errors. You have to modify a lot of the python scripts. In my case I was using the "fast rcnn" version of the network so you might encounter different problems. I cannot list every things that I modified but there were a lot. Try to follow the execution of the code and use print command to check if everything is okay at each step. Doing that, I managed to get everything right. What I explained in my previous message was the last error that was blocking me

HuguesTHOMAS on 28 Nov 2016

Hi!After my training on my own dataset, it said my background AP is 0 and my interest class AP is 0.905, so I wanna know why the background AP is 0 and does it matter?

Leerw on 15 Oct 2017

Here is the text:

AP for __backgroud__ = 0.0000
AP for myclass = 0.9025
Mean AP = 0.4513
~~
Results:
0.000
0.903
0.451
~~

Leerw on 15 Oct 2017

👍1

I managed to make my net work. But not as @HuguesTHOMAS--- I've tried your suggestion yet from result given by demo.py it still seems that the net isn't work ---- it still draws a lot bboxes with confidence about 1.0 which makes no sense.
But I'm sure this issue is triggered by last "bbox_pred_.." layer since when I change the name of that layer in test.prototxt different from what I have in train.prototxt , it somehow gives a correct output! For my case, the bbox_pred layer in my train.prototxt is "bbox_pred_robot", then only if I rename that layer in test.prototxt like "bbox_pred_robot_abc" the net will work.... I know this is very strange because such act would clear the trained weight for that layer .... yet it works....... Anybody have any ideas for this?