Mask_rcnn: Fail to continue to train from the trained weights

Created on 3 Jul 2018  路  2Comments  路  Source: matterport/Mask_RCNN

Following the Python script balloon.py, I created a similar script card.py and trained the network with my own dataset for 5 epochs. The trained weights are stored in mask_rcnn_card_0005.h5.

Then, I would like to continue to train the network by using the trained weights. (I run it in Spyder)
run card.py train --dataset=/path/to/datasets/card/ --weights=/path/to/mask_rcnn_card_0005.h5

It displays

 ......
Loading weights  /path/to/mask_rcnn_card_0005.h5
Re-starting from epoch 5
Training network heads

Starting at epoch 5. LR=0.001

Checkpoint Path:  /path/to/mask_rcnn_card_{epoch:04d}.h5
Selecting layers to train
fpn_c5p5               (Conv2D)
...
...
mrcnn_mask          (TimeDistributed)

Then the program just stop here and does not produce any results.
Is there something that I am missing? Thanks

Most helpful comment

@thhung

If you pass the argument weights= /path/to/logs/mask rcnn card 0005.h5 with the
setting epochs=5 in the card.py script. It would not continue to train since
it has already trained 5 epochs. (the tag ..._0005.h5 indicates it has already been trained with 5 epochs)

Solution to further train 5 epochs:

  1. Put the weight mask_rcnn_card_0005.h5 outside the logs/ directory and setting epochs=5 in card.py, it will (continue to) train 5 epochs. The NEW weights would be saved as mask_rcnn_card_0001.h5, ... , mask_rcnn_card_0005.h5
  1. Set epochs=10 in card.py and this time, the weight mask_rcnn_card_0005.h5 is still kept inside the logs/ directory. Then you should see the output Re-starting from epoch 5. The NEW weights would be saved as mask_rcnn_card_0006.h5, ... mask_rcnn_card_0010.h5

Hope this help :)

All 2 comments

@bennyphtam Could you share your solution here pls? I have the same issue.

@thhung

If you pass the argument weights= /path/to/logs/mask rcnn card 0005.h5 with the
setting epochs=5 in the card.py script. It would not continue to train since
it has already trained 5 epochs. (the tag ..._0005.h5 indicates it has already been trained with 5 epochs)

Solution to further train 5 epochs:

  1. Put the weight mask_rcnn_card_0005.h5 outside the logs/ directory and setting epochs=5 in card.py, it will (continue to) train 5 epochs. The NEW weights would be saved as mask_rcnn_card_0001.h5, ... , mask_rcnn_card_0005.h5
  1. Set epochs=10 in card.py and this time, the weight mask_rcnn_card_0005.h5 is still kept inside the logs/ directory. Then you should see the output Re-starting from epoch 5. The NEW weights would be saved as mask_rcnn_card_0006.h5, ... mask_rcnn_card_0010.h5

Hope this help :)

Was this page helpful?
0 / 5 - 0 ratings