Keras-retinanet: Pretrained models for other backbone models

Created on 24 Feb 2018  路  25Comments  路  Source: fizyr/keras-retinanet

Hi,

Thank you for the great work!
Is there any chance you may release the pretrained models for other backbone models, e.g. resnet101, resnet152 or mobilenet128_1.0, mobilenet128_0.75, mobilenet160_1.0? Currently we only have pretrained models for resnet50.

That would be super helpful for transfer learning. Otherwise, I might need to train on COCO from scratch.

Thanks a lot!

help wanted

Most helpful comment

@lvaleriu could you please share with me the Pretrained models for mobilenet128_1.0 backbone ?

All 25 comments

Considering our resources for this project are limited, we don't provide the pretrained models for the other architectures. If in the future we happen to have trained these architectures on COCO we Will probably make them publicly available. For now, your best bet is to start with imagenet trained weights and then fine-tune on COCO or your own dataset.

Cool. Thanks for letting me know!

I assigned the label help wanted. We'd be happy to add pretrained (COCO/Pascal) networks to this repository if they are provided to us, but there is a risk that the architecture changes which causes those models to become obsolete. If that is the case, we likely won't update the pretrained models (except for ResNet50 on COCO).

For training on coco what are the parameters? batch_size=1, flip_x augmentation? (if that matters)

Yes, models might change a bit. It might be a good idea to use the official keras repository models (from applications), the ones from https://github.com/keras-team/keras-contrib or copy them directly in this repository (but we still need to link to the imagenet weights).

For training on coco what are the parameters? batch_size=1, flip_x augmentation? (if that matters)

Yeah, only those.

Started training on COCO (train2017+ val2017) using mobilenet224_1.0 + batch_size=1 + flip_x + image_min_side=800, image_max_side=1333+ perform NMS per class+ FPN correction

I'll keep updating this post with training results.

Epoch 2:
10000/10000 [==============================] - 2939s 294ms/step - loss: 3.7631 - regression_loss: 2.8676 - classification_loss: 0.8955

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.004
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.009
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.002
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.002
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.005
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.005
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.049
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.097
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.102
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.076
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.169

Epoch 4:
10000/10000 [==============================] - 2810s 281ms/step - loss: 3.3364 - regression_loss: 2.5441 - classification_loss: 0.7923

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.009
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.020
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.007
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.006
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.014
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.011
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.088
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.197
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.231
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.096
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.239
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.348

Epoch 6:
10000/10000 [==============================] - 2757s 276ms/step - loss: 3.1225 - regression_loss: 2.3957 - classification_loss: 0.7268

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.019
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.040
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.016
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.013
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.029
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.023
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.111
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.227
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.259
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.103
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.271
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.384

Epoch 8:
10000/10000 [==============================] - 4962s 496ms/step - loss: 2.9642 - regression_loss: 2.2937 - classification_loss: 0.6706

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.030
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.060
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.025
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.017
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.040
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.037
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.126
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.255
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.301
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.137
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.329
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.424

Epoch 11:
10000/10000 [==============================] - 3330s 333ms/step - loss: 2.7947 - regression_loss: 2.1787 - classification_loss: 0.6160

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.046
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.090
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.041
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.024
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.062
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.057
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.140
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.278
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.332
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.169
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.369
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.448

Epoch 13:
10000/10000 [==============================] - 28566s 3s/step - loss: 2.7168 - regression_loss: 2.1301 - classification_loss: 0.5867

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.057
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.108
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.053
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.032
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.076
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.070
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.152
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.296
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.351
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.181
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.390
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.468

Epoch 16:
10000/10000 [==============================] - 8350s 835ms/step - loss: 2.6045 - regression_loss: 2.0449 - classification_loss: 0.5596

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.067
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.128
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.063
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.034
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.084
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.084
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.160
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.300
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.347
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.186
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.381
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455

Epoch 18:

10000/10000 [==============================] - 21270s 2s/step - loss: 2.5862 - regression_loss: 2.0303 - classification_loss: 0.5559

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.070
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.133
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.066
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.036
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.089
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.086
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.157
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.295
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.334
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.177
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.365
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.443

Epoch 20:
10000/10000 [==============================] - 2790s 279ms/step - loss: 2.4398 - regression_loss: 1.9232 - classification_loss: 0.5166

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.085
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.158
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.083
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.043
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.103
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.104
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.170
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.320
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.369
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.211
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.401
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.477

@lvaleriu How is your mobilenet training process? My training on COCO using densenet169 as backbone gives only a MAP of 0.028 at epoch 24.

@panda9095 Very bad. So i'll start again using the FPN-CORRECTION.

Actually it got merged into master.

Could I use mobilennet initial weights from here?
https://github.com/experiencor/basic-yolo-keras
does it work?

@panda9095 Started training mobilenet on coco again. I'll update the previous comment with the results after each epoch.

@panda9095 It seems better now.
@hgaiser Can you take a look at the learning progression? I've never trained resnet50 from scratch on coco till now and dont have a reference for the learning curve.

Here are the results after training mobilenet224_1.0 for 140+ epoch(keras-retinanet0.2,batch_size=1)
Every epoch takes 60min on my single 1080ti.The GPU utilization is 90%+.
The red line is mobilenet224_1.0 and the orange line is res50_retinanet.It seems that the loss decrease very slow.
The learning rate change because i keep training from epoch 100 using --weights command.
screen shot 2018-03-10 at 10 45 52 am
screen shot 2018-03-10 at 10 46 15 am
screen shot 2018-03-10 at 10 46 33 am

@lvaleriu, can u please explain why do we get 6 values of precision and recall?

As defined in http://cocodataset.org/#detection-eval, here are the 12 metrics:

image

@lvaleriu Thanks.

Actually I need the more powerfull backbone support ,such as ResNeXt, or the SE-ResNeXt. Of course I tried by myself , but the performence dropped a litter while I excepted for higher. Maybe it's because that I used the customed dataset which contains about 10K images. I will train on the COCO. If there is any idea for higher performence, I would be gratefull

PRs for those backbones would be very welcome.

Pretraining on COCO sounds like the right thing to do, it also gives you a better measure of how well the backbone works.

Hey all!
I just tried to train a net with mobilenet160_0.75 as backbone. I just added "--backbone mobilenet160_0.75" to the command provided in the README.md for training on csv datasets. It is throwing an error while creating mobilenet in site-package keras-applications. Did i forget an argument?

That's better suited for a separate issue (also, mention the error, it helps to find the cause).

@lvaleriu could you please share with me the Pretrained models for mobilenet128_1.0 backbone ?

For mobilenet, I saw keras-retinanet is used in vehicle detection:
https://github.com/yangliupku/retinanet_detection
Can someone merge it?

Actually I need the more powerfull backbone support ,such as ResNeXt, or the SE-ResNeXt. Of course I tried by myself , but the performence dropped a litter while I excepted for higher. Maybe it's because that I used the customed dataset which contains about 10K images. I will train on the COCO. If there is any idea for higher performence, I would be gratefull

Where did you get the pretrained weights of ResNext? Which implementation of ResNext did you follow?

Actually I need the more powerfull backbone support ,such as ResNeXt, or the SE-ResNeXt. Of course I tried by myself , but the performence dropped a litter while I excepted for higher. Maybe it's because that I used the customed dataset which contains about 10K images. I will train on the COCO. If there is any idea for higher performence, I would be gratefull

Did you trained ResNeXt on COCO? If yes can you please provide me with the pretrained model.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ChienLiu picture ChienLiu  路  6Comments

sumeetssaurav picture sumeetssaurav  路  4Comments

xyiaaoo picture xyiaaoo  路  5Comments

Doodle1106 picture Doodle1106  路  3Comments

fernandocamargoti picture fernandocamargoti  路  4Comments