Yolov3: Can't reproduce your results

Created on 26 May 2020 · 8Comments · Source: ultralytics/yolov3

Hi, guys!

I can't reproduce your results.

I looked only at certain 11 classes ( person, bicycle, car, motorcycle, bus, truck, cat, dog, cow, bird, bear)
The metrics of my trained model are lower than the metrics of your model in these classes.

I used this command to start training:
python train.py --data coco2017.data --weights '' --batch-size 32 --cfg yolov3-spp.cfg --multi

After training (300 epochs), i tested the model and got following metrics:
mAP0.5= 0.739, precision = 0.578, recall = 0.788
my res

But when i tested your yolov3-spp-ultralytics (608) model i got different results, such as:
mAP0.5= 0.809, precision = 0.623, recall = 0.836
yolov3-orig

Could you tell me what i did wrong? And why i got this results after training?

N of commit: b2fcfc573e5418c0b2ef0c0357bf51bc5cb027b6

Thank you!

Source

Alexey-Miliutin

Most helpful comment

Btw we had a feature request for an mp spawn implementation, which may help speed up multi GPU, but this has not been done yet, so multi GPU runs of distributed data parallel. You don’t need to take any special action, if no device is specified it will use all of your devices.

glenn-jocher on 27 May 2020

👍2

All 8 comments

@Alexey-Miliutin there is a section in the readme devoted exactly to reproducing our training results. It has code you literally copy and paste.
https://github.com/ultralytics/yolov3#reproduce-our-results

glenn-jocher on 26 May 2020

@glenn-jocher ok, i'll try it. Could you tell me, what effective batch-size should i use, if i have 4 gpus(v100,16gb)?

Alexey-Miliutin on 27 May 2020

It shouldn’t matter all that much. If you can do batch size 64 I would use that.

glenn-jocher on 27 May 2020

👍1

glenn-jocher on 27 May 2020

👍2

Hi,
I trained a model with flags, as in your guide:
python train.py --data coco2014.data --weights '' --batch-size 64 --cfg yolov3-spp.cfg

But i got bad results.

For example, my model gives this metrics for the validation set of "coco 2017":
Precision: 0.44, Recall: 0.537, [email protected]: 0.481, F1: 0.478
More details: yolov3-spp-ultralytics.txt

While your model (yolov3-spp-ultralytics.pt) gives a different metrics, such as:
Precision: 0.536, Recall: 0.786, [email protected]: 0.729 , F1: 0.633
More details: yolov3-spp-coco2014.txt

These models tested with following comand:
python test.py --cfg yolov3_spp.cfg --weights model.pt --img-size 608 --data coco2017.data

Could you tell me why my model can't achieve good results as in your guide?
The fact is that I conducted several experiments and still haven't achieved metrics, like your model. So I don’t understand what I'm doing wrong.

Thank you

Alexey-Miliutin on 1 Jun 2020

@Alexey-Miliutin all pretrained checkpoints in this repo are trained on coco2014, so you can not apply them to the coco2017 dataset, as you will get false results. To reproduce our training see https://github.com/ultralytics/yolov3#reproduce-our-results. There is nothing more to do beyond following the directions there.

glenn-jocher on 1 Jun 2020

@glenn-jocher, As i said, i trained the model on coco2014;
Last time i tested on coco2017 and now i tested on coco2014 and got the following results:

My model:
Precision: 0.356, Recall: 0.609, [email protected]: 0.507, F1: 0.435
More details: test coco2014.txt

Yolov3-spp-ultralytics model:
Precision: 0.43, Recall: 0.834, [email protected]: 0.754, F1: 0.562
More details: test coco2014.ultralytics.txt

These models tested with following comand:
python test.py --cfg yolov3_spp.cfg --weights model.pt --img-size 608 --data coco2014.data --augment

As you can see, there is a difference in metrics between the models with the same approach to training the models. You can see this difference both on coco2017(in my past comment) and on coco2014

So, I have a question. how is it?

Thank you for answer!

Alexey-Miliutin on 2 Jun 2020

@Alexey-Miliutin commands to reproduce training are shows on the readme. There is nothing more to add beyond these. These are the same commands we use to train our models for this repo.
https://github.com/ultralytics/yolov3#reproduce-our-results

If you see different results it is because you are not reproducing our methods correctly.

glenn-jocher on 2 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How different is this library when using YOLOv4 weights with YOLOv4 performance?

suarezjessie · 5Comments

RuntimeError: expected device cuda:0 and dtype Float but got device cuda:0 and dtype Bool

Blddwkb · 4Comments

Failing to convert model to CoreML

acburigo · 4Comments

Cannot find --transfer in train.py and how to freeze layers except darknet backbone?

yoga-0125 · 4Comments

Error while changing image size

JungminChung · 3Comments