Hello Alexey AB, I鈥檓 a student in mathematics and as part of my internship I work on a subject on objects detection. I use your darknet version and I鈥檓 very satisfied. But I鈥檝e some questions to ask you too. I have to train my detector with 4 objects (facade_7705, card1, card2, Fan_module) with 1848.
The big concern I have is the risk of overfiting. I don鈥檛 quite understand how you check this in your modeling. I wanted to start on a partitioning of the data in Three to know:
Train: For training
Valid: For validation
Test: For the final test to know if you are not overfitting
The configuration you suggest is as follows:
classes= 4
train = data/train.txt
valid = data/valid.txt
names = data/obj.names
backup = backup/
when I do :
darknet.exe detector map data / obj.data yolo-obj.cfg backup yolo-obj_7000.weights
I get the following table and this corresponds to the results of my models on data Train and Valid. Now I would like to know if it is possible to produce such a table with Test data, to ensure that we do not have a problem of overfitting. Is it really necessary to do it?
@HerbertGourout Hi,
Yes, just set valid = data/test.txt and do the same commands
darknet.exe detector map data/obj.data yolo-obj.cfg backup\yolo-obj_7000.weights
It is necessary to divide dataset to 3 (train, val, test), only in 2 cases:
(train ~5%, val ~80%, test ~15%) - if you want to train on small Train-set, then Fine-tune on Val-set with frozen most of layers, and then check mAP on Test-set
(train ~80%, val ~10%, test ~10%) - if you want to use double-blind checking, you send Train and Val sets to another person, who will train on Train-set and check mAP on Val-set, then you want to check whether he cheat you and you receive the model from this person and you check it on Test-set that he didn't have
In other cases you should divide dataset only to 2 (train, val)
Thank you for your satisfying return.
1) I would still like to know if it is necessary to have representative object classes in each partition?
2) What exactly is a small train-set for object detection ?
5-15% of dataset, as I specified in previous answerthank's
Most helpful comment
@HerbertGourout Hi,
Yes, just set
valid = data/test.txtand do the same commandsdarknet.exe detector map data/obj.data yolo-obj.cfg backup\yolo-obj_7000.weightsIt is necessary to divide dataset to 3 (train, val, test), only in 2 cases:
(train ~5%, val ~80%, test ~15%) - if you want to train on small Train-set, then Fine-tune on Val-set with frozen most of layers, and then check mAP on Test-set
(train ~80%, val ~10%, test ~10%) - if you want to use double-blind checking, you send Train and Val sets to another person, who will train on Train-set and check mAP on Val-set, then you want to check whether he cheat you and you receive the model from this person and you check it on Test-set that he didn't have
In other cases you should divide dataset only to 2 (train, val)