Darknet: No detection on custom trained model.

Created on 13 Jun 2018 · 8Comments · Source: pjreddie/darknet

Using the latest darknet repo, trained my network using custom labels and images in this format:

chars.data:

classes=6
train=/mnt/c/Users/hello/Documents/github/darknet/chardata/train.txt
valid=/mnt/c/Users/hello/Documents/github/darknet/chardata/val.txt
names=/mnt/c/Users/hello/Documents/github/darknet/chardata/custom.names
backup=backup

train.txt and val.txt each have newline-separated paths to the images used.
The label txt files are formatted as:

<index of label e.g. 3> <center of bbox x> <center of bbox y> <width of bbox> <height of bbox>

custom.names:

Archer
Dragon
Wizard
Hydra
Soldier

For the training config, I used the Yolo v2 config, setting classes=6 and filters=55.

The training went fine, did not get any NaN during training and loss decreased. I wanted to test detection so I ran the detect command like so:

./darknet detect cfg/yolov2-chars.cfg backup/yolov2-chars_1000.weights -thresh 0

However any image I feed in (they're all 512x1024, same size as training input) comes out with no predictions, like this:

Loading weights from yolov2-chars_1000.weights...Done!
Enter Image Path: in.png
in.png: Predicted in 14.750000 seconds.

The predictions.png file also has no bboxes. No matter what I do, I can't get it to ever detect (even doing a fresh install and following the steps exactly from the website to just get normal detection on data/dog.jpg). Same issue with the detector test and yolo test commands from previous versions. Also tried yolov3, the tiny versions from 3 and 2, and the voc versions from 3 and 2. Also tried downgrading to last July, still no luck.

Source

pshah123

All 8 comments

I think the problem is mainly because of your training time, or dataset.
If you try to search 'no detection' here, you will some similar issures.
for example #839, #795 here, you might need a larger dataset, or train for longer times, say 10,000 its?
Hope it will help you

rxqy on 16 Jun 2018

I tried running for longer but still no luck... Also noticed that Darknet takes much longer to converge than other implementations of YOLO. Darkflow (TF port of Darknet) converges in 300 iterations on the exact same dataset but Darknet still has loss > 50 after 5000 iterations. Furthermore the threshold parameter does not seem to indicate that the detection is working (confidence can only be 0-1 so setting thresh to 0 should return everything?).

Using the trained .weights file in Darkflow works and returns steady detections so I assume the issue is with Darknet? Perhaps an uncaught or ignored error.

Also noticed that detect command only uses the coco.data configuration so the labels provided are always coco labels (e.g. "person" "aeroplane"), whereas detector test allows us to pass in our own .data with label set. I changed the coco.names file to hold my classes instead, but still did not fix the issue.

pshah123 on 16 Jun 2018

👀1 👍1

can you post you training results (the log)?

corentin87 on 4 Jul 2018

Already deleted darknet but I took this snippet a few weeks ago:

No matter what config I chose ([tiny-]yolo v1/2/3 [-voc]) and modified accordingly I couldn't get testing to work although training definitely works (like I said I loaded it into DarkFlow and got results). The only aberrant things in training I noted is that it said mask_scale: Using default '1.000000' and Resizing.

pshah123 on 4 Jul 2018

and do you have the output of the last iterations? with Obj and NoObj?

corentin87 on 5 Jul 2018

Unfortunately I do not but it definitely trained the model correctly as it worked perfectly in Darkflow.

pshah123 on 17 Jul 2018

opencv=1
cuda=1
cudnn=1

same error, please modify cfg/yolov3.cfg,

# Testing
batch=1
subdivisions=1
# Training
#batch=256
#subdivisions=64

then, detector result display ok.

zkailinzhang on 28 Dec 2018

👍1

opencv=1
cuda=1
cudnn=1

same error, please modify cfg/yolov3.cfg,
# Testing
batch=1
subdivisions=1
# Training
#batch=256
#subdivisions=64
then, detector result display ok.

Hi @zkl99999 , where did your label files located? According to the instruction of the VOC training procedure, there is only one requirement that the label file should have the same name with the image file. And in the cfg files, only the image files paths are given. I got confused that how and where the labels are feeded to the traing procedure.

Now, I can only suppose that the labels files should be located in the same folder of the images, but the VOC dataset shows that the labels files are located in an individal label folder?

Do you have any comments? Thanks a lot.