Darknet: Classifier always predicts one class

Created on 12 Nov 2019  路  35Comments  路  Source: AlexeyAB/darknet

I'm trying to train a classifier for traffic light colour. I have a full object detector that finds and crops the traffic lights and will pass the cropped traffic light to the classifier - there are 3 classes: red, green, off

I started with tiny.cfg and

  • Set width=32, height=64 as most of the cropped images are about that large, some smaller, some larger.
  • Set max_crop=16 as it is such a small image
  • Set hue=0.01 as the hue is obviously an important component of the class of the objects so I want minimal variation.
  • Set filters=3 in the final [convolutional] as there are 3 classes

Training set consists of 80% of each class: 8657 "red", 4771 "green" and 389 "off"

After not very many iterations - the Top1 converges to 0.572989 and just stays there. My validation set has 1731 red traffic light images out of a total 3021 and 1731/3021 = 0.572989 so it has clearly just learned to always predict "red".

Training with:
./darknet/darknet classifier train tf.data tiny.cfg darknet/tiny.conv.15 -dont_show -mjpeg_port 8090 -topk

There is no bad.list or any error messages

$ cat tf.data 
classes=3
train=train.list
valid=val.list
backup=backup
labels=labels.list
names=names.list
top=1

$ cat labels.list 
green
red
off

$ cat names.list 
green
red
off

$ head val.list 
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_592479e3-924b-459f-ab51-99b09503fe2dred.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_14-33-52-344140green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_60669red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_fe70f927-cb53-477a-9466-861bbb6a5c02red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_13-29-19-199408green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_17-10-17-703999green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_14-52-22-034667red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_14-39-59-765028red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_30112cf9-78ef-4f26-8b2a-db155920b749green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_11-05-30-765166red.jpg

$ head train.list 
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_1562926583_red.png
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_14-32-13-546227red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_13-36-11-040180red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_12-28-13-166136green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_25527b5d3a9b6427_2red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_12-07-50-074281red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_14-55-30-297755green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_17-52-10-205638green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_15-44-31-485642red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_c03e2602-5370-4491-af54-ea134a666e5dgreen.jpg

tiny.cfg.txt

Please could you help me work out where I'm going wrong? Also.. is it possible to have the graph give me top1 instead of top5?

Bug fixed

All 35 comments

What Top1 and Top5 do you get?


Please could you help me work out where I'm going wrong? Also.. is it possible to have the graph give me top1 instead of top5?

Set top=1 in the tf.data


Set filters=3 in the final [convolutional] as there are 3 classes

Yes, use

[convolutional]
filters=3
size=1
stride=1
pad=0
activation=linear

What Top1 and Top5 do you get?

Top1 starts at 0.394907 and soon rises up to 0.572989 where it stays
Top5 is always 100% because there are only 3 classes.

Set top=1 in the tf.data

I had already done so.

[convolutional]
filters=3
size=1
stride=1
pad=0
activation=linear

I have modified to use pad=0 but it hasn't made a difference. Maybe my size is just too small?

I'm not really sure if tiny.cfg is an appropriate cfg for trying to classify an image that is best distinguished by colour composition, to be honest. If you have any suggestions it'd be great to hear. Maybe the network is just converging to the majority class and gets stuck in a local minima there.

I've tried copying some of the greens so that the training set is red/green balanced. The top1 now just varies wildly between around 0.03 and about 0.6

Set hue=0.01

Also may be saturation=1.1

I fixed code, so now TopK will be taken from obj.data file, now you can set top=1 in the obj.data: https://github.com/AlexeyAB/darknet/blob/c516b6cb0a08f82023067a649d10238ff18cf1e1/src/classifier.c#L163

chart

What is the size of images?
What is the network size?
What Top1 can you get on valid=train.txt?

network size is width=32, height=64
the images are various sizes, mostly similar, some smaller, some larger.
top1= 0.518997 with valid=train.txt

I might have messed up something in labelling or whatever, but if you think this is some darknet bug to do with an edge case, I'd be happy to send you my working directory with all the data to help reproduce it. or is there anything else I can do to help diagnose it? The dataset is thousands of images but it's small because they're all so small and the network trains in like 30 minutes.

chart
tried training with width=64, height=128 which made no real difference

update: have tried increasing learning rate and training for many more batches, it now just predicts everything as "green".

chart

Send me your dataset and cfg-file, when I will have a time I will check it.
[email protected]

chart

I tried training again without any pretrained weights, it now alternates back and forth between always predicting one and the other. I no longer believe that I have made some mistake, perhaps tiny-darknet just isn't an appropriate algorithm for this task? or maybe if it trains for long enough it will converge? but the loss isn't going down so idk...

  • What Top1 can you get for Training dataset on valid=train.txt?
  • Show examples of your training and detection images.

with the training corresponding to the above chart.png I get a top1=top 1: 0.486340 with valid=train.txt - which is the exact proportion of the training set that is red (or green)

is there some way to see the images as they appear during training with augmentation etc. ?

Using OpenCV, on the line before https://github.com/AlexeyAB/darknet/blob/master/src/image_opencv.cpp#L1237 add something like:

cv::Mat tmp;
cv::cvtColor(sized, tmp, cv::COLOR_RGB2BGR);
cv::imshow("image", tmp);
cv::waitKey(0);

Or maybe just uncomment here if you don't have blur: https://github.com/AlexeyAB/darknet/blob/master/src/image_opencv.cpp#L1194 but I guess you still need to convert RGB to BGR.

Reds:
1569239612___
fbc_1568440552__
fbl# _1568440602_
frame0356

Greens:
1562662673_
1569238465_cam_front_bottom_centre____
1569239445

I get a top1=top 1: 0.486340 with valid=train.txt

Training goes wrong.

Attach your cfg-file and labels and shortnames files.

have emailed you the whole project!

I just tested your model on validation ImegNet dataset, and it works.
width=96
height=96
...
filters=1000

Now I will test it on your dataset.

chart

On your dataset I get 93% Top1 on val-set training for 1 min.
width=96
height=96
learning_rate=0.1
...
filters=3

chart

On your dataset
width=64
height=64
learning_rate=0.1
...
filters=3

chart

On your dataset
width=32
height=32
learning_rate=0.1
...
filters=3

chart

oh that's a good result..... so where could I be going wrong?

ok I changed to width=32 height=32 and it works.

It seems that darknet classifiers need input to be square?

The reason non-square network size.
Somewhere is a bug.

If I train 64x64 and validate 64x64 or 64x128 it works well.
If I train 64x128 and validate 64x64 or 64x128 it works poorly. So the reason in the training.

I fixed it: https://github.com/AlexeyAB/darknet/commit/11142d00bedbafb015991fb20a05a5eb048200d6

with you original cfg-file just max_batches=1000

height=64
width=32

max_batches=1000

after ~5 minutes

chart

It's interesting how 32 x 64 (TOP1 98%) gives higher Top1 than 64 x 64 (96%)
I initially went for 32 x 64 because it is roughly how large the images are anyway, but I wouldn't have thought that distorting aspect ratio more would matter if it is done in a consistent way and in a way that actually boosts the size of the network.

mildly interesting, if instead of training from scratch, I train from tiny.conv.15 coco pretrained weights, the network takes longer to converge and spends a while stuck on predicting just one color before suddenly breaking through and becoming accurate.
chart

yolov3-tiny.cfg (and yolov3-tiny.conv.15) is based on darknet.cfg classifier rather than tiny.cfg. This is why it interferes with learning.

Also there are only the first 13 layers the same in yolov3-tiny.cfg and darknet.cfg (not 15).

oh.... I'll try switching to darknet.cfg - so what is tiny.cfg ?

so what is tiny.cfg ?

Classifier )

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Yumin-Sun-00 picture Yumin-Sun-00  路  3Comments

qianyunw picture qianyunw  路  3Comments

louisondumont picture louisondumont  路  3Comments

HilmiK picture HilmiK  路  3Comments

siddharth2395 picture siddharth2395  路  3Comments