Darknet: Classifier always predicts one class

Created on 12 Nov 2019 · 35Comments · Source: AlexeyAB/darknet

I'm trying to train a classifier for traffic light colour. I have a full object detector that finds and crops the traffic lights and will pass the cropped traffic light to the classifier - there are 3 classes: red, green, off

I started with tiny.cfg and

Set width=32, height=64 as most of the cropped images are about that large, some smaller, some larger.
Set max_crop=16 as it is such a small image
Set hue=0.01 as the hue is obviously an important component of the class of the objects so I want minimal variation.
Set filters=3 in the final [convolutional] as there are 3 classes

Training set consists of 80% of each class: 8657 "red", 4771 "green" and 389 "off"

After not very many iterations - the Top1 converges to 0.572989 and just stays there. My validation set has 1731 red traffic light images out of a total 3021 and 1731/3021 = 0.572989 so it has clearly just learned to always predict "red".

Training with:
./darknet/darknet classifier train tf.data tiny.cfg darknet/tiny.conv.15 -dont_show -mjpeg_port 8090 -topk

There is no bad.list or any error messages

$ cat tf.data 
classes=3
train=train.list
valid=val.list
backup=backup
labels=labels.list
names=names.list
top=1

$ cat labels.list 
green
red
off

$ cat names.list 
green
red
off

$ head val.list 
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_592479e3-924b-459f-ab51-99b09503fe2dred.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_14-33-52-344140green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_60669red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_fe70f927-cb53-477a-9466-861bbb6a5c02red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_13-29-19-199408green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_17-10-17-703999green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_14-52-22-034667red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_14-39-59-765028red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_30112cf9-78ef-4f26-8b2a-db155920b749green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_11-05-30-765166red.jpg

$ head train.list 
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_1562926583_red.png
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_14-32-13-546227red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_13-36-11-040180red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_12-28-13-166136green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_25527b5d3a9b6427_2red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_12-07-50-074281red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_14-55-30-297755green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_17-52-10-205638green.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/red/red_15-44-31-485642red.jpg
/srv/perception/traffic_night_classifier/TF_darknet/tf_data/green/green_c03e2602-5370-4491-af54-ea134a666e5dgreen.jpg

tiny.cfg.txt

Please could you help me work out where I'm going wrong? Also.. is it possible to have the graph give me top1 instead of top5?

Bug fixed

Source

LukeAI

All 35 comments

What Top1 and Top5 do you get?

Please could you help me work out where I'm going wrong? Also.. is it possible to have the graph give me top1 instead of top5?

Set top=1 in the tf.data

Set filters=3 in the final [convolutional] as there are 3 classes

Yes, use

[convolutional]
filters=3
size=1
stride=1
pad=0
activation=linear

AlexeyAB on 12 Nov 2019

What Top1 and Top5 do you get?

Top1 starts at 0.394907 and soon rises up to 0.572989 where it stays
Top5 is always 100% because there are only 3 classes.

Set top=1 in the tf.data

I had already done so.

[convolutional]
filters=3
size=1
stride=1
pad=0
activation=linear

I have modified to use pad=0 but it hasn't made a difference. Maybe my size is just too small?

LukeAI on 12 Nov 2019

I'm not really sure if tiny.cfg is an appropriate cfg for trying to classify an image that is best distinguished by colour composition, to be honest. If you have any suggestions it'd be great to hear. Maybe the network is just converging to the majority class and gets stuck in a local minima there.

LukeAI on 12 Nov 2019

I've tried copying some of the greens so that the training set is red/green balanced. The top1 now just varies wildly between around 0.03 and about 0.6

LukeAI on 12 Nov 2019

Set hue=0.01

Also may be saturation=1.1

I fixed code, so now TopK will be taken from obj.data file, now you can set top=1 in the obj.data: https://github.com/AlexeyAB/darknet/blob/c516b6cb0a08f82023067a649d10238ff18cf1e1/src/classifier.c#L163

AlexeyAB on 12 Nov 2019

🎉1

chart

LukeAI on 12 Nov 2019

What is the size of images?
What is the network size?
What Top1 can you get on valid=train.txt?

AlexeyAB on 12 Nov 2019

network size is width=32, height=64
the images are various sizes, mostly similar, some smaller, some larger.
top1= 0.518997 with valid=train.txt

LukeAI on 12 Nov 2019

I might have messed up something in labelling or whatever, but if you think this is some darknet bug to do with an edge case, I'd be happy to send you my working directory with all the data to help reproduce it. or is there anything else I can do to help diagnose it? The dataset is thousands of images but it's small because they're all so small and the network trains in like 30 minutes.

LukeAI on 12 Nov 2019

chart
tried training with width=64, height=128 which made no real difference

LukeAI on 12 Nov 2019

update: have tried increasing learning rate and training for many more batches, it now just predicts everything as "green".

chart

LukeAI on 12 Nov 2019

Send me your dataset and cfg-file, when I will have a time I will check it.
[email protected]

AlexeyAB on 13 Nov 2019

chart

I tried training again without any pretrained weights, it now alternates back and forth between always predicting one and the other. I no longer believe that I have made some mistake, perhaps tiny-darknet just isn't an appropriate algorithm for this task? or maybe if it trains for long enough it will converge? but the loss isn't going down so idk...

LukeAI on 13 Nov 2019

What Top1 can you get for Training dataset on valid=train.txt?
Show examples of your training and detection images.

AlexeyAB on 13 Nov 2019

with the training corresponding to the above chart.png I get a top1=top 1: 0.486340 with valid=train.txt - which is the exact proportion of the training set that is red (or green)

LukeAI on 13 Nov 2019

is there some way to see the images as they appear during training with augmentation etc. ?

LukeAI on 13 Nov 2019

Using OpenCV, on the line before https://github.com/AlexeyAB/darknet/blob/master/src/image_opencv.cpp#L1237 add something like:

cv::Mat tmp;
cv::cvtColor(sized, tmp, cv::COLOR_RGB2BGR);
cv::imshow("image", tmp);
cv::waitKey(0);

Or maybe just uncomment here if you don't have blur: https://github.com/AlexeyAB/darknet/blob/master/src/image_opencv.cpp#L1194 but I guess you still need to convert RGB to BGR.

kossolax on 13 Nov 2019

Reds:
1569239612___
fbc_1568440552__
fbl# _1568440602_
frame0356

LukeAI on 13 Nov 2019

Greens:
1562662673_
1569238465_cam_front_bottom_centre____
1569239445

LukeAI on 13 Nov 2019

I get a top1=top 1: 0.486340 with valid=train.txt

Training goes wrong.

Attach your cfg-file and labels and shortnames files.

AlexeyAB on 13 Nov 2019

names.list.txt
tf.data.txt
tiny.cfg.txt
labels.list.txt

LukeAI on 13 Nov 2019

have emailed you the whole project!

LukeAI on 13 Nov 2019

I just tested your model on validation ImegNet dataset, and it works.
width=96
height=96
...
filters=1000

Now I will test it on your dataset.

chart

AlexeyAB on 13 Nov 2019

On your dataset I get 93% Top1 on val-set training for 1 min.
width=96
height=96
learning_rate=0.1
...
filters=3

chart

AlexeyAB on 13 Nov 2019

On your dataset
width=64
height=64
learning_rate=0.1
...
filters=3

chart

AlexeyAB on 13 Nov 2019

On your dataset
width=32
height=32
learning_rate=0.1
...
filters=3

chart

AlexeyAB on 13 Nov 2019

oh that's a good result..... so where could I be going wrong?

LukeAI on 13 Nov 2019

ok I changed to width=32 height=32 and it works.

It seems that darknet classifiers need input to be square?

LukeAI on 13 Nov 2019

The reason non-square network size.
Somewhere is a bug.

If I train 64x64 and validate 64x64 or 64x128 it works well.
If I train 64x128 and validate 64x64 or 64x128 it works poorly. So the reason in the training.

AlexeyAB on 13 Nov 2019

I fixed it: https://github.com/AlexeyAB/darknet/commit/11142d00bedbafb015991fb20a05a5eb048200d6

with you original cfg-file just max_batches=1000

height=64
width=32

max_batches=1000

after ~5 minutes

chart

AlexeyAB on 13 Nov 2019

🎉1

It's interesting how 32 x 64 (TOP1 98%) gives higher Top1 than 64 x 64 (96%)
I initially went for 32 x 64 because it is roughly how large the images are anyway, but I wouldn't have thought that distorting aspect ratio more would matter if it is done in a consistent way and in a way that actually boosts the size of the network.

LukeAI on 13 Nov 2019

mildly interesting, if instead of training from scratch, I train from tiny.conv.15 coco pretrained weights, the network takes longer to converge and spends a while stuck on predicting just one color before suddenly breaking through and becoming accurate.
chart