Darknet: How to train custom objects based on yolo-resnet152?

Created on 8 Mar 2018 · 52Comments · Source: AlexeyAB/darknet

Hi@AlexeyAB, I have trained the yolo v2 with custom objects, it seems not so accurate, it is possibly because of my images for 1 object are always different.
for example, these 2 images are belong to the same category, we took photo on 2 sides of the snack:
https://imgur.com/a/huw7x
https://imgur.com/a/CiprO

and we have many similar objects, which images on both sides are totally different, however, we need to detect them as one object.

I currently used Yolo Mark which is also developed by you, there is training tutorial on yolo v2, however, I could not find any document on how to train with yolo-resnet152, I found this command:
darknet.exe partial resnet152.cfg resnet152.weights resnet152.201 201
however, how can I input my custom images? could I just modified to this?:
darknet.exe partial data/img resnet152.cfg resnet152.weights resnet152.201 201

Thank you.

Source

anguoyang

Most helpful comment

@anguoyang Even though you need to detect objects with the fixed scales & same background, you need more variety in your training data. If you're training with just 8 images per class, your network will easily overfit to your training image & not learn features of the objects present in the train data.

You can improve the performance your NN by providing the same objects in different background, scales, lighting etc. If you cannot afford to collect such data, you can try augmentation.

sivagnanamn on 19 Mar 2018

👍2

All 52 comments

@anguoyang Hi, resnet152 for detection is here /build/darknet/x64/resnet152_yolo.cfg

Download this file: https://pjreddie.com/media/files/resnet152.weights
Do darknet.exe partial cfg/resnet152.cfg resnet152.weights resnet152.201 201
Change these lines: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1-L23

To these lines: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/yolo-voc.2.0.cfg#L1-L18

Change number of classes: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1463
And number of fileters as usual, filters=(classes+5)*num_of_anchors: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1456
Remove this line (to do transfer-learning instead of fine-tuning): https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1439
Run training: darknet.exe detector train data/obj.data resnet152_yolo.cfg resnet152.201

If it lead to Nan, then you can try to leave this line to do fine-tuning: https://github.com/AlexeyAB/darknet/blob/100d6f78011f0a773442411e2882a0203d390585/build/darknet/x64/resnet152_yolo.cfg#L1439

AlexeyAB on 8 Mar 2018

❤1 👍1

thank you so much for your great help! I suppose that the densenet-yolo use the similar procedure?

anguoyang on 8 Mar 2018

@anguoyang Yes,

Just download: https://pjreddie.com/media/files/densenet201.weights
Do darknet.exe partial cfg/densenet201.cfg densenet201.weights densenet201.300 300

...

AlexeyAB on 8 Mar 2018

Ok, thank you, it is running now, one more question, if finished the training, can I use the same interface for detection?

anguoyang on 9 Mar 2018

Yes, you can use the same command for detection:

darknet.exe detector test obj.data resnet152_yolo.cfg resnet152_yolo_2000.weights -thresh 0.1

AlexeyAB on 9 Mar 2018

@AlexeyAB Thanks for your great work and I did learn a lot from your code. I have a question how to train resnet50-yolo. From your code above, I can see that the first 2 steps should be:

wget https://pjreddie.com/media/files/resnet50.weights
darknet partial cfg/resnet50_yolo.cfg resnet50.weights resnet50.xx xx.
I get stuck in this step how to choose the last parameters (i.e. xx) passed to partial cmd.

How to set last parameters (i.e. xx) in partial cmd for resnet50-yolo?

lixiangchun on 9 Mar 2018

@lixiangchun Hi,

You can see it here: https://github.com/AlexeyAB/darknet/blob/a6c51e3b758aee7fd3a6f1d37daa8dcad4891e52/build/darknet/x64/partial.cmd#L25

Why is 65?

Just run resnet50.cfg as classifier: https://github.com/AlexeyAB/darknet/blob/a6c51e3b758aee7fd3a6f1d37daa8dcad4891e52/build/darknet/x64/classifier_resnet50.cmd#L1

and see the number of penultimate convolutional layer 64. So you should extract layers [0 - 64], i.e. 65 layers.

In this way for training detector, 65 layers will be loaded from pre-trained resnet50.65 and last convolutional layer will be initialized using random values: https://github.com/AlexeyAB/darknet/blob/a6c51e3b758aee7fd3a6f1d37daa8dcad4891e52/src/convolutional_layer.c#L246

AlexeyAB on 9 Mar 2018

❤1 👍1

@AlexeyAB Thanks for your prompt reply, I got it right now.

lixiangchun on 9 Mar 2018

@AlexeyAB When I training resnet50_yolo with random=1 in cfg file, the following error occurred:

Cannot resize this type of layer: File exists
darknet: ./src/utils.c:199: error: Assertion '0' failed.

If random=0, resnet50_yolo works.

lixiangchun on 10 Mar 2018

@lixiangchun Yes, [shortcut] layer doesn't support resize yet. I will fix it.

AlexeyAB on 10 Mar 2018

👍1

Hi@AlexeyAB , I have tried to test on yolo, densenet and resnet, I found they are all not sensitive to color difference, which means, if two objects are similar in shape but different in color, then it is difficult to distinguish from each other, could you please give me some advice on how to improve it? thank you

anguoyang on 13 Mar 2018

@anguoyang Probably due to data augumentation.

Set these params and train: https://github.com/AlexeyAB/darknet/blob/15c89e7a714e7e37c13618eace9325a06f0642fc/cfg/yolo-voc.2.0.cfg#L10-L12

saturation = 1.01
exposure = 1.5
hue=.01

Can you show examples of colors that the network can not distinguish?

AlexeyAB on 13 Mar 2018

hi@AlexeyAB , thank you a lot for your quick response.
I have uploaded 2 images which is similar in shape and different in color:
https://imgur.com/a/02Ngo
https://imgur.com/a/dZN8g

anguoyang on 13 Mar 2018

I have modified the cfg file and re-train the yolo v2 network, it seems no improvement, maybe it is because that I have only trained 2000 iterations? is there any other factors which lead to loss of color information? thank you

anguoyang on 13 Mar 2018

I have uploaded the whole image data directory:
https://github.com/anguoyang/fpdw4win/raw/master/data.zip
you could download and train/test it with yolo or other nets

anguoyang on 13 Mar 2018

Oh, your colors are very close to each other. So you should use to train:

saturation = 1.0
exposure = 1.0
hue=0.0

Thus, the colors will not change during training. And try to train more than 2000 iterations.

AlexeyAB on 13 Mar 2018

Hi@AlexeyAB, I have modified the cfg according to your advice, it is better than the original one, however, it more better(only for my case) to set with:
saturation = 0.1
exposure = 0.1
hue=0.1

I am trying to set it into this and training:
saturation = 0.0
exposure = 0.0
hue=0.0

If finished, I will go back here for the result:)

anguoyang on 14 Mar 2018

Hi @AlexeyAB, is that okay to do center-cropping before any data augmentation? If yes, how to specify it in cfg file, I haven't yet found this parameter.

lixiangchun on 14 Mar 2018

@lixiangchun Yes, it is ok, you can se jitter= param from 0 to 1: https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/cfg/yolo-voc.2.0.cfg#L234

It's done here in the code: https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/data.c#L689-L704

AlexeyAB on 14 Mar 2018

@anguoyang Do these params saturation = 0.1 exposure = 0.1 hue=0.1 give you better results than saturation = 1.0 exposure = 1.0 hue=0.0?

This is strange, because exactly these params do not change colors at all during data augumentation, so it can distinguish colors better: https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/image.c#L1246-L1267

Functions: rand_scale() and rand_uniform_strong()

https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/utils.c#L615-L620
https://github.com/AlexeyAB/darknet/blob/df076653e00db69c6fb57869981f5196a8f55e70/src/utils.c#L654-L662

AlexeyAB on 14 Mar 2018

yes, it's true, it can distinguish the 2 images after modified all to 0.1, the iteration is almost same, I don't know why

anguoyang on 14 Mar 2018

you could try my dataset on training and testing, as I have already marked all labels, so maybe will not cost so much time, you could just train it and see the result

anguoyang on 14 Mar 2018

@AlexeyAB When I training resnet50_yolo with random=1 in cfg file, the following error occurred:

Cannot resize this type of layer: File exists
darknet: ./src/utils.c:199: error: Assertion '0' failed.

If random=0, resnet50_yolo works.

@lixiangchun I added resize_shortcut_layer(). So now you can use random=1 for the resnet50_yolo.cfg and resnet152_yolo.cfg.

AlexeyAB on 15 Mar 2018

Hi @AlexeyAB, Thanks for your great work. I will try it soon.

lixiangchun on 16 Mar 2018

Hi @AlexeyAB, When training resnet50_yolo with random = 1, error occurs:

darknet: ./src/shortcut_layer.c:41: resize_shortcut_layer: Assertionl->w == l->out_w' failed.`

lixiangchun on 16 Mar 2018

@lixiangchun Hi, try to do

make clean
make -j8

After what number of iterations this error does occur?
Resizing to what resolution does lead to this error?
Can you provide your resnet50_yolo.cfg file that causes the error?

I trained this model for about 300 iterations, and it did not lead to an error: resnet50_yolo.zip

darknet.exe detector train data/voc_air.data resnet50_yolo.cfg resnet50.65

AlexeyAB on 16 Mar 2018

@AlexeyAB I cloned your latest commit and I also encountered the same error by using your resnet50_yolo.cfg at the beginning after loading model. Could your provide link to your resnet50.65?

lixiangchun on 16 Mar 2018

@lixiangchun You can get file resnet50.65 using this command:
./darknet partial cfg/resnet50.cfg resnet50.weights resnet50.65 65
Before you should download: https://pjreddie.com/media/files/resnet50.weights

Hot to get pre-trained files for other models: https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/partial.cmd

AlexeyAB on 16 Mar 2018

@anguoyang @AlexeyAB @lixiangchun
Cloud you release the stable linux version of code and trained weights regarding detection and classification, for training and testing resnet50, resnet101, resnet152 and densenet201. Thus, you can save more time to coding new functions.

TaihuLight on 18 Mar 2018

👍1

@TaihuLight

Cloud you release the stable linux version of code

What do you mean? Current commit is almost stable, and there is stable version: https://github.com/AlexeyAB/darknet/releases

trained weights regarding detection and classification, for training and testing resnet50, resnet101, resnet152 and densenet201.

There are trained weights for classification: https://pjreddie.com/darknet/imagenet/#pretrained

Then for training your own wegiths you can use pre-trained weights that you can get by launch this file: https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/partial.cmd

Also these cfg-files aren't well tested as detector and slow:

So we are waiting for Yolo v3: https://github.com/pjreddie/darknet/tree/yolov3

AlexeyAB on 18 Mar 2018

👍1

Hi@AlexeyAB, based on my testing/experiences, the source code is stable enough, thank you.
What troubles me is the accuracy and also the maximum objects number.
I have also tested on densenet, it is better than yolo on accuracy, but not good enough, for example:
I have trained and tested with 5 objects, each object with about 8 images, the testing on these objects is not bad, however, if I put another more object for testing(which is different with trained objects in shape and color), sometime it also get 40% confidence, which is very depressing

anguoyang on 19 Mar 2018

I have the same problem with @anguoyang , stable version means that the code can be used to train weights and get better accuracy on the specific dataset, not only run successfully. Thus, you can help more learners.
Besides, I need resnet101.cfg for classification, where can I get it or I need to edit it myself?

TaihuLight on 19 Mar 2018

@TaihuLight You need to cut off several layers from resnet152.cfg by yourself.

AlexeyAB on 19 Mar 2018

@anguoyang

I have trained and tested with 5 objects, each object with about 8 images

Do you mean that your training dataset contains only 8 images per class? This is very few.

What mAP and IoU can you get?

Also you can read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

AlexeyAB on 19 Mar 2018

hi@AlexeyAB , but we could only take photo on 8 directions, do you mean that we take more photos on each direction? the background is the same

anguoyang on 19 Mar 2018

Hi@AlexeyAB , I have also tried to modify the cfg file according to https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
including random=1, add negative samples, etc. but didnt work, maybe the only thing I need to do is to add more images for each object

anguoyang on 19 Mar 2018

Yes, more images.
And this: desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides.

Training dataset should contain images with all scales, rotations, lightings, sides - which you want to detect.

AlexeyAB on 19 Mar 2018

@AlexeyAB, we actually need to detect objects with fixed scales, lightings and rotations, and even the background is same, so we took photo on both sides and 4 directions, so we actually need only 8 images by far:)

anguoyang on 19 Mar 2018

You can improve the performance your NN by providing the same objects in different background, scales, lighting etc. If you cannot afford to collect such data, you can try augmentation.

sivagnanamn on 19 Mar 2018

👍2

@sivagnanamn , thank you, I will try more images.

anguoyang on 20 Mar 2018

@lixiangchun @AlexeyAB
I also encountered the same error by using your resnet50_yolo.cfg provided in this issue at the beginning after loading model when OEPNCV=0. but I have generate resnet50.65 with correct commmand. Does you solve it?

But if random=0, resnet50_yolo works.

$./darknet partial cfg/resnet50.cfg resnet50.weights resnet50.65 65
layer filters size input output
0 conv 64 7 x 7 / 2 256 x 256 x 3 -> 128 x 128 x 64
1 max 2 x 2 / 2 128 x 128 x 64 -> 64 x 64 x 64
2 conv 64 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 64
3 conv 64 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 64
4 conv 256 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 256
5 Shortcut Layer: 1
6 conv 64 1 x 1 / 1 64 x 64 x 256 -> 64 x 64 x 64
7 conv 64 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 64
8 conv 256 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 256
9 Shortcut Layer: 5
10 conv 64 1 x 1 / 1 64 x 64 x 256 -> 64 x 64 x 64
11 conv 64 3 x 3 / 1 64 x 64 x 64 -> 64 x 64 x 64
12 conv 256 1 x 1 / 1 64 x 64 x 64 -> 64 x 64 x 256
13 Shortcut Layer: 9
14 conv 128 1 x 1 / 1 64 x 64 x 256 -> 64 x 64 x 128
15 conv 128 3 x 3 / 2 64 x 64 x 128 -> 32 x 32 x 128
16 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512
17 Shortcut Layer: 13
18 conv 128 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 128
19 conv 128 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 128
20 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512
21 Shortcut Layer: 17
22 conv 128 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 128
23 conv 128 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 128
24 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512
25 Shortcut Layer: 21
26 conv 128 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 128
27 conv 128 3 x 3 / 1 32 x 32 x 128 -> 32 x 32 x 128
28 conv 512 1 x 1 / 1 32 x 32 x 128 -> 32 x 32 x 512
29 Shortcut Layer: 25
30 conv 256 1 x 1 / 1 32 x 32 x 512 -> 32 x 32 x 256
31 conv 256 3 x 3 / 2 32 x 32 x 256 -> 16 x 16 x 256
32 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024
33 Shortcut Layer: 29
34 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256
35 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256
36 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024
37 Shortcut Layer: 33
38 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256
39 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256
40 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024
41 Shortcut Layer: 37
42 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256
43 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256
44 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024
45 Shortcut Layer: 41
46 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256
47 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256
48 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024
49 Shortcut Layer: 45
50 conv 256 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 256
51 conv 256 3 x 3 / 1 16 x 16 x 256 -> 16 x 16 x 256
52 conv 1024 1 x 1 / 1 16 x 16 x 256 -> 16 x 16 x1024
53 Shortcut Layer: 49
54 conv 512 1 x 1 / 1 16 x 16 x1024 -> 16 x 16 x 512
55 conv 512 3 x 3 / 2 16 x 16 x 512 -> 8 x 8 x 512
56 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048
57 Shortcut Layer: 53
58 conv 512 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 512
59 conv 512 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x 512
60 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048
61 Shortcut Layer: 57
62 conv 512 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x 512
63 conv 512 3 x 3 / 1 8 x 8 x 512 -> 8 x 8 x 512
64 conv 2048 1 x 1 / 1 8 x 8 x 512 -> 8 x 8 x2048
65 Shortcut Layer: 61
66 conv 1000 1 x 1 / 1 8 x 8 x2048 -> 8 x 8 x1000
67 avg 8 x 8 x1000 -> 1000
68 softmax 1000
69 cost 1000
Loading weights from resnet50.weights...
seen 64
Done!
Saving weights to resnet50.65
$ ./darknet detector train data/voc.data cfg/resnet50_yolo.cfg resnet50.65
resnet50_yolo
layer filters size input output
0 conv 64 7 x 7 / 2 416 x 416 x 3 -> 208 x 208 x 64
1 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64
2 conv 64 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 64
3 conv 64 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 64
4 conv 256 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 256
5 Shortcut Layer: 1
6 conv 64 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 64
7 conv 64 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 64
8 conv 256 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 256
9 Shortcut Layer: 5
10 conv 64 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 64
11 conv 64 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 64
12 conv 256 1 x 1 / 1 104 x 104 x 64 -> 104 x 104 x 256
13 Shortcut Layer: 9
14 conv 128 1 x 1 / 1 104 x 104 x 256 -> 104 x 104 x 128
15 conv 128 3 x 3 / 2 104 x 104 x 128 -> 52 x 52 x 128
16 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512
17 Shortcut Layer: 13
18 conv 128 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 128
19 conv 128 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 128
20 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512
21 Shortcut Layer: 17
22 conv 128 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 128
23 conv 128 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 128
24 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512
25 Shortcut Layer: 21
26 conv 128 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 128
27 conv 128 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 128
28 conv 512 1 x 1 / 1 52 x 52 x 128 -> 52 x 52 x 512
29 Shortcut Layer: 25
30 conv 256 1 x 1 / 1 52 x 52 x 512 -> 52 x 52 x 256
31 conv 256 3 x 3 / 2 52 x 52 x 256 -> 26 x 26 x 256
32 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024
33 Shortcut Layer: 29
34 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256
35 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256
36 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024
37 Shortcut Layer: 33
38 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256
39 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256
40 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024
41 Shortcut Layer: 37
42 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256
43 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256
44 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024
45 Shortcut Layer: 41
46 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256
47 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256
48 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024
49 Shortcut Layer: 45
50 conv 256 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 256
51 conv 256 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 256
52 conv 1024 1 x 1 / 1 26 x 26 x 256 -> 26 x 26 x1024
53 Shortcut Layer: 49
54 conv 512 1 x 1 / 1 26 x 26 x1024 -> 26 x 26 x 512
55 conv 512 3 x 3 / 2 26 x 26 x 512 -> 13 x 13 x 512
56 conv 2048 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x2048
57 Shortcut Layer: 53
58 conv 512 1 x 1 / 1 13 x 13 x2048 -> 13 x 13 x 512
59 conv 512 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x 512
60 conv 2048 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x2048
61 Shortcut Layer: 57
62 conv 512 1 x 1 / 1 13 x 13 x2048 -> 13 x 13 x 512
63 conv 512 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x 512
64 conv 2048 1 x 1 / 1 13 x 13 x 512 -> 13 x 13 x2048
65 Shortcut Layer: 61
66 conv 1024 1 x 1 / 1 13 x 13 x2048 -> 13 x 13 x1024
67 conv 125 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 125
68 detection
mask_scale: Using default '1.000000'
Loading weights from resnet50.65...
seen 32
Done!
Learning Rate: 0.0001, Momentum: 0.9, Decay: 0.0005
Resizing network
416
darknet: ./src/shortcut_layer.c:41: resize_shortcut_layer: Assertion `l->w == l->out_w' failed.
Aborted

TaihuLight on 6 Apr 2018

@TaihuLight Yes, this is solved, try to train using last version of build/darknet/x64/resnet50_yolo.cfg
https://raw.githubusercontent.com/AlexeyAB/darknet/2a9f7e44ce1b73d3d56ef83f83e94f074ecac3f9/build/darknet/x64/resnet50_yolo.cfg

I can train it successfully:

AlexeyAB on 6 Apr 2018

Hello @AlexeyAB
I've read the post above carefully. And I do as you suggested but I still got the error darknet: ./src/shortcut_layer.c:41: resize_shortcut_layer: Assertion 'l->w == l->out_w' failed.

Here's what I did:

clone the latest code.
modify the resnet50_yolo.cfg file. looks like: https://github.com/yanhn/testFile/blob/master/resnet50_yolo.cfg
./darknet partial cfg/resnet50.cfg cfg/resnet50.weights self_cfg/resnet50.65 65
./darknet detector train self_cfg/video.data self_cfg/resnet50_yolo.cfg self_cfg/resnet50.65 -gpus 0 -dont_show

And I also trained with resnet152 trained with pjreddie's darknet, both the same error.
Anything will be helpful, thanks.

yanhn on 12 Apr 2018

@yanhn
Try to remove old version of Darknet and download it again.

I just run this command and I can train your https://github.com/yanhn/testFile/blob/master/resnet50_yolo.cfg file successfully:
darknet.exe detector train self_cfg/video.data self_cfg/resnet50_yolo.cfg self_cfg/resnet50.65 -gpus 0 -dont_show

AlexeyAB on 12 Apr 2018

Thank you. I managed to trained the model on windows platform, which proves that my data is correct and the partial resnet50 model is correct. But still got the error on ubuntu.

I use the latest code with commit id 0fe1c6bcc86edc649624d655643627e20d02eba9
And I changed opencv version from opencv3.3 3.4 and 3.4.1, but still the same error.

As for my ubuntu environment, I managed to train a yolov3 & tiny yolo model using my own data. So I think the environment is ok.
I printed some log maybe helpful. I added printf("input: %d, output: %d\ninwidth: %d, inheight: %d, outwidth: %d, outheight: %d\n", l->inputs, l->outputs, l->w, l->h, l->out_w, l->out_h); in line 41 of shortcut_layer.c. And here's the output:

I solved it by comment the assert phrase. Can train the model for now, but I need to check it later.

yanhn on 13 Apr 2018

@yanhn Did you totally remove Darknet from Ubuntu, and did you remove your old cfg-file from any places on Ubuntu?
Please check it twice.

AlexeyAB on 13 Apr 2018

@AlexeyAB No. I just pull the latest code. I'll try as you suggest.
And by the way, does the old cfg-file indicate the resnet50_yolo.cfg?

yanhn on 13 Apr 2018

@yanhn Yes, you should just comment these lines with assert(). I fixed it: https://github.com/AlexeyAB/darknet/commit/16cfff811f8a5898899cdd0b7139d216466371d2

https://github.com/AlexeyAB/darknet/blob/16cfff811f8a5898899cdd0b7139d216466371d2/src/shortcut_layer.c#L39-L42

On Windows asserts disabled for Release mode, so I didn't see this errors. Now I check it on Linux, and asserts should be removed.

AlexeyAB on 13 Apr 2018

Ok, thank you.

yanhn on 14 Apr 2018

@AlexeyAB @sivagnanamn
Just as we discussed before, 8 images for each item is too few, we tried our best to take about 500 images for each item(currently we could only afford to get images for 8 items).
the problems is, when I followed the instruction on yolo v3 for custom objects, my 1080TI machine always generate cuda - out of memory error, my question is: how to calculate the image requirement?
total images size x batch ?

anguoyang on 20 Apr 2018

@anguoyang Use batch=64 subdivision=64

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection