Darknet: input size vs. anchors size

Created on 3 Jan 2018 · 9Comments · Source: pjreddie/darknet

input image vs. anchors: I don't know the relations about two params.

If compare with voc models, same input sizes. but different anchors.
@yolo-voc.cfg:
[net]
...
height=416
width=416

[region]
anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071
...

@tiny-yolo-voc.cfg:
[net]
...
height=416
width=416

[region]
anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52

If compare with coco models, small input sizes. but bigger anchors.
@yolo.cfg:
[net]
...
width=608
height=608

[region]
anchors = 0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828
...

@yolo.2.0.cfg
[net]
...
height=416
width=416

[region]
anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741
...

Source

OseongKwon

Most helpful comment

The anchors are generated from the dataset you train on with the clustering algorithm kmeans. After kmeans you have for example 5 clusters each with a centroid. The coordinates of a centroid is an anchor. If you want to more about it and why it is done you should read the Yolo900 paper in Section 2 "Better" it is explained.

So for different datasets you will have/need different anchors

Nopileos2 on 5 Jan 2018

👍2

All 9 comments

So for different datasets you will have/need different anchors

Nopileos2 on 5 Jan 2018

👍2

@Nopileos2 I think you don't understand my questions. of course, I read it.
As I said at the issue, Although each models use the same train data(voc, coco), their anchors are different.
I said that. why they are different?

OseongKwon on 6 Jan 2018

I indeed didn't understand your questions thanks for the clarification.
I would assume that maybe different subsets of data or data were used to generate the anchors for the models or some fine tuning via hand. The anchors are not that different.

More important the anchors are found with kmeans. Kmeans is not necessary deterministic. If the initialization is random the results are different. So i think this could be the main reason since most kmeans implementations i saw are using random initialization. This way you can run it multiple times and pick the best one.

So maybe kmeans was used for every model interdependently to find the anchors. But this is ofc only speculation, i don't know why the anchors are different.

Nopileos2 on 6 Jan 2018

@Nopileos2 I would get some odds.
at COCO model,
yolo.cfg's input is 608, but yolo.2.0.cfg's input is 416. yolo.cfg's input is bigger.
But yolo.cfg's anchors is smaller proportionally 0.77 times as below.

yolo.cfg anchors | 0.57273 | 0.677385 | 1.87446 | 2.06253 | 3.33843 | 5.47434 | 7.88282 | 3.52778 | 9.77052 | 9.16828
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
yolo2.0.cfg anchors | 0.738768 | 0.874946 | 2.42204 | 2.65704 | 4.30971 | 7.04493 | 10.246 | 4.59428 | 12.6868 | 11.8741
1st anchors/2nd anchors | 0.77525 | 0.774202 | 0.773918 | 0.776251 | 0.77463 | 0.777061 | 0.769356 | 0.767864 | 0.770133 | 0.772124

Input is bigger, but anchors is smaller.
what do you think?

OseongKwon on 8 Jan 2018

I really wouldn't think so much about this one file. If you look at the history it was changed a lot of times. One time the input was 416 and then it was 608, then it was changed back ... . In this commit the old anchors from the yolo.cfg are now in the new yolo2.cfg and yolo.cfg got new anchors and they were not changed from that point on.

So for me it looks like yolo.cfg is a test cfg or something like this and i wouldn't bother that much.

Nopileos2 on 8 Jan 2018

@Nopileos2 I thanks for your comment. your comment is best for me. regards!
But it is not transparent for me in respect to logic.

The author of yolo didn't take the enough comments about training procedure.
So, I'll explore the more journey with darknet.

You are good fellow to me. thanks.

OseongKwon on 8 Jan 2018

And what about the multi-scale training? The anchors must be scaled according to the image input size during train? I do not see nothing about it in the source code.

Thanks!

Pezaun on 11 Jan 2018

@Pezaun. Did you find out how anchors size change when the input size change, I mean for multi resolution training?

mmderakhshani on 10 Feb 2018

I really wouldn't think so much about this one file. If you look at the history it was changed a lot of times. One time the input was 416 and then it was 608, then it was changed back ... . In this commit the old anchors from the yolo.cfg are now in the new yolo2.cfg and yolo.cfg got new anchors and they were not changed from that point on.

So for me it looks like yolo.cfg is a test cfg or something like this and i wouldn't bother that much.

The anchors ane clustered at coco datasets, I want to know when clustering what the images size is , orginal pictures or resized pictures such as :416416 , 320320 , 640*640 .thank you !