Darknet: kmeans yolov3 anchors?

Created on 11 Jun 2019  路  9Comments  路  Source: pjreddie/darknet

I enountered a issue:
I tried to calculate anchors use my own kmeans script. I cast my input object width and height to range 0, 1, and get 9 pair of floats also in range 0, 1. which val should I multiply with these nine pair of nums to get the right pixel number to be written in my yolov3.cfg file? .

  • I tried to multiply with 416 and train this model , and I get quite poor result.

  • do I need to multiply with 8, 16, 32 respectively?

hope to get your reply, thx.

Most helpful comment

@ameeiyn Is your answer based on yolov2? YOLOv3 has three types of grids, 13_13, 26_26, 52*52. And I want to know the exact number I should multiply my kmeans output(9 pairs of floats belong to (0,1)) with.
thx.

Refer to https://github.com/pjreddie/darknet/issues/555锛宼he author said, "In YOLOv3 anchor sizes are actual pixel values. this simplifies a lot of stuff and was only a little bit harder to implement".

Thus, we just use kmean to get anchor using [0,1] boxes, then multiply anchors with input_size (320, 416 or 608) ~

All 9 comments

"YOLO's anchors are specific to dataset that is trained on (default set is based on PASCAL VOC). They ran a k-means clustering on the normalized width and height of the ground truth bounding boxes and obtained 5 values.

The final values are based on not coordinates but grid values. YOLO default set:
anchors = 1.3221, 1.73145, 3.19275, 4.00944, 5.05587, 8.09892, 9.47112, 4.84053, 11.2364, 10.0071
this means the height and width of first anchor is slightly over one grid cell [1.3221, 1.73145] and the last anchor almost covers the whole image [11.2364, 10.0071] considering the image is 13x13 grid."

Therefore, It depends on the number of grids you have, If you have 13 grids multiply it with same.

@ameeiyn Is your answer based on yolov2? YOLOv3 has three types of grids, 1313, 2626, 52*52. And I want to know the exact number I should multiply my kmeans output(9 pairs of floats belong to (0,1)) with.
thx.

The default grids in YOLOv2 and v3 are same in the config file, I am not sure though whether it is intentional or a bug. But generally anchors are just prerequisite sizes which will be resized for them to become localized bounding boxes. The default file of v3 shows only one set of anchors which are based on 13x13 (same as v2).

Try multiplying them with 13 (which should work fine for all three grids given your objects are not as big as the whole image but you can also try 26 and 52 if you want) and let me know the results.

You can also refer this thread to get some more understanding(good comments to understand all about the anchors) once you put your model to train [ https://github.com/pjreddie/darknet/issues/568 ].

@ameeiyn Is your answer based on yolov2? YOLOv3 has three types of grids, 13_13, 26_26, 52*52. And I want to know the exact number I should multiply my kmeans output(9 pairs of floats belong to (0,1)) with.
thx.

Refer to https://github.com/pjreddie/darknet/issues/555锛宼he author said, "In YOLOv3 anchor sizes are actual pixel values. this simplifies a lot of stuff and was only a little bit harder to implement".

Thus, we just use kmean to get anchor using [0,1] boxes, then multiply anchors with input_size (320, 416 or 608) ~

@soldier828, Just checked Issue 555, and what you say seems correct. Will need an update from @3epochs now.

@ameeiyn @soldier828 thx for ur reply.
check out darknet/cfg/yolov3.cfg and darknet/cfg/yolov3-voc.cfg.
although they have different size(608 and 416), their anchors are identical.
So I don't think this is the right answer.

@3epochs
Any updates ?

I enountered a issue:
I tried to calculate anchors use my own kmeans script. I cast my input object width and height to range 0, 1, and get 9 pair of floats also in range 0, 1. which val should I multiply with these nine pair of nums to get the right pixel number to be written in my yolov3.cfg file? .

  • I tried to multiply with 416 and train this model , and I get quite poor result.
  • do I need to multiply with 8, 16, 32 respectively?

hope to get your reply, thx.

Hello, I have the same doubts as you. I guess the anchor size is relative to the size of the original image size.

I enountered a issue:
I tried to calculate anchors use my own kmeans script. I cast my input object width and height to range 0, 1, and get 9 pair of floats also in range 0, 1. which val should I multiply with these nine pair of nums to get the right pixel number to be written in my yolov3.cfg file? .

I tried to multiply with 416 and train this model , and I get quite poor result.

do I need to multiply with 8, 16, 32 respectively?

hope to get your reply, thx.

could you please offer your kmeans script for me ,thx

Was this page helpful?
0 / 5 - 0 ratings

Related issues

spaul13 picture spaul13  路  3Comments

sayanmutd picture sayanmutd  路  3Comments

arianaa30 picture arianaa30  路  3Comments

AaronYKing picture AaronYKing  路  3Comments

ryuzakinho picture ryuzakinho  路  4Comments