Darknet: How to update anchors when changing the input size of the network

Created on 11 Oct 2017  路  4Comments  路  Source: AlexeyAB/darknet

When changing the input size of the network, how should the anchors be updated?

In https://github.com/Jumabek/darknet_scripts/blob/master/gen_anchors.py#L52, calculated anchors are finally multiplied by the size of the network and divided by 32.
By example, following this logic, we should multiply each anchors values by 2, when using 832 as input values instead of 416. However, it doesn't give me good results.

I have to precise that I want to change the network size to be able to detect smaller objects, but that I should still be able to detect "normal sized" objects too.
Should I then add new anchor values in addition to the old ones?

In #199, you (@AlexeyAB) wrote

Try to use for detection 1088x1088 and multiply each anchor value by 1.6 (but if you trained with random=1, then multiple by 2.4)

I don't understand from where is coming the 1.6 factor.
And how should we determine the factor when using _random=1_ (what is my case)?

Thank you for your help

Most helpful comment

  • If you train using yolo-voc.cfg and width=416 height=416 random=0 and then detect on 1088x1088, then you should multiply the anchors by 2.6 = 1088/416

  • If you train using yolo-voc.cfg and random=1 and then detect on 1088x1088, then you should multiply the anchors by ~2.4 = 1088/464 = 1088/((320+608)/2)
    (when random=1, then during training for each 10 iterations network size will be resized randomly from 320x320 to 608x608)

All 4 comments

The best way is to use 1088x1088 and 10 anchors (20 values) for training. Where 5 unchanged anchors for small objects copied from yolo-voc.cfg, and 5 anchors scaled at 2.4 times for large object. And train on images (with size larger than 1000x1000) that contain small and large objects.

For models already trained on 416x416, you can use some 3 scaled anchors and 2 unchanged anchors for detection on 1088x1088.

Also you can try to train densenet201_yolo2.cfg with initial weights densenet201.300: https://github.com/AlexeyAB/darknet/issues/179#issuecomment-330047738
darknet.exe detector train data/obj.data densenet201_yolo.cfg densenet201.300

If I'm training on bigger networks, I can imagine it will take a lot more time to train?

I still don't get where the 2.4 times is coming from (I want to understand it to be able to adapt it for other size networks).

I will look at the use of DenseNet with YOLO !

Thank you for your answers

  • If you train using yolo-voc.cfg and width=416 height=416 random=0 and then detect on 1088x1088, then you should multiply the anchors by 2.6 = 1088/416

  • If you train using yolo-voc.cfg and random=1 and then detect on 1088x1088, then you should multiply the anchors by ~2.4 = 1088/464 = 1088/((320+608)/2)
    (when random=1, then during training for each 10 iterations network size will be resized randomly from 320x320 to 608x608)

  • If you train using yolo-voc.cfg and width=416 height=416 random=0 and then detect on 1088x1088, then you should multiply the anchors by 2.6 = 1088/416
  • If you train using yolo-voc.cfg and random=1 and then detect on 1088x1088, then you should multiply the anchors by ~2.4 = 1088/464 = 1088/((320+608)/2)
    (when random=1, then during training for each 10 iterations network size will be resized randomly from 320x320 to 608x608)

maybe actually it is (288+576)/2, and their average is different. 288= int(416/1.4/32)32, 576=int(416x
1.4/32)
32.
Please let me know if I am wrong.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Mididou picture Mididou  路  3Comments

shootingliu picture shootingliu  路  3Comments

PROGRAMMINGENGINEER-NIKI picture PROGRAMMINGENGINEER-NIKI  路  3Comments

Greta-A picture Greta-A  路  3Comments

qianyunw picture qianyunw  路  3Comments