Hi
I wanna maximize mAP in tiny yolo v3, for single class detection model. In my first try I used 5 anchors for yolo layer, different for each one, and got 64% mAP, but in my second try I used 10 same anchors in both yolo layers and got 56% mAP. In the second try I noticed that there are more than one bounding boxes over the detected object, and I suppose that's why the recall is low. What do you suggest? How many anchors? Different for each yolo layer or the same?
Thank you
@nsantavas Hi,
I used 10 same anchors in both yolo layers and got 56% mAP.
What anchors= did you use in both layers?
What num= did you use in both layers?
What masks= did you use in both layers?
What do you suggest? How many anchors?
It depnds on IoU that you see by using calc_anchors.
Different for each yolo layer or the same?
You must set the same anchors= in both yolo-layers anyway.
Just 2nd yolo layer should have masks= for anchors that are smaller than 64x64
These configs for both yolo layers:
anchors= 16,35, 30,42, 31,90, 55,60, 62,124, 110,96, 109,209, 244,151, 224,319, 439,388
num=10
masks=0,1,2,3,4,5,6,7,8,9
and for the first try I used :
for first yolo layer
mask = 9,10,11,12,13,14,15,16,17
anchors = 65, 98, 119, 87, 91,145, 110,247, 200,147, 197,313, 383,242, 312,410, 505,444
num=9
and for the second yolo layer
mask = 0,1,2,3,4,5,6,7,8
anchors = 13, 26, 14, 50, 23, 33, 40, 36, 29, 50, 25, 93, 43, 72, 72, 50, 43,145
num=9
Can you rename to txt-file your cfg-file and attach it?
1st yolo layer is for big objects.
2nd yolo layer is for small objects.
The 2nd yolo layer should use anchors that are smaller than 64 on one of the axes, for example, 59,119 or 37,58.
So rolfer2.txt is better than rolfer.txt.
If you will use the same anchors (same masks) for both yolo-layers, then they will be used for both small and big objects, that isn't good.
So you will get many double detections and many False-positives.
Also you should use
or
[yolo]
mask = 0,1,2,3,4,5,6,7,8
anchors = 65, 98, 119, 87, 91,145, 110,247, 200,147, 197,313, 383,242, 312,410, 505,444
classes=1
num=9
...
[yolo]
mask = 0,1,2,3,4,5,6,7,8
anchors = 13, 26, 14, 50, 23, 33, 40, 36, 29, 50, 25, 93, 43, 72, 72, 50, 43,145
classes=1
num=9
or
[yolo]
mask = 9,10,11,12,13,14,15,16,17
anchors = 13, 26, 14, 50, 23, 33, 40, 36, 29, 50, 25, 93, 43, 72, 72, 50, 43,145, 65, 98, 119, 87, 91,145, 110,247, 200,147, 197,313, 383,242, 312,410, 505,444
classes=1
num=18
...
[yolo]
mask = 0,1,2,3,4,5,6,7,8
anchors = 13, 26, 14, 50, 23, 33, 40, 36, 29, 50, 25, 93, 43, 72, 72, 50, 43,145, 65, 98, 119, 87, 91,145, 110,247, 200,147, 197,313, 383,242, 312,410, 505,444
classes=1
num=18
Thank you
Hello @AlexeyAB
We read this issue and several doubts has arisen to us
The 2nd yolo layer should use anchors that are smaller than 64 on one of the axes, for example, 59,119 or 37,58.
Why is that? Does this metric affect just to yolo3 tiny or also to yolo3 ? Does the size of the anchors depends on the resolution of the network?
If we have a yolo3 network with 320 x 320 resolution, should we consider these limits in terms of anchor size (i.e. 64) per yolo output layer ?
Previously, we had calculated the anchors through Darknet and it didn't work better than the default anchors. For this reason, we still try to figure out which anchors and how many fit better with our 1 class dataset.
Thank you so much
@tpereztorres Hi,
It is just an observation for the network resolution width=608 height=608 that
for the 1st [yolo] layer with subsampling 32x, can best detect objects larger than 64x64: https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L607
for the 2ndt [yolo] layer with subsampling 16x, can best detect objects larger than 32x32: https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L693
for the 3rdt [yolo] layer with subsampling 8x, can best detect objects larger than 16x16: https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L780
The same is for yolov3-tiny.cfg, just there are only 2 [yolo] layers instead of 3.
So the 1st [yolo] layer will detect objects with size higher than ~64x64, and the 2nd with size less than ~64x64.
(if you used this suggestion layers = -1, 11 and stride=4 for yolov3.cfg https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L716-L720
then the 3rd layer with subsampling 4x, can best detect objects larger than 8x8)
4x4 it is a size of object after image resizing to the network size 608x608.
So if you have anchors
12.9961,20.4277, 115.6670,108.2655, 101.7315,150.9405, 165.9326,145.0915, 152.0592,206.8517, 235.7279,213.8195
then better to use in yolov3-tiny.
[yolo]
masks=1,2,3,4,5
anchors=12.9961,20.4277, 115.6670,108.2655, 101.7315,150.9405, 165.9326,145.0915, 152.0592,206.8517, 235.7279,213.8195
...
[yolo]
masks=0
anchors=12.9961,20.4277, 115.6670,108.2655, 101.7315,150.9405, 165.9326,145.0915, 152.0592,206.8517, 235.7279,213.8195
Or manually change 2nd and 3d anchors and use:
[yolo]
masks=3,4,5
anchors=12.9961,20.4277, 30,30, 50,50, 165.9326,145.0915, 152.0592,206.8517, 235.7279,213.8195
...
[yolo]
masks=0,1,2
anchors=12.9961,20.4277, 30,30, 50,50, 165.9326,145.0915, 152.0592,206.8517, 235.7279,213.8195
Hi @AlexeyAB,
I really don't get how do we set the number of anchors. For 9 anchors I get 80.86% average IOU, and the more I increase it, the more IOU increases. How should I opt their number?
What do you suggest? How many anchors?
It depnds on IoU that you see by using
calc_anchors.
Most helpful comment
@tpereztorres Hi,
It is just an observation for the network resolution width=608 height=608 that
for the 1st [yolo] layer with subsampling 32x, can best detect objects larger than 64x64: https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L607
for the 2ndt [yolo] layer with subsampling 16x, can best detect objects larger than 32x32: https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L693
for the 3rdt [yolo] layer with subsampling 8x, can best detect objects larger than 16x16: https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L780
The same is for yolov3-tiny.cfg, just there are only 2 [yolo] layers instead of 3.
So the 1st [yolo] layer will detect objects with size higher than ~64x64, and the 2nd with size less than ~64x64.
(if you used this suggestion
layers = -1, 11andstride=4for yolov3.cfg https://github.com/AlexeyAB/darknet/blob/81f7fc2c7bbddb79b5c87b5ab88fedf98f2b9963/cfg/yolov3.cfg#L716-L720then the 3rd layer with subsampling 4x, can best detect objects larger than 8x8)
4x4 it is a size of object after image resizing to the network size 608x608.
So if you have anchors
12.9961,20.4277, 115.6670,108.2655, 101.7315,150.9405, 165.9326,145.0915, 152.0592,206.8517, 235.7279,213.8195then better to use in yolov3-tiny.
Or manually change 2nd and 3d anchors and use: