Darknet: How many objects are detectable in Yolo?

Created on 10 Sep 2019 · 4Comments · Source: AlexeyAB/darknet

Hello,
I read a lot of comments about the process of detecting objects with Yolo, but I still have some questions concerning the detection process:
Yolov3 divides the input image in 13x13 parts and on each part, yolo checks for anchor-points, right?
So when you have 9 anchors in your .cfg-file, you can detext up to 13x13x9 objects?

Thanks for your help!

Knust

Source

Knuust

Most helpful comment

I assume you're asking about YOLOv3. Then this diagram might be helpful.

source: https://www.cyberailab.com/home/a-closer-look-at-yolov3

In short, predictions are made at 3 scales, 1/8, 1/16 and 1/32 of the original dimension. For input size of 416x416, it'll be 52x52, 26x26, 13x13. For every grid in each scale, 3 anchor boxes are predicted. There are 9 anchor boxes given, because there are 3 for every scale.

So the total no. of prediction is (52x52 + 26x26 + 13x13)*3 = 10647. AlexeyAB's formula will give you the same number.

gnefihs on 12 Sep 2019

👍2

All 4 comments

How many objects are detectable in Yolo?

https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

the global maximum number of objects that can be detected by YoloV3 is 0,0615234375*(width*height)

AlexeyAB on 10 Sep 2019

👍1

Ok, thank you very much!
But it seems, I didn't understand the whole process at all.
Can you give me a hint where to find a good explanation of the yolov3 net? I just find questions to specific topics.

Knuust on 10 Sep 2019

I assume you're asking about YOLOv3. Then this diagram might be helpful.

source: https://www.cyberailab.com/home/a-closer-look-at-yolov3

So the total no. of prediction is (52x52 + 26x26 + 13x13)*3 = 10647. AlexeyAB's formula will give you the same number.

gnefihs on 12 Sep 2019

👍2

thank you very much. that helped a lot

Knuust on 19 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings