Darknet: How small can I detect with YOLO?

Created on 12 Dec 2018 · 8Comments · Source: AlexeyAB/darknet

I am going to detect logo of vehicle in road and logo is small against vehicle...
Is it available?
How small things can i detect with YOLO?

Source

richardminh

👍2

Most helpful comment

@anavc94

And if it does, how did you get that criteria?

Yes, if you want to know the best minimal size for recognition, then use this criteria obj_size = 32*(img_size/416), i.e. your object size should be 32x32 pixels or more after that image is resized to the network size 416x416.

Why 32x32? This is subsampling multiplier of the first [yolo] layer with the most generalizing ability, 32 = pow(2,5) i.e. there is 5 subsampling-layers with stride=2 between input and the 1st [yolo]-layer:

the first [yolo] layer: https://github.com/AlexeyAB/darknet/blob/6b4dca27d3dd3c4c7b41076596a32e7171638412/cfg/yolov3.cfg#L607

Does it mean that to be able to recognize the object it must ocuppy a minimum area of 295x166 píxels in the image or frame?

Object can be recognized if you can recognize it by your eyes, after that images is resized to the network size 416x416
Object can be recognized with ~77.2% on 1000 classes (there is used backbone network Darknet53 with 77.2% top1 accuracy on 1000 classes ImageNet https://pjreddie.com/darknet/imagenet/ ) if the object size is 32x32 or more, after that images is resized to the network size 416x416

These are approximate theoretical criteria.

AlexeyAB on 12 Dec 2018

👍5

All 8 comments

Hello @richardminh ,

I am interested in the same question. I hope someone can help us.
What are the minimum pixels an object must occupy to be able to detect it with Yolo?

On the other hand, I've been able to detect some logos in cars with Yolov2, even training with a more generic dataset (not only logos in cars but also logos in advertising, etc.). But it's true that I couldn't detect the logos when the car was far from the point of view. Don't know if training with more specific images to this application will success... Hope it helps.

Ana

anavc94 on 12 Dec 2018

@richardminh @anavc94

Resize your images to the network size (width=416 height=416 in your cfg-file), and if you can recognize objects by your eyes, then Yolo can do it too.

AlexeyAB on 12 Dec 2018

👍3 😄2

@AlexeyAB I Have been reading in other issues (https://github.com/AlexeyAB/darknet/issues/1475) that there's like a criteria for recognition which is:

obj_size = 32*(img_size/416) (in case we are using width=416 or height=416)

If i have a resolution of, for example, 3840x2160, my obj_size would be about (32*3840)/416 = 295. Doing the same for vertical resolution, obj_size = 295x166. Does it mean that to be able to recognize the object it must ocuppy a minimum area of 295x166 píxels in the image or frame? And if it does, how did you get that criteria?

Thanks for the reply,
Ana

anavc94 on 12 Dec 2018

@anavc94

And if it does, how did you get that criteria?

the first [yolo] layer: https://github.com/AlexeyAB/darknet/blob/6b4dca27d3dd3c4c7b41076596a32e7171638412/cfg/yolov3.cfg#L607

Does it mean that to be able to recognize the object it must ocuppy a minimum area of 295x166 píxels in the image or frame?

Object can be recognized if you can recognize it by your eyes, after that images is resized to the network size 416x416
Object can be recognized with ~77.2% on 1000 classes (there is used backbone network Darknet53 with 77.2% top1 accuracy on 1000 classes ImageNet https://pjreddie.com/darknet/imagenet/ ) if the object size is 32x32 or more, after that images is resized to the network size 416x416

These are approximate theoretical criteria.

AlexeyAB on 12 Dec 2018

👍5

Interesting! Thanks for the reply @AlexeyAB , I really appreciate it.

Ana

anavc94 on 13 Dec 2018

@AlexeyAB If one split a large image into 4, 9, 16 equal pieces, and use 416 network size, (width=416 height=416 in your cfg-file). will darknet pickup some details?

c2h2 on 24 Feb 2019

@c2h2

Yes.

But better to split into pieces with overlaps, so if some object will be in the edge of one piece, then it should fully be visible on another piece for better detection.

AlexeyAB on 24 Feb 2019

👍1

Thanks! might need to use some algorithm to de-duplicate the overlap boxes if needed.

c2h2 on 25 Feb 2019

Was this page helpful?

0 / 5 - 0 ratings