Hi @AlexeyAB
My testing and training image size is 320 x 240 px. Because of the limitation on the computing on the processor (Atom E3845 - Quad core - 1.91 GHz), i have to reduce the network size to 160 x 160 to increase the detection time. I use the tiny-yolo configuration for my network, would it affect the accuracy of the training model ?
Thank you so much!!!
I am a new one on YOLO. If you need more information about this question. please leave the comment.
Thank you so much
Hi @AlexeyAB !!!
Could you please to show me, how to reduce the detection time and keep maintain the accuracy of the model.
Thank you so much!!!
@trannhutle Hi,
You can use width=320 height=224 in yolov3-tiny.cfg to achive high speed without accuracy drop.
If you use random=1 in cfg-file, then you should use only this repository for Training, and any for Detection.
If you use width=160 height=160 then it will lead to slightly loss of accuracy.
@AlexeyAB Hi Alexey,
Thank you so much for your answer. It increase the accuracy of the model so much!!!
Because the limitation on the processor, for detection, the network resolution that I could set would be width 192 and height: 192. Could you give me some advice for setting configuration training and detect without dropping in detection accuracy ?
Does the image resolution for the training have to be bigger or same size with the network resolution?
Do we have to maintain the image resolution for both training and testing ?
Thank you so much for your help!!!
@AlexeyAB Hi Alexey,
For increase the time for recognition, when i build the libdarknet.so, on the Makefile i change the AVX = 1. It does not work, do you know how to fix it?
Thank you so much !
@AlexeyAB Hi Alexey,
I used your modify cfg file (Tiny-model: 3 yolo layers: https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-tiny_3l.cfg). The result is so amazing, but it costs over 5 secs to detect objects. Could you please teach me how to change the cfg file to improve the calculation time and avoid accuracy reduction. Thank you so much!!!! Alexey!!!
@trannhutle Hi,
To speedup Detection on CPU set OPENMP=1 or OPENMP=1 AVX=1 in the Makefile.
Try to train with width 192 and height: 160
For increase the time for recognition, when i build the libdarknet.so, on the Makefile i change the AVX = 1. It does not work, do you know how to fix it?
Can you show screenshot?
What CPU do you use?
@AlexeyAB Hi Alexey,
width 224 and height 192 on detection.
There is no error when i build with 'OPENMP=1', but with the 'OPENMP=1 AVX=1', there is no error on build the so lib. However, when i initialize the network, it show Try to load cfg: ./config/cfg/test_so.cfg, weights: ./config/weights/test_so.weights, clear = 0
Illegal instruction
This is my CPU : (Atom E3845 - Quad core - 1.91 GHz)
Thank you so much!!!
Hi @AlexeyAB ,
I do not acctualy understand the meaning of Network Resolution, could you please give me some document to understand about that. Thank you so much!!!
@trannhutle Hi,
width= and height= in cfg-file is a network resolution
Atom E3845 doesn't have AVX2, since it is old CPU: https://ark.intel.com/content/www/ru/ru/ark/products/78475/intel-atom-processor-e3845-2m-cache-1-91-ghz.html
So you should compile with OPENMP=1 AVX=0
Hi @AlexeyAB ,
I have changed the configuration and It does work really well.
I have another question, is the background (except from the bounding box) from the images in training data set, affect learning of YOLO ? Or the learning is affected by the region inside of the bounding box ?
About the overexposed and underexposed images on the detection image, how could we train the model (including capturing the images) to deal with overexposed and underexposed on the image ?
What if the network just learn the objects with the same color ? like (apple, cucumber, avocado, green capsicum, ...) How could we deal with those kind of problems?
Thank you so much for your strong support!!!
HI @AlexeyAB ,
Why do i train the Tiny Yolo for 4 objects, data set for every object around 160 images the accuracy is very low, while I train with the same configuration for 14 objects It work better?
What does the factor affect to the training model?
Hi @AlexeyAB,
About your comment on this, https://github.com/AlexeyAB/darknet/issues/3001#issuecomment-485773915, Does the background affect training, even though it does not include the objects ?
I have another question, is the background (except from the bounding box) from the images in training data set, affect learning of YOLO ? Or the learning is affected by the region inside of the bounding box ?
Background from the images affects learning Yolo.
About the overexposed and underexposed images on the detection image, how could we train the model (including capturing the images) to deal with overexposed and underexposed on the image ?
Use data augmentation, set exposure=3.0 in cfg: https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-%5Bnet%5D-section
What if the network just learn the objects with the same color ? like (apple, cucumber, avocado, green capsicum, ...) How could we deal with those kind of problems?
What is the problem?
About your comment on this, #3001 (comment), Does the background affect training, even though it does not include the objects ?
Yes.
Hi @AlexeyAB,
What i would like to do next is capturing the background images and i will crop the objects on different angles and location. Next I would apply those cropped objects into different background. Does it help to improve the accuracy of training the network ?
Thank for your quick response!!!
@AlexeyAB,
Although I know that the reflection of the color from the background affect the object, does applying the cropped objects increase the training and detecting?
Someone says It would not help the network learn more feature about the object. Could you please give me some idea about that ?
Thank you so much Alexey!
Although I know that the reflection of the color from the background affect the object, does applying the cropped objects increase the training and detecting?
No (in this case).
Next I would apply those cropped objects into different background. Does it help to improve the accuracy of training the network ?
It can improve accuracy.
Although I know that the reflection of the color from the background affect the object, does applying the cropped objects increase the training and detecting?
No (in this case).
In this case you mean that increasing the training time and detecting time or what ? I do not very much understand ?
Thank you so much Alexey!!!
Hi @AlexeyAB ,
About the function in image.c
void draw_detections(image im, int num, float thresh, box *boxes, float **probs, char **names, image **alphabet, int classes),
How could use it in python ? Because now when i get the detection result, i draw the bounding box is so bad ?
If we could use it on Python, how could i use it ? What parameters do i have to pass on ?
Thank you so much @AlexeyAB
@trannhutle
Use this in Python: https://github.com/AlexeyAB/darknet/blob/c9129c207823a96f0a1b3a840883a6c510073347/darknet_video.py#L18-L33
Or this: https://github.com/AlexeyAB/darknet/blob/c9129c207823a96f0a1b3a840883a6c510073347/darknet.py#L413-L424
Next I would apply those cropped objects into different background. Does it help to improve the accuracy of training the network ?
It can improve accuracy.
@AlexeyAB
So cropping positive rectangle and putting it randomly on different background does not hurt accuracy?
There will be strong borders and region in the box will be totally different than outside. It will allow us to reduce labeling errors but I am not sure if this is beneficial.
What if there are many annotations? Or what if I leave some padding inside box before moving to a new background?
For example, we use a pseudo labeler to detect detectable objects and putting them on random or its own clean bg and there are claims in team that this hurts accuracy.
@isgursoy
Can you show examples?
Cropped objects that are inserted in another image increases accuracy - is known as CutMix: https://arxiv.org/pdf/1905.04899v2.pdf

We will be back with examples from our case in few hours. Thanks for your time.
In addition to isgursoy's post:
Putting cropped object to a different background improves the model? By cropping an object we mean to take the object from its original background by its bounding box, we don鈥檛 mean a technique like CutMix. In case of a human detection problem, we mean cropping the entire human object and putting it to a different background. My question is about three cases:
Does it improve the model to put the cropped human to a completely different background?

We automatically detected humans and labelled them in a pseudo way. Then we cropped them and located the detected boxes back onto a specified general background that is slightly different from its original background. Does it affect accuracy?

Original image sizes may be different from the network size. For example, image size can be 512x512 (square) while the network size can be 416x416 (square) and they are proportional. What if the image size is rectangle and network size is square or vice versa? Does it affect the accuracy?
@ekarabulut
If we believe the results of the article https://arxiv.org/pdf/1905.04899v2.pdf , yes, it increases accuracy.
Yes, it increases accuracy
If network size 416x416 and image size was 640x480 during both Training and Detection - then this is normal.
It鈥檚 bad when objects have different aspect ratios during training and detection, after that image is resized to the network size, for example, training image 1000x100, while detection image 100x1000
in addition to @ekarabulut 's post:
1-2) @AlexeyAB Strong gradient makes me think. May model learn borders and wants to use this trick? In my opinion, small padding for positive makes me feel better.
3) Images in varying sizes in many aspect ratios.
@AlexeyAB First off, thanks for the quick reply.
In CutMix, a part of bounding box (e.g. human's leg) is inserted into another bounding box. In the above example image (1), whole bounding box is put into another background (e.g. context is removed or replaced for the box). Is your comment still valid for this situation?
@ekarabulut
It depends on your task.
In general it improves accuracy, like any variety improves accuracy.
But to be more precise, in your training dataset:
https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
for each object which you want to detect - there must be at least 1 similar object in the Training dataset with about the same: shape, side of object, relative size, angle of rotation, tilt, illumination. So desirable that your training dataset include images with objects at diffrent: scales, rotations, lightings, from different sides, on different backgrounds - you should preferably have 2000 different images for each class or more, and you should train 2000*classes iterations or more
@isgursoy @ekarabulut
1-2) @AlexeyAB Strong gradient makes me think. May model learn borders and want to use this trick? In my opinion, small padding for positive makes me feel better.
3) Images in varying sizes in many aspect ratios.Yes, a model can simply be overfitted to boundaries (Strong gradient), in the end it will just look for sharp boundaries instead of the objects themselves - it will degrade accuracy.
May be later I will add something like this with Blending by using Pyramids (if OPENCV=1): https://docs.opencv.org/master/dc/dff/tutorial_py_pyramids.html
I added this issue: #4378
About different aspect-ratios there are pros and cons for different resize approaches: #232 (comment)
What do you think about leave some padding after a positive box in crop and move? What changes in this case in your opinion?
@isgursoy @ekarabulut
1-2) @AlexeyAB Strong gradient makes me think. May model learn borders and want to use this trick? In my opinion, small padding for positive makes me feel better.
3) Images in varying sizes in many aspect ratios.
Yes, a model can simply be overfitted to boundaries (Strong gradient), in the end it will just look for sharp boundaries instead of the objects themselves - it will degrade accuracy.
May be later I will add something like this with Blending by using Pyramids (if OPENCV=1): https://docs.opencv.org/master/dc/dff/tutorial_py_pyramids.html

I added this issue: https://github.com/AlexeyAB/darknet/issues/4378
About different aspect-ratios there are pros and cons for different resize approaches: https://github.com/AlexeyAB/darknet/issues/232#issuecomment-336955485
Thanks.
Most helpful comment
@isgursoy
Can you show examples?
Cropped objects that are inserted in another image increases accuracy - is known as CutMix: https://arxiv.org/pdf/1905.04899v2.pdf
Also read:
https://github.com/AlexeyAB/darknet/issues/4264