Darknet: Data augmentation ratio

Created on 29 Oct 2018 · 15Comments · Source: AlexeyAB/darknet

Q1. I confirmed there are many hyper-parameters related to data augmentation in config file.
For example, (saturation=1.5, exposure=1.5, hue=.1, jitter=0.3, scales=.1,.1 and so on).
But, I don't know exactly what the values (1.5, .1, 0.3, .1,.1) mean. Could you explain the value or let me know description(file) explaining the value?

Q2. I don't know whether training dataset is doubled, triple or 20%, through data augmentation. Also, how do I set the ratio of crop, flip, and other methods? For example, if I have 6 training data (img1, img2, img3, img4, img5, img6), decide to make training dataset double through data augmentation, and set crop=0.5 and flip=0.5, then I get 12 dataset (img1, img2, img3, img4, img5, img6, crop_img1, crop_img2, crop_img3, filp_img4, flip_img5, flip_img6). In this case, I just want to know where I set the ratio of crop and flip (crop=0.5, flip=0.5).

If you don't understand my question, please let me know! Thanks!

question

Source

89douner

Most helpful comment

if saturation=1.5 then it will be changes saturation= init_value * rand(1/1.5, 1.5)
if exposure=1.5 then it will be changes exposure= init_value * rand(1/1.5, 1.5)
if hue=0.1 then it will be changes hue= init_value + rand(-0.1, 0.1)

Also about cfg-parameters: https://github.com/AlexeyAB/darknet/issues/279#issuecomment-347002399

If you are talking about Yolo (Detection) rather than Classification, then you should use jitter instead of crop

data augmentation will generate infinite number of augmented (changed) images
flip can be 0 or 1, if flip=1 (by default) then will be randomly used horizontal flipping, if flip=0 then willn't be used
jitter can be [0.0 to 1.0] - how does it work: https://github.com/AlexeyAB/darknet/blob/5a2efd5e5327c56a362442dce70bb3e46201cb89/src/data.c#L679-L697

AlexeyAB on 29 Oct 2018

👍5 ❤2 🚀1

All 15 comments

if saturation=1.5 then it will be changes saturation= init_value * rand(1/1.5, 1.5)
if exposure=1.5 then it will be changes exposure= init_value * rand(1/1.5, 1.5)
if hue=0.1 then it will be changes hue= init_value + rand(-0.1, 0.1)

Also about cfg-parameters: https://github.com/AlexeyAB/darknet/issues/279#issuecomment-347002399

If you are talking about Yolo (Detection) rather than Classification, then you should use jitter instead of crop

data augmentation will generate infinite number of augmented (changed) images
flip can be 0 or 1, if flip=1 (by default) then will be randomly used horizontal flipping, if flip=0 then willn't be used
jitter can be [0.0 to 1.0] - how does it work: https://github.com/AlexeyAB/darknet/blob/5a2efd5e5327c56a362442dce70bb3e46201cb89/src/data.c#L679-L697

AlexeyAB on 29 Oct 2018

👍5 ❤2 🚀1

Thanks your answer! I have other two questions.

Q1. Does one file explaining(or describing) training hyper-parameters or data augmentation hyper-parameters exist?

Q2. Tensorflow object detection supports the probability of data augmentation. For example, "The probability of flipping the image is 50%."
Can I set the probability of flipping or jitter ?

89douner on 29 Oct 2018

There is no any document with explanation of hyper parameters.
if flip=0 then probability of flipping = 0%, if flip=1 then probability of flipping = 50%
Probability of jitter is 100%, you can only change maximum volume of random coordinat changing for cropping image according to this code: https://github.com/AlexeyAB/darknet/blob/5a2efd5e5327c56a362442dce70bb3e46201cb89/src/data.c#L679-L697

AlexeyAB on 29 Oct 2018

👍2

Q1. I can see "flip" in data.c, but not in config flie (yolov3.cfg). Although there isn't flip hyper-parameter in yolov3.cfg, flip data augmentation is applied through above data.c code?

Q2. I don't know what scale =.1,.1 means. is it scale range = 0.1~0.1 ??

89douner on 30 Oct 2018

Hi @AlexeyAB ,

Can I know how many times is each image augmented for each augmentation parameter set?

For example, I would like to understand approximately, if my training data is say 1000 images, then how many times does these 1000 images get augmented when the batch size is set as 64 and number of iterations that I ran is 1000 ?
And does one image in a batch gets augmented with only one type at a time OR can it get augmented with a combination of types at a time ( types being color, flip , hue, crop, saturation etc., etc.,).
And what sampling strategy is followed to pick up the batch of 64 images from 1000 images in the above example - Is it sampling with replacement or sampling without replacement ? I am assuming the later because that would guarantee the walk through of all images ?

kmsravindra on 19 Dec 2018

@kmsravindra Hi,

batch*iterations = 64*1000 = 64 000 times random images will be loaded. So approximately each of your 1000 images will be loaded 64 times. Then each of your images will be randomly augmented approximately 64 times.
Each image always augmented by using all augmentations: color, flip , hue, crop, saturation etc., etc.,.
Just random. There is no guarantee the walk through of all images

AlexeyAB on 19 Dec 2018

👍2

@89douner

flip=1 by default, even if it is absent in cfg-file
scale =.1,.1 - multipliers at which will be multiplied learning_rate when iteration number will reach numbers in steps= (parameter in cfg-file)

AlexeyAB on 19 Dec 2018

@kmsravindra Hi,

1. `batch*iterations = 64*1000 = 64 000` times random images will be loaded. So approximately each of your 1000 images will be loaded 64 times. Then each of your images will be randomly augmented approximately 64 times.

2. Each image always augmented by using all augmentations: color, flip , hue, crop, saturation etc., etc.,.

3. Just random. There is no guarantee the walk through of all images

Hi @AlexeyAB
From your answer to the first question, can i say one image is augmented 64 times (according to the batch size).
To be clear, in a batch size of 64 and say sub-division of 8,
1) 64 images from the training set is loaded, 8 images are passed to GPU, now where do the data augmentation part happens ?
2) Do these 8 images in a sub-division are augmented and sent which means, network is trained on 8*64 augmented images after next batch is sent?

Really in need of your reply, please reply asap.

prateekgupta891 on 1 Jul 2019

@prateekgupta891

64 images loaded, then all 64 images augmented, then in loop 8 times will be loaded by 8 images to GPU for processing
The model is trained for 64 images per iteration.

AlexeyAB on 1 Jul 2019

👍2 🎉1

@prateekgupta891

1. 64 images loaded, then all 64 images augmented, then in loop 8 times will be loaded by 8 images to GPU for processing

2. The model is trained for 64 images per iteration.

So, 1 image (from the training set) after being loaded get augmented how many times? 1 time only or equal to the batch ?

prateekgupta891 on 1 Jul 2019

1 time only

AlexeyAB on 1 Jul 2019

👍3 ❤1

@kmsravindra Hi,

1. `batch*iterations = 64*1000 = 64 000` times random images will be loaded. So approximately each of your 1000 images will be loaded 64 times. Then each of your images will be randomly augmented approximately 64 times.

I am a little bit confuse here. You previously said "that each of your images will be randomly augmented approximately 64 times".

However, recently you said "1 time only".

So which one is correct?

Ujang24 on 23 Apr 2020

@Ujang24 Hi,
What he was trying to say was, a batch of images are loaded according to your batch size ( which here is 64, and all the images are different), now they will be augmented and passed to the GPU in sub-batches of 8 ( sub-division is 8, so total 8 times will be loaded).
So one image is augmented once only (no copies were created), just it meant that 64 images were loaded in a batch and all of them were augmented.
Hope that clears it up!

prateekgupta891 on 23 Apr 2020

@Ujang24 Hi,
What he was trying to say was, a batch of images are loaded according to your batch size ( which here is 64, and all the images are different), now they will be augmented and passed to the GPU in sub-batches of 8 ( sub-division is 8, so total 8 times will be loaded).
So one image is augmented once only (no copies were created), just it meant that 64 images were loaded in a batch and all of them were augmented.
Hope that clears it up!

Yes, I can understand clearly what you've said. Thank you.

Btw, could you please also clarify, what he said that "data augmentation will generate infinite number of augmented (changed) images". That is the answer for @89douner Q2. I don't know whether training dataset is doubled, triple or 20%, through data augmentation. Also, how do I set the ratio of crop, flip, and other methods?
https://github.com/AlexeyAB/darknet/issues/1842#issuecomment-433918329

Thanks

Ujang24 on 24 Apr 2020

Look up for online data augmentation. Because that is what it is doing!
In every epoch due to the augmentations, the whole set looks like a new one (hence infinite!)

There is a cfg file for network definition. There you need to set the values.
AlexeyAB repo helps a lot.
https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

Also refer to the last 10 line of this .cfg file
https://github.com/AlexeyAB/darknet/blob/3d2d0a7c98dbc8923d9ff705b81ff4f7940ea6ff/cfg/yolov3.cfg#L17

prateekgupta891 on 24 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings