Py-faster-rcnn: Shuffled data

Created on 23 Jun 2016 · 4Comments · Source: rbgirshick/py-faster-rcnn

Hey guys, during my work with the framework the question came up, whether somewhere the data is shuffled before it is presented the training process?

If so I could skip my own shuffling process.

Thanks.

Source

ednarb29

Most helpful comment

@ednarb29

This would follow that for 10k images and 100k iterations we will train 10 epochs (where after each epoch the database is shuffled?

roi_data_layer will take cfg.TRAIN.IMS_PER_BATCH images every iteration. For _training in RPN within 4-stage alternating training_ and _end-to-end training_, cfg.TRAIN.IMS_PER_BATCH must be 1. For _training in rcnn classification sub-network_, cfg.TRAIN.IMS_PER_BATCH can be greater than 1.
When cfg.TRAIN.IMS_PER_BATCH == 2, 100k iterations corresponds to 20 epochs. _shuffle_roidb_inds is called every epoch.

Although there are only cfg.TRAIN.IMS_PER_BATCH images in an iteration, the number of the anchor training examples and rcnn sub-network training proposals can be greater than 100, which are controlled by cfg.TRAIN.RPN_BATCHSIZE and cfg.TRAIN.BATCH_SIZE.

Each image with at least one foreground or background RoI corresponds to one _valid_ roidb. The foreground RoIs are the proposals (can be ground truth bounding boxes, Selective Search proposals, RPN proposals etc) overlap with some ground truth object bounding box by at least cfg.TRAIN.FG_THRESH (_Intersection-over-Union_ >= cfg.TRAIN.FG_THRESH). See is_valid for further information.

During training phase, the roi_data_layer will take cfg.TRAIN.IMS_PER_BATCH images (i.e. cfg.TRAIN.IMS_PER_BATCH valid roidb) every iteration (see _get_next_minibatch_inds, which is called by _get_next_minibatch).

For _the training in RPN_, cfg.TRAIN.IMS_PER_BATCH can only be 1 (which is limited by the implementation of the anchor_target_layer and the proposal_layer). In each iteration the anchor_target_layer will generate at most cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE foreground anchor examples, and at most cfg.TRAIN.RPN_BATCHSIZE - #(foregound examples) background anchor examples according to the proposals in the roidb. See anchor_target_layer.py for further information.
For _the training in rcnn classification sub-network_, roi_data_layer (in 4-stage alternating training, or proposal_data_layer in end-to-end training) will sample at most cfg.TRAIN.BATCH_SIZE proposals as training examples from cfg.TRAIN.IMS_PER_BATCH roidbs every iteration. In 4-stage-alternating training, cfg.TRAIN.IMS_PER_BATCH is 2 and cfg.TRAIN.BATCH_SIZE is 128, so roi_data_layer will sample at most 64 (128 / 2) proposals per _roidb_ in an iteration.

manipopopo on 12 Jul 2016

👍8

All 4 comments

Do you mean https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/roi_data_layer/layer.py#L23 ?
_shuffle_roidb_inds is called every epoch.

manipopopo on 8 Jul 2016

Yes, looks like this code shuffles the ROIs because it is the first (data) layer. Thanks!

My following question is now, what a mini-batch is? Normally a mini-batch consists of a bunch of images depending on the batch size defined within the solver, in this framework within the config ymls instead. There the mini-batch size is set to 1, so I assume that a mini-batch consists of 1 image and the corresponding ground true bounding boxes?

So each iteration would process a single image and one epoch is the amount of iterations where the whole dataset was presented the net once. This would follow that for 10k images and 100k iterations we will train 10 epochs (where after each epoch the database is shuffled? Am I right?

Thanks a lot!

ednarb29 on 12 Jul 2016

@ednarb29

This would follow that for 10k images and 100k iterations we will train 10 epochs (where after each epoch the database is shuffled?

For _the training in RPN_, cfg.TRAIN.IMS_PER_BATCH can only be 1 (which is limited by the implementation of the anchor_target_layer and the proposal_layer). In each iteration the anchor_target_layer will generate at most cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE foreground anchor examples, and at most cfg.TRAIN.RPN_BATCHSIZE - #(foregound examples) background anchor examples according to the proposals in the roidb. See anchor_target_layer.py for further information.
For _the training in rcnn classification sub-network_, roi_data_layer (in 4-stage alternating training, or proposal_data_layer in end-to-end training) will sample at most cfg.TRAIN.BATCH_SIZE proposals as training examples from cfg.TRAIN.IMS_PER_BATCH roidbs every iteration. In 4-stage-alternating training, cfg.TRAIN.IMS_PER_BATCH is 2 and cfg.TRAIN.BATCH_SIZE is 128, so roi_data_layer will sample at most 64 (128 / 2) proposals per _roidb_ in an iteration.