Hey guys, during my work with the framework the question came up, whether somewhere the data is shuffled before it is presented the training process?
If so I could skip my own shuffling process.
Thanks.
Do you mean https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/roi_data_layer/layer.py#L23 ?
_shuffle_roidb_inds is called every epoch.
Yes, looks like this code shuffles the ROIs because it is the first (data) layer. Thanks!
My following question is now, what a mini-batch is? Normally a mini-batch consists of a bunch of images depending on the batch size defined within the solver, in this framework within the config ymls instead. There the mini-batch size is set to 1, so I assume that a mini-batch consists of 1 image and the corresponding ground true bounding boxes?
So each iteration would process a single image and one epoch is the amount of iterations where the whole dataset was presented the net once. This would follow that for 10k images and 100k iterations we will train 10 epochs (where after each epoch the database is shuffled? Am I right?
Thanks a lot!
@ednarb29
This would follow that for 10k images and 100k iterations we will train 10 epochs (where after each epoch the database is shuffled?
roi_data_layer will take cfg.TRAIN.IMS_PER_BATCH images every iteration. For _training in RPN within 4-stage alternating training_ and _end-to-end training_, cfg.TRAIN.IMS_PER_BATCH must be 1. For _training in rcnn classification sub-network_, cfg.TRAIN.IMS_PER_BATCH can be greater than 1.
When cfg.TRAIN.IMS_PER_BATCH == 2, 100k iterations corresponds to 20 epochs. _shuffle_roidb_inds is called every epoch.
Although there are only cfg.TRAIN.IMS_PER_BATCH images in an iteration, the number of the anchor training examples and rcnn sub-network training proposals can be greater than 100, which are controlled by cfg.TRAIN.RPN_BATCHSIZE and cfg.TRAIN.BATCH_SIZE.
Each image with at least one foreground or background RoI corresponds to one _valid_ roidb. The foreground RoIs are the proposals (can be ground truth bounding boxes, Selective Search proposals, RPN proposals etc) overlap with some ground truth object bounding box by at least cfg.TRAIN.FG_THRESH (_Intersection-over-Union_ >= cfg.TRAIN.FG_THRESH). See is_valid for further information.
During training phase, the roi_data_layer will take cfg.TRAIN.IMS_PER_BATCH images (i.e. cfg.TRAIN.IMS_PER_BATCH valid roidb) every iteration (see _get_next_minibatch_inds, which is called by _get_next_minibatch).
cfg.TRAIN.IMS_PER_BATCH can only be 1 (which is limited by the implementation of the anchor_target_layer and the proposal_layer). In each iteration the anchor_target_layer will generate at most cfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZE foreground anchor examples, and at most cfg.TRAIN.RPN_BATCHSIZE - #(foregound examples) background anchor examples according to the proposals in the roidb. See anchor_target_layer.py for further information.cfg.TRAIN.BATCH_SIZE proposals as training examples from cfg.TRAIN.IMS_PER_BATCH roidbs every iteration. In 4-stage-alternating training, cfg.TRAIN.IMS_PER_BATCH is 2 and cfg.TRAIN.BATCH_SIZE is 128, so roi_data_layer will sample at most 64 (128 / 2) proposals per _roidb_ in an iteration.Thanks a lot !! =)
Most helpful comment
@ednarb29
roi_data_layerwill takecfg.TRAIN.IMS_PER_BATCHimages every iteration. For _training in RPN within 4-stage alternating training_ and _end-to-end training_,cfg.TRAIN.IMS_PER_BATCHmust be 1. For _training in rcnn classification sub-network_,cfg.TRAIN.IMS_PER_BATCHcan be greater than 1.When
cfg.TRAIN.IMS_PER_BATCH == 2, 100k iterations corresponds to 20 epochs. _shuffle_roidb_inds is called every epoch.Although there are only
cfg.TRAIN.IMS_PER_BATCHimages in an iteration, the number of the anchor training examples and rcnn sub-network training proposals can be greater than 100, which are controlled bycfg.TRAIN.RPN_BATCHSIZEandcfg.TRAIN.BATCH_SIZE.Each image with at least one foreground or background RoI corresponds to one _valid_
roidb. The foreground RoIs are the proposals (can be ground truth bounding boxes, Selective Search proposals, RPN proposals etc) overlap with some ground truth object bounding box by at leastcfg.TRAIN.FG_THRESH(_Intersection-over-Union_ >=cfg.TRAIN.FG_THRESH). See is_valid for further information.During training phase, the roi_data_layer will take
cfg.TRAIN.IMS_PER_BATCHimages (i.e.cfg.TRAIN.IMS_PER_BATCHvalidroidb) every iteration (see _get_next_minibatch_inds, which is called by _get_next_minibatch).cfg.TRAIN.IMS_PER_BATCHcan only be 1 (which is limited by the implementation of the anchor_target_layer and the proposal_layer). In each iteration the anchor_target_layer will generate at mostcfg.TRAIN.RPN_FG_FRACTION * cfg.TRAIN.RPN_BATCHSIZEforeground anchor examples, and at mostcfg.TRAIN.RPN_BATCHSIZE - #(foregound examples)background anchor examples according to the proposals in the roidb. See anchor_target_layer.py for further information.cfg.TRAIN.BATCH_SIZEproposals as training examples fromcfg.TRAIN.IMS_PER_BATCHroidbs every iteration. In 4-stage-alternating training,cfg.TRAIN.IMS_PER_BATCHis 2 andcfg.TRAIN.BATCH_SIZEis 128, so roi_data_layer will sample at most 64 (128 / 2) proposals per _roidb_ in an iteration.