Hi,
I'm trying to train Mask R-CNN on COCO 2014 using a single GTX 1080ti.
I've modified e2e_mask_rcnn_R-101-FPN_1x.yaml to use single GPU and set BATCH_SIZE_PER_IM: 128
Training ETA is ~12 hours - Does it make sense?
Also non-GPU RAM usage is almost 100% of my 16GB - How can i reduce it?
Thanks.
BATCH_SIZE_PER_IM is the number of RoIs per image. It is not the actual minibatch size, which is given by "IMS_PER_BATCH". To train on 1 GPU, you don't need to change either of them unless you don't have enough memory. However, you need to multiple number of iterations (and schedule) by 8x and lr by 1/8. See the note in https://github.com/facebookresearch/Detectron/blob/master/GETTING_STARTED.md#2-multi-gpu-training and our paper "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour".
Isn't it possible to speed up the inference by using batch size of more than 1 image? I just want to extract bounding boxes during inference. Why is there no support of batch inference ?
Most helpful comment
BATCH_SIZE_PER_IM is the number of RoIs per image. It is not the actual minibatch size, which is given by "IMS_PER_BATCH". To train on 1 GPU, you don't need to change either of them unless you don't have enough memory. However, you need to multiple number of iterations (and schedule) by 8x and lr by 1/8. See the note in https://github.com/facebookresearch/Detectron/blob/master/GETTING_STARTED.md#2-multi-gpu-training and our paper "Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour".