Hi,
I want to ask how TRAIN.SCALES and TRAIN.MAX_SIZE are actually working when the training phase.
In the config.py, this values are defined as __C.TRAIN.SCALES = (600, ), __C.TRAIN.MAX_SIZE = 1000.
I read the comment in the config.py, but I cannot catch the effect of those when training.
Does network resize all the input image to the value in TRAIN.SCALES (e.g. resize 800X800 image into 600X600)?
Also, the network does not consider the image which sizes over than TRAIN.MAX_SIZE (e.g. ignore 1024X1024 image (>TRAIN.MAX_SIZE = 1000))?
Also, when I train the retinaNet with COCO dataset in the tutorial, network configuration is printed as follows:
INFO net.py: 210: Printing model: retinanet
INFO net.py: 240: data : (2, 3, 896, 1280) => conv1 : (2, 64, 448, 640) ------- (op: Conv)
INFO net.py: 240: conv1 : (2, 64, 448, 640) => conv1 : (2, 64, 448, 640) ------- (op: AffineChannel)
INFO net.py: 240: conv1 : (2, 64, 448, 640) => conv1 : (2, 64, 448, 640) ------- (op: Relu)
INFO net.py: 240: conv1 : (2, 64, 448, 640) => pool1 : (2, 64, 224, 320) ------- (op: MaxPool)
INFO net.py: 240: pool1 : (2, 64, 224, 320) => res2_0_branch2a : (2, 64, 224, 320) ------- (op: Conv)
.....
Anybody know why the data layer shape like (2, 3, 896, 1280)? I think there is no image in COCO dataset size of (896, 1280).
TRAIN.SCALES is the shorter side of an image, and images are always resized with aspect ratio preserved. TRAIN.SCALES is expected to be the main factor to control the size. TRAIN.MAX_SIZE is used to clip the longer side of an image, and it does not take effect if resizing the image following TRAIN.SCALES does not lead to a longer size bigger than this number. TRAIN.MAX_SIZE is mainly for not running out of memory.
But does it mean, that it also scales segmentation and bounding boxes?
Bounding boxes and masks are scaled accordingly.
@KaimingHe If the image has its smallest dimension within TRAIN.SCALES , will the image dimension be upscaled to that value or it is taken in its original form?
@nonstop1962 As you asked above :
Also, the network does not consider the image which sizes over than TRAIN.MAX_SIZE (e.g. ignore 1024X1024 image (>TRAIN.MAX_SIZE = 1000))?
Did you get the answer to that?
like if the image is of dimension 2000X2000 and I have set TRAIN.MAX_SIZE = 1200 and TRAIN.SCALES = 1000, will the image gets resized to 1000x1000 or 1200x1200 or will it get ignored at all?? @KaimingHe
Please comment
Most helpful comment
TRAIN.SCALES is the shorter side of an image, and images are always resized with aspect ratio preserved. TRAIN.SCALES is expected to be the main factor to control the size. TRAIN.MAX_SIZE is used to clip the longer side of an image, and it does not take effect if resizing the image following TRAIN.SCALES does not lead to a longer size bigger than this number. TRAIN.MAX_SIZE is mainly for not running out of memory.