When normalizing input images, it seems that the default INPUT.PIXEL_STD is set to [1.0, 1.0, 1.0] in defaults.py. The only stds that are set to values above 1 are the fbnet configs. What is the consideration of this?
I'm asking this because I'm training on my own dataset in COCO format. Like the default settings, it works well when I set INPUT.PIXEL_MEAN to my own custom mean value. However, when I set INPUT.PIXEL_STD above 1 I often get NaN loss quickly.
Any help is greatly appreciated!
Hi,
According to my understanding,
The INPUT.PIXEL_MEAN and INPUT.PIXEL_STD are set according to how the pre-trained model is trained for ImageNet Classification task. So, they are not customized to the COCO.
So, if you are training your network from scratch, you can adjust the parameters to better fit your dataset, otherwise, you do not need it. If the color distribution of your dataset is similar to ImageNet dataset, I do not think you need to adjust it. However, if it does not, it could be a domain transform problem and I think it's an open research problem now.
Closing following the comments from @chengyangfu . Thanks!
Most helpful comment
Hi,
According to my understanding,
The
INPUT.PIXEL_MEANandINPUT.PIXEL_STDare set according to how the pre-trained model is trained for ImageNet Classification task. So, they are not customized to the COCO.So, if you are training your network from scratch, you can adjust the parameters to better fit your dataset, otherwise, you do not need it. If the color distribution of your dataset is similar to ImageNet dataset, I do not think you need to adjust it. However, if it does not, it could be a domain transform problem and I think it's an open research problem now.