I was wondering if pretrained models on the Visual Genome dataset will be released? Right now, all the models in the MODEL_ZOO are trained and evaluated on MS-COCO.
Many papers have been published where they used Faster-RCNN features extracted from a model trained on Visual Genome. To compare with these publications, it would be good to use a similar pretraining. Since this would be helpful for many researchers so they easily extract comparable features from data, it would be good to add this to detectron2.
Plus it would save a lot of time/energy if not everybody had to train this.
Thanks in advance.
+1.
Many vision BERT based models are using VG pretrained features.
+1
Or if there is an official code to train with VG dataset. I met some problems these days.Thanks!
+1.
Would be great to have Detectron2 pre-trained on VG dataset. Useful for variouls NLP-related tasks, image captioning, for example.
VG dataset do not have mask annotations, so it can not train the model for instance segment. 《Learning to segment Every thing》shows how to train with COCO and VG.
(just saw this issue) If it helps, we have released a VG pre-training codebase based on detectron2 in https://github.com/facebookresearch/grid-feats-vqa. While the focus is on grid features, we also included a faithful reimplementation for the bottom-up features proposed in UpDn model (https://arxiv.org/abs/1707.07998).
Most helpful comment
(just saw this issue) If it helps, we have released a VG pre-training codebase based on detectron2 in https://github.com/facebookresearch/grid-feats-vqa. While the focus is on grid features, we also included a faithful reimplementation for the bottom-up features proposed in UpDn model (https://arxiv.org/abs/1707.07998).