Currently, torchvision offers state-of-the-art pretrained/architecture networks mainly for classification.
I was wondering if there is any interest to add segmentation models such as:
which can actually reusing the available architecture (Resnet, VGG, etc.)
Maybe there is something interesting to share from this repo
@vfdev-5 Yep there is quite a lot of repo around:
So basically, I am thinking if it would not be worth to actually upstream to torchvision the state-of-the-art networks. And having already the encoder, we can actually get model.features to plug in the decoder of each segmentation net.
Yes, that's a good idea!
In some unet-like networks as they use bridges, any idea how to get these from model.features?
Edit: actually this can be done with register_forward_hook. ref
Yes, I agree that it would be good to extend torchvision to also have other domains (not only classification on imagenet). But I'm unsure if we should be hosting the models in here, because it would probably require training code for the models as well.
So here are my thoughts:
What are your thoughts on that?
There are many parts in detection models indeed. I'm working with the Single-Shot multibox Detector, and planning to add it to torchvision. (see #440)
I'm thinking about adding this folder to torchvision/models:
ssd/
ssd.py model, layers
box_coder.py bounding box encoding/decoding using prior boxes
multibox_loss.py criterion
utils.py (some parts may later be integrated in the library)
Then, we could add the data augmentation code in torchvision/transforms and the dataset management in torchvision/datasets. In my opinion, the training/evaluation code would become compact enough to have it in a single script, in pytorch/examples.
@lemairecarl I think this looks reasonable.
We might want to reorganize the models in tasks though, because we would be mixing classification models with detection / segmentation models, so a hierarchy might be clearer.
Also, will the standard be to have the models trained on COCO instead of Pascal? I think this is reasonable, and should be enforced.
I agree.
I've discussed with Max deGroot (https://github.com/amdegroot), maintainer of the ssd.pytorch repo. They already had plans to integrate their implementation in torchvision. Thus I will work with them to accelerate and facilitate the process.
I'll keep you posted.
We have added detection models (Faster R-CNN, Mask R-CNN, Keypoint R-CNN), segmentation models (FCN and DeepLabV3) and reference training / evaluation scripts in TorchVision 0.3, see https://github.com/pytorch/vision/pull/898 and https://github.com/pytorch/vision/pull/820