I am very new to CNTK. I wanted to train a set of images (to detect objects like glasses/bottles) using CNTK - ResNet/Fast-R CNN.
I am trying to follow below documentation from GitHub; However, it does not appear to be a straight forward procedure. https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN
I cannot find proper documentation to generate ROI's for the images with different sizes and shapes. And how to create object labels based on the trained models? Can someone point out to a proper documentation or training link using which I can work on the cntk model? Please see the attached image in which I was able to load a sample image with default ROI's in the script. How do I properly set the size and label the object in the image ? Thanks in advance!
Pls use the following tutorial:
https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN
If you need more detailed information about how the method works, you should read the paper:
https://arxiv.org/pdf/1504.08083.pdf
regards
The tutorial from Patrick Buehler has more details on the parameters: https://github.com/Azure/ObjectDetectionUsingCntk.
Thank you @pkranen and @Vedaevolution , I am already following those documentation.
To train our own images (in my case to detect alcohol glasses and bottles), what do I need to edit project-specific parameters in PARAMETERS.py file to assign image annotations?
I assume, imdbs = dict() database should provide the entire data base of pre-trained images correct?
Creating something like this does not automatically detect objects and assign annotations.
I mentioned classes = ('__background__', 'Glass', 'Bottle') ; do they have to match the existing database, if so where can I find it?
Alcohol is my data source name where I kept all the images positive/negative/testImages
if datasetName.startswith("Alcohol"):
classes = ('__background__', 'Glass', 'Bottle')
# roi generation
roi_minDimRel = 0.04
roi_maxDimRel = 0.4
roi_minNrPixelsRel = 2 * roi_minDimRel * roi_minDimRel
roi_maxNrPixelsRel = 0.33 * roi_maxDimRel * roi_maxDimRel
# model training / scoring
classifier = 'nn'
cntk_num_train_images = 25
cntk_num_test_images = 5
cntk_mb_size = 5
cntk_max_epochs = 20
cntk_momentum_time_constant = 10
# postprocessing
nmsThreshold = 0.01
# database
imdbs = dict() # database provider of images and image annotations
for image_set in ["train", "test"]:
imdbs[image_set] = imdb_data(image_set, classes, cntk_nrRois, imgDir, roiDir, cntkFilesDir, boAddGroundTruthRois = (image_set!='test'))
You need to manually annotate your images using the C1_... and C2_... scripts as described at https://github.com/Microsoft/CNTK/wiki/Object-Detection-using-Fast-R-CNN#train-on-your-own-data. The imdb database will be filled with your provided ground truth annotations when you follow the steps described in that section.
Thanks much for your replies @pkranen
For some reason, the annotation tool does not show up when I ran "C1_DrawBboxesOnImages.py" .
The command just returns without any exit code. If I understand it correctly, when I run C1_... script, it should allow me to annotate the images manually by using mouse cursor right? Wondering if that is something related to environment setup. Any suggestions? Thanks!
Finally, I was able to run and train my images. Something went wrong with my environment; I re-did it and am now able to get the annotation tool loaded.
But I need to understand if we have a new image to test based on the training I did on my own dataset, using above example is it possible to do this without actually generating ROI's on the new test image and just get the predicted label of an object ?(if at all the object is present in the new test image that I input to my trained model)?
Thanks!