Hi,
I'm trying to train the example network at https://github.com/facebookresearch/Detectron/blob/master/GETTING_STARTED.md#2-coco-dataset
The instructions refer to "coco_2014_minival (which must be properly installed)".
I wasn't very sure what to do and I unzipped https://s3-us-west-2.amazonaws.com/detectron/coco/coco_annotations_minival.tgz and renamed the files as follows:
coconut
| _ annotations
| _ instances_minival2014.json -> instances_train2014.json
| _ instances_valminusminival2014.json -> instances_val2014.json
| _ person_keypoints_valminusminival2014.json -> person_keypoints_train2014.json
| _ person_keypoints_minival2014.json -> person_keypoints_val2014.json
I have compiled the current github master of Detectron and the current master of caffe2 to use CUDA 9 and cuDNN 7
When I execute this:
python2 tools/train_net.py \
--cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml \
OUTPUT_DIR /home/detectron/detectron-output
I receive the following error:
INFO net.py: 125: res2_0_branch2a_b preserved in workspace (unused)
INFO net.py: 125: res4_3_branch2c_b preserved in workspace (unused)
I0128 19:06:50.143659 96176 net_dag_utils.cc:118] Operator graph pruning prior to chain compute took: 0.000545186 secs
I0128 19:06:50.144330 96176 net_dag.cc:61] Number of parallel execution chains 340 Number of operators = 632
INFO train_net.py: 318: Outputs saved to: /home/detectron/detectron-output/train/coco_2014_train/generalized_rcnn
Traceback (most recent call last):
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/coordinator.py", line 50, in stop_on_exception
Traceback (most recent call last):
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/coordinator.py", line 50, in stop_on_exception
yield
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 101, in minibatch_loader_thread
yield
Traceback (most recent call last):
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 101, in minibatch_loader_thread
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/coordinator.py", line 50, in stop_on_exception
yield
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 101, in minibatch_loader_thread
Traceback (most recent call last):
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/coordinator.py", line 50, in stop_on_exception
INFO loader.py: 227: Pre-filling mini-batch queue...
yield
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 101, in minibatch_loader_thread
blobs = self.get_next_minibatch()
INFO loader.py: 232: [0/64]
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 134, in get_next_minibatch
blobs = self.get_next_minibatch()
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 134, in get_next_minibatch
blobs, valid = get_minibatch(minibatch_db)
blobs = self.get_next_minibatch()
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 70, in get_minibatch
blobs, valid = get_minibatch(minibatch_db)
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 134, in get_next_minibatch
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 70, in get_minibatch
blobs = self.get_next_minibatch()
blobs, valid = get_minibatch(minibatch_db)
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/loader.py", line 134, in get_next_minibatch
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 70, in get_minibatch
blobs, valid = get_minibatch(minibatch_db)
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 70, in get_minibatch
im_blob, im_scales = _get_image_blob(roidb)
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 106, in _get_image_blob
im_blob, im_scales = _get_image_blob(roidb)
im_blob, im_scales = _get_image_blob(roidb)
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 106, in _get_image_blob
im, cfg.PIXEL_MEANS, [target_size], cfg.TRAIN.MAX_SIZE
im, cfg.PIXEL_MEANS, [target_size], cfg.TRAIN.MAX_SIZE
im_blob, im_scales = _get_image_blob(roidb)
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 106, in _get_image_blob
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/blob.py", line 78, in prep_im_for_blob
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/blob.py", line 78, in prep_im_for_blob
File "/home/detectron/Downloads/detectron/Detectron/lib/roi_data/minibatch.py", line 106, in _get_image_blob
im, cfg.PIXEL_MEANS, [target_size], cfg.TRAIN.MAX_SIZE
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/blob.py", line 78, in prep_im_for_blob
im, cfg.PIXEL_MEANS, [target_size], cfg.TRAIN.MAX_SIZE
File "/home/detectron/Downloads/detectron/Detectron/lib/utils/blob.py", line 78, in prep_im_for_blob
im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'
im = im.astype(np.float32, copy=False)
im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'
AttributeError: 'NoneType' object has no attribute 'astype'
INFO loader.py: 113: Stopping mini-batch loading thread
INFO loader.py: 113: Stopping mini-batch loading thread
im = im.astype(np.float32, copy=False)
AttributeError: 'NoneType' object has no attribute 'astype'
INFO loader.py: 113: Stopping mini-batch loading thread
INFO loader.py: 113: Stopping mini-batch loading thread
INFO detector.py: 434: Changing learning rate 0.000000 -> 0.000833 at iter 0
E0128 19:06:51.167364 96979 net_dag.cc:212] Operator chain failed: input: "gpu_0/roi_blobs_queue_fbf08ad2-d3b1-4e77-b2a3-dd1cba4ff1c8" output: "gpu_0/data" output: "gpu_0/im_info" output: "gpu_0/roidb" output: "gpu_0/rpn_labels_int32_wide_fpn2" output: "gpu_0/rpn_bbox_targets_wide_fpn2" output: "gpu_0/rpn_bbox_inside_weights_wide_fpn2" output: "gpu_0/rpn_bbox_outside_weights_wide_fpn2" output: "gpu_0/rpn_labels_int32_wide_fpn3" output: "gpu_0/rpn_bbox_targets_wide_fpn3" output: "gpu_0/rpn_bbox_inside_weights_wide_fpn3" output: "gpu_0/rpn_bbox_outside_weights_wide_fpn3" output: "gpu_0/rpn_labels_int32_wide_fpn4" output: "gpu_0/rpn_bbox_targets_wide_fpn4" output: "gpu_0/rpn_bbox_inside_weights_wide_fpn4" output: "gpu_0/rpn_bbox_outside_weights_wide_fpn4" output: "gpu_0/rpn_labels_int32_wide_fpn5" output: "gpu_0/rpn_bbox_targets_wide_fpn5" output: "gpu_0/rpn_bbox_inside_weights_wide_fpn5" output: "gpu_0/rpn_bbox_outside_weights_wide_fpn5" output: "gpu_0/rpn_labels_int32_wide_fpn6" output: "gpu_0/rpn_bbox_targets_wide_fpn6" output: "gpu_0/rpn_bbox_inside_weights_wide_fpn6" output: "gpu_0/rpn_bbox_outside_weights_wide_fpn6" name: "" type:
E0128 19:06:51.167557 96176 net.h:70] Failed to execute async run
Traceback for operator 0 in network generalized_rcnn
/home/detectron/caffe2_build/caffe2/python/helpers/conv.py:149
/home/detectron/caffe2_build/caffe2/python/helpers/conv.py:196
/home/detectron/caffe2_build/caffe2/python/brew.py:121
/home/detectron/caffe2_build/caffe2/python/cnn.py:112
/home/detectron/Downloads/detectron/Detectron/lib/modeling/ResNet.py:94
/home/detectron/Downloads/detectron/Detectron/lib/modeling/ResNet.py:38
/home/detectron/Downloads/detectron/Detectron/lib/modeling/FPN.py:103
/home/detectron/Downloads/detectron/Detectron/lib/modeling/FPN.py:47
/home/detectron/Downloads/detectron/Detectron/lib/modeling/model_builder.py:162
/home/detectron/Downloads/detectron/Detectron/lib/modeling/optimizer.py:60
/home/detectron/Downloads/detectron/Detectron/lib/modeling/optimizer.py:38
/home/detectron/Downloads/detectron/Detectron/lib/modeling/model_builder.py:222
/home/detectron/Downloads/detectron/Detectron/lib/modeling/model_builder.py:89
/home/detectron/Downloads/detectron/Detectron/lib/modeling/model_builder.py:117
tools/train_net.py:283
tools/train_net.py:205
tools/train_net.py:196
tools/train_net.py:358
Traceback (most recent call last):
File "tools/train_net.py", line 358, in <module>
main()
File "tools/train_net.py", line 196, in main
checkpoints = train_model()
File "tools/train_net.py", line 217, in train_model
workspace.RunNet(model.net.Proto().name)
File "/home/detectron/caffe2_build/caffe2/python/workspace.py", line 224, in RunNet
StringifyNetName(name), num_iter, allow_fail,
File "/home/detectron/caffe2_build/caffe2/python/workspace.py", line 189, in CallWithExceptionIntercept
return func(*args, **kwargs)
RuntimeError: [enforce fail at pybind_state.cc:867] success. Error running net generalized_rcnn
However if I use the full dataset annotations it starts training without errors. I downloaded these annotations from:
http://msvocds.blob.core.windows.net/annotations-1-0-3/instances_train-val2014.zip
http://msvocds.blob.core.windows.net/annotations-1-0-3/person_keypoints_trainval2014.zip
http://msvocds.blob.core.windows.net/annotations-1-0-3/captions_train-val2014.zip
How should I install coco_annotations_minival.tgz?
you do not need to rename the json files, just put them into the annotations folder, leaving the json file names unchanged
Hi @virilo, as @XupingZHENG pointed out, there is no need to rename minival annotations.
Please read the COCO dataset setup instructions here.
Relevant extract:
To complete installation of the COCO dataset, you will need to copy the minival and valminusminival json annotation files to the coco/annotations directory referenced above.
Thanks a lot @XupingZHENG, @ir413
I didn't realize that the example trains with full COCO 2014, but infers in COCO 2014 minival. So I had to put all the .json in the same folder ... as the instructions and the yaml file say ... :$
can you please explain how to make the instance_minival2017 and instances_valminusminival2017 json files
@ir413 the page that you provide aint working ...
Updated the link, it should work now. Sorry for the inconvenience.
can you please explain how to make the instance_minival2017 and instances_valminusminival2017 json files
There are no minival and valminusminival 2017 splits. Please read the instructions from the page linked above carefully (this section describes how coco 2014 splits are related to the coco 2017 splits).
Most helpful comment
you do not need to rename the json files, just put them into the annotations folder, leaving the json file names unchanged