Models: Multi-GPU can't set when use model fasterRcnn_inception_resnet_v2

Created on 18 Sep 2017  路  3Comments  路  Source: tensorflow/models

When I use model fasterRcnn_inception_resnet_v2 with my own data for training, I set --num_clones=2 to use my 2 GPUs. But I got the error below:
File "/home/zha/Documents/models-master/object_detection/trainer.py", line 117, in _create_losses ) = _get_inputs(input_queue, detection_model.num_classes) ValueError: need more than 0 values to unpack
I tested with model ssd then everything is fine. The version of python is 2.7 and the system is ubuntu 16.04.
Could anyone can tell me why I got this error?(Search in stack overflow but no response). Thanks a lot!

Most helpful comment

@chenyuZha I had similar issue. Reason was that I tried to run multi-gpu while my batchsize in the config file was still set at 1. If you run --num_clones 2, your batch size in the config file must be at least 2 or in incrments of 2.

All 3 comments

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

@chenyuZha I had similar issue. Reason was that I tried to run multi-gpu while my batchsize in the config file was still set at 1. If you run --num_clones 2, your batch size in the config file must be at least 2 or in incrments of 2.

darraghdog is correct regarding batch size.

Also, if you are still having issues, try adding --ps_tasks=1 to your list of arguments for train.py (putting it right after the num_clones argument should work). This works for me when I run ssd_inception_v2_coco using TF runtime 1.6 with python 2.7 on ubuntu 16.04. I haven't tried the particular model you are using.

Was this page helpful?
0 / 5 - 0 ratings