Models: Multi-GPU can't set when use model fasterRcnn_inception_resnet_v2

Created on 18 Sep 2017 · 3Comments · Source: tensorflow/models

When I use model fasterRcnn_inception_resnet_v2 with my own data for training, I set --num_clones=2 to use my 2 GPUs. But I got the error below:
File "/home/zha/Documents/models-master/object_detection/trainer.py", line 117, in _create_losses ) = _get_inputs(input_queue, detection_model.num_classes) ValueError: need more than 0 values to unpack
I tested with model ssd then everything is fine. The version of python is 2.7 and the system is ubuntu 16.04.
Could anyone can tell me why I got this error?(Search in stack overflow but no response). Thanks a lot!

Source

chenyuZha

Most helpful comment

@chenyuZha I had similar issue. Reason was that I tried to run multi-gpu while my batchsize in the config file was still set at 1. If you run --num_clones 2, your batch size in the config file must be at least 2 or in incrments of 2.

darraghdog on 28 Oct 2017

👍5 ❤3

All 3 comments

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

poxvoculi on 26 Sep 2017

darraghdog on 28 Oct 2017

👍5 ❤3

darraghdog is correct regarding batch size.

Also, if you are still having issues, try adding --ps_tasks=1 to your list of arguments for train.py (putting it right after the num_clones argument should work). This works for me when I run ssd_inception_v2_coco using TF runtime 1.6 with python 2.7 on ubuntu 16.04. I haven't tried the particular model you are using.