I am getting the following log. Please check the last line. Doing nvidia-smi shows my gpu is not being used while training:
python train.py --logtostderr --pipeline_config_path=/home/gabbar/ML/tf/models/object_detection/faster_rcnn_resnet101_voc07.config --train_dir=/home/gabbar/ML/tf/models/object_detection/train_dir
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients_impl.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2017-06-20 13:00:11.537942: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-20 13:00:11.537976: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-20 13:00:11.537983: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-20 13:00:11.537990: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-20 13:00:11.537999: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
INFO:tensorflow:Restoring parameters from /home/gabbar/ML/tf/models/object_detection/train_dir/model.ckpt-2840
2017-06-20 13:00:12.470248: I tensorflow/core/common_runtime/simple_placer.cc:675] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0
```
It seems there is no error message in this log. For queue, It can use CPU. It does not mean it will ignore GPU completely.
It isnt an error. The training is completely running on cpu. I updated my comment.
This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!
@abhiML
Make sure your GPU is recognized by tensorflow by running this command:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
For me, my GPU is recognized when I run in Python3.5, but not in Python2.7 (which is required to run object_detection/train.py)
Looking into why this is the case at the moment.
EDIT:
I tried reinstalling Tensorflow - my VM came with Tensorflow pre-installed.
sudo pip2 install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.0-cp27-none-linux_x86_64.whl
and this seemed to fix the issue for Python2.7
Went from 6-8 seconds per step to 0.7 seconds per step.
Thanks. However my gpu seems to get a memory error.
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: ExpandDims_5/_5601 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_299_ExpandDims_5", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Hey @mjohnst
I'm getting 4 seconds per step. I want this to be less than 1.0. I'm Using windows 10 with NVIDIA GEFORCE GTX 1050 Ti.
I tried this from tensorflow.python.client import device_lib
device_lib.list_local_devices()
I was using tensorflow 1.15 but tensorflow did not recognize my GPU.
I tried this 'pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.2.0-cp27-none-linux_x86_64.whl' and i got this error
ERROR: tensorflow_gpu-1.2.0-cp27-none-linux_x86_64.whl is not a supported wheel on this platform.
Most helpful comment
@abhiML
Make sure your GPU is recognized by tensorflow by running this command:
For me, my GPU is recognized when I run in Python3.5, but not in Python2.7 (which is required to run
object_detection/train.py)Looking into why this is the case at the moment.
EDIT:
I tried reinstalling Tensorflow - my VM came with Tensorflow pre-installed.
and this seemed to fix the issue for Python2.7
Went from 6-8 seconds per step to 0.7 seconds per step.