Recently I download the master brantch of this repo. Every time I run the evaluate.py. It costs about 30 minutes for the first batch. I trained on VOC dataset with two GPUs and evaluate with one. I debuged the evaluate.py that it get stuck on the "predict_on_batch" . I elso keep watch on the GPU memory by NVIDIA-SMI , the memory usage is 713MiB before the prediction really start.
The evaluation during training is slow, or evaluate.py is slow?
What tensorflow / Keras are you using? What backbone? Can you check in nvidia-smi if the GPU is being utilized? Can you post the output of evaluate.py?
Also, what do you mean first batch? It takes 30 minutes for one image? Otherwise, what is your batch size?
tensorflow -gpu 1.5.0 & Keras 2.2.2 with backbone resnet101. When i run the evaluate.py, the gpu usage is 713 MiB first. The output is below:
/home/wen/anaconda3/bin/python /home/wen/pycharm-2017.2.4/helpers/pydev/pydevd.py --multiproc --qt-support=auto --client 127.0.0.1 --port 41388 --file /home/wen/net_project/keras-retinanet-master/keras_retinanet/bin/evaluate.py pascal /home/wen/data/test922/voc ./snapshots/resnet101_pascal_120.h5 --gpu=3
pydev debugger: process 24228 is connecting
Connected to pydev debugger (build 172.4343.24)
Using TensorFlow backend.
2018-09-30 14:49:49.053546: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-30 14:49:49.531490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 10.91GiB freeMemory: 10.75GiB
2018-09-30 14:49:49.531563: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
Loading model, this may take a second...
QXcbConnection: Failed to initialize XRandr
Qt: XKEYBOARD extension not present on the X server.
Backend Qt5Agg is interactive backend. Turning interactive mode on.
/home/wen/anaconda3/lib/python3.6/site-packages/keras/engine/saving.py:268: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '
----------------------------------------------------------------------------------------------------------------------------------
the program is stunk on this line:
# run network
boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))[:3]
---------------------------------------------------------------------------------------------------------------------------------
it costs a long time in the first predict
It should use way more than 713MB for resnet101. Also, it seems you are using a python debugger as well, this may slow it down further. Could you check in nvidia-smi if the GPU is utilized, as in if it receives any workload (I forgot the name of the column in nvidia-smi, but it shows a percentage of workload).
Tensorflow 1.5 is also quite old, consider updating to at least 1.8.
when I run the evaluate.py directly ,I got this output.
/home/wen/anaconda3/bin/python /home/wen/net_project/keras-retinanet-master/keras_retinanet/bin/evaluate.py pascal /home/wen/data/test922/voc ./snapshots/resnet101_pascal_120.h5 --gpu=3
Using TensorFlow backend.
2018-09-30 15:03:32.741131: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-09-30 15:03:33.211905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:09:00.0
totalMemory: 10.91GiB freeMemory: 10.06GiB
2018-09-30 15:03:33.211954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
Loading model, this may take a second...
/home/wen/anaconda3/lib/python3.6/site-packages/keras/engine/saving.py:268: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '
Sun Sep 30 15:06:46 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:04:00.0 Off | N/A |
| 25% 33C P8 16W / 250W | 0MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:05:00.0 Off | N/A |
| 25% 36C P8 16W / 250W | 0MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:08:00.0 Off | N/A |
| 25% 32C P8 16W / 250W | 0MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:09:00.0 Off | N/A |
| 25% 36C P2 54W / 250W | 5694MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 4 GeForce GTX 108... Off | 00000000:85:00.0 Off | N/A |
| 50% 83C P2 211W / 250W | 10613MiB / 11172MiB | 97% Default |
+-------------------------------+----------------------+----------------------+
| 5 GeForce GTX 108... Off | 00000000:86:00.0 Off | N/A |
| 25% 35C P8 16W / 250W | 0MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 6 GeForce GTX 108... Off | 00000000:89:00.0 Off | N/A |
| 25% 30C P8 15W / 250W | 0MiB / 11172MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 7 GeForce GTX 108... Off | 00000000:8A:00.0 Off | N/A |
| 50% 83C P2 115W / 250W | 9599MiB / 11172MiB | 84% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 3 24228 C /home/wen/anaconda3/bin/python 4971MiB |
| 3 24825 C /home/wen/anaconda3/bin/python 713MiB |
--------------------------------------------------------------------------------------------------------------------
Doesn't that show nearly 5gb of memory usage though? That's more in line with what I expect it to be.
Still though, update tensorflow to 1.10 if possible, I've heard people say it makes a lot of difference.
Thank you and your team so much! I solved this porblem by upgrading the tensorflow from 1.5 to 1.10.
Very helpful to know that older versions of tensorflow slow down program,
i had tensorflow-gpu==1.5, and initial predict was taking 300s,
installed tensorflow-gpu==1.10 and it went down to 6s
Dear I have same problem but I am using tensorflow-gpu 1.14
Most helpful comment
Thank you and your team so much! I solved this porblem by upgrading the tensorflow from 1.5 to 1.10.