Serving: Is there a way to verify Tensorflow Serving is using GPUs on a GPU instance?

Created on 7 Mar 2017 · 5Comments · Source: tensorflow/serving

While running Tensorflow Serving how to verify it uses GPUs for serving? Under serving/tensorflow, configured Tensorflow for CUDA during ./configure.

Tried monitoring nvidia-smi while running serving client query but it shows no running process found.

Below is the subset of ./configure

Do you wish to build TensorFlow with CUDA support? [y/N] y
CUDA support will be enabled for TensorFlow
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.

Source

sskgit

Most helpful comment

@sskgit , if you install TF serving viabazel build tensorflow_serving/... as suggested in official document, then you are using the CPU-version of TF serving. You need to use bazel build -c opt --config=cuda tensorflow_serving/... to compile the GPU-version.

Also when compiling, you might find some errors about crosstool and nccl. You can find solution as follows:
https://github.com/tensorflow/serving/issues/186#issuecomment-251152755
https://github.com/tensorflow/serving/issues/327#issuecomment-305771708

sugartom on 14 Jun 2017

👍14 🎉3

All 5 comments

I checked it in 2 ways:

nvidia-smi shows high memory usage (TensorFlow gets the memory when launched even if no model is created yet, so TensorFlow serving does the same as launched).
include the devices in the code, if you use with tf.device('/gpu:0'): TensorFlow Serving shows error messages if it cannot use that device when the model is loaded.

The current way to compile TensorFlow serving doesn't guarantee that it is compiled with CUDA support. I have to create my own script, other people has solved it with less issues than me (check other issues to know solutions).

jorgemf on 7 Mar 2017

👍1

Thanks @jorgemf for the clarification.

sskgit on 15 Mar 2017

sugartom on 14 Jun 2017

👍14 🎉3

qiaohaijun on 2 Nov 2017

@jorgemf Hi, I am using the cuda version, I think. Because GPU load will rise up if I lauch tensorflow. But if I start run to build a model, I can see GPU load is just upto 38% or so, not full loaded. Can you explain why??