Serving: TF Serving 1.8+ not utilising GPU

Created on 12 Jul 2018 · 10Comments · Source: tensorflow/serving

From TF Serving 1.8 onward, when I start the tensorflow_model_server, it's not using GPUs.

I've tried building docker image from Dockerfile from github repo, used the docker image from docker-hub also. I'm getting following error while starting the server:

2018-07-12 14:23:18.848070: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUresult(-1)

The server is getting started but it's using CPUs and not GPU.

I've used TF-Serving 1.6 and 1.7 earlier and could use GPU also.

Source

upendra2017

Most helpful comment

In the container, try:
export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64/usr/local/cuda-9.0/lib64::/usr/local/cuda-9.0/targets/x86_64-linux/lib

gautamvasudevan on 13 Jul 2018

👍4

All 10 comments

Please indicate what OS you're running it on, how you're fetching or starting the image, hardware you're using, and any other relevant issue for debugging this problem. Thanks!

gautamvasudevan on 12 Jul 2018

OS: Ubuntu 16.04

I tried different ways to run the container:

From Dockerfile provided by Tensorflow team in the repo.
From official docker-hub of serving.

I tried to run the docker image on:

Nvidia K80
Nvidia P100
Nvidia V100

upendra2017 on 12 Jul 2018

Can you provide a set of steps to recreate a scenario that fails for you?

There's also this thread over at TF with some things you can try to verify: https://github.com/tensorflow/tensorflow/issues/7653

gautamvasudevan on 12 Jul 2018

I am having the same problem. The steps I took are

built docker container using this as base tensorflow/serving:1.8.0-devel-gpu
ran nvidia-docker run -it -d -p xxxx:xxxx tfserving bash
inside docker container ran tensorflow_model_server --rest_api_port=xxxx --model_name=tacotron --model_base_path=/tensorflow-serving/models/

That gave me the logs below

2018-07-12 22:11:41.337925: I tensorflow_serving/model_servers/main.cc:153] Building single TensorFlow model file config:  model_name: tacotron model_base_path: /tensorflow-serving/models/
2018-07-12 22:11:41.338161: I tensorflow_serving/model_servers/server_core.cc:459] Adding/updating models.
2018-07-12 22:11:41.338178: I tensorflow_serving/model_servers/server_core.cc:514]  (Re-)adding model: tacotron
2018-07-12 22:11:41.438807: I tensorflow_serving/core/basic_manager.cc:718] Successfully reserved resources to load servable {name: tacotron version: 1}
2018-07-12 22:11:41.438875: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: tacotron version: 1}
2018-07-12 22:11:41.438898: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: tacotron version: 1}
2018-07-12 22:11:41.438938: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:360] Attempting to load native SavedModelBundle in bundle-shim from: /tensorflow-serving/models/1
2018-07-12 22:11:41.438978: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:242] Loading SavedModel with tags: { serve }; from: /tensorflow-serving/models/1
2018-07-12 22:11:41.537738: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2FMA
2018-07-12 22:11:41.538254: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUresult(-1)
2018-07-12 22:11:41.538277: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: c9091fb771d7
2018-07-12 22:11:41.538285: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: c9091fb771d7
2018-07-12 22:11:41.538354: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
2018-07-12 22:11:41.538384: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 384.130.0
2018-07-12 22:11:41.744399: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:161] Restoring SavedModel bundle.
2018-07-12 22:11:41.898385: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:196] Running LegacyInitOp on SavedModel bundle.
2018-07-12 22:11:41.934421: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:291] SavedModel load for tags { serve }; Status: success. Took 495018 microseconds.
2018-07-12 22:11:41.935802: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:83] No warmup data file found at /tensorflow-serving/models/1/assets.extra/tf_serving_warmup_requests
2018-07-12 22:11:41.936605: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: tacotron version: 1}
2018-07-12 22:11:41.940737: I tensorflow_serving/model_servers/main.cc:323] Running ModelServer at 0.0.0.0:8500 ...
[warn] getaddrinfo: address family for nodename not supported
2018-07-12 22:11:41.944768: I tensorflow_serving/model_servers/main.cc:333] Exporting HTTP/REST API at:localhost:xxxx ...
[evhttp_server.cc : 235] RAW: Entering the event loop ...

I suspect the issue can be found in the log lines 10 - 14 which starts with an error an error failed call to cuInit: CUresult(-1)

2018-07-12 22:11:41.538254: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUresult(-1)
2018-07-12 22:11:41.538277: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] retrieving CUDA diagnostic information for host: c9091fb771d7
2018-07-12 22:11:41.538285: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:170] hostname: c9091fb771d7
2018-07-12 22:11:41.538354: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:194] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
2018-07-12 22:11:41.538384: I external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_diagnostics.cc:198] kernel reported version is: 384.130.0

The inference is working but runs on a CPU and not GPU as stated by the OP. This is running on a Nvidia GTX 1080 TI.

LearnedVector on 13 Jul 2018

How are you launching docker? What's the exact command line you're using?

gautamvasudevan on 13 Jul 2018

@gautamvasudevan

nvidia-docker run -it -d --name tacotron -p xxxx:xxxx tfserving ./scripts/start_tfserving.sh xxxx

start_tfserving.sh is simply this tensorflow_model_server --rest_api_port=$1 --model_name=tacotron --model_base_path=/tensorflow-serving/models/

replace x's with an actual port number.

LearnedVector on 13 Jul 2018

In the container, try:
export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64/usr/local/cuda-9.0/lib64::/usr/local/cuda-9.0/targets/x86_64-linux/lib

gautamvasudevan on 13 Jul 2018

👍4

That did it!

LearnedVector on 13 Jul 2018

This works! Thanks a lot @gautamvasudevan is works. It seems that Dockerfile is missing this environment variable.

upendra2017 on 13 Jul 2018

I'll have a fix in soon. Thanks!

gautamvasudevan on 13 Jul 2018

Was this page helpful?

0 / 5 - 0 ratings