Following the example in the t2t serving readme allows me to export a model, run a model server and query that server successfully.
However, the model is served on cpu only.
How to run / query the model server on gpu?
Issues on exporting and running servers #648 #501 have been closed, but did serving on gpu work with you?
I think it's because you used the TF serving binary.
For GPU you need to rebuild Serving.
This whole process is a nightmare (you need also to stay on the same version between TF and Serving which is not always possible depending on some requirements at time).
@vince62s Thanks for the hint. It is as you said.
git clone --recurse-submodules https://github.com/tensorflow/serving
Changing @org_tensorflow//third_party/gpus/crosstool to @local_config_cuda//crosstool:toolchain in tools/bazel.rc
Then, running bazel clean --expunge && export TF_NEED_CUDA=1 and bazel query 'kind(rule, @local_config_cuda//...)'
And finally running .../serving/tensorflow $ bazel build -c opt --config=cuda tensorflow/... as described here did the trick.
Running bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server lets the model serve on gpu.
Thank you!
Nothing should be changed in the parameters passed to tensor2tensor.serving.export ? If the machine will have a GPU running tensorflow_model_server will automatically be run on GPU?
Most helpful comment
@vince62s Thanks for the hint. It is as you said.
Changing
@org_tensorflow//third_party/gpus/crosstoolto@local_config_cuda//crosstool:toolchainintools/bazel.rcThen, running
bazel clean --expunge && export TF_NEED_CUDA=1andbazel query 'kind(rule, @local_config_cuda//...)'And finally running
.../serving/tensorflow $ bazel build -c opt --config=cuda tensorflow/...as described here did the trick.Running
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_serverlets the model serve on gpu.Thank you!