Serving: Failed to deploying a TF-TRT optimized model, Op type not registered 'TRTEngineOp'

Created on 28 Jun 2019 · 6Comments · Source: tensorflow/serving

Describe the problem the feature is intended to solve

Deploying a SavedModel optimized by TF-TRT, where the model have some TRTEngineOp nodes.

Describe the solution

What I want to happen is launching the TF-TRT model successfully like ordinary model.

Describe alternatives you've considered

The model without TRTEngineOp nodes was launched successfully but got no speed improvement. https://github.com/tensorflow/tensorrt/issues/89

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux RHEL 7.6
TensorFlow Serving installed from (source or binary): with Docker
TensorFlow Serving version: 1.13

Describe the problem

During deployment, I got this error
~
2019-06-28 04:38:22.403601: E tensorflow_serving/util/retrier.cc:37]
Loading servable: {name: trt-frcnn version: 1} failed:
Not found: Op type not registered 'TRTEngineOp' in binary running on 1e69d63dfee9.
Make sure the Op and Kernel are registered in the binary running in this process.
Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
~

Exact Steps to Reproduce and logs

1) Using TF-TRT to optimize a SavedModel, where the tensorflow-gpu version is 1.14
~
2019-06-28 11:16:03.857404: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:460] There are 2722 ops of 57 different types in the graph that are not converted to TensorRT: Sum, TopKV2, Select, CropAndResize, Fill, Split, Transpose, Where, Size, GatherV2, Greater, Equal, NonMaxSuppressionV3, Reshape, Add, ResizeBilinear, Assert, LoopCond, Merge, Squeeze, Enter, DataFormatVecPermute, ZerosLike, Less, Range, Placeholder, TensorArrayV3, TensorArraySizeV3, TensorArrayScatterV3, Cast, Maximum, StridedSlice, Shape, Minimum, Switch, TensorArrayReadV3, Prod, Identity, ExpandDims, ConcatV2, Unpack, RealDiv, Pad, Slice, LogicalAnd, Mul, Round, TensorArrayWriteV3, GreaterEqual, NoOp, Pack, Exit, NextIteration, TensorArrayGatherV3, Sub, Const, Tile, (For more information see https://docs.nvidia.com/deeplearning/dgx/tf-trt-user-guide/index.html#supported-ops).
2019-06-28 11:16:04.378135: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:733] Number of TensorRT candidate segments: 18
2019-06-28 11:16:04.684423: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.1
2019-06-28 11:16:04.684771: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node ClipToWindow/TRTEngineOp_0 added for segment 0 consisting of 8 nodes succeeded.
2019-06-28 11:16:04.684937: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_1 added for segment 1 consisting of 4 nodes succeeded.
2019-06-28 11:16:04.685106: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_2 added for segment 2 consisting of 18 nodes succeeded.
2019-06-28 11:16:04.685303: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_3 added for segment 3 consisting of 18 nodes succeeded.
2019-06-28 11:16:04.685498: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_4 added for segment 4 consisting of 18 nodes succeeded.
2019-06-28 11:16:04.685696: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_5 added for segment 5 consisting of 18 nodes succeeded.
2019-06-28 11:16:04.705593: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_6 added for segment 6 consisting of 442 nodes succeeded.
2019-06-28 11:16:04.708003: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_7 added for segment 7 consisting of 4 nodes succeeded.
2019-06-28 11:16:04.708203: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_8 added for segment 8 consisting of 3 nodes succeeded.
2019-06-28 11:16:04.708369: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_9 added for segment 9 consisting of 3 nodes succeeded.
2019-06-28 11:16:04.708506: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node GridAnchorGenerator/TRTEngineOp_10 added for segment 10 consisting of 8 nodes succeeded.
2019-06-28 11:16:04.708626: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node GridAnchorGenerator/TRTEngineOp_11 added for segment 11 consisting of 3 nodes succeeded.
2019-06-28 11:16:04.708736: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node GridAnchorGenerator/TRTEngineOp_12 added for segment 12 consisting of 3 nodes succeeded.
2019-06-28 11:16:04.725830: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_13 added for segment 13 consisting of 169 nodes succeeded.
2019-06-28 11:16:04.727548: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_14 added for segment 14 consisting of 7 nodes succeeded.
2019-06-28 11:16:04.728181: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node TRTEngineOp_15 added for segment 15 consisting of 7 nodes succeeded.
2019-06-28 11:16:04.728442: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node SecondStagePostprocessor/TRTEngineOp_16 added for segment 16 consisting of 8 nodes succeeded.
2019-06-28 11:16:04.728586: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:835] TensorRT node SecondStagePostprocessor/TRTEngineOp_17 added for segment 17 consisting of 7 nodes succeeded.
2019-06-28 11:16:04.945385: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:716] Optimization results for grappler item: tf_graph
2019-06-28 11:16:04.945483: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 6456 nodes (-475), 10488 edges (-486), time = 764.6ms.
2019-06-28 11:16:04.945501: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] layout: Graph size after: 6483 nodes (27), 10515 edges (27), time = 245.293ms.
2019-06-28 11:16:04.945517: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] constant folding: Graph size after: 6471 nodes (-12), 10508 edges (-7), time = 489.997ms.
2019-06-28 11:16:04.945540: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:718] TensorRTOptimizer: Graph size after: 5741 nodes (-730), 9719 edges (-789), time = 1155.79297ms.
~
2) Serve the converted model on Tensorflow:Serving with Docker
~~~
$ docker run -e NVIDIA_VISIBLE_DEVICES=1 -t --rm --name trt-frcnn \
-p 9000:8500 -v "/home/yilrr/tf-serving/trt-frcnn/versions/:/models/trt-frcnn" \
-e MODEL_NAME=trt-frcnn -t 8ed566398ac8

2019-06-28 04:38:22.116399: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config: model_name: trt-frcnn model_base_path: /models/trt-frcnn
2019-06-28 04:38:22.116569: I tensorflow_serving/model_servers/server_core.cc:461] Adding/updating models.
2019-06-28 04:38:22.116583: I tensorflow_serving/model_servers/server_core.cc:558] (Re-)adding model: trt-frcnn
2019-06-28 04:38:22.217017: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: trt-frcnn version: 1}
2019-06-28 04:38:22.217039: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: trt-frcnn version: 1}
2019-06-28 04:38:22.217053: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: trt-frcnn version: 1}
2019-06-28 04:38:22.217076: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/trt-frcnn/1
2019-06-28 04:38:22.217091: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/trt-frcnn/1
2019-06-28 04:38:22.276450: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2019-06-28 04:38:22.403529: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:285] SavedModel load for tags { serve }; Status: fail. Took 186421 microseconds.
2019-06-28 04:38:22.403601: E tensorflow_serving/util/retrier.cc:37] Loading servable: {name: trt-frcnn version: 1} failed: Not found: Op type not registered 'TRTEngineOp' in binary running on 1e69d63dfee9. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) tf.contrib.resampler should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
~~~

awaiting tensorflower performance support

Source

EsmeYi

👍1

Most helpful comment

@BertrandD
Thanks!
Btw, I found it also works in TF 1.14.0 using import tensorflow.contrib.tensorrt as trt to dynamically load the TRTEngineOp.

But I can't find the solution to make tf:serving(with docker) support TRTEngineOp.

EsmeYi on 1 Jul 2019

👍2

All 6 comments

You compiled a graph in Tensorflow using an OP that isn't available in Serving 1.13. Please refer to the following issue which describes how to solve problems like these and let me know if it helps. Thanks!

gowthamkpr on 28 Jun 2019

I don't know Serving, but I know that the TRTEngineOp is not loaded by Tensorflow 1.13 by default. You need to use import tensorflow.contrib.tensorrt as trt to dynamically load the TRTEngineOp.

In TF 1.14.0 the dynamic load is not enabled too, and I didn't find how to load it, but it is fixed in the 1.14.1-dev version (master branch)

I hope it can help you

BertrandD on 1 Jul 2019

👎1

@BertrandD
Thanks!
Btw, I found it also works in TF 1.14.0 using import tensorflow.contrib.tensorrt as trt to dynamically load the TRTEngineOp.

But I can't find the solution to make tf:serving(with docker) support TRTEngineOp.

EsmeYi on 1 Jul 2019

👍2

I have a similar problem. However, I'm a bit skeptical about @gowtham-kp 's proposed solution because:

The TensorRT optimisation completed successfully. (saved_model_cli doesn't produce an error, return code is 0)
If the same graph were to be loaded with Python code _and_ with import tensorflow.contrib.tensorrt as trt, it should run successfully.
All the fixes so far about TRTEngineOp in particular all point to including the line above.

But if I understand what you're trying to say, it seems TF-Serving would only work if _every_ OP is supported during conversion. Given that it's possible to run the graph in spite of unsupported OPs (since TensorRT would just ignore those nodes), it seems like this is a defect.

benjamintanweihao on 2 Jul 2019

👍1

TensorRT should be supported by TF serving starting from v1.13 (see release notes at https://github.com/tensorflow/serving/releases/tag/1.13.0).
This post has some usage examples: https://medium.com/tensorflow/optimizing-tensorflow-serving-performance-with-nvidia-tensorrt-6d8a2347869a.
In "Steps to reproduce", could you please share the full "docker run" command with the TF serving image included?