Models: Object Detection: Errors with Export/Import for inference

Created on 4 Aug 2017 · 6Comments · Source: tensorflow/models

System information

What is the top-level directory of the model you are using:
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): binary
TensorFlow version (use command below): 1.2.1
Bazel version (if compiling from source): N/A
CUDA/cuDNN version: 6.0
GPU model and memory: Tesla P100 16276 MiB
Exact command to reproduce: See description

Describe the problem

Overview

Loading a frozen model created with export_inference_graph.py results in "truncated message: errors. As a result, I am unable to use the model for inference. Model was generated at 10,820 iterations training the pet detector example locally.

Details:

The tutorial given at https://github.com/tensorflow/models/blob/master/object_detection/g3doc/exporting_models.md has some incorrect input option names, however, a few minor changes to the suggested inputs gets export_inference_graph.py to run via:
python object_detection/export_inference_graph.py --input_type image_tensor --pipeline_config_path /home/qzhrlc/models/object_detection/samples/configs/faster_rcnn_resnet101_pets.config --trained_checkpoint_prefix /path/to/checkpoint/model.ckpt-10820 --output_directory .

This produces a .pb file, however, some log messages appear which may indicate some issue. Then, when loading into python via
PATH_TO_CKPT = '/home/qzhrlc/models/saved_model/saved_model.pb'
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
An error is generated on od_graph_def.ParseFromString(serialized_graph) of DecodeError: Truncated message. Output of export_inference_graph and error message included below.

Source code / logs

export_inference_graph output:

2017-08-04 12:00:13.624623: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-04 12:00:13.624652: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-04 12:00:13.624656: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-04 12:00:13.624678: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-04 12:00:13.624683: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-08-04 12:00:14.054272: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: Tesla P100-SXM2-16GB
major: 6 minor: 0 memoryClockRate (GHz) 1.4805
pciBusID 0000:05:00.0
Total memory: 15.89GiB
Free memory: 514.25MiB
2017-08-04 12:00:14.452981: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x714e4f0 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-08-04 12:00:14.453690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 1 with properties:
name: Tesla P100-SXM2-16GB
major: 6 minor: 0 memoryClockRate (GHz) 1.4805
pciBusID 0000:06:00.0
Total memory: 15.89GiB
Free memory: 514.25MiB
2017-08-04 12:00:14.885142: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x7151e80 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-08-04 12:00:14.885995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 2 with properties:
name: Tesla P100-SXM2-16GB
major: 6 minor: 0 memoryClockRate (GHz) 1.4805
pciBusID 0000:84:00.0
Total memory: 15.89GiB
Free memory: 514.25MiB
2017-08-04 12:00:15.310494: W tensorflow/stream_executor/cuda/cuda_driver.cc:523] A non-primary context 0x7155810 exists before initializing the StreamExecutor. We haven't verified StreamExecutor works with that.
2017-08-04 12:00:15.311252: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 3 with properties:
name: Tesla P100-SXM2-16GB
major: 6 minor: 0 memoryClockRate (GHz) 1.4805
pciBusID 0000:85:00.0
Total memory: 15.89GiB
Free memory: 514.25MiB
2017-08-04 12:00:15.312769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0 1 2 3
2017-08-04 12:00:15.312779: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y Y Y Y
2017-08-04 12:00:15.312783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 1: Y Y Y Y
2017-08-04 12:00:15.312786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 2: Y Y Y Y
2017-08-04 12:00:15.312790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 3: Y Y Y Y
2017-08-04 12:00:15.312798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:05:00.0)
2017-08-04 12:00:15.312805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla P100-SXM2-16GB, pci bus id: 0000:06:00.0)
2017-08-04 12:00:15.312810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:2) -> (device: 2, name: Tesla P100-SXM2-16GB, pci bus id: 0000:84:00.0)
2017-08-04 12:00:15.312832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:3) -> (device: 3, name: Tesla P100-SXM2-16GB, pci bus id: 0000:85:00.0)
2017-08-04 12:01:02.982605: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:05:00.0)
2017-08-04 12:01:02.982634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla P100-SXM2-16GB, pci bus id: 0000:06:00.0)
2017-08-04 12:01:02.982656: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:2) -> (device: 2, name: Tesla P100-SXM2-16GB, pci bus id: 0000:84:00.0)
2017-08-04 12:01:02.982661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:3) -> (device: 3, name: Tesla P100-SXM2-16GB, pci bus id: 0000:85:00.0)
Converted 530 variables to const ops.
2017-08-04 12:01:12.496242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:05:00.0)
2017-08-04 12:01:12.496283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla P100-SXM2-16GB, pci bus id: 0000:06:00.0)
2017-08-04 12:01:12.496288: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:2) -> (device: 2, name: Tesla P100-SXM2-16GB, pci bus id: 0000:84:00.0)
2017-08-04 12:01:12.496292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:3) -> (device: 3, name: Tesla P100-SXM2-16GB, pci bus id: 0000:85:00.0)

### Import error message:

DecodeError Traceback (most recent call last)
in ()
4 with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
5 serialized_graph = fid.read()
----> 6 od_graph_def.ParseFromString(serialized_graph)
7 tf.import_graph_def(od_graph_def, name='')

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/message.pyc in ParseFromString(self, serialized)
183 """
184 self.Clear()
--> 185 self.MergeFromString(serialized)
186
187 def SerializeToString(self):

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/python_message.pyc in MergeFromString(self, serialized)
1060 length = len(serialized)
1061 try:
-> 1062 if self._InternalParse(serialized, 0, length) != length:
1063 # The only reason _InternalParse would return early is if it
1064 # encountered an end-group tag.

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/python_message.pyc in InternalParse(self, buffer, pos, end)
1096 pos = new_pos
1097 else:
-> 1098 pos = field_decoder(buffer, new_pos, end, self, field_dict)
1099 if field_desc:
1100 self._UpdateOneofState(field_desc)

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/decoder.pyc in DecodeField(buffer, pos, end, message, field_dict)
631 raise _DecodeError('Truncated message.')
632 # Read sub-message.
--> 633 if value._InternalParse(buffer, pos, new_pos) != new_pos:
634 # The only reason _InternalParse would return early is if it encountered
635 # an end-group tag.

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/decoder.pyc in DecodeRepeatedField(buffer, pos, end, message, field_dict)
610 raise _DecodeError('Truncated message.')
611 # Read sub-message.
--> 612 if value.add()._InternalParse(buffer, pos, new_pos) != new_pos:
613 # The only reason _InternalParse would return early is if it
614 # encountered an end-group tag.

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/decoder.pyc in DecodeMap(buffer, pos, end, message, field_dict)
741 # Read sub-message.
742 submsg.Clear()
--> 743 if submsg._InternalParse(buffer, pos, new_pos) != new_pos:
744 # The only reason _InternalParse would return early is if it
745 # encountered an end-group tag.

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/python_message.pyc in InternalParse(self, buffer, pos, end)
1086 if field_decoder is None:
1087 value_start_pos = new_pos
-> 1088 new_pos = local_SkipField(buffer, new_pos, end, tag_bytes)
1089 if new_pos == -1:
1090 return pos

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/decoder.pyc in SkipField(buffer, pos, end, tag_bytes)
848 # The wire type is always in the first byte since varints are little-endian.
849 wire_type = ord(tag_bytes[0:1]) & wiretype_mask
--> 850 return WIRETYPE_TO_SKIPPERwire_type
851
852 return SkipField

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/decoder.pyc in _SkipGroup(buffer, pos, end)
797 while 1:
798 (tag_bytes, pos) = ReadTag(buffer, pos)
--> 799 new_pos = SkipField(buffer, pos, end, tag_bytes)
800 if new_pos == -1:
801 return pos

/home/qzhrlc/miniconda2/envs/ajlBaseCaffe/lib/python2.7/site-packages/google/protobuf/internal/decoder.pyc in _SkipFixed32(buffer, pos, end)
812 pos += 4
813 if pos > end:
--> 814 raise _DecodeError('Truncated message.')
815 return pos
816

DecodeError: Truncated message.

bug

Source

matlabninja

Most helpful comment

@matlabninja @yzs15 @alicefqy @classicCoder16
sorry ， I just remembered this issue,.. I had find my problem.... and it will slove your promblem too...
When using ' export_inference_graph.py' file generate the frozen model, it will create two .pb file in output_directory,
first is "frozen_inference_graph.pb" that in root dir
second is 'saved_model/saved_model.pb' that in 'saved_model' dir
we both used the wrong file to import.... what a *

so, your code :
PATH_TO_CKPT = '/home/qzhrlc/models/saved_model/saved_model.pb'
is not right ,
the right way to import model is:
PATH_TO_CKPT = '/home/qzhrlc/models/frozen_inference_graph.pb'

then it will run good... (sorry for my poor English writing...)

leacoleaco on 25 Sep 2017

👍12

All 6 comments

did u solve the problem? could you help me with the same problem

yzs15 on 17 Aug 2017

Same problem on windows platform.

qy-feng on 20 Aug 2017

same problem on windows10 platform ... T-T

leacoleaco on 6 Sep 2017

Likewise -- does anyone have an update on this?

classicCoder16 on 22 Sep 2017

so, your code :
PATH_TO_CKPT = '/home/qzhrlc/models/saved_model/saved_model.pb'
is not right ,
the right way to import model is:
PATH_TO_CKPT = '/home/qzhrlc/models/frozen_inference_graph.pb'

then it will run good... (sorry for my poor English writing...)

leacoleaco on 25 Sep 2017

👍12

@leacoleaco Thanks for the info! I had stepped away from this for a while, but I've come back to it, and your advice worked!

matlabninja on 9 Apr 2018

Was this page helpful?

0 / 5 - 0 ratings