Models: Exporting inference graph leads to lower performance in object detection

Created on 27 Aug 2017 · 10Comments · Source: tensorflow/models

I have been working on a custom object detection dataset and both training and evaluation have been working as expected. However, when I export the latest checkpoint as a frozen inference graph and rerun evaluation using the exact same API from eval_util in object detection module, the mAP is much lower than (0.43 vs 0.5) the reported mAP on tensorboard during training/evaluation. What will be the cause for the issue and how can I resolve it?

Source

pkdogcom

Most helpful comment

Well, in fact the way Tensorflow decodes the image has minimal impact (1%~2%) on the performance. The reason I have much lower performance with OpenCV decoded input image is that OpenCV use BGR color space in imread or VideoCapture while Tensorflow uses RGB. Changing the color space fix my issue.

pkdogcom on 2 Dec 2017

❤3

All 10 comments

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

michaelisard on 28 Aug 2017

@pkdogcom How did you do the evaluation using the frozen graph? I can't load it

psuff on 7 Nov 2017

@psuff

detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

from here

JulienSiems on 10 Nov 2017

@JulienSiems that just loads the graph, but i'd like to evaluate it using eval.py to compute the mAP

psuff on 10 Nov 2017

👍3

how to do evaluation using checkpoints....mine is not loading.

SumanyuShukla on 18 Nov 2017

It turns out that this issue has nothing to do with the exporter. The cause of lower performance is that the way Tensorflow decodes JPEG image, using JDCT_IFAST Discrete Cosine Transform by default, is different from how OpenCV or Scipy/PIL does, which can lead to non-trivial difference in the result numpy image array. Please refer to stackoverflow for more details.

Since my model was trained on images decoded by Tensorflow while during my evaluation the images are read by OpenCV, it is normal that the performance is different or lower. Since I have to use OpenCV to read images/videos, I believe I have to either modify the training/evaluation scripts to accept 4D decoded image tensor in TFRecord or use RandomDistortColor data augmentation during training to make my model more robust to pixel/color distortion.

pkdogcom on 1 Dec 2017

👍1

@psuff To manually evaluate the exported model or checkpoints, you can use the object_detection_evaluation API directly, as how evaluator.py does. For example,

from object_detection.core.standard_fields import DetectionResultFields, InputDataFields
from object_detection.utils import object_detection_evaluation

...

evaluator = object_detection_evaluation.PascalDetectionEvaluator(categories)

for image, data in zip(image_list, annotation_data):
    with tf.gfile.GFile(image, 'rb') as fid:
        image_np = detector.decode_image(fid.read())

    (boxes, scores, classes, num_detections) = detector.detect(image_np, min_score_thresh=0.0)

    # Add groundtruth to evaluator
    groundtruth_boxes = []
    groundtruth_classes = []
    for obj in data['object']:
        groundtruth_boxes.append([obj['bndbox']['ymin'], obj['bndbox']['xmin'], obj['bndbox']['ymax'],
                                  obj['bndbox']['xmax']])
        groundtruth_classes.append(label_map_dict[obj['name']])
    groundtruth_dict = {InputDataFields.groundtruth_boxes: np.array(groundtruth_boxes, dtype=np.float32),
                        InputDataFields.groundtruth_classes: np.array(groundtruth_classes)}
    evaluator.add_single_ground_truth_image_info(image, groundtruth_dict)

    # Scale detection results to absolute coordinates and add them to evaluator
    width = int(data['size']['width'])
    height = int(data['size']['height'])
    scaled_boxes = scale_boxes_to_absolute(boxes, width, height)
    detections_dict = {DetectionResultFields.detection_boxes: scaled_boxes,
                       DetectionResultFields.detection_scores: scores,
                       DetectionResultFields.detection_classes: classes}
    evaluator.add_single_detected_image_info(image, detections_dict)

metrics = evaluator.evaluate()

pkdogcom on 1 Dec 2017

pkdogcom on 2 Dec 2017

❤3

@psuff ,have you got the mAP using .pb file .would you please give me some advice?thanks.