Models: Very slow Postprocessing in Object Detection API

Created on 4 Nov 2017  Â·  5Comments  Â·  Source: tensorflow/models

The Tensorflow Object Detection API is >3x slower in inference than comparable Tensorflow implementations.

The paper "Speed/accuracy trade-offs for modern convolutional object detectors" states:
"Postprocessing can take up the bulk of the running time for the fastest models at ∼40ms and currently caps our maximum framerate to 25 frames per second."

What justifies the bulky Postprocessing? Can you please make it optional or faster?

Benchmark SSD300 on Nvidia GTX 1080, Ubuntu 16.04:

ssd_mobilenet_v1_coco 15.55 FPS
ssd_inception_v2_coco 14.07 FPS
https://github.com/balancap/SSD-Tensorflow(VGG) 55.55 FPS

Test mAP COCO
ssd_mobilenet_v1_coco 21 mAP
ssd_inception_v2_coco 24 mAP
https://github.com/balancap/SSD-Tensorflow(VGG) 25.1 mAP

Most helpful comment

what i experienced with the API is very low GPU Usage while detecting/inference.
Is there any option to optimize that? Is there a way to switch to GPU Mode?
For example i got following performances detecting a openCV Webcam Stream on SSD Mobile Net:

  • Dell Laptop with Intel i7 CPU and Nvidia GTX 1050 runs at 30-40% GPU Usage with 25fps
  • Nvidia Jetson Tx2 runs at 5-10% GPU Usage with 5fps

My project repo is:(https://github.com/GustavZ/realtime_object_detection)

I would appreciate any hints on how to increase performance!

All 5 comments

@jch1 Here's some feedback on the performance of the object_detection API.

Hi @speyside42 --- we have changed the tf.image.non_max_suppression op to be significantly faster since that paper was written. This being said, it is still run on CPU. Future work might be to do some of this work on the GPU.

@jch1 , I am also unable to reproduce the numbers listed by the Google team on the model zoo. On the Resnet101 coco model, I am seeing ~3x slower times on a Titan X with TF 1.4. Created a stack overflow question.

what i experienced with the API is very low GPU Usage while detecting/inference.
Is there any option to optimize that? Is there a way to switch to GPU Mode?
For example i got following performances detecting a openCV Webcam Stream on SSD Mobile Net:

  • Dell Laptop with Intel i7 CPU and Nvidia GTX 1050 runs at 30-40% GPU Usage with 25fps
  • Nvidia Jetson Tx2 runs at 5-10% GPU Usage with 5fps

My project repo is:(https://github.com/GustavZ/realtime_object_detection)

I would appreciate any hints on how to increase performance!

As GustavZ has found out in the meantime, you can simply change the non-maxima threshold from 1e-8 to something like 0.5 in the .config file to drastically improve speed without loosing considerable performance. Then just export your model as a frozen graph. You will receive the timings they present in the table. (still slower than balancap)

GustavZ introduced more tricks to speed up the inference, I recommend checking out his repo.

I'm disappointed that this is not the default configuration and that issues are mostly ignored even when simple fixes exist.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sun9700 picture sun9700  Â·  3Comments

amirjamez picture amirjamez  Â·  3Comments

25b3nk picture 25b3nk  Â·  3Comments

Mostafaghelich picture Mostafaghelich  Â·  3Comments

frankkloster picture frankkloster  Â·  3Comments