Models: lowproposals models are no faster than corresponding models with more proposals

Created on 19 Nov 2017 · 9Comments · Source: tensorflow/models

System information

What is the top-level directory of the model you are using:

research/object_detection

Have I written custom code (as opposed to using a stock example script provided in TensorFlow):

Yes, I've written a simple script that allows me to select the model, input video, frames, etc.

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):

Ubuntu 17.04

TensorFlow installed from (source or binary):

pip install tensorflow or tensorflow-gpu (in a virtual environment)

TensorFlow version (use command below):

1.4

Bazel version (if compiling from source):

N/A

CUDA/cuDNN version:

Whatever was installed with python 3.5 and pip install tensorflow-gpu. I get the same behavior, however, independent of whether I'm running with tensorflow w/o CUDA or tensorflow-gpu w/ CUDA support.

GPU model and memory:

The results are independent of whether I run on a GPU.

Exact command to reproduce:

Run the object detection models using any strategy you want, then change just the model name.
My script is essentially the same steps as that in:

research/object_detection/object_detection_tutorial.ipynb

Describe the problem

I pretty sure there's a bug in the lowproposals models. I have a script
that iterates over all models on the object model detection model zoo and
the results for the lowproposal models are essentailly the same for both
runtime and number of regions detected. Based on the table describing the
models they should be several times faster.

The table below summarizes the results obtained for 10 frames of 4k video. In it:

model = an abbreviation obtained by taking the first letter of each word
in the model name and using 50 or 101 in the corresonding models to
resolve abbreviations that would conflict with my naming convention. I'm
also pretty sure that my table is in the same order as the models are
presented in the mode zoo page.
dt = the per frame execution time in seconds (single GPU run)
regions = the average number of objects detected per frame

| model | dt | regions |
|----------+-------+---------|
| sm1c | 0.264 | 5.50 |
| si2c | 0.342 | 6.20 |
| fri2c | 0.616 | 10.30 |
| frr50c | 0.713 | 14.20 |
| frr50lc | 0.732 | 14.20 |
| rr101c | 0.719 | 8.50 |
| frr101c | 0.834 | 13.40 |
| frr101lc | 0.809 | 13.40 |
| frir2ac | 2.229 | 12.70 |
| frir2alc | 2.220 | 12.70 |
| frnc | 1.899 | 15.50 |
| frnlc | 1.907 | 15.50 |

Note that *lc results are effectively identical to the coresponding *c
results. I get the same results using GPU and non-GPU execution (only the
runtime changes).

For SSD I'm using the Nov 17th models and for the others I'm using the Nov
8th models. The only difference in my code is the selection of a different
model file. I'm running top-of-tree master as of Nov 18th 20:32 PST.

Source code / logs

The table below summarizes the results obtained for 10 frames of 4k video. In it:

model = an abbreviation obtained by taking the first letter of each word
in the model name and using 50 or 101 in the corresonding models to
resolve abbreviations that would conflict with my naming convention. I'm
also pretty sure that my table is in the same order as the models are
presented in the mode zoo page.
dt = the per frame execution time in seconds (single GPU run)
regions = the average number of objects detected per frame

Note that *lc results are effectively identical to the coresponding *c
results. I get the same results using GPU and non-GPU execution (only the
runtime changes).

Source

headdab

👍2

Most helpful comment

Also confirmed.
The models in "lowproposals" have exactly the same configuration pipeline file and model (same md5 sum) compared to the regular models.

sczhengyabin on 20 Nov 2017

👍5

All 9 comments

Also confirmed.
The models in "lowproposals" have exactly the same configuration pipeline file and model (same md5 sum) compared to the regular models.

sczhengyabin on 20 Nov 2017

👍5

I hit the same problem, the lowproposal siblings do have same config and return the same number of proposals as their "big" counterparts. Was not sure if it is a bug or just my misunderstanding and tried to ask here: https://stackoverflow.com/questions/47892848/lowproposals-tagged-model-from-tensorflow-detection-model-zoo-actually-does-not but got no response.

lcerman on 22 Dec 2017

@tombstone not sure what I did, just wanted to add my comment above, but I see in the log, I did "unassigned" you, sorry for that, don't know how to take it back...

lcerman on 22 Dec 2017

I think @tombstone isn't in the TF github group any more so that probably caused him to be unassigned by your comment. (? I don't understand github very well.) I assigned to another owner hi @derekjchow!

michaelisard on 22 Dec 2017

Should we just reduce "first_stage_max_proposals" in the config from 300 to 100, or 50, to get the expected "lowproposal" behavior?

stevenpclark on 19 Jan 2018

@stevenpclark I tried it and did not get a better prediction performance. Have you found a solution?

JulienSiems on 4 Feb 2018

Nope. I'm disappointed that we are getting no support here, since a major reason I chose to with TF was that I read that it had such a great support community.

headdab on 5 Feb 2018

After reducing the first_stage_max_proposals, you have to export the graph again (using https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/exporting_models.md ) You should see a small increase in speed (with no other changes, probably 10 to 15%)

nguyeho7 on 5 Feb 2018

👍1

Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.