Models: eval.py stops evaluating after 3-5 images

Created on 5 Jul 2017 · 19Comments · Source: tensorflow/models

On both the pet dataset and my own custom dataset, the evaluation script seemingly stops evaluating images after a few samples. In both cases there is a warning about no ground truth samples for some of the classes, but I altered the line to remove the warning occuring and the evaluation still stops. Using python 3.5

WARNING:root:The following classes have no ground truth examples: [11 15 16 17 18 19]
/home/chris/tensorflow/models/object_detection/utils/metrics.py:144: RuntimeWarning: invalid value encountered in true_divide
  num_images_correctly_detected_per_class / num_gt_imgs_per_class)

There is no error, only the warning, but after that nothing new appears in tensorboar.
On a side note, does the number of examples need to be set in the config file? seems strange to be hard coding...

eval_config: { num_examples: 158 }

Source

ckalas

Most helpful comment

eval.py is still evaluating background, it just do not print the evaluating result.
It print eval result normally after I set the logging level in the object_detect/eval.py
Just add two lines

import logging
logging.basicConfig(level=logging.INFO)

INFO:tensorflow:Restoring parameters from ./object_detection/train_result\model.ckpt-7378
INFO:tensorflow:Restoring parameters from ./object_detection/train_result\model.ckpt-7378
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-0.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-1.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-2.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-3.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-4.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-5.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-6.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-7.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-8.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-9.
INFO:root:Running eval ops batch 100/4952
INFO:root:Running eval ops batch 200/4952
INFO:root:Running eval ops batch 300/4952
INFO:root:Running eval ops batch 400/4952
INFO:root:Running eval ops batch 500/4952
INFO:root:Running eval ops batch 600/4952
INFO:root:Running eval ops batch 700/4952
INFO:root:Running eval ops batch 800/4952
INFO:root:Running eval ops batch 900/4952
INFO:root:Running eval ops batch 1000/4952
INFO:root:Running eval ops batch 1100/4952
INFO:root:Running eval ops batch 1200/4952
INFO:root:Running eval ops batch 1300/4952
INFO:root:Running eval ops batch 1400/4952
INFO:root:Running eval ops batch 1500/4952
INFO:root:Running eval ops batch 1600/4952
INFO:root:Running eval ops batch 1700/4952
INFO:root:Running eval ops batch 1800/4952
INFO:root:Running eval ops batch 1900/4952
INFO:root:Running eval ops batch 2000/4952
INFO:root:Running eval ops batch 2100/4952
INFO:root:Running eval ops batch 2200/4952
INFO:root:Running eval ops batch 2300/4952
INFO:root:Running eval ops batch 2400/4952
INFO:root:Running eval ops batch 2500/4952
INFO:root:Running eval ops batch 2600/4952
INFO:root:Running eval ops batch 2700/4952
INFO:root:Running eval ops batch 2800/4952
INFO:root:Running eval ops batch 2900/4952
INFO:root:Running eval ops batch 3000/4952
INFO:root:Running eval ops batch 3100/4952
INFO:root:Running eval ops batch 3200/4952
INFO:root:Running eval ops batch 3300/4952
INFO:root:Running eval ops batch 3400/4952
INFO:root:Running eval ops batch 3500/4952
INFO:root:Running eval ops batch 3600/4952
INFO:root:Running eval ops batch 3700/4952
INFO:root:Running eval ops batch 3800/4952
INFO:root:Running eval ops batch 3900/4952
INFO:root:Running eval ops batch 4000/4952
INFO:root:Running eval ops batch 4100/4952
INFO:root:Running eval ops batch 4200/4952
INFO:root:Running eval ops batch 4300/4952
INFO:root:Running eval ops batch 4400/4952
INFO:root:Running eval ops batch 4500/4952
INFO:root:Running eval ops batch 4600/4952
INFO:root:Running eval ops batch 4700/4952
INFO:root:Running eval ops batch 4800/4952
INFO:root:Running eval ops batch 4900/4952
INFO:root:Running eval batches done.
INFO:root:Computing Pascal VOC metrics on results.
WARNING:root:The following classes have no ground truth examples: 0
D:\Deep Learning Software\RCNN\object_detection\utils\metrics.py:145: RuntimeWarning: invalid value encountered in true_divide
  num_images_correctly_detected_per_class / num_gt_imgs_per_class)
INFO:root:Writing metrics to tf summary.
INFO:root:PerformanceByCategory/[email protected]/aeroplane: 0.817954
INFO:root:PerformanceByCategory/[email protected]/bicycle: 0.721141
INFO:root:PerformanceByCategory/[email protected]/bird: 0.799558
INFO:root:PerformanceByCategory/[email protected]/boat: 0.457477
INFO:root:PerformanceByCategory/[email protected]/bottle: 0.616356
INFO:root:PerformanceByCategory/[email protected]/bus: 0.729833
INFO:root:PerformanceByCategory/[email protected]/car: 0.798665
INFO:root:PerformanceByCategory/[email protected]/cat: 0.889659
INFO:root:PerformanceByCategory/[email protected]/chair: 0.562433
INFO:root:PerformanceByCategory/[email protected]/cow: 0.658600
INFO:root:PerformanceByCategory/[email protected]/diningtable: 0.342314
INFO:root:PerformanceByCategory/[email protected]/dog: 0.840469
INFO:root:PerformanceByCategory/[email protected]/horse: 0.855745
INFO:root:PerformanceByCategory/[email protected]/motorbike: 0.779330
INFO:root:PerformanceByCategory/[email protected]/person: 0.887980
INFO:root:PerformanceByCategory/[email protected]/pottedplant: 0.448292
INFO:root:PerformanceByCategory/[email protected]/sheep: 0.599590
INFO:root:PerformanceByCategory/[email protected]/sofa: 0.553417
INFO:root:PerformanceByCategory/[email protected]/train: 0.762534
INFO:root:PerformanceByCategory/[email protected]/tvmonitor: 0.640537
INFO:root:Precision/[email protected]: 0.688094
INFO:root:Metrics written to tf summary.
INFO:root:# success: 4952
INFO:root:# skipped: 0
INFO:root:Starting evaluation at 2017-09-07-04:18:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:23:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:28:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:33:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:38:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:43:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:48:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:53:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:58:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:03:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:08:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:13:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:18:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds

If you want to watch visualized detection result in the tensorboard.
You should modify this parameter in the file object_detection\protos\eval.proto
optional uint32 num_visualizations = 1 [default=10];

robert780612 on 7 Sep 2017

👍15 🎉3 😄1

All 19 comments

If I understand it well the number of examples is the number of images on which the current model would be evaluated, for instance if you have 5000 images in your test dataset, setting num_examples : 2000 would make each evaluation running only at 2000 of them.

Regarding the warning, I think it just takes time, you could track the progress by adding
tf.logging.set_verbosity(tf.logging.INFO) after the imports in eval.py
.

schesho on 5 Jul 2017

okay ill add in the extra logging. what is strange though is that it does the first few images really quickly, hits that warning (or not if i don't call the numpy function and just return zeros) and then nothing happens.

ckalas on 5 Jul 2017

I have the same issue, tf.logging.set_verbosity(tf.logging.INFO) (after imports) seems to not work for eval.py, in contrast to train.py.

Furthermore the num_examples is weird. If your test set has X images and num_examples < X then you test a subset. But if num_examples is bigger, for example num_examples = 2X, then evaluation is run twice on the same test set.

It seems that right now there is no default value that makes evaluation run only once for the full test set, without you having to manually specify its size. It will be nice if this is added later as a feature.

dimsava on 5 Jul 2017

Still happening, unsure why. Trying to look into the evaluator.py file but not finding much there. Is there a potential that its to do with how my data is annotated? for instance, I omitted some tags like pose, difficult, truncated, etc. because they arent relevant to my data. Seems unlikely becuase if this where the case why would it train fine and evaluate a couple of images?

ckalas on 7 Jul 2017

Hi @ckalas - can you provide a bit more of your eval log? That might be useful for additional context.

jch1 on 7 Jul 2017

@jch1 this is all i get, the verbose option is on afaik.

chris@chris-P775DM3-G: python3 eval.py --pipeline_config=ssd_inception_v2_cylinders_fp.config' --eval_dir='ssd_inception_fp' --checkpoint_dir='ssd_inception_fp'
2017-07-10 10:15:50.610774: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-07-10 10:15:50.611137: I tensorflow/core/common_runtime/gpu/gpu_device.cc:938] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.847
pciBusID 0000:01:00.0
Total memory: 7.92GiB
Free memory: 7.30GiB
2017-07-10 10:15:50.611149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:959] DMA: 0 
2017-07-10 10:15:50.611153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:969] 0:   Y 
2017-07-10 10:15:50.611164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1028] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0)
INFO:tensorflow:Restoring parameters from ssd_inception_fp/model.ckpt-11438
INFO:tensorflow:Restoring parameters from ssd_inception_fp/model.ckpt-11438
WARNING:root:The following classes have no ground truth examples: [11 15 16 17 18 19]
/home/chris/tensorflow/models/object_detection/utils/metrics.py:144: RuntimeWarning: invalid value encountered in true_divide
  num_images_correctly_detected_per_class / num_gt_imgs_per_class)

After that warning I have to ctrl-c, and rerun to evaluate more images, however it never reaches the end of evaluation and has never printed any results.

ckalas on 10 Jul 2017

I found some error i the label_map_utils file, in the function convert_label_map_to_categories. The docstring says "We only allow class into the list if its id-label_id_offset is between 0 (inclusive)", but that cleary is not including 0 if you look at this line if not 0 < item.id <= max_num_classes

still doesnt evaluate and gives same error. i also changed eval_config.num_visualizations which shows more images, but they repeat alot (as my validation set is small), and the program didnt finish running either.

ckalas on 11 Jul 2017

Same problem with me..

WARNING:root:The following classes have no ground truth examples: 0 /home/zha/Documents/models-master/object_detection/utils/metrics.py:144: RuntimeWarning: invalid value encountered in true_divide num_images_correctly_detected_per_class / num_gt_imgs_per_class)
When I check via Tensorboard, it print the result of 5 images, then hang on..

chenyuZha on 11 Jul 2017

I think this line if not 0 < item.id <= max_num_classes from this file can be consistent with the comment

We only allow class into the list if its id-label_id_offset is
between 0 (inclusive) and max_num_classes (exclusive).

if we make the assumption that label_id_offset is 1. So for the label map with class ids 1, 2, ..., n and max_num_classes >= n, this should work fine. However, I'm not sure if the code can be improved for better clarity (or whether this is related to your problem).

According to the new "using your own dataset" tutorial, "Label maps should always start from id 1." So, PR #1903 has removed class id 0 ("none of the above") from the label maps of the pets dataset.

@ckalas Could you give some more information about running the pets dataset (so others could try reproducing the error)? Also, there was an issue with python 3 that was fixed in PR #1758. Could you try git pull to update?

korrawat on 11 Jul 2017

@korrawat the pet dataset was run using the tutorials in g3docs folder, mainly following the run locally and prepare input sections. also didnt see the new tutorial, ill pull the new version and check it out

ckalas on 11 Jul 2017

I had the same problem:
WARNING:root:The following classes have no ground truth examples: [ 5 10 11 12 13 14 15 16 17 19]

Found solution to my problem. In the default config file, eval_config.num_examples=2000 and eval_input_reader.shuffle=false. So if your validation set pic num>2000 and sorted by class, the classes after 2000pic wont be validated. The solution is to change eval_config.num_examples to the size of your val set or to change eval_input_reader.shuffle to true.

As for the problem of validation stopped, the validation didn't stopped, by default it runs periodically. If you want it do just once, add max_evals: 1 to eval_config in config file.

bclyc on 14 Jul 2017

👍8

i added in the max evals and it now stops, but it doesn't print anything to console other than the warning. from what i see in the code it should print my map scores after it runs? in my case i have much less than 2000 eval images so that shouldn't affect me.

ckalas on 14 Jul 2017

It won't print scores after it finishes, but will output summary file to your val output folder. If the max_evals == 1, it will stop after one val, else you can check your val output folder if there is new summary file and use tensorboard to visualize it.

bclyc on 18 Jul 2017

eval.py is still evaluating background, it just do not print the evaluating result.
It print eval result normally after I set the logging level in the object_detect/eval.py
Just add two lines

import logging
logging.basicConfig(level=logging.INFO)

INFO:tensorflow:Restoring parameters from ./object_detection/train_result\model.ckpt-7378
INFO:tensorflow:Restoring parameters from ./object_detection/train_result\model.ckpt-7378
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-0.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-1.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-2.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-3.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-4.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-5.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-6.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-7.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-8.
INFO:root:Creating detection visualizations.
INFO:root:Detection visualizations written to summary with tag image-9.
INFO:root:Running eval ops batch 100/4952
INFO:root:Running eval ops batch 200/4952
INFO:root:Running eval ops batch 300/4952
INFO:root:Running eval ops batch 400/4952
INFO:root:Running eval ops batch 500/4952
INFO:root:Running eval ops batch 600/4952
INFO:root:Running eval ops batch 700/4952
INFO:root:Running eval ops batch 800/4952
INFO:root:Running eval ops batch 900/4952
INFO:root:Running eval ops batch 1000/4952
INFO:root:Running eval ops batch 1100/4952
INFO:root:Running eval ops batch 1200/4952
INFO:root:Running eval ops batch 1300/4952
INFO:root:Running eval ops batch 1400/4952
INFO:root:Running eval ops batch 1500/4952
INFO:root:Running eval ops batch 1600/4952
INFO:root:Running eval ops batch 1700/4952
INFO:root:Running eval ops batch 1800/4952
INFO:root:Running eval ops batch 1900/4952
INFO:root:Running eval ops batch 2000/4952
INFO:root:Running eval ops batch 2100/4952
INFO:root:Running eval ops batch 2200/4952
INFO:root:Running eval ops batch 2300/4952
INFO:root:Running eval ops batch 2400/4952
INFO:root:Running eval ops batch 2500/4952
INFO:root:Running eval ops batch 2600/4952
INFO:root:Running eval ops batch 2700/4952
INFO:root:Running eval ops batch 2800/4952
INFO:root:Running eval ops batch 2900/4952
INFO:root:Running eval ops batch 3000/4952
INFO:root:Running eval ops batch 3100/4952
INFO:root:Running eval ops batch 3200/4952
INFO:root:Running eval ops batch 3300/4952
INFO:root:Running eval ops batch 3400/4952
INFO:root:Running eval ops batch 3500/4952
INFO:root:Running eval ops batch 3600/4952
INFO:root:Running eval ops batch 3700/4952
INFO:root:Running eval ops batch 3800/4952
INFO:root:Running eval ops batch 3900/4952
INFO:root:Running eval ops batch 4000/4952
INFO:root:Running eval ops batch 4100/4952
INFO:root:Running eval ops batch 4200/4952
INFO:root:Running eval ops batch 4300/4952
INFO:root:Running eval ops batch 4400/4952
INFO:root:Running eval ops batch 4500/4952
INFO:root:Running eval ops batch 4600/4952
INFO:root:Running eval ops batch 4700/4952
INFO:root:Running eval ops batch 4800/4952
INFO:root:Running eval ops batch 4900/4952
INFO:root:Running eval batches done.
INFO:root:Computing Pascal VOC metrics on results.
WARNING:root:The following classes have no ground truth examples: 0
D:\Deep Learning Software\RCNN\object_detection\utils\metrics.py:145: RuntimeWarning: invalid value encountered in true_divide
  num_images_correctly_detected_per_class / num_gt_imgs_per_class)
INFO:root:Writing metrics to tf summary.
INFO:root:PerformanceByCategory/[email protected]/aeroplane: 0.817954
INFO:root:PerformanceByCategory/[email protected]/bicycle: 0.721141
INFO:root:PerformanceByCategory/[email protected]/bird: 0.799558
INFO:root:PerformanceByCategory/[email protected]/boat: 0.457477
INFO:root:PerformanceByCategory/[email protected]/bottle: 0.616356
INFO:root:PerformanceByCategory/[email protected]/bus: 0.729833
INFO:root:PerformanceByCategory/[email protected]/car: 0.798665
INFO:root:PerformanceByCategory/[email protected]/cat: 0.889659
INFO:root:PerformanceByCategory/[email protected]/chair: 0.562433
INFO:root:PerformanceByCategory/[email protected]/cow: 0.658600
INFO:root:PerformanceByCategory/[email protected]/diningtable: 0.342314
INFO:root:PerformanceByCategory/[email protected]/dog: 0.840469
INFO:root:PerformanceByCategory/[email protected]/horse: 0.855745
INFO:root:PerformanceByCategory/[email protected]/motorbike: 0.779330
INFO:root:PerformanceByCategory/[email protected]/person: 0.887980
INFO:root:PerformanceByCategory/[email protected]/pottedplant: 0.448292
INFO:root:PerformanceByCategory/[email protected]/sheep: 0.599590
INFO:root:PerformanceByCategory/[email protected]/sofa: 0.553417
INFO:root:PerformanceByCategory/[email protected]/train: 0.762534
INFO:root:PerformanceByCategory/[email protected]/tvmonitor: 0.640537
INFO:root:Precision/[email protected]: 0.688094
INFO:root:Metrics written to tf summary.
INFO:root:# success: 4952
INFO:root:# skipped: 0
INFO:root:Starting evaluation at 2017-09-07-04:18:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:23:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:28:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:33:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:38:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:43:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:48:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:53:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-04:58:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:03:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:08:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:13:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds
INFO:root:Starting evaluation at 2017-09-07-05:18:21
INFO:root:Found already evaluated checkpoint. Will try again in 300 seconds

robert780612 on 7 Sep 2017

👍15 🎉3 😄1

@robert780612 Hi I tried to run the evaluation script localy with tensor flow 1.3 . It's not showing any results as they do show like in your case
See this is the what I got .

shamanez on 15 Sep 2017

Hi @shamanez
Did it show any information after the deprecated warning?
I am not sure what's going on, maybe it is caused by tensorflow version (I use tf1.2)

robert780612 on 15 Sep 2017

Hi @robert780612 ,
I got the same error. Did you find any solution for it?

siddas27 on 6 May 2019

Hi There,
We are checking to see if you still need help on this, as this seems to be considerably old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.