Models: Error in run eval.py WARNING:root:The following classes have no ground truth examples: 0

Created on 20 Jun 2017 · 36Comments · Source: tensorflow/models

when I running the tensorflow object detection API locally just as https://github.com/tensorflow/models/blob/9c17823e147ff2893427b47cb57d171da9350d20/object_detection/g3doc/running_locally.md suggest, it goes well when I run

$ python object_detection/train.py -logtostderr --pipeline_config_path=object_detection/mymodels/model/faster_rcnn_resnet101_voc07.config --train_dir=object_detection/mymodels/model/train/

and it can train correctly, but when I try to eval,and run

python object_detection/eval.py --logtostderr --pipeline_config_path=object_detection/mymodels/model/faster_rcnn_resnet101_voc07.config --checkpoint_dir=object_detection/mymodels/model/train/ --eval_dir=object_detection/mymodels/model/eval/

it show:
WARNING:root:The following classes have no ground truth examples: 0
/home/yanliang/.conda/envs/tensorflow/models/object_detection/utils/metrics.py:144: RuntimeWarning: invalid value encountered in true_divide
num_images_correctly_detected_per_class / num_gt_imgs_per_class)
^CTraceback (most recent call last):
File "object_detection/eval.py", line 162, in
tf.app.run()
File "/home/yanliang/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "object_detection/eval.py", line 158, in main
FLAGS.checkpoint_dir, FLAGS.eval_dir)
File "/home/yanliang/.conda/envs/tensorflow/models/object_detection/evaluator.py", line 211, in evaluate
save_graph_dir=(eval_dir if eval_config.save_graph else ''))
File "/home/yanliang/.conda/envs/tensorflow/models/object_detection/eval_util.py", line 524, in repeated_checkpoint_run
time.sleep(time_to_next_eval)
KeyboardInterrupt

The dataset I use is pascal_voc_2012, I follow the tutorial as well.
+data
-pascal_label_map.pbtxt
-pascal_train.record
-pascal_voc.record
+models

model
-faster_rcnn_resnet101_voc07.config
+train
+eval

Are there any body give me some suggest? thanks!

Source

YanLiang0813

👍9

Most helpful comment

@YanLiang0813 You can ignore the error. The class at index 0 is 'none_of_the_above' for both PASCAL and pet datasets and is a placeholder index. The TFRecords will contain no instances of this placeholder class.

derekjchow on 20 Jun 2017

👍8

All 36 comments

I have the same issue.

ahmetkucuk on 20 Jun 2017

@ahmetkucuk did your training works well? This is partial of my training log:
INFO:tensorflow:Restoring parameters from /home/yanliang/.conda/envs/tensorflow/models/object_detection/mymodels/faster_rcnn_resnet101_coco_11_06_2017/model.ckpt
INFO:tensorflow:Starting Session.
INFO:tensorflow:Saving checkpoint to path object_detection/mymodels/model/train/model.ckpt
INFO:tensorflow:Starting Queues.
INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Recording summary at step 0.
INFO:tensorflow:global step 1: loss = 4.3562 (6.369 sec/step)
2017-06-20 10:50:49.153778: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 2383 get requests, put_count=1971 evicted_count=1000 eviction_rate=0.507357 and unsatisfied allocation rate=0.634494
2017-06-20 10:50:49.153983: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 100 to 110
INFO:tensorflow:global step 2: loss = 4.5299 (1.051 sec/step)
INFO:tensorflow:global step 3: loss = 4.3959 (0.363 sec/step)
INFO:tensorflow:global step 4: loss = 5.5421 (0.799 sec/step)
INFO:tensorflow:global step 5: loss = 3.9413 (1.042 sec/step)
INFO:tensorflow:global step 6: loss = 3.6625 (0.354 sec/step)
INFO:tensorflow:global step 7: loss = 3.6821 (0.364 sec/step)
INFO:tensorflow:global step 8: loss = 3.4374 (0.355 sec/step)
INFO:tensorflow:global step 9: loss = 3.3901 (0.359 sec/step)
INFO:tensorflow:global step 10: loss = 3.1503 (1.024 sec/step)
INFO:tensorflow:global step 11: loss = 3.2978 (0.360 sec/step)
INFO:tensorflow:global step 12: loss = 2.8448 (1.055 sec/step)
INFO:tensorflow:global step 13: loss = 3.2599 (0.470 sec/step)
INFO:tensorflow:global step 14: loss = 2.5151 (0.359 sec/step)
INFO:tensorflow:global step 15: loss = 2.2614 (0.358 sec/step)
INFO:tensorflow:global step 16: loss = 2.2486 (0.355 sec/step)
INFO:tensorflow:global step 17: loss = 2.2398 (0.810 sec/step)
2017-06-20 10:50:58.253875: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:247] PoolAllocator: After 2110 get requests, put_count=2065 evicted_count=1000 eviction_rate=0.484262 and unsatisfied allocation rate=0.506161
2017-06-20 10:50:58.253938: I tensorflow/core/common_runtime/gpu/pool_allocator.cc:259] Raising pool_size_limit_ from 256 to 281
INFO:tensorflow:global step 18: loss = 2.1277 (0.360 sec/step)
INFO:tensorflow:global step 19: loss = 2.9921 (0.349 sec/step)
INFO:tensorflow:global step 20: loss = 2.0339 (0.353 sec/step)
INFO:tensorflow:global step 21: loss = 2.6191 (0.347 sec/step)
INFO:tensorflow:global step 22: loss = 3.0585 (0.359 sec/step)
INFO:tensorflow:global step 23: loss = 1.1144 (0.976 sec/step)
INFO:tensorflow:global step 24: loss = 1.7001 (0.382 sec/step)
INFO:tensorflow:global step 25: loss = 1.3169 (0.347 sec/step)
INFO:tensorflow:global step 26: loss = 1.2461 (0.368 sec/step)
INFO:tensorflow:global step 27: loss = 1.9536 (0.370 sec/step)
INFO:tensorflow:global step 28: loss = 1.7631 (0.376 sec/step)
INFO:tensorflow:global step 29: loss = 2.2164 (1.042 sec/step)
INFO:tensorflow:global step 30: loss = 0.9388 (0.353 sec/step)
INFO:tensorflow:global step 31: loss = 2.1595 (0.362 sec/step)
INFO:tensorflow:global step 32: loss = 1.9991 (0.352 sec/step)
INFO:tensorflow:global step 33: loss = 2.1409 (0.365 sec/step)
INFO:tensorflow:global step 34: loss = 3.0498 (0.361 sec/step)
INFO:tensorflow:global step 35: loss = 1.7767 (0.355 sec/step)
INFO:tensorflow:global step 36: loss = 1.3106 (0.354 sec/step)
INFO:tensorflow:global step 37: loss = 1.3067 (0.357 sec/step)
INFO:tensorflow:global step 38: loss = 4.0444 (0.785 sec/step)
INFO:tensorflow:global step 39: loss = 1.9622 (1.082 sec/step)
INFO:tensorflow:global step 40: loss = 2.8836 (1.094 sec/step)
INFO:tensorflow:global step 41: loss = 2.6982 (0.382 sec/step)
INFO:tensorflow:global step 42: loss = 1.6046 (0.359 sec/step)
INFO:tensorflow:global step 43: loss = 1.1759 (1.070 sec/step)
INFO:tensorflow:global step 44: loss = 0.9371 (0.377 sec/step)
INFO:tensorflow:global step 45: loss = 1.4666 (0.377 sec/step)
INFO:tensorflow:global step 46: loss = 2.4793 (1.080 sec/step)
INFO:tensorflow:global step 47: loss = 2.8852 (0.379 sec/step)
INFO:tensorflow:global step 48: loss = 1.8985 (0.380 sec/step)
INFO:tensorflow:global step 49: loss = 1.8162 (0.638 sec/step)
INFO:tensorflow:global step 50: loss = 0.9691 (0.357 sec/step)
INFO:tensorflow:global step 51: loss = 1.2954 (0.437 sec/step)
INFO:tensorflow:global step 52: loss = 2.8442 (0.644 sec/step)

YanLiang0813 on 20 Jun 2017

@YanLiang0813 Yes, the total loss decreases gradually in my case as well.

ahmetkucuk on 20 Jun 2017

Having the same issue as well!

jaydee713 on 20 Jun 2017

@sguada I really need your help, could'd you give some suggestion on how to solve this problem? Thanks!!!

YanLiang0813 on 20 Jun 2017

derekjchow on 20 Jun 2017

👍8

@derekjchow how to ignore the error, I comment the lines in object_detection_evaluation.py https://github.com/tensorflow/models/blob/a4944a57ad2811e1f6a7a87589a9fc8a776e8d3c/object_detection/utils/object_detection_evaluation.py#L197

if (self.num_gt_instances_per_class == 0).any():
  logging.warn(
      'The following classes have no ground truth examples: %s',
      np.squeeze(np.argwhere(self.num_gt_instances_per_class == 0)))

but it doesn't work, the error still exist:

/home/yanliang/.conda/envs/tensorflow/models/object_detection/utils/metrics.py:144: RuntimeWarning: invalid value encountered in true_divide
num_images_correctly_detected_per_class / num_gt_imgs_per_class)

could you give me some suggestion, how can i ignore the error? And are there any one solved it ?

YanLiang0813 on 21 Jun 2017

@jaydee713 did you solve this problem?

YanLiang0813 on 21 Jun 2017

@YanLiang0813 I didn't, decided I would just ignore it since it is just a warning :P doesn't seem to have caused me any problems yet...

jaydee713 on 21 Jun 2017

@jaydee713 Yes, I now know it, we just ignore it and run train.py and eval.py concurrently, so we can see the precision on tensorboard

YanLiang0813 on 21 Jun 2017

@YanLiang0813 but after this warning, the eval.py seems hanging. or it just takes long time??

KleinYuan on 25 Jun 2017

        It just take a long time,you can open tensorboard to monitor the result 发自网易邮箱大师
        On 06/25/2017 10:48, kwyuan wrote:@YanLiang0813 but after this warning, the eval.py seems hanging. or it just takes long time??

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/tensorflow/models","title":"tensorflow/models","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/tensorflow/models"}},"updates":{"snippets":[{"icon":"PERSON","message":"@KleinYuan in #1696: @YanLiang0813 but after this warning, the eval.py seems hanging. or it just takes long time??"}],"action":{"name":"View Issue","url":"https://github.com/tensorflow/models/issues/1696#issuecomment-310877379"}}}

YanLiang0813 on 25 Jun 2017

Looks like this is resolved. This is just a warning that is safe to ignore. Closing this issue.

ali01 on 30 Jun 2017

I'm getting this same error. I think it crashes it.

alexiskattan on 1 Jul 2017

@ali01 The eval directory is being populated with new tfrecords up until this warning/error comes up. Maybe reopen the issue?

alexiskattan on 1 Jul 2017

@alexalemi It's warning and just wait for a it completes. Takes a while. Don't think this will crash the app.

KleinYuan on 1 Jul 2017

I am encountering the same issue, but mine does not wait but exits after giving traceback. How did you ignore the error(what changes if any)

SriramGS on 20 Jul 2017

        I did not change anything just train and eval synchronization发自网易邮箱大师
        On 07/20/2017 08:17, SriramGS wrote:I am encountering the same issue, but mine does not wait but exits after giving traceback. How did you ignore the error(what changes if any)

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/tensorflow/models","title":"tensorflow/models","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/tensorflow/models"}},"updates":{"snippets":[{"icon":"PERSON","message":"@SriramGS in #1696: I am encountering the same issue, but mine does not wait but exits after giving traceback. How did you ignore the error(what changes if any)"}],"action":{"name":"View Issue","url":"https://github.com/tensorflow/models/issues/1696#issuecomment-316552367"}}}

YanLiang0813 on 20 Jul 2017

Oh, My run does the training successfully, but when i run eval.py, I get the warning and program quits itself, does not continue. Any idea why.

SriramGS on 20 Jul 2017

Can I label objects with the placeholder class 0, and treat these images as true negatives to improve my model?

slandersson on 1 Aug 2017

@SriramGS Did you solve the problem? I have this same issue

szymonk92 on 4 Aug 2017

        It's not a problem, it just spend a long time,you just wait the result发自网易邮箱大师
        On 08/04/2017 18:07, Szymon Klepacz wrote:@SriramGS Did you solve the problem? I have this same issue

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/tensorflow/models","title":"tensorflow/models","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/tensorflow/models"}},"updates":{"snippets":[{"icon":"PERSON","message":"@szymonk92 in #1696: @SriramGS Did you solve the problem? I have this same issue "}],"action":{"name":"View Issue","url":"https://github.com/tensorflow/models/issues/1696#issuecomment-320209436"}}}

YanLiang0813 on 4 Aug 2017

I made another try with just few iterations it took a minute and I left my computer for 30minutes, nothing happened. I will try again. Thanks!

szymonk92 on 4 Aug 2017

@szymonk92 I was not able to solve it. I am still looking for a solution. Let me know if you find anything.

SriramGS on 4 Aug 2017

I have also received this error. I'm waiting to see if it continues after the message

DanMossa on 12 Aug 2017

Some people in this solve the issue by running train.py and eval.py at the same time. I also have tried this suggestion but it fails, cuz there is no enough memory. However, I have 8 GB GPU memory.

Abduoit on 22 Aug 2017

I built TensorFlow from source and I still have this same problem. On both computers. I can see the evaluation results (images) after few seconds but terminal is frozen for an hour.

Any ideas? Can I force close the terminal?

I would like to run training and evaluation at this same time, however my computer (GPU 12GB ) doesn't have enough memory to run them simultaneously using Faster RCNN with Inception v2.

szymonk92 on 1 Sep 2017

@szymonk92

U need to divide your gpu to two parts, 50% for running training and 50% for evaluation.

and don't worry about this warning, see this discussion

Add those lines to the train.py file. The first 2 lines in main...

def main(_):
  gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)  
  sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))  
  assert FLAGS.train_dir, '`train_dir` is missing.'

Abduoit on 1 Sep 2017

@Abduoit Thanks for the tip. I tried with 6GB and it seems that I don't have enough memory. I will try again at Monday with 12GB

szymonk92 on 2 Sep 2017

@szymonk92

even if u tried with 6GB, it should allocate 50% of gpu for train.py and the second 50% will be for eval.py.

plz make sure that u add the following lines correctly in file train.py. the two lines should be after def main(_):

def main(_):
  gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.5)  
  sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

Abduoit on 2 Sep 2017

I have 2 classes in my label_map.pbtxt, yet I get the warning:

The following classes have no ground truth examples: [ 0 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255]

Also, the precision when I evaluate is also always 0 (Precision/[email protected]: 0.000000) after 500k training steps. I couldn't find any solutions so far, so any help would be appreciated. Thanks.

sidthekid402 on 20 Sep 2017

@Abduoit My train.py take 50% but eval.py take almost 100 % of my memory GPU and run out of memory. It is possible to limit the allocation of memory for train.py but how to do it for eval.py ? Thanks.

ghost on 9 Nov 2017

@YanLiang0813 , what's your GPU ? i can't fine-tune faster_rcn_res101_coco for pascal 2007 with 1080.

PythonImageDeveloper on 23 Feb 2018

I used transfer learning to detect my own dataset using the _ssd_mobilenet_v1_coco_11_06_2017_ model.
I trained my model on Google Cloud using its training job through The cloud shell. My training was successful and I exported the model onto my local machine. I decided to run the evaluation using eval.py on my local machine but the eval.py command stuck after this:

I have only 3 classes:
Here's my object-detection.pbtxt file:

 {
  id: 1
  name: 'tree'

  id: 2
  name: 'water body'

  id: 3
  name: 'building'
}

Please help.

psdas on 14 Jun 2018

Hey, I was able to resolve the error and hence successfully run my model by changing my label pbtxt file (object-detection.pbtxt in my case).
Earlier my file was:

{
  id: 1
  name: 'tree'

  id: 2
  name: 'water body'

  id: 3
  name: 'building'
}

I changed that to:

item {
  id: 1
  name: 'tree'
     }

item {
  id: 2
  name: 'water body'
     }

item {
  id: 3
  name: 'building'
     }

psdas on 15 Jun 2018

l have the same issue, you need to check your .txt file

mrainezty on 15 Aug 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

lm_1b BATCH_SIZE > 1

nmfisher · 3Comments

Export Inference Model Error

frankkloster · 3Comments

tutorial image cifar10 estimator generate TFRecord error

jacknlliu · 3Comments

#Textsum# How to generate the vocab file from the original data And what's the format of test data

licaoyuan123 · 3Comments

I can't find preprocessor_pb2,who can help me

hanzy123 · 3Comments