Models: AttributeError: 'module' object has no attribute 'LookupTensor'

Created on 11 Mar 2018 · 26Comments · Source: tensorflow/models

I am trying to run a training job in Google Cloud using Tensorflow . I tried to run the training using by running the following command.

gcloud ml-engine jobs submit training training_1 --job-dir=gs://object-detection-bucket-test/train --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz --module-name object_detection.train --region us-central1 --config object_detection/samples/cloud/cloud.yml --runtime-version=1.2 -- --train_dir=gs://object-detection-bucket-test/train --pipeline_config_path=gs://object-detection-bucket-test/data/ssd_mobilenet_v1_coco.config

But When I run a job, I am getting the following error. Any idea why?

The replica master 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 210, in init slim_example_decoder.LookupTensor( AttributeError: 'module' object has no attribute 'LookupTensor' The replica worker 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 210, in init slim_example_decoder.LookupTensor( AttributeError: 'module' object has no attribute 'LookupTensor' The replica worker 1 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 210, in init slim_example_decoder.LookupTensor( AttributeError: 'module' object has no attribute 'LookupTensor' The replica worker 2 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 210, in init slim_example_decoder.LookupTensor( AttributeError: 'module' object has no attribute 'LookupTensor' The replica worker 3 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 210, in init slim_example_decoder.LookupTensor( AttributeError: 'module' object has no attribute 'LookupTensor' The replica worker 4 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 167, in tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 163, in main worker_job_name, is_chief, FLAGS.train_dir) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 235, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 120, in get_next dataset_builder.build(config)).get_next() File "/root/.local/lib/python2.7/site-packages/object_detection/builders/dataset_builder.py", line 138, in build label_map_proto_file=label_map_proto_file) File "/root/.local/lib/python2.7/site-packages/object_detection/data_decoders/tf_example_decoder.py", line 210, in init slim_example_decoder.LookupTensor( AttributeError: 'module' object has no attribute 'LookupTensor'

Source

kamalsomu

👍11

Most helpful comment

UPDATE:

For TensorFlow 1.4 compatibility I used the fad6075359b852b9c0a4c6f1b068790d44a6441a commit instead.

$ git clone https://github.com/tensorflow/models/
$ cd models
$ git checkout fad6075359b852b9c0a4c6f1b068790d44a6441a

From there I was able to get past the LookupTensor tensor error.

I then ran into a _second_ error when trying to run train.py:

  File "/home/ubuntu/.virtualenvs/tfod_api/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1023, in unstack
    (axis, -value_shape.ndims, value_shape.ndims))
ValueError: axis = 0 not in [0, 0)

I encountered this error before and resolved it by adding anchorwise_output: true to the weighted_sigmoid and weighted_smooth_l1 in my .config file:

    loss {
      classification_loss {
        weighted_sigmoid {
                anchorwise_output: true # add this
        }
      }
      localization_loss {
        weighted_smooth_l1 {
                anchorwise_output: true #add this
        }
      }

From there I was able to train the model.

I hope that helps someone!

jrosebr1 on 12 Mar 2018

👍8 ❤2

All 26 comments

Tensorflow version: 1.3.0

I'm trying to train a ssd_inception_v2_coco model on my desktop and I'm getting the same error:

python ../../../tensorflow_models/research/object_detection/train.py ...` on my desktop.

  File "../../../tensorflow_models/research/object_detection/train.py", line 167, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "../../../tensorflow_models/research/object_detection/train.py", line 163, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "/media/xxxx/Projects/tensorflow_models/research/object_detection/trainer.py", line 235, in train
    train_config.prefetch_queue_capacity, data_augmentation_options)
  File "/media/xxxx/Projects/tensorflow_models/research/object_detection/trainer.py", line 59, in create_input_queue
    tensor_dict = create_tensor_dict_fn()
  File "../../../tensorflow_models/research/object_detection/train.py", line 120, in get_next
    dataset_builder.build(config)).get_next()
  File "/media/masoud/DATA/Projects/tensorflow_models/research/object_detection/builders/dataset_builder.py", line 138, in build
    label_map_proto_file=label_map_proto_file)
  File "/media/xxxx/Projects/tensorflow_models/research/object_detection/data_decoders/tf_example_decoder.py", line 210, in __init__
    slim_example_decoder.LookupTensor(
AttributeError: 'module' object has no attribute 'LookupTensor'

smasoudn on 11 Mar 2018

Tensorflow version: 1.4.1
I also have the same problem when trying to train a ssd_mobilenet model.
After updating version to 1.6.0, it runs normally.

fengyang95 on 12 Mar 2018

I ran into the same error as well. I am running tensorflow-gpu==1.4.1 on Ubuntu 16.04. I am using the latest commit of models/research:

$ git show
commit d9c430b3aa7c1b2515cfde6ae10973a5e6308cc7
Merge: ac6ab36 080795f
Author: Mark Daoust <[email protected]>
Date:   Sun Mar 11 20:40:08 2018 -0700

    Merge pull request #3561 from kopankom/fix/assigment-in-loop

    unnecessary variable assignment in loop

I will try upgrading to TensorFlow 1.6.0.

Is there a known commit that works with TensorFlow 1.4?

jrosebr1 on 12 Mar 2018

UPDATE:

For TensorFlow 1.4 compatibility I used the fad6075359b852b9c0a4c6f1b068790d44a6441a commit instead.

$ git clone https://github.com/tensorflow/models/
$ cd models
$ git checkout fad6075359b852b9c0a4c6f1b068790d44a6441a

From there I was able to get past the LookupTensor tensor error.

I then ran into a _second_ error when trying to run train.py:

  File "/home/ubuntu/.virtualenvs/tfod_api/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py", line 1023, in unstack
    (axis, -value_shape.ndims, value_shape.ndims))
ValueError: axis = 0 not in [0, 0)

I encountered this error before and resolved it by adding anchorwise_output: true to the weighted_sigmoid and weighted_smooth_l1 in my .config file:

    loss {
      classification_loss {
        weighted_sigmoid {
                anchorwise_output: true # add this
        }
      }
      localization_loss {
        weighted_smooth_l1 {
                anchorwise_output: true #add this
        }
      }

From there I was able to train the model.

I hope that helps someone!

jrosebr1 on 12 Mar 2018

👍8 ❤2

I'm trying to run training job on the google cloud and I'm getting the same AttributeError: 'module' object has no attribute 'LookupTensor'
I'm trying @jrosebr1 's fix but I can't seem to checkout that fad6075359b852b9c0a4c6f1b068790d44a6441a commit. I'm on Windows 10, python 2.7
it comes back with this:
fatal: Not a git repository (or any of the parent directories): .git

funkysandman on 13 Mar 2018

You need to cd into models first which is the repo you just cloned down. I'll edit my response to include this.

jrosebr1 on 14 Mar 2018

👍1

Thanks for the tip about checkout of the particular commit, @jrosebr1 . Interestingly enough, after I did that, it worked, and I didn't run into the second error that you mentioned above...

Supersak80 on 15 Mar 2018

@Supersak80 Hm, interesting. Did your .prototxt file already include the update mentioned? I pulled mine from the bleeding edge of the repo.

jrosebr1 on 15 Mar 2018

@jrosebr1 Which .proto file has those parameters?

Supersak80 on 15 Mar 2018

@Supersak80 Whoops, I didn't mean to say .prototxt. I mean the pipeline.config. In particular, I needed to update the Faster R-CNN and SSDs for the COCO and Pets examples.

jrosebr1 on 15 Mar 2018

Upgrading to tensorflow 1.5+ would resolve missing LookupTensor issue. I'm also preparing a fix to make it tf 1.4 compatible(likely early next week).

pkulzc on 16 Mar 2018

😄1

Thanks so much @pkulzc!

jrosebr1 on 16 Mar 2018

jrosebr1 suggestion worked for me though I did not encounter the second error

Mageshpoondi on 17 Mar 2018

@jrosebr1，I followed your method in Ubuntu14.04, tensorflow 1.4.0, python3.5, protobuf 3.5.1. The following error occurred during training:
File "/home/dl/anaconda3/lib/python3.5/site-packages/google/protobuf/text_format.py", line 703, in _MergeField
(message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 167:3 : Message type "object_detection.protos.TrainConfig" has no field named "max_number_of_boxes".
Is it a problem with my protobuf version? thanks for your help

LXWDL on 21 Mar 2018

@LXWDL Sorry, I'm not sure. I'm not a TF developer. It seems that it's either an issue with your Protobuf version or an issue but I'm not sure.

jrosebr1 on 21 Mar 2018

@LXWDL I just ran the command 'protoc object_detection/protos/*.proto --python_out=.' again under the research folder and this mistake has never occurred again.

LeonidasCl on 23 Mar 2018

@pkulzc do you have an updated estimate for the fix, or is it already commited?

relational on 24 Mar 2018

@relational it's already in.

pkulzc on 24 Mar 2018

hi @pkulzc, seems this issue still not fixed on tensorflow 1.4

FortiLeiZhang on 25 Mar 2018

@FortiLeiZhang did you sync to HEAD? It's at least working with 1.4.0 in with my test local environment.

pkulzc on 25 Mar 2018

Hi, @pkulzc. thanks for your reply.
I checked this on 1.4.0/1.4.1/1.5.1/1.6.0, the issue reported in this thread has gone. I thinks this bug could be closed.

However, a new error msg was shown on 1.4.0/1.4.1, but not on 1.5.1/1.6.0:
File "/home/usr/models/research/object_detection/utils/dataset_util.py", line 128, in read_dataset
tf.contrib.data.parallel_interleave(
AttributeError: 'module' object has no attribute 'parallel_interleave'

FortiLeiZhang on 26 Mar 2018

@FortiLeiZhang have solve this problem？

TyrionChou on 28 Mar 2018

@TyrionChou, I am not going deeper on 1.4.0. I switched to 1.6.0 and this version is good.

FortiLeiZhang on 28 Mar 2018

@LeonidasCl , I encountered the same problem as you, but it seemed your method could not work me out. My Protobuf Compilation works well, It confused me a lot, do you have any other ideas?

yujianxiang on 17 Apr 2018

@aNothing, @pkulzc , it seems the /proto/ssd.proto has been modified recently. In this new version, they removed the field
"optional bool batch_norm_trainable = 6 [default=true];"

FortiLeiZhang on 17 Apr 2018

Yes, this field has been deprecated. Please see this stackoverflow question if anyone has issue with missing batch_norm_trainable.

I'm also closing this issue as LookupTensor issue has been fixed, which is also stated in faq.

Feel free to reopen this if the same issue happens after your syncing to head.

pkulzc on 17 Apr 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Feature Request: Exporting .meta file of an already pre-trained model (.ckpt) in tf.slim

amirjamez · 3Comments

"ValueError: model_dir should be non-empty" for RNN for Drawing Classification

noumanriazkhan · 3Comments

tutorial image cifar10 estimator generate TFRecord error

jacknlliu · 3Comments

Convert .ckpt file into .pbtxt

rakashi · 3Comments

I can't find preprocessor_pb2,who can help me

hanzy123 · 3Comments