Models: Lack of documentation lstm_object_detection

Created on 11 Jan 2019 · 16Comments · Source: tensorflow/models

System information

What is the top-level directory of the model you are using: lstm_object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 1.12.0
Bazel version (if compiling from source):
CUDA/cuDNN version: 9.0
GPU model and memory:
Exact command to reproduce:

Describe the problem

This is NOT a bug issue, but a request for more documentation/information about the training process of the lstm_object_detection. I tired to find help on Stackoverflow, but now I hope for a response of the authors. It would help a lot if there would be a documentation on how to train or in particular on how to prepare the training data. A sample create_record.py file like the ones in object_detection would be great.

awaiting model gardener docs

Source

Freephi

👍20

Most helpful comment

Did anyone end up getting this to train at all?

wrkgm on 22 Feb 2019

👍7

All 16 comments

Would you please tell me where can I get pretrained models of lstm object detection?

bongmo on 25 Jan 2019

I am new to TensorFlow and I am trying to convert the ImageNet VID 2015 to TensorFlow record (tfrecord) in order to train the video object detection model here. The missing part is the way on how to convert the ImageNet VID 2015 into TensorFlow record (tfrecord).

The data set are extracted images, which each image represents a video frame. The data set can be found here under "Object detection from video (VID)". The "VID initial release snippets" includes the videos and "VID initial release" includes the extracted frames as JPEG images.

There is a code here to convert Pascal VOC data set to tfrecord, which has a similar structure to ImageNet VID 2015 data set. However, the ImageNet VID 2015 data set has the training and validation data and annotation in separate sub-folders and the labels in separate text files.

Probably one way could be adapting the tfrecord converter of Pascal VOC to ImagNet VID so it handles to scattered data in sub-folders and text files.

I would appreciate any help. Thank you!

Best regards,
Ashkan

ashkanee on 12 Feb 2019

Did anyone end up getting this to train at all?

wrkgm on 22 Feb 2019

👍7

A side question: does anyone have problem with equation (4) in the original paper?
I posted a question on stackoverflow as well.

ktw361 on 17 Apr 2019

While running train.py from lstm_object_detection, I'm getting following errors:-

TypeError: Expected binary or unicode string, got

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "lstm_object_detection/train.py", line 185, in
tf.app.run()
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "lstm_object_detection/train.py", line 181, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/kt-ml1/models-master/models-master/research/lstm_object_detection/trainer.py", line 293, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/home/kt-ml1/models-master/models-master/research/slim/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(args, *kwargs)
File "/home/kt-ml1/models-master/models-master/research/lstm_object_detection/trainer.py", line 174, in _create_losses
losses_dict = detection_model.loss(prediction_dict, true_image_shapes)
File "/home/kt-ml1/models-master/models-master/research/lstm_object_detection/meta_architectures/lstm_ssd_meta_arch.py", line 165, in loss
match_list = [matcher.Match(match) for match in tf.unstack(batch_match)]
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1149, in unstack
value = ops.convert_to_tensor(value)
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1039, in convert_to_tensor
return convert_to_tensor_v2(value, dtype, preferred_dtype, name)
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1097, in convert_to_tensor_v2
as_ref=False)
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1175, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 304, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 245, in constant
allow_broadcast=True)
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 283, in _constant_impl
allow_broadcast=allow_broadcast))
File "/home/kt-ml1/.local/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 562, in make_tensor_proto
"supported type." % (type(values), values))
TypeError: Failed to convert object of type to Tensor. Contents: [, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ]. Consider casting elements to a supported type.

PS- I have commented data augmentation option in the config file because it was giving me groundtruth_error
the problem seems to be coming from tensors in the file seq_dataset_builder.py, as the tensors are empty.

Any help would be appreciated

Aaronreb on 16 May 2019

Comment out the line 165 from lstm_object_detection/meta_architectures/lstm_ssd_meta_arch.py

match_list = [matcher.Match(match) for match in tf.unstack(batch_match)]

Instead use the following code-
match_list=batch_match

I found the solution here:
https://github.com/tensorflow/models/issues/6777

SonaHarutyunyan on 21 May 2019

About groundtruth_weights error:

You can comment only lines 3212-3214 in object_detection/core/preprocessor.py and not the whole data augmentation option.
They are the following lines:

if include_label_weights:
groundtruth_label_weights = (
fields.InputDataFields.groundtruth_weights)

SonaHarutyunyan on 21 May 2019

About groundtruth_weights error:

You can comment only lines 3212-3214 in object_detection/core/preprocessor.py and not the whole data augmentation option.
They are the following lines:

if include_label_weights:
groundtruth_label_weights = (
fields.InputDataFields.groundtruth_weights)

Thanks!
Did you successfully trained your lstm model?

Aaronreb on 22 May 2019

About groundtruth_weights error:

You can comment only lines 3212-3214 in object_detection/core/preprocessor.py and not the whole data augmentation option.
They are the following lines:

if include_label_weights:
groundtruth_label_weights = (
fields.InputDataFields.groundtruth_weights)

Can you help me with the following error?

tensorflow.python.framework.errors_impl.InvalidArgumentError: Name: , Feature list 'image/encoded' is required but could not be found. Did you mean to include it in feature_list_dense_missing_assumed_empty or feature_list_dense_defaults?
[[{{node ParseSingleSequenceExample/ParseSingleSequenceExample}}]]

PS- I am using tfrecords, created out of Open Image Dataset (OID).
Any help would be appreciated.

Aaronreb on 22 May 2019

Unfortunately I am having the same error and didn't succeed yet.

SonaHarutyunyan on 23 May 2019

👍1

Unfortunately I am having the same error and didn't succeed yet.

if solved, do post the solution.
Thanks.

Aaronreb on 17 Jun 2019

What exactly should be the input to the lstm model? Is it tfrecords of images or tfrecords of videos?

System information

What is the top-level directory of the model you are using: lstm_object_detection

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04

TensorFlow installed from (source or binary): source

TensorFlow version (use command below): 1.12.0

Bazel version (if compiling from source):

CUDA/cuDNN version: 9.0

GPU model and memory:

Exact command to reproduce:

Describe the problem

This is NOT a bug issue, but a request for more documentation/information about the training process of the lstm_object_detection. I tired to find help on Stackoverflow, but now I hope for a response of the authors. It would help a lot if there would be a documentation on how to train or in particular on how to prepare the training data. A sample create_record.py file like the ones in object_detection would be great.

Aaronreb on 18 Jun 2019

What exactly should be the input to the lstm model? Is it tfrecords of images or tfrecords of videos?

System information

What is the top-level directory of the model you are using: lstm_object_detection

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04

TensorFlow installed from (source or binary): source

TensorFlow version (use command below): 1.12.0

Bazel version (if compiling from source):

CUDA/cuDNN version: 9.0

GPU model and memory:

Exact command to reproduce:

Describe the problem

This is NOT a bug issue, but a request for more documentation/information about the training process of the lstm_object_detection. I tired to find help on Stackoverflow, but now I hope for a response of the authors. It would help a lot if there would be a documentation on how to train or in particular on how to prepare the training data. A sample create_record.py file like the ones in object_detection would be great.

It should be tfrecords of images. So each video will contains a sequence of images and their respective annotations. All these details are to be stacked into one SequenceExample. So each SequenceExample will conatin one video's details.

Shruthi-Sampathkumar on 30 Jun 2019

train.py is not starting to train and no errors are thrown as well. I have included the stack trace for reference. I am not able to understand what is wrong because there is no error thrown. It is getting killed automatically after the line "Use standard file utilities to get mtimes". Any help would be appreciated. Thanks.

WARNING:tensorflow:From /scratch/user/shruthi/.conda/envs/env2/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py:737: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
WARNING:tensorflow:From /scratch/user/shruthi/.conda/envs/env2/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py:737: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2019-06-30 16:01:41.065781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-06-30 16:01:41.065940: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-06-30 16:01:41.065960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 1
2019-06-30 16:01:41.065971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N Y
2019-06-30 16:01:41.065993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1: Y N
2019-06-30 16:01:41.066174: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9765 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:83:00.0, compute capability: 3.7)
2019-06-30 16:01:41.066385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 9765 MB memory) -> physical GPU (device: 1, name: Tesla K80, pci bus id: 0000:84:00.0, compute capability: 3.7)
WARNING:tensorflow:From /scratch/user/shruthi/.conda/envs/env2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
WARNING:tensorflow:From /scratch/user/shruthi/.conda/envs/env2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
WARNING:tensorflow:From /scratch/user/shruthi/.conda/envs/env2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
WARNING:tensorflow:From /scratch/user/shruthi/.conda/envs/env2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1070: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
Killed

Shruthi-Sampathkumar on 30 Jun 2019

My num_classes in config is 2 (I have two non-background classes). My input tfrecrd contains class indices : 0(for background), 1 and 2. I have enabled add_background_class. I am getting shape mismatch error between logits and labels : logits are (10,324,3) and labels are (10,324,2). May I know where I am going wrong? I am not able to understand why my target tensor is getting fed into the network as 2D. Thanks for the help in advance.

Shruthi-Sampathkumar on 23 Jul 2019

Hi There,
We are checking to see if you still need help on this, as this seems to be an old issue. Please update this issue with the latest information, code snippet to reproduce your issue and error you are seeing.
If we don't hear from you in the next 7 days, this issue will be closed automatically. If you don't need help on this issue any more, please consider closing this.