I am doing object detection in tensorflow and using the faster_rcnn_resnet101_coco_2018_01_28 model
for training locally but I am getting the follwing error:
InvalidArgumentError (see above for traceback): assertion failed: [maximum box coordinate value is larger than 1.100000: ] [1.15277779]
I tried training using ssd_mobilenet_v1_pets
but training was successful. My training images are a mixture of different sizes. I don't believe any of my bounding boxes are outside the image coordinates. Even if that's the case why would the ssd_mobilenet_v1_pets model work but not the resnet model?
[[Node: ToAbsoluteCoordinates/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch, Loss/ToAbsoluteCoordinates/Assert/AssertGuard/Assert/data_0, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch_1)]]
The traceback is :
/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py, line 1717, in __init__
self._traceback = tf_stack.extract_stack()
For preparing the tensorflow record files, the script I used is https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py except that I have two labels .
That happen because your bonding box in your dataset bigger than your image height or width. you shoud check your training data
I checked my training file and none of the box values exceed the image dimensions.
I face the same problem. I use the GTSRB dataset and wrote a script to check the size of the image against the bounding box size.
<annotation>
<folder>00035</folder>
<filename>00000_00003.jpeg</filename>
<path>/content/traffic_signs/train/00035/00000_00003.jpeg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>39</width>
<height>38</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>straight_only</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>6</xmin>
<ymin>5</ymin>
<xmax>34</xmax>
<ymax>33</ymax>
</bndbox>
</object>
</annotation>
The annotation for the images looks like this, and the code I used to generate the tfrecords files is here:
import os
import io
import xml.etree.ElementTree as ET
import tensorflow as tf
from object_detection.utils import dataset_util
from PIL import Image
def create_tf_example(images_dir, example):
image_path = images_dir + "/" + example
labels_path = images_dir + "/" + os.path.splitext(example)[0] + '.xml'
# Read the image
img = Image.open(image_path)
width, height = img.size
img_bytes = io.BytesIO()
img.save(img_bytes, format=img.format)
height = height
width = width
encoded_image_data = img_bytes.getvalue()
image_format = img.format.encode('utf-8')
# Read the label XML
tree = ET.parse(labels_path)
root = tree.getroot()
xmins = xmaxs = ymins = ymaxs = list()
for coordinate in root.find('object').iter('bndbox'):
xmins = [int(coordinate.find('xmin').text)]
xmaxs = [int(coordinate.find('xmax').text)]
ymins = [int(coordinate.find('ymin').text)]
ymaxs = [int(coordinate.find('ymax').text)]
classes_text = classes_csv.label.ravel()
classes_text = [label.encode('utf-8') for label in classes_text]
classes = range(1,43)
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(encoded_image_data),
'image/source_id': dataset_util.bytes_feature(encoded_image_data),
'image/encoded': dataset_util.bytes_feature(encoded_image_data),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
writer = tf.python_io.TFRecordWriter("/content/train.record")
train_dir = "/content/traffic_signs/train/"
for entry_folder in os.listdir(train_dir):
for entry_file in os.listdir(os.path.join(train_dir, entry_folder)):
if os.path.join(train_dir, entry_folder, entry_file).endswith(".jpeg"):
tf_example = create_tf_example(os.path.join(train_dir, entry_folder), entry_file)
writer.write(tf_example.SerializeToString())
writer.close()
writer = tf.python_io.TFRecordWriter("/content/valid.record")
test_dir = "/content/traffic_signs/test/"
for entry_folder in os.listdir(test_dir):
for entry_file in os.listdir(os.path.join(test_dir, entry_folder)):
if os.path.join(test_dir, entry_folder, entry_file).endswith(".jpeg"):
tf_example = create_tf_example(os.path.join(test_dir, entry_folder), entry_file)
writer.write(tf_example.SerializeToString())
writer.close()
This is my first time trying to create a tfrecords file, is this the right way to do it?
Well I solved the issue. The problem was with the image sizes. I used a script to find the errant entries in the train and test files.
@AjayZinngg I found that
Checked 353 files and realized 352 errors
from the script you linked. But the dataset consisted of the original and augmented images. I already trained using the original images and there was no error. But the combination of both original and augmented images upon checking, all of them are erroneous according to check_images.py
you linked.
How did you solve this?
Hi @AjayZinngg I removed the images that were found erroneous by the code you have given. But similar error. How did you go about this?
Solved on #1754
I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]
Well I solved the issue. The problem was with the image sizes. I used a script to find the errant entries in the train and test files.
@AjayZinngg
I am facing the same problem and I believe my issue is also related to the size. May I ask how did you change the sizes? Or did you get rid of the errant images? Did you make your images smaller? If yes how small?
Hi @serenaraju ,
It's been a while since I've worked on that project so I'm not sure of all the steps I took there. Can you try with images of the same dimensions? Or try what @HwangJohn has commented?
I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]
hey i am doing that but still getting the error
Hey, sorry for the late replay, but what I did was to add a preprocessing step to transform all the images to the same x&y dimension, similiar to what @AjayZinngg proposed.
Most helpful comment
I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md