Models: Assertion failed: [maximum box coordinate value is larger than 1.100000: ] [1.15277779]

Created on 11 Oct 2018  路  13Comments  路  Source: tensorflow/models

System information

  • What is the top-level directory of the model you are using: object-detection
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • TensorFlow installed from (source or binary): pip
  • TensorFlow version (use command below): 1.10.1
  • Bazel version (if compiling from source): N/A
  • CUDA/cuDNN version:N/A
  • GPU model and memory:N/A
  • Exact command to reproduce: python model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=samples/configs/faster_rcnn_resnet101_pets.config

Describe the problem

I am doing object detection in tensorflow and using the faster_rcnn_resnet101_coco_2018_01_28 model for training locally but I am getting the follwing error:

    InvalidArgumentError (see above for traceback): assertion failed: [maximum box coordinate value is larger than 1.100000: ] [1.15277779]

I tried training using ssd_mobilenet_v1_pets but training was successful. My training images are a mixture of different sizes. I don't believe any of my bounding boxes are outside the image coordinates. Even if that's the case why would the ssd_mobilenet_v1_pets model work but not the resnet model?

Source code / logs

[[Node: ToAbsoluteCoordinates/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch, Loss/ToAbsoluteCoordinates/Assert/AssertGuard/Assert/data_0, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch_1)]]

The traceback is : /venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py, line 1717, in __init__ self._traceback = tf_stack.extract_stack()

Most helpful comment

I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

All 13 comments

For preparing the tensorflow record files, the script I used is https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py except that I have two labels .

That happen because your bonding box in your dataset bigger than your image height or width. you shoud check your training data

I checked my training file and none of the box values exceed the image dimensions.

I face the same problem. I use the GTSRB dataset and wrote a script to check the size of the image against the bounding box size.

<annotation>
    <folder>00035</folder>
    <filename>00000_00003.jpeg</filename>
    <path>/content/traffic_signs/train/00035/00000_00003.jpeg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>39</width>
        <height>38</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>straight_only</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>6</xmin>
            <ymin>5</ymin>
            <xmax>34</xmax>
            <ymax>33</ymax>
        </bndbox>
    </object>
</annotation>

The annotation for the images looks like this, and the code I used to generate the tfrecords files is here:

import os
import io
import xml.etree.ElementTree as ET
import tensorflow as tf

from object_detection.utils import dataset_util
from PIL import Image

def create_tf_example(images_dir, example):

    image_path = images_dir + "/" + example
    labels_path = images_dir + "/" + os.path.splitext(example)[0] + '.xml'

    # Read the image
    img = Image.open(image_path)
    width, height = img.size
    img_bytes = io.BytesIO()
    img.save(img_bytes, format=img.format)

    height = height
    width = width
    encoded_image_data = img_bytes.getvalue()
    image_format = img.format.encode('utf-8')

    # Read the label XML
    tree = ET.parse(labels_path)
    root = tree.getroot()
    xmins = xmaxs = ymins = ymaxs = list()

    for coordinate in root.find('object').iter('bndbox'):
        xmins = [int(coordinate.find('xmin').text)]
        xmaxs = [int(coordinate.find('xmax').text)]
        ymins = [int(coordinate.find('ymin').text)]
        ymaxs = [int(coordinate.find('ymax').text)]

    classes_text = classes_csv.label.ravel()
    classes_text = [label.encode('utf-8') for label in classes_text]
    classes = range(1,43)

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(encoded_image_data),
        'image/source_id': dataset_util.bytes_feature(encoded_image_data),
        'image/encoded': dataset_util.bytes_feature(encoded_image_data),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example



writer = tf.python_io.TFRecordWriter("/content/train.record")

train_dir = "/content/traffic_signs/train/"

for entry_folder in os.listdir(train_dir):
  for entry_file in os.listdir(os.path.join(train_dir, entry_folder)):
    if os.path.join(train_dir, entry_folder, entry_file).endswith(".jpeg"):
      tf_example = create_tf_example(os.path.join(train_dir, entry_folder), entry_file)
      writer.write(tf_example.SerializeToString())

writer.close()    

writer = tf.python_io.TFRecordWriter("/content/valid.record")

test_dir = "/content/traffic_signs/test/"

for entry_folder in os.listdir(test_dir):
  for entry_file in os.listdir(os.path.join(test_dir, entry_folder)):
    if os.path.join(test_dir, entry_folder, entry_file).endswith(".jpeg"):
      tf_example = create_tf_example(os.path.join(test_dir, entry_folder), entry_file)
      writer.write(tf_example.SerializeToString())

writer.close()  

This is my first time trying to create a tfrecords file, is this the right way to do it?

Well I solved the issue. The problem was with the image sizes. I used a script to find the errant entries in the train and test files.

@AjayZinngg I found that

Checked 353 files and realized 352 errors

from the script you linked. But the dataset consisted of the original and augmented images. I already trained using the original images and there was no error. But the combination of both original and augmented images upon checking, all of them are erroneous according to check_images.py you linked.

How did you solve this?

Hi @AjayZinngg I removed the images that were found erroneous by the code you have given. But similar error. How did you go about this?

Solved on #1754

I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

Well I solved the issue. The problem was with the image sizes. I used a script to find the errant entries in the train and test files.

@AjayZinngg
I am facing the same problem and I believe my issue is also related to the size. May I ask how did you change the sizes? Or did you get rid of the errant images? Did you make your images smaller? If yes how small?

Hi @serenaraju ,
It's been a while since I've worked on that project so I'm not sure of all the steps I took there. Can you try with images of the same dimensions? Or try what @HwangJohn has commented?

I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

hey i am doing that but still getting the error

Hey, sorry for the late replay, but what I did was to add a preprocessing step to transform all the images to the same x&y dimension, similiar to what @AjayZinngg proposed.

Was this page helpful?
0 / 5 - 0 ratings