Models: Assertion failed: [maximum box coordinate value is larger than 1.100000: ] [1.15277779]

Created on 11 Oct 2018 · 13Comments · Source: tensorflow/models

System information

What is the top-level directory of the model you are using: object-detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): pip
TensorFlow version (use command below): 1.10.1
Bazel version (if compiling from source): N/A
CUDA/cuDNN version:N/A
GPU model and memory:N/A
Exact command to reproduce: python model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=samples/configs/faster_rcnn_resnet101_pets.config

Describe the problem

I am doing object detection in tensorflow and using the faster_rcnn_resnet101_coco_2018_01_28 model for training locally but I am getting the follwing error:

    InvalidArgumentError (see above for traceback): assertion failed: [maximum box coordinate value is larger than 1.100000: ] [1.15277779]

I tried training using ssd_mobilenet_v1_pets but training was successful. My training images are a mixture of different sizes. I don't believe any of my bounding boxes are outside the image coordinates. Even if that's the case why would the ssd_mobilenet_v1_pets model work but not the resnet model?

Source code / logs

[[Node: ToAbsoluteCoordinates/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch, Loss/ToAbsoluteCoordinates/Assert/AssertGuard/Assert/data_0, ToAbsoluteCoordinates/Assert/AssertGuard/Assert/Switch_1)]]

The traceback is : /venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py, line 1717, in __init__ self._traceback = tf_stack.extract_stack()

Source

AjayZinngg

Most helpful comment

I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

HwangJohn on 10 Apr 2019

👍3

All 13 comments

For preparing the tensorflow record files, the script I used is https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py except that I have two labels .

AjayZinngg on 11 Oct 2018

That happen because your bonding box in your dataset bigger than your image height or width. you shoud check your training data

zongpingdeng on 12 Oct 2018

👍2

I checked my training file and none of the box values exceed the image dimensions.

AjayZinngg on 12 Oct 2018

I face the same problem. I use the GTSRB dataset and wrote a script to check the size of the image against the bounding box size.

<annotation>
    <folder>00035</folder>
    <filename>00000_00003.jpeg</filename>
    <path>/content/traffic_signs/train/00035/00000_00003.jpeg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>39</width>
        <height>38</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>straight_only</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>6</xmin>
            <ymin>5</ymin>
            <xmax>34</xmax>
            <ymax>33</ymax>
        </bndbox>
    </object>
</annotation>

The annotation for the images looks like this, and the code I used to generate the tfrecords files is here:

import os
import io
import xml.etree.ElementTree as ET
import tensorflow as tf

from object_detection.utils import dataset_util
from PIL import Image

def create_tf_example(images_dir, example):

    image_path = images_dir + "/" + example
    labels_path = images_dir + "/" + os.path.splitext(example)[0] + '.xml'

    # Read the image
    img = Image.open(image_path)
    width, height = img.size
    img_bytes = io.BytesIO()
    img.save(img_bytes, format=img.format)

    height = height
    width = width
    encoded_image_data = img_bytes.getvalue()
    image_format = img.format.encode('utf-8')

    # Read the label XML
    tree = ET.parse(labels_path)
    root = tree.getroot()
    xmins = xmaxs = ymins = ymaxs = list()

    for coordinate in root.find('object').iter('bndbox'):
        xmins = [int(coordinate.find('xmin').text)]
        xmaxs = [int(coordinate.find('xmax').text)]
        ymins = [int(coordinate.find('ymin').text)]
        ymaxs = [int(coordinate.find('ymax').text)]

    classes_text = classes_csv.label.ravel()
    classes_text = [label.encode('utf-8') for label in classes_text]
    classes = range(1,43)

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(encoded_image_data),
        'image/source_id': dataset_util.bytes_feature(encoded_image_data),
        'image/encoded': dataset_util.bytes_feature(encoded_image_data),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example



writer = tf.python_io.TFRecordWriter("/content/train.record")

train_dir = "/content/traffic_signs/train/"

for entry_folder in os.listdir(train_dir):
  for entry_file in os.listdir(os.path.join(train_dir, entry_folder)):
    if os.path.join(train_dir, entry_folder, entry_file).endswith(".jpeg"):
      tf_example = create_tf_example(os.path.join(train_dir, entry_folder), entry_file)
      writer.write(tf_example.SerializeToString())

writer.close()    

writer = tf.python_io.TFRecordWriter("/content/valid.record")

test_dir = "/content/traffic_signs/test/"

for entry_folder in os.listdir(test_dir):
  for entry_file in os.listdir(os.path.join(test_dir, entry_folder)):
    if os.path.join(test_dir, entry_folder, entry_file).endswith(".jpeg"):
      tf_example = create_tf_example(os.path.join(test_dir, entry_folder), entry_file)
      writer.write(tf_example.SerializeToString())

writer.close()

This is my first time trying to create a tfrecords file, is this the right way to do it?

Transmitt0r on 19 Oct 2018

Well I solved the issue. The problem was with the image sizes. I used a script to find the errant entries in the train and test files.

AjayZinngg on 19 Oct 2018

👍2 ❤1

@AjayZinngg I found that

Checked 353 files and realized 352 errors

from the script you linked. But the dataset consisted of the original and augmented images. I already trained using the original images and there was no error. But the combination of both original and augmented images upon checking, all of them are erroneous according to check_images.py you linked.

How did you solve this?

dscha09 on 11 Dec 2018

Hi @AjayZinngg I removed the images that were found erroneous by the code you have given. But similar error. How did you go about this?

dscha09 on 11 Dec 2018

Solved on #1754

codexponent on 8 Apr 2019

I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

HwangJohn on 10 Apr 2019

👍3

Well I solved the issue. The problem was with the image sizes. I used a script to find the errant entries in the train and test files.

@AjayZinngg
I am facing the same problem and I believe my issue is also related to the size. May I ask how did you change the sizes? Or did you get rid of the errant images? Did you make your images smaller? If yes how small?

serenaraju on 11 Jul 2020

👍1

Hi @serenaraju ,
It's been a while since I've worked on that project so I'm not sure of all the steps I took there. Can you try with images of the same dimensions? Or try what @HwangJohn has commented?

AjayZinngg on 11 Jul 2020

I also struggled with this problem, but that was my mistake.
I didn't normalize xmin, xmax, ymin and ymax.
For example, xmins=[xmin / img_width]

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/using_your_own_dataset.md

hey i am doing that but still getting the error

sumitsah9263 on 30 Jul 2020

Hey, sorry for the late replay, but what I did was to add a preprocessing step to transform all the images to the same x&y dimension, similiar to what @AjayZinngg proposed.

Transmitt0r on 10 Aug 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings