Models: Feature Request: How to create bounding box annotations for Object Detection Examples/Code

Created on 4 Jul 2017 · 20Comments · Source: tensorflow/models

System information

What is the top-level directory of the model you are using: object_detection
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): working on it... not yet.
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Mint 18
TensorFlow installed from (source or binary): from pip
TensorFlow version (use command below): 1.2

Describe the problem

The new Object Detection model has some great tutorials for training on PASCAL VOC and Pets, and example scripts for how to create TFRecords. The documentation and tutorials reference scripts for creating your own custom datasets... but these scripts focus only on the creation of TFRecords. I would argue that the biggest problem people, including myself, have with creating our own custom datasets for a model like this, is that it's apparently assumed that we know how to create the bounding box annotations for our images. This is anything but simple, and there is a huge void around how to accomplish this. The best I've located so far is for something called Yolo with Darknet (which seems remarkably similar, and predates, the object detection model here). The information about creating bounding boxes annotations for Yolo involves using a an application to create individual text files for each image, and then running the Yolo/Darknet code that turns those individual text files into a PASCAL VOC style annotation file.

My feature request is that a tutorial be added, or an application added, to the object detection model for creating the annotations. One of the most frustrating assumptions made by tensorflow developers appears to be that everyone is born understanding how to create their labels.

Cheers, and thank you!

docs

Source

aloerch

👍14

Most helpful comment

hi, i share your concern. Am in the process of implementing bounding box prediction for kitti data set, i worte the whole code from scratch, will share here once am done.

vxy10 on 4 Jul 2017

🎉6 👍1

All 20 comments

hi, i share your concern. Am in the process of implementing bounding box prediction for kitti data set, i worte the whole code from scratch, will share here once am done.

vxy10 on 4 Jul 2017

🎉6 👍1

That would be fantastic! Thank you in advance!

aloerch on 4 Jul 2017

It's exactly where I'm stuck, I run the examples but I can not create my own dataset because it's not clear to me how to do it, your problem description was perfect, I already posted a Issue about it but nobody answered, please, when someone has a clear example post here that will be helping the community.

DarkNavrel on 5 Jul 2017

Will do., btw in the mean time, you look at lines 96-120 here, https://github.com/tensorflow/models/blob/master/object_detection/create_pascal_tf_record.py

vxy10 on 5 Jul 2017

👎5

I used this script and tried to adapt it to my images, but I did not succeed, because it presents an error that says missing the annotations . I see that this piece you suggested is to generate the annotations, but how to use it? Could you give a simple example with a folder containing the images a.jpeg, b.jpeg and c.jpeg?
Thanks man!

_Sent from my Motorola XT1058 using FastHub_

DarkNavrel on 5 Jul 2017

👍1

Thanks vxy, that's very helpful. Dark, if you download the Pascal VOC dataset you can use the images and annotations as examples (and have thousands of examples). The trick would be to create or find an application capable of letting you create/draw bounding boxes on your imagery and having them saved in the Pascal VOC annotation (xml) format.

I haven't tried this one yet, but it seems promising: https://github.com/tzutalin/labelImg

aloerch on 5 Jul 2017

👍1

Thanks aloech is that, my question then is: How can I create bounding box annotations PASCAL VOC (xml files), from my folderv "/myimages" containing "a.jpeg, b.jpeg and c.jpeg"?

_Sent from my Motorola XT1058 using FastHub_

DarkNavrel on 5 Jul 2017

Similar to what @aloerch offers, Sloth is an easy GUI for creating bounding box annotations, in a JSON like format. Once you have annotations in any form, you can create a custom script that reads the annotations however you have them stored and turns them into what TFRecords needs, turning them into PASCAL VOC format is an unnecessary middle man.

micahprice on 5 Jul 2017

Thanks Micahprice its awesome! Is the sloth data output even a .json file or can it be converted into a VOC compliant PASCAL? I think there is a solution here :)

_Sent from my Motorola XT1058 using FastHub_

DarkNavrel on 5 Jul 2017

We're working on documenting how you can bring in your own dataset. It should be coming soon!

For those who can't wait, take a look at the create_pascal_tf_record.py and create_pet_tf_record.py files in the object_detection directory (specifically the dict_to_tf_example function). These scripts show how we read data from the PASCAL VOC format to the TFRecord format used by the object_detection API

derekjchow on 5 Jul 2017

👍1

Thanks for the help derekjchow, this is very important to us. As tfrecord interprets VOC data PASCAL seems to be now quite clear to me. The question is:
How to convert images in jpeg format to the data bounding box, VOC PASCAL, to TFrecord read?

_Sent from my Motorola XT1058 using FastHub_

DarkNavrel on 5 Jul 2017

Dark, like I mentioned, you can use the program I linked to (I'm trying it out now, it's not my program...) which writes annotations natively in PASCAL VOC xml format. You get a jpeg, and an xml. At that point, you can use/modify the create tfrecord from pascal script to create the records. Using sloth would work to, but it actually doesn't cut out a middle-man... it makes you create your own middle-man. If you're struggling to figure out how to create the annotations and tfrecords, then you probably don't want to start by having to create your own custom script.

aloerch on 5 Jul 2017

Has anyone tried to build a convolutional NN on top of a trained object classifier? I am struggling with the same problem as of you 😢

dennisushi on 9 Jul 2017

Now the tutorial is available. Thank you @derekjchow!

korrawat on 12 Jul 2017

👎2

Thank you, that tutorial is very helpful. I'm closing this issue :+1:

aloerch on 12 Jul 2017

@vxy10 did you ever finish the kitti project? writing the same converter now

sshleifer on 18 Sep 2017

Try to use labelImg to annotate image in Pascal format.
Convert the Pascal format to TFrecord by referring to the project, SSD-Tensorflow or Tensorflow-models