Py-faster-rcnn: RuntimeWarning: invalid value encountered in log while training on own annotated dataset

Created on 11 Nov 2015  路  2Comments  路  Source: rbgirshick/py-faster-rcnn

Hi!

I am trying to train on my own dataset which consists of 3 classes. I have already been able to train on VOC2007 dataset with less classes so I am quite sure that the problem isn't caused by different number of classes.

I am able to successfully finish training and evaluate it, however for one of these 3 classes I get 0 mAP. With digging further I found out that while training sometimes there appears numpy _RuntimeWarning: invalid value encountered in log_. This warning is due to negative value in log function.

In _lib/fast_rcnn/bbox_transform.py_ on line 16 there are two vectors _gt_rois[:, 2]_ and _gt_rois[:, 0]_ which are deducted and then later on log function is applied on their difference. In some cases their difference is suprisingly negative. The pair of numbers is usually like _(12.809, 111.236), (161.667, 291.667), (636.667, 788.333)_ but in these problem cases the first number is much larger _(98302.5, 591)_. The _gt_rois_ array is passed from _lib/rpn/anchor_target_layer.py_ inside _forward()_ method.

At first I thought that problem could be with data, so I checked it and deleted some images which were not RGB (they were part of the unsuccessfully trained class). I have also modified some of _xmax_ and _ymax_ in order to allow only maximum value within range _[0, (width-1)]_ and _[0, (height-1)]_, respectively. Nonetheless, any of these changes helped and I still receive _RuntimeWarning: invalid value encountered in log_ at some points of training.

Any idea what wrong could be with data? Or how could I further track these large values? I know that that mentioned _forward()_ method is activated from _lib/fast_rcnn/train.py_ by _self.solver.step(1)_ command, but I still haven't found a place where _bottom_ parameter containing data is passed.

Thank you!

Martin

Most helpful comment

As I anticipated it turned out that the problem was with data. For annotating images I used this tool https://github.com/tzutalin/labelImg which allows to assign 0 value to _xmin_ and _xmax_ tags. These points determining bounding boxes are then deducted by number 1 somewhere in _py-faster-rcnn_ code. It leads to underflowing. Number 0 becomes 65535 and when it is scaled by factor 1.5, the result is 98302.5 (the same number as I wrote in a post above).

The only thing which still isn't clear to me is that these 0-valued coordinates weren't only in the class which received 0 mAP.

I hope this could help somebody who is tackling with the same problem.

All 2 comments

As I anticipated it turned out that the problem was with data. For annotating images I used this tool https://github.com/tzutalin/labelImg which allows to assign 0 value to _xmin_ and _xmax_ tags. These points determining bounding boxes are then deducted by number 1 somewhere in _py-faster-rcnn_ code. It leads to underflowing. Number 0 becomes 65535 and when it is scaled by factor 1.5, the result is 98302.5 (the same number as I wrote in a post above).

The only thing which still isn't clear to me is that these 0-valued coordinates weren't only in the class which received 0 mAP.

I hope this could help somebody who is tackling with the same problem.

and don't forget to delete the annotations_cache...

Was this page helpful?
0 / 5 - 0 ratings