Darkflow: Training results loss: nan avg loss: nan

Created on 4 Apr 2018  路  4Comments  路  Source: thtrieu/darkflow

Hi all,
Apologies, this is my first time using object detection of any sort, and in fact my first time using python. While trying to train my data set I am getting results as nan.
The error i receive is:
RuntimeWarning: invalid value encountered in sqrt
obj[4] = np.sqrt(obj[4])

image

Sample annotation xml file

<annotation>
  <folder>images</folder>
  <filename>000006.png</filename>
  <segmented>0</segmented>
  <size>
    <width>225</width>
    <height>225</height>
    <depth>3</depth>
  </size>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>35</xmin>
      <ymin>93</ymin>
      <xmax>45</xmax>
      <ymax>45</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>69</xmin>
      <ymin>94</ymin>
      <xmax>77</xmax>
      <ymax>77</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>129</xmin>
      <ymin>85</ymin>
      <xmax>136</xmax>
      <ymax>136</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>145</xmin>
      <ymin>98</ymin>
      <xmax>153</xmax>
      <ymax>153</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>81</xmin>
      <ymin>143</ymin>
      <xmax>92</xmax>
      <ymax>92</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>185</xmin>
      <ymin>188</ymin>
      <xmax>195</xmax>
      <ymax>195</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>77</xmin>
      <ymin>189</ymin>
      <xmax>93</xmax>
      <ymax>93</ymax>
    </bndbox>
  </object>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>75</xmin>
      <ymin>204</ymin>
      <xmax>89</xmax>
      <ymax>89</ymax>
    </bndbox>
  </object>
</annotation>

cfg file

[net]
batch=64
subdivisions=8
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
max_batches = 40100
policy=steps
steps=-1,100,20000,30000
scales=.1,10,.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

###########

[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=30
activation=linear

[region]
anchors = 1.08,1.19,  3.42,4.41,  6.63,11.38,  9.42,5.11,  16.62,10.52
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1

object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1

absolute=1
thresh = .5
random=1

I hope someone can point me in the right direction. Its the first time i've tried training my own model.
Thanks,
Lee

Most helpful comment

You are getting nan cuz the Ymin is bigger than Ymax:

<bndbox>
      <xmin>69</xmin>
      <ymin>94</ymin>
      <xmax>77</xmax>
      <ymax>77</ymax>
</bndbox>

Make sure that Min is smaller than Max for both X and Y.

All 4 comments

If the train uses your custom dataset, and it's initial training.
you should be use 'burn_in' because gradient and loss are very unstable in the initial train.
check the code and search 'seen' in https://github.com/marvis/pytorch-yolo2/blob/master/train.py

i have the same questions... Do you had some idea about this?

This script apparently helps you identify when the Nans start to occur

https://gist.github.com/yuq-1s/ce63a306f1d39d1c0c80d33f7855f3b5

But I am not sure how to use it with darkflow - if you work it out please let me know

You are getting nan cuz the Ymin is bigger than Ymax:

<bndbox>
      <xmin>69</xmin>
      <ymin>94</ymin>
      <xmax>77</xmax>
      <ymax>77</ymax>
</bndbox>

Make sure that Min is smaller than Max for both X and Y.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Khobzer picture Khobzer  路  5Comments

eugtanchik picture eugtanchik  路  4Comments

ShawnDing1994 picture ShawnDing1994  路  4Comments

realityzero picture realityzero  路  3Comments

off99555 picture off99555  路  5Comments