So I copied the tiny yolo cfg and put it in a new cfg file. I changed the classes and filters on the new cfg to match the classes on my data set. I also changed the index.txt to match my classes. I then ran:
python flow --train --model cfg/new.cfg --load bin/tiny-yolo.weights --dataset
The training kicks off but for every step I get
step 1 - loss nan - moving ave loss nan
So i feel like I'm doing something wrong. Any help on this issue would be greatly appreciated.
Dataset and annotation are blank in your command. You have to specify the path to the .xml annotations dir and dataset image dir.
Oh ya my actual command is:
python flow --train --model cfg/adas.cfg --load bin/tiny-yolo.weights --dataset "signDatabasePublicFramesOnly/annotations/" --annotation "signDatabasePublicFramesOnly/bb_annotations/"
I was just writing my command like that because I figured no one cared what my path was
Just making sure.
Can you post your cfg file? Are your annotations correct?
So heres the new cfg file.
[net]
batch=64
subdivisions=8
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
learning_rate=0.001
max_batches = 120000
policy=steps
steps=-1,100,80000,100000
scales=.1,10,.1,.1
[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[maxpool]
size=2
stride=1
[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=80
activation=linear
[region]
anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741
bias_match=1
classes=11
coords=4
num=5
softmax=1
jitter=.2
rescore=1
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
Also I had to write a script to make the xml files from the dataset that I had. Here's an example of of one my annotation xmls:
<annotation>
<source>
<image>UCSD</image>
<annotation>UCSD LISA</annotation>
<flickrid>0000000</flickrid>
<database>The VOC2007 Database</database>
</source>
<object>
<bndbox>
<xmin>474</xmin>
<ymin>206</ymin>
<ymax>144</ymax>
<xmax>526</xmax></bndbox>
<pose>Unspecified</pose>
<truncated>0</truncated>
<name>speedLimit25</name>
<difficult>0</difficult>
</object>
<filename>speedLimit25_1333392931.avi_image0.png</filename>
<segmented>0</segmented>
<owner>
<name>LISA</name>
<flickrid>UCSD</flickrid>
</owner>
<folder>annotations</folder>
<size>
<width>1024</width>
<depth>3</depth>
<height>522</height>
</size>
</annotation>
Also thanks for all the help.
I also just realized an error I was getting.
C:\Users\Jorda\Documents\darkflow\darkflow\net\yolov2\data.py:41: RuntimeWarning: invalid value encountered in sqrt obj[4] = np.sqrt(obj[4])
Training statistics:
Learning rate : 1e-05
Batch size : 16
Epoch number : 1000
Backup every : 2000
Im debugging it now, but any insight would be appreciated.
Figured out it was a problem with the script that made the xml files.
That's what I figured, hence the sqrt thing.
Hi @dddevo26 ,
Can you tell me what the script is? or which scripy , I also encountered this mistake.
I was step 884375. print loss nan - moving are loss nan
Thank you .
@dddevo26 yea, what exactly is the problem with the xml write script? I'm facing the same problem, so you can help me check which part of my script could go wrong. thanks
C:\Users\Jorda\Documents\darkflow\darkflow\net\yolov2\data.py:41: RuntimeWarning: invalid value encountered in sqrt obj[4] = np.sqrt(obj[4])
you are getting this problem because your xmin > xmax and ymin > ymax.
hope this will help!!!
hello i go the same error and i dont understand why my xml file look like classic one
and my xmin
loss nan - moving ave loss nan
Even I got the same issue. How do I debug it? This issue must be reopened