Darknet: free(): invalid next size (fast) when using custom trained weights

Created on 16 May 2018 · 8Comments · Source: pjreddie/darknet

Hello, I keep receiving this output when I used my custom weights:

Loading weights from backup/yolov3-logo_50000.weights...Done!
/home/ryan/Projects/data_synthesization/output/dataset/01a0ad79-94d3-4923-b210-d87b3cfd4f82.mov-0001.jpg: Predicted in 9.369853 seconds.
hyundai: 100%
free(): invalid next size (fast)

The command is:
./darknet detector test cfg/voc.data cfg/yolov3-logo.cfg backup/yolov3-logo_50000.weights /home/ryan/Projects/data_synthesization/output/dataset/01a0ad79-94d3-4923-b210-d87b3cfd4f82.mov-0001.jpg

voc.data

classes= 1
train  = /home/pjreddie/data/voc/train.txt
valid  = /home/pjreddie/data/voc/2007_test.txt
names = data/voc.names
backup = backup

yolov3-logo.cfg

[net]
# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=16
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 500200
policy=steps
steps=400000,450000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
...
[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=1
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

Weights file: https://drive.google.com/open?id=1X69M6Qb_pSvoM0Dr2lN2Q04eUzVek1V4

Most of the times when darknet detects something this error would occur. If darknet fails to detect anything then this error would not occur.

Source

ryanaleksander

👍2

Most helpful comment

Fixed!
In Yolov3, in the .cfg file, there are 3 occurrences of classes and 3 occurrences of filters (technically there are more occurrences of filters=xxx, but you just want to change those that by default =75). In yolov2 there is only one.
Changing those numbers to the correct value (filters = (num/3) * (5+classes) , where num = 9 by default) fixed the both darknet: ./src/parser.c:312: parse_yolo: Assertion `l.outputs == params.inputs' failed. and free(): invalid next size (fast)

nospotfer on 20 Jun 2018

👍3

All 8 comments

there are 3 positions pair of filters and classes (in your case filters=18 and classes=1). Are you replace all of them ?

khiemntu on 17 May 2018

3 total? I thought there was only one like YOLOv2, I guess I'll check it again. It's kinda weird though, since I had no problem with training

ryanaleksander on 18 May 2018

@ryanaleksander there are no any problem when training, but when you testing, it'll occur

khiemntu on 18 May 2018

I get the same error. On my laptop using CPU only everything seems fine. On a GPU machine I get

* Error in `./darknet': free(): invalid next size (fast): 0x0000000001f13b40 *

The command I use is
./darknet detector test -thresh 0.1 voc.data cfg/yolov3.cfg backups/yolov3-weights.backup image.jpg

In the debugger I see that it happens in ./src/network.c

#0  0x00007fdcf2e12c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007fdcf2e16028 in __GI_abort () at abort.c:89
#2  0x00007fdcf2e4f2a4 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7fdcf2f61350 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007fdcf2e5b82e in malloc_printerr (ptr=<optimized out>, str=0x7fdcf2f614f0 "free(): invalid next size (fast)", action=1) at malloc.c:4998
#4  _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3842
#5  0x0000000000461fc9 in free_detections (dets=0x7fdca2ee73a0, n=3) at ./src/network.c:573
#6  0x000000000042238b in test_detector (datacfg=0x7fff7d11ab52 "/volume/share/public/SUN2012pascalformat/sun2012-voc.data", cfgfile=0x7fff7d11ab8c "/volume/share/public/SUN2012pascalformat/yolov3-sun10-learning-rate.cfg", 
    weightfile=0x7fff7d11abd4 "/volume/share/public/SUN2012pascalformat/weight_backups/yolov3-sun10-learning-rate.backup", filename=0x0, thresh=0.100000001, hier_thresh=0.5, outfile=0x0, fullscreen=0) at ./examples/detector.c:605
#7  0x0000000000422957 in run_detector (argc=8, argv=0x7fff7d119f48) at ./examples/detector.c:845
#8  0x0000000000426a0b in main (argc=8, argv=0x7fff7d119f48) at ./examples/darknet.c:434
(gdb) frame 5
#5  0x0000000000461fc9 in free_detections (dets=0x7fdca2ee73a0, n=3) at ./src/network.c:573
573             free(dets[i].prob);
(gdb) list
568
569     void free_detections(detection *dets, int n)
570     {
571         int i;
572         for(i = 0; i < n; ++i){
573             free(dets[i].prob);
574             if(dets[i].mask) free(dets[i].mask);
575         }
576         free(dets);
577     }

I set a watchpoint on dets[0].prob and it is modified here:
Hardware watchpoint 3: *0xfd88318

Old value = 23029904
New value = 262246768
__memcpy_sse2 () at ../sysdeps/x86_64/multiarch/../memcpy.S:206
206     ../sysdeps/x86_64/multiarch/../memcpy.S: No such file or directory.
(gdb) bt
#0  __memcpy_sse2 () at ../sysdeps/x86_64/multiarch/../memcpy.S:206
#1  0x00007ffff03075ce in __GI_qsort_r (b=0xfd88300, n=3, s=48, cmp=<optimized out>, arg=<optimized out>) at msort.c:271
#2  0x00000000004703b0 in do_nms_sort (dets=0xfd88300, total=3, classes=10, thresh=0.449999988) at ./src/box.c:77
#3  0x0000000000422313 in test_detector (datacfg=0x7fffffffeb36 "/volume/share/public/SUN2012pascalformat/sun2012-voc.data", cfgfile=0x7fffffffeb70 "/volume/share/public/SUN2012pascalformat/yolov3-sun10-learning-rate.cfg", 
    weightfile=0x7fffffffebb8 "/volume/share/public/SUN2012pascalformat/weight_backups/yolov3-sun10-learning-rate.backup", filename=0x0, thresh=0.100000001, hier_thresh=0.5, outfile=0x0, fullscreen=0) at ./examples/detector.c:603
#4  0x0000000000422957 in run_detector (argc=8, argv=0x7fffffffe828) at ./examples/detector.c:845
#5  0x0000000000426a0b in main (argc=8, argv=0x7fffffffe828) at ./examples/darknet.c:434

Unfortunately I have no idea whether do_nms_sort is supposed to change it or not.

felixendres on 29 May 2018

I have found the actual source of the error: In my .cfg file, one of the three yolo layers had a different number of classes than the other two. 🤦

felixendres on 30 May 2018

👍2

I find same problem in my custom trained weights.
Will I need to retrain my model?or it is same when training?
sorry for my poor English

suchengpo on 14 Jun 2018

👀1

nospotfer on 20 Jun 2018

👍3

I am having the same problem: *** Error in `python3': free(): invalid next size (fast): darknet...
I thought that my .cfg file was the same error as @felixendres.

I have found the actual source of the error: In my .cfg file, one of the three yolo layers had a different number of classes than the other two. 🤦

But I can't see where is the error:

Relevant parts of yolov3.fcg

[net]
# Testing
batch=1
subdivisions=1
# Training
#batch=32
#subdivisions=16
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 500200
policy=steps
steps=400000,450000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

...

[convolutional]
size=1
stride=1
pad=1
filters=33
activation=linear


[yolo]
mask = 6,7,8
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=6
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1


[route]
layers = -4

...

[convolutional]
size=1
stride=1
pad=1
filters=33
activation=linear


[yolo]
mask = 3,4,5
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=6
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1



[route]
layers = -4

...

[convolutional]
size=1
stride=1
pad=1
filters=33
activation=linear


[yolo]
mask = 0,1,2
anchors = 10,13,  16,30,  33,23,  30,61,  62,45,  59,119,  116,90,  156,198,  373,326
classes=6
num=9
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

Edit:
The error was on the .data file. I trained 6 classes but on .data file I forgot to change the number of classes:

classes= 7
train  = train.txt  
valid  = test.txt  
names = 
backup = backup

fernaper on 8 Apr 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Loading weights from darknet53.conv.74...Couldn't open file: darknet53.conv.74

bujingdexin · 3Comments

Help

ghost · 4Comments

Why the bboxes have a coordinate offsets with python interface?

AaronYKing · 3Comments

Google Open Images v5

gpsmit · 3Comments

Get object bounding box coordinates?

arianaa30 · 3Comments