Darknet: Memory leak in darknet.py detect function load_image

Created on 21 Mar 2018 · 5Comments · Source: pjreddie/darknet

I have installed darknet on Ubuntu 16.04 with an Nvidia 1050Ti GPU. I have custom trained some weights and they detect images (not video) accurately and rapidly (<0.5 second).

I then wrote a small python script to detect incoming images in a folder from 2 IP cameras. The images are either deleted if nothing is detected, or moved to a different folder is something is detected. My python program crashes after a (variable) period.

If I run htop and feed multiple images to the input folder, for every image detect function loop, htop shows the memory used by the python script increasing by ~3% each time. So after ~30 images, the system runs out of memory. I have narrowed this memory usage to the "im = load_image(image, 0, 0)" line in the detect function within darknet.py i.e. the memory jump occurs each time a new image is loaded by this line.

I have been through all the issues in this github repository and of course Googled as much as possible. I have followed memory leak-related suggestions by others here (thank you), which frees images, deallocates memory (in C libraries), and I've upgraded the stb_image.h (to v2.19) and stb_image_write (v1.09) files. But none of this has resulted in this large ~3% memory increase per image load coming down. The result is I'm unable to run my script because it will die (and often also kills the machine) within ~30 image detections.

My guess is that this memory leak is quite deep within the C code or associated libraries, where I am rather low in skills. So if anyone has any suggestions, I appreciate any input or pointers. I'm happy to climb into the C source, but not sure where to look next.

Thanks
Cliff

Source

cliffco

Most helpful comment

I recently faced the same issue and after digging for few hours on the issue found out that we should free call 'free_image' method after each detection. I have used the repo from https://github.com/AlexeyAB/darknet
In the script darknet.py in 'detect_image' method, call free_image(im) before returning the result.
At least doing this solved the issue of memory leak in my case.
If anyone is interested, I have made few modifications in darknet.py to use darknet as a class.

Below is the full code of darknet.py that I modified for use with opencv videocapture:

#!python3

from ctypes import *
import math
import random
import os
import numpy as np

def sample(probs):
    s = sum(probs)
    probs = [a/s for a in probs]
    r = random.uniform(0, 1)
    for i in range(len(probs)):
        r = r - probs[i]
        if r <= 0:
            return i
    return len(probs)-1

def c_array(ctype, values):
    arr = (ctype*len(values))()
    arr[:] = values
    return arr

class BOX(Structure):
    _fields_ = [("x", c_float),
                ("y", c_float),
                ("w", c_float),
                ("h", c_float)]

class DETECTION(Structure):
    _fields_ = [("bbox", BOX),
                ("classes", c_int),
                ("prob", POINTER(c_float)),
                ("mask", POINTER(c_float)),
                ("objectness", c_float),
                ("sort_class", c_int)]


class IMAGE(Structure):
    _fields_ = [("w", c_int),
                ("h", c_int),
                ("c", c_int),
                ("data", POINTER(c_float))]

class METADATA(Structure):
    _fields_ = [("classes", c_int),
                ("names", POINTER(c_char_p))]

hasGPU = True
altNames = None

class Darknet(object):

    def __init__(self):
        self.lib = CDLL("./libdarknet.so", RTLD_GLOBAL)
        self.lib.network_width.argtypes = [c_void_p]
        self.lib.network_width.restype = c_int
        self.lib.network_height.argtypes = [c_void_p]
        self.lib.network_height.restype = c_int

        self.copy_image_from_bytes = self.lib.copy_image_from_bytes
        self.copy_image_from_bytes.argtypes = [IMAGE,c_char_p]

        def network_width(self, net):
            return self.lib.network_width(net)

        def network_height(self, net):
            return self.lib.network_height(net)

        self.predict = self.lib.network_predict_ptr
        self.predict.argtypes = [c_void_p, POINTER(c_float)]
        self.predict.restype = POINTER(c_float)

        if hasGPU:
            self.set_gpu = self.lib.cuda_set_device
            self.set_gpu.argtypes = [c_int]

        self.make_image = self.lib.make_image
        self.make_image.argtypes = [c_int, c_int, c_int]
        self.make_image.restype = IMAGE

        self.get_network_boxes = self.lib.get_network_boxes
        self.get_network_boxes.argtypes = [c_void_p, c_int, c_int, c_float, c_float, POINTER(c_int), c_int, POINTER(c_int), c_int]
        self.get_network_boxes.restype = POINTER(DETECTION)

        self.make_network_boxes = self.lib.make_network_boxes
        self.make_network_boxes.argtypes = [c_void_p]
        self.make_network_boxes.restype = POINTER(DETECTION)

        self.free_detections = self.lib.free_detections
        self.free_detections.argtypes = [POINTER(DETECTION), c_int]

        self.free_ptrs = self.lib.free_ptrs
        self.free_ptrs.argtypes = [POINTER(c_void_p), c_int]

        self.network_predict = self.lib.network_predict_ptr
        self.network_predict.argtypes = [c_void_p, POINTER(c_float)]

        self.reset_rnn = self.lib.reset_rnn
        self.reset_rnn.argtypes = [c_void_p]

        self.load_net = self.lib.load_network
        self.load_net.argtypes = [c_char_p, c_char_p, c_int]
        self.load_net.restype = c_void_p

        self.load_net_custom = self.lib.load_network_custom
        self.load_net_custom.argtypes = [c_char_p, c_char_p, c_int, c_int]
        self.load_net_custom.restype = c_void_p

        self.do_nms_obj = self.lib.do_nms_obj
        self.do_nms_obj.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

        self.do_nms_sort = self.lib.do_nms_sort
        self.do_nms_sort.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

        self.free_image = self.lib.free_image
        self.free_image.argtypes = [IMAGE]

        self.letterbox_image = self.lib.letterbox_image
        self.letterbox_image.argtypes = [IMAGE, c_int, c_int]
        self.letterbox_image.restype = IMAGE

        self.load_meta = self.lib.get_metadata
        self.lib.get_metadata.argtypes = [c_char_p]
        self.lib.get_metadata.restype = METADATA

        self.load_image = self.lib.load_image_color
        self.load_image.argtypes = [c_char_p, c_int, c_int]
        self.load_image.restype = IMAGE

        self.rgbgr_image = self.lib.rgbgr_image
        self.rgbgr_image.argtypes = [IMAGE]

        self.predict_image = self.lib.network_predict_image
        self.predict_image.argtypes = [c_void_p, IMAGE]
        self.predict_image.restype = POINTER(c_float)

    def array_to_image(self, arr):
        # need to return old values to avoid python freeing memory
        arr = arr.transpose(2,0,1)
        c = arr.shape[0]
        h = arr.shape[1]
        w = arr.shape[2]
        arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0
        data = arr.ctypes.data_as(POINTER(c_float))
        im = IMAGE(w,h,c,data)
        return im, arr

    def classify(self, net, meta, im):
        out = self.predict_image(net, im)
        res = []
        for i in range(meta.classes):
            if altNames is None:
                nameTag = meta.names[i]
            else:
                nameTag = altNames[i]
            res.append((nameTag, out[i]))
        res = sorted(res, key=lambda x: -x[1])
        return res

    def detect(self, net, meta, image, thresh=.5, hier_thresh=.5, nms=.45, debug= False):
        """
        Performs the meat of the detection
        """
        #pylint: disable= C0321
        im = self.load_image(image, 0, 0)
        if debug: print("Loaded image")
        ret = self.detect_image(net, meta, im, thresh, hier_thresh, nms, debug)
        self.free_image(im)
        if debug: print("freed image")
        return ret

    def detect_image(self, net, meta, im, thresh=.5, hier_thresh=.5, nms=.45, debug= False):
        #import cv2
        #custom_image_bgr = cv2.imread(image) # use: detect(,,imagePath,)
        #custom_image = cv2.cvtColor(custom_image_bgr, cv2.COLOR_BGR2RGB)
        #custom_image = cv2.resize(custom_image,(lib.network_width(net), lib.network_height(net)), interpolation = cv2.INTER_LINEAR)
        #import scipy.misc
        #custom_image = scipy.misc.imread(image)
        #im, arr = array_to_image(custom_image)     # you should comment line below: free_image(im)
        num = c_int(0)
        if debug: print("Assigned num")
        pnum = pointer(num)
        if debug: print("Assigned pnum")
        self.predict_image(net, im)
        if debug: print("did prediction")
        #dets = get_network_boxes(net, custom_image_bgr.shape[1], custom_image_bgr.shape[0], thresh, hier_thresh, None, 0, pnum, 0) # OpenCV
        dets = self.get_network_boxes(net, im.w, im.h, thresh, hier_thresh, None, 0, pnum, 0)
        if debug: print("Got dets")
        num = pnum[0]
        if debug: print("got zeroth index of pnum")
        if nms:
            self.do_nms_sort(dets, num, meta.classes, nms)
        if debug: print("did sort")
        res = []
        if debug: print("about to range")
        for j in range(num):
            if debug: print("Ranging on "+str(j)+" of "+str(num))
            if debug: print("Classes: "+str(meta), meta.classes, meta.names)
            for i in range(meta.classes):
                if debug: print("Class-ranging on "+str(i)+" of "+str(meta.classes)+"= "+str(dets[j].prob[i]))
                if dets[j].prob[i] > 0:
                    b = dets[j].bbox
                    if altNames is None:
                        nameTag = meta.names[i]
                    else:
                        nameTag = altNames[i]
                    if debug:
                        print("Got bbox", b)
                        print(nameTag)
                        print(dets[j].prob[i])
                        print((b.x, b.y, b.w, b.h))
                    res.append((nameTag, dets[j].prob[i], (b.x, b.y, b.w, b.h)))
        if debug: print("did range")
        res = sorted(res, key=lambda x: -x[1])
        if debug: print("did sort")
        self.free_detections(dets, num)
        if debug: print("freed detections")
        self.free_image(im)
        return res

darknet = Darknet()

netMain = darknet.load_net_custom("./cfg/yolo-obj.cfg".encode("ascii"), "./backup/yolo-obj_last.weights".encode("ascii"), 0, 1)  # batch size = 1

metaMain = darknet.load_meta("./cfg/obj.data".encode("ascii"))

try:
    with open("yolo-obj.data") as metaFH:
        metaContents = metaFH.read()

        match = re.search("names *= *(.*)$", metaContents,
                        re.IGNORECASE | re.MULTILINE)
        if match:
            result = match.group(1)
        else:
            result = None
        try:
            if os.path.exists(result):
                with open(result) as namesFH:
                    namesList = namesFH.read().strip().split("\n")
                    altNames = [x.strip() for x in namesList]
        except TypeError:
            pass
except Exception:
    pass

import sys
from time import time
import cv2

stream = cv2.VideoCapture("video.mp4")
ret, frame = stream.read()
count = 0
t1 = time()
while ret:
    img = darknet.make_image(frame.shape[1], frame.shape[0], 3)
    img_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    darknet.copy_image_from_bytes(img, img_rgb.tobytes())
    detected = darknet.detect_image(netMain, metaMain, img, thresh=0.5)
    # print (detected)
    count += 1
    ret, frame = stream.read()
    if count%30 == 0:
        sys.stdout.write("\rframe_rate: {} frames/second".format(count/(time()-t1)))
        sys.stdout.flush()

Hope this helps

sanjaykhanal on 1 May 2019

👍5 ❤3 😄1

All 5 comments

I forgot to mention that darknet is built with
GPU=1
CUDNN=0
OPENCV=0
OPENMP=0
DEBUG=1

cliffco on 21 Mar 2018

I didn't have any issues regarding webcam and darknet, but I'm also using a modified version which might help with the memory leaks, I guess you can adapt you code with this example https://github.com/pjreddie/darknet/issues/289#issuecomment-342448358 and see if it fix the memory leak issue.

It's not using load_image, but instead OpenCV's image load. But I guess you need to compile with OpenCV to get my example to work.

TheMikeyR on 23 Mar 2018

Many thanks MikeyR. I'll give it a go

cliffco on 25 Mar 2018

Below is the full code of darknet.py that I modified for use with opencv videocapture:

#!python3

from ctypes import *
import math
import random
import os
import numpy as np

def sample(probs):
    s = sum(probs)
    probs = [a/s for a in probs]
    r = random.uniform(0, 1)
    for i in range(len(probs)):
        r = r - probs[i]
        if r <= 0:
            return i
    return len(probs)-1

def c_array(ctype, values):
    arr = (ctype*len(values))()
    arr[:] = values
    return arr

class BOX(Structure):
    _fields_ = [("x", c_float),
                ("y", c_float),
                ("w", c_float),
                ("h", c_float)]

class DETECTION(Structure):
    _fields_ = [("bbox", BOX),
                ("classes", c_int),
                ("prob", POINTER(c_float)),
                ("mask", POINTER(c_float)),
                ("objectness", c_float),
                ("sort_class", c_int)]


class IMAGE(Structure):
    _fields_ = [("w", c_int),
                ("h", c_int),
                ("c", c_int),
                ("data", POINTER(c_float))]

class METADATA(Structure):
    _fields_ = [("classes", c_int),
                ("names", POINTER(c_char_p))]

hasGPU = True
altNames = None

class Darknet(object):

    def __init__(self):
        self.lib = CDLL("./libdarknet.so", RTLD_GLOBAL)
        self.lib.network_width.argtypes = [c_void_p]
        self.lib.network_width.restype = c_int
        self.lib.network_height.argtypes = [c_void_p]
        self.lib.network_height.restype = c_int

        self.copy_image_from_bytes = self.lib.copy_image_from_bytes
        self.copy_image_from_bytes.argtypes = [IMAGE,c_char_p]

        def network_width(self, net):
            return self.lib.network_width(net)

        def network_height(self, net):
            return self.lib.network_height(net)

        self.predict = self.lib.network_predict_ptr
        self.predict.argtypes = [c_void_p, POINTER(c_float)]
        self.predict.restype = POINTER(c_float)

        if hasGPU:
            self.set_gpu = self.lib.cuda_set_device
            self.set_gpu.argtypes = [c_int]

        self.make_image = self.lib.make_image
        self.make_image.argtypes = [c_int, c_int, c_int]
        self.make_image.restype = IMAGE

        self.get_network_boxes = self.lib.get_network_boxes
        self.get_network_boxes.argtypes = [c_void_p, c_int, c_int, c_float, c_float, POINTER(c_int), c_int, POINTER(c_int), c_int]
        self.get_network_boxes.restype = POINTER(DETECTION)

        self.make_network_boxes = self.lib.make_network_boxes
        self.make_network_boxes.argtypes = [c_void_p]
        self.make_network_boxes.restype = POINTER(DETECTION)

        self.free_detections = self.lib.free_detections
        self.free_detections.argtypes = [POINTER(DETECTION), c_int]

        self.free_ptrs = self.lib.free_ptrs
        self.free_ptrs.argtypes = [POINTER(c_void_p), c_int]

        self.network_predict = self.lib.network_predict_ptr
        self.network_predict.argtypes = [c_void_p, POINTER(c_float)]

        self.reset_rnn = self.lib.reset_rnn
        self.reset_rnn.argtypes = [c_void_p]

        self.load_net = self.lib.load_network
        self.load_net.argtypes = [c_char_p, c_char_p, c_int]
        self.load_net.restype = c_void_p

        self.load_net_custom = self.lib.load_network_custom
        self.load_net_custom.argtypes = [c_char_p, c_char_p, c_int, c_int]
        self.load_net_custom.restype = c_void_p

        self.do_nms_obj = self.lib.do_nms_obj
        self.do_nms_obj.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

        self.do_nms_sort = self.lib.do_nms_sort
        self.do_nms_sort.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]

        self.free_image = self.lib.free_image
        self.free_image.argtypes = [IMAGE]

        self.letterbox_image = self.lib.letterbox_image
        self.letterbox_image.argtypes = [IMAGE, c_int, c_int]
        self.letterbox_image.restype = IMAGE

        self.load_meta = self.lib.get_metadata
        self.lib.get_metadata.argtypes = [c_char_p]
        self.lib.get_metadata.restype = METADATA

        self.load_image = self.lib.load_image_color
        self.load_image.argtypes = [c_char_p, c_int, c_int]
        self.load_image.restype = IMAGE

        self.rgbgr_image = self.lib.rgbgr_image
        self.rgbgr_image.argtypes = [IMAGE]

        self.predict_image = self.lib.network_predict_image
        self.predict_image.argtypes = [c_void_p, IMAGE]
        self.predict_image.restype = POINTER(c_float)

    def array_to_image(self, arr):
        # need to return old values to avoid python freeing memory
        arr = arr.transpose(2,0,1)
        c = arr.shape[0]
        h = arr.shape[1]
        w = arr.shape[2]
        arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0
        data = arr.ctypes.data_as(POINTER(c_float))
        im = IMAGE(w,h,c,data)
        return im, arr

    def classify(self, net, meta, im):
        out = self.predict_image(net, im)
        res = []
        for i in range(meta.classes):
            if altNames is None:
                nameTag = meta.names[i]
            else:
                nameTag = altNames[i]
            res.append((nameTag, out[i]))
        res = sorted(res, key=lambda x: -x[1])
        return res

    def detect(self, net, meta, image, thresh=.5, hier_thresh=.5, nms=.45, debug= False):
        """
        Performs the meat of the detection
        """
        #pylint: disable= C0321
        im = self.load_image(image, 0, 0)
        if debug: print("Loaded image")
        ret = self.detect_image(net, meta, im, thresh, hier_thresh, nms, debug)
        self.free_image(im)
        if debug: print("freed image")
        return ret

    def detect_image(self, net, meta, im, thresh=.5, hier_thresh=.5, nms=.45, debug= False):
        #import cv2
        #custom_image_bgr = cv2.imread(image) # use: detect(,,imagePath,)
        #custom_image = cv2.cvtColor(custom_image_bgr, cv2.COLOR_BGR2RGB)
        #custom_image = cv2.resize(custom_image,(lib.network_width(net), lib.network_height(net)), interpolation = cv2.INTER_LINEAR)
        #import scipy.misc
        #custom_image = scipy.misc.imread(image)
        #im, arr = array_to_image(custom_image)     # you should comment line below: free_image(im)
        num = c_int(0)
        if debug: print("Assigned num")
        pnum = pointer(num)
        if debug: print("Assigned pnum")
        self.predict_image(net, im)
        if debug: print("did prediction")
        #dets = get_network_boxes(net, custom_image_bgr.shape[1], custom_image_bgr.shape[0], thresh, hier_thresh, None, 0, pnum, 0) # OpenCV
        dets = self.get_network_boxes(net, im.w, im.h, thresh, hier_thresh, None, 0, pnum, 0)
        if debug: print("Got dets")
        num = pnum[0]
        if debug: print("got zeroth index of pnum")
        if nms:
            self.do_nms_sort(dets, num, meta.classes, nms)
        if debug: print("did sort")
        res = []
        if debug: print("about to range")
        for j in range(num):
            if debug: print("Ranging on "+str(j)+" of "+str(num))
            if debug: print("Classes: "+str(meta), meta.classes, meta.names)
            for i in range(meta.classes):
                if debug: print("Class-ranging on "+str(i)+" of "+str(meta.classes)+"= "+str(dets[j].prob[i]))
                if dets[j].prob[i] > 0:
                    b = dets[j].bbox
                    if altNames is None:
                        nameTag = meta.names[i]
                    else:
                        nameTag = altNames[i]
                    if debug:
                        print("Got bbox", b)
                        print(nameTag)
                        print(dets[j].prob[i])
                        print((b.x, b.y, b.w, b.h))
                    res.append((nameTag, dets[j].prob[i], (b.x, b.y, b.w, b.h)))
        if debug: print("did range")
        res = sorted(res, key=lambda x: -x[1])
        if debug: print("did sort")
        self.free_detections(dets, num)
        if debug: print("freed detections")
        self.free_image(im)
        return res

darknet = Darknet()

netMain = darknet.load_net_custom("./cfg/yolo-obj.cfg".encode("ascii"), "./backup/yolo-obj_last.weights".encode("ascii"), 0, 1)  # batch size = 1

metaMain = darknet.load_meta("./cfg/obj.data".encode("ascii"))

try:
    with open("yolo-obj.data") as metaFH:
        metaContents = metaFH.read()

        match = re.search("names *= *(.*)$", metaContents,
                        re.IGNORECASE | re.MULTILINE)
        if match:
            result = match.group(1)
        else:
            result = None
        try:
            if os.path.exists(result):
                with open(result) as namesFH:
                    namesList = namesFH.read().strip().split("\n")
                    altNames = [x.strip() for x in namesList]
        except TypeError:
            pass
except Exception:
    pass

import sys
from time import time
import cv2

stream = cv2.VideoCapture("video.mp4")
ret, frame = stream.read()
count = 0
t1 = time()
while ret:
    img = darknet.make_image(frame.shape[1], frame.shape[0], 3)
    img_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    darknet.copy_image_from_bytes(img, img_rgb.tobytes())
    detected = darknet.detect_image(netMain, metaMain, img, thresh=0.5)
    # print (detected)
    count += 1
    ret, frame = stream.read()
    if count%30 == 0:
        sys.stdout.write("\rframe_rate: {} frames/second".format(count/(time()-t1)))
        sys.stdout.flush()

Hope this helps

sanjaykhanal on 1 May 2019

👍5 ❤3 😄1

free_image(im) fixed it for me.