I can't run the code currently (Issue #241), but I'd assume it should be possible to take the result (bounding boxes) and draw them using the OpenCV rectangle function.

Since I can't run the code I can't confirm if detect() returns the bounding boxes, but looking at the code I'm simply assuming it because of line 110 in darknet.py res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h))) .

So taking these coordinates, feeding them into the function and drawing the rectangle by yourself would be the solution I'd go for. I'd provide you with some code, but since I can't run it, that doesn't make much sense. After implementing a function for saving the image from the result of detect() you could make a pull request, so that other user can profit from your code as well.

I hope this helped you. When #241 get's solved I can help you more.

njoye on 11 Oct 2017

Thanks @njoye. I'll look into it once, and make a request if it works fine.

ishansan on 11 Oct 2017

I was able to predict the boxes, but looks like the python wrapper is not able to predict the correct coordinates for the image. I have attached two samples: from the original shell implementation and other from the python wrapper implementation, using the coordinates found on line 110 in python/darknet.py:

res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))

predictions
screenshot from 2017-10-11 19-50-50

ishansan on 11 Oct 2017

That is weird. Could you provide the code for drawing the boxes ? It's possible that you've switched the coordinates without wanting it. It also happened to me very (like seriously, extremely) often that the way coordinates were given were different! For example the given coordinates would be x1, y1, x2, y2 where the the values would be distance from the upper left corner, but the drawing function would count values from the lower left corner.

I think this is the case here as well, since the general prediction amount (3 boxes) and the size of the boxes seem to be right. It just looks as if the boxes are a little off. My suggestion would be to look at the coordinates again and try to figure out if they have been set correctly in the drawing function.

njoye on 11 Oct 2017

Yes sure @njoye .
Here is the code and the output:

res.append((meta.names[i], probs[j][i], boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h))

Name: bicycle Predict %: 0.853098213673 X: 341.839660645 Y: 285.840026855 W: 492.894165039 Z: 323.559906006

Name: dog Predict %: 0.823984861374 X: 226.710205078 Y: 376.563171387 W: 189.13192749 Z: 289.121612549

Name: truck Predict %: 0.635909080505 X: 574.128173828 Y: 126.135978699 W: 212.539764404 Z: 83.7097015381

screenshot from 2017-10-11 20-48-59

ishansan on 11 Oct 2017

@ishansan I also meant the code where you call opencv (or what you used for creating the rectangles) to create the rectangles. That information is crucial to my thought :)

njoye on 11 Oct 2017

@njoye Sure.

Here is the complete code:

def detect(net, meta, image, thresh=.5, hier_thresh=.5, nms=.45):
im = load_image(image, 0, 0)
boxes = make_boxes(net)
probs = make_probs(net)
num = num_boxes(net)
network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
res = []
for j in range(num):
for i in range(meta.classes):
if probs[j][i] > 0.25:
res.append((meta.names[i], probs[j][i], boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h))
# print meta.names[i],probs[j][i],boxes[j].x,boxes[j].y,boxes[j].w,boxes[j].h
res = sorted(res, key=lambda x: -x[1])
free_image(im)
free_ptrs(cast(probs, POINTER(c_void_p)), num)
return res

if __name__ == "__main__":
#net = load_net("cfg/densenet201.cfg", "/home/pjreddie/trained/densenet201.weights", 0)
#im = load_image("data/wolf.jpg", 0, 0)
#meta = load_meta("cfg/imagenet1k.data")
#r = classify(net, meta, im)
#print r[:10]
img = "../data/dog.jpg"
net = load_net("../cfg/yolo.cfg", "../../yolo-weights/yolo.weights", 0)
meta = load_meta("../cfg/coco.data")
r = detect(net, meta, img)

for k in range(len(r)):
    print "Name: ",r[k][0],"Predict %: ",r[k][1],"X: ",r[k][2],"Y: ",r[k][3],"W: ",r[k][4],"Z: ",r[k][5],'\n'

#get detected image
im2 = np.array(Image.open(img), dtype=np.uint8)
fig,ax = plt.subplots(1)
fig.set_size_inches(imgw,imgh)
ax.imshow(im2)
for k in range(len(r)):
    rect = patches.Rectangle((r[k][2],r[k][5]),r[k][4],r[k][5],linewidth=1,edgecolor='r',facecolor='none')
    ax.add_patch(rect)
plt.show()
plt.savefig('image.jpg')

ishansan on 11 Oct 2017

I don’t have a good enough internet connection to post the code atm, but the problem you have is solved by calculating y=y-(height/2) and x=x-(width/2) because the function seems to only return the center of the bounding box (whyever though). I’ll post the code tomorrow if you need it.

njoye on 11 Oct 2017

👍5 🎉4

Yes. It worked perfectly!

ishansan on 12 Oct 2017

@njoye @ishansan can anyone post the whole code? I didn't get where to change y=y-(height/2) and x=x-(width/2)
Thank you

Ankit09 on 7 Dec 2017

@Ankit09 That need to be afterprocessing from when you do the r = detect(net, meta, img).
You get the output back as center x, y, width and height.

OpenCV rectangle function wants top left and bottom right corner as an input https://docs.opencv.org/3.0-beta/modules/imgproc/doc/drawing_functions.html#rectangle

So you need to convert the output from darknet to "opencv-format"

TheMikeyR on 7 Dec 2017

@TheMikeyR thanks for your reply, but I am not getting any X and Y values from detect().
How can I get the perfect square on my detected object?

Ankit09 on 7 Dec 2017

@Ankit09 Can't help you without knowing anything about what you are trying to execute, your code etc. If this is not related to the above issue, "saving an image using python wrapper" feel free to open a new ticket with description of your system, what you did and your issue.

TheMikeyR on 7 Dec 2017

@TheMikeyR ,
Thanks for your time brother,
I am getting the output of prediction and rectangle that is shown in the image (left side). The expected output is on right side.
How I can get the perfect x,y co-ordinated and draw my expected output?
screen shot 2017-12-07 at 7 09 02 pm

Ankit09 on 7 Dec 2017

@Ankit09 you need to post your code where you draw the rectangles and how you process the output from the detection step.

TheMikeyR on 7 Dec 2017

@TheMikeyR
this is my whole code:

from ctypes import *
import math
import random
import cv2
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.patches as patches

def sample(probs):
s = sum(probs)
probs = [a/s for a in probs]
r = random.uniform(0, 1)
for i in range(len(probs)):
r = r - probs[i]
if r <= 0:
return i
return len(probs)-1

def c_array(ctype, values):
return (ctype * len(values))(*values)

class BOX(Structure):
_fields_ = [("x", c_float),
("y", c_float),
("w", c_float),
("h", c_float)]

class IMAGE(Structure):
_fields_ = [("w", c_int),
("h", c_int),
("c", c_int),
("data", POINTER(c_float))]

class METADATA(Structure):
_fields_ = [("classes", c_int),
("names", POINTER(c_char_p))]

lib = CDLL("/media/psf/Home/Desktop/Car-Project/5/ankit/libdarknet.so", RTLD_GLOBAL)
lib.network_width.argtypes = [c_void_p]
lib.network_width.restype = c_int
lib.network_height.argtypes = [c_void_p]
lib.network_height.restype = c_int

predict = lib.network_predict
predict.argtypes = [c_void_p, POINTER(c_float)]
predict.restype = POINTER(c_float)

set_gpu = lib.cuda_set_device
set_gpu.argtypes = [c_int]

make_image = lib.make_image
make_image.argtypes = [c_int, c_int, c_int]
make_image.restype = IMAGE

make_boxes = lib.make_boxes
make_boxes.argtypes = [c_void_p]
make_boxes.restype = POINTER(BOX)

free_ptrs = lib.free_ptrs
free_ptrs.argtypes = [POINTER(c_void_p), c_int]

num_boxes = lib.num_boxes
num_boxes.argtypes = [c_void_p]
num_boxes.restype = c_int

make_probs = lib.make_probs
make_probs.argtypes = [c_void_p]
make_probs.restype = POINTER(POINTER(c_float))

detect = lib.network_predict
detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]

reset_rnn = lib.reset_rnn
reset_rnn.argtypes = [c_void_p]

load_net = lib.load_network
load_net.argtypes = [c_char_p, c_char_p, c_int]
load_net.restype = c_void_p

free_image = lib.free_image
free_image.argtypes = [IMAGE]

letterbox_image = lib.letterbox_image
letterbox_image.argtypes = [IMAGE, c_int, c_int]
letterbox_image.restype = IMAGE

load_meta = lib.get_metadata
lib.get_metadata.argtypes = [c_char_p]
lib.get_metadata.restype = METADATA

load_image = lib.load_image_color
load_image.argtypes = [c_char_p, c_int, c_int]
load_image.restype = IMAGE

rgbgr_image = lib.rgbgr_image
rgbgr_image.argtypes = [IMAGE]

predict_image = lib.network_predict_image
predict_image.argtypes = [c_void_p, IMAGE]
predict_image.restype = POINTER(c_float)

network_detect = lib.network_detect
network_detect.argtypes = [c_void_p, IMAGE, c_float, c_float, c_float, POINTER(BOX), POINTER(POINTER(c_float))]

def classify(net, meta, im):
out = predict_image(net, im)
res = []
for i in range(meta.classes):
res.append((meta.names[i], out[i]))
res = sorted(res, key=lambda x: -x[1])
return res

def detect(net, meta, image, thresh=.5, hier_thresh=.5, nms=.45):
im = load_image(image, 0, 0)
#cv2.imshow('frame',im)
boxes = make_boxes(net)
probs = make_probs(net)
num = num_boxes(net)
network_detect(net, im, thresh, hier_thresh, nms, boxes, probs)
res = []
for j in range(num):
for i in range(meta.classes):
if probs[j][i] > 0:
res.append((meta.names[i], probs[j][i], (boxes[j].x, boxes[j].y, boxes[j].w, boxes[j].h)))
print (meta.names[i])
print ("X1 is: %f and Y1 is: %f" % (boxes[j].x, boxes[j].y))
print ("X2 is: %f and Y2 is: %f" % (boxes[j].w, boxes[j].h))
res = sorted(res, key=lambda x: -x[1])
free_image(im)
free_ptrs(cast(probs, POINTER(c_void_p)), num)
return res

if __name__ == "__main__":

#net = load_net(b"/media/psf/Home/Desktop/Car-Project/5/ankit/cfg/yolo.cfg", b"/media/psf/Home/Desktop/Car-Project/5/ankit/yolo.weights", 0)
#meta = load_meta(b"/media/psf/Home/Desktop/Car-Project/5/ankit/cfg/coco.data")
#r = detect(net, meta, b"/media/psf/Home/Desktop/Car-Project/5/ankit/data/i.jpg")
img = "/media/psf/Home/Desktop/Car-Project/5/ankit/data/dog.jpg"
net = load_net("/media/psf/Home/Desktop/Car-Project/5/ankit/cfg/yolo.cfg", "/media/psf/Home/Desktop/Car-Project/5/ankit/hello.weights", 0)
meta = load_meta("/media/psf/Home/Desktop/Car-Project/5/ankit/cfg/coco.data")
r = detect(net, meta, img)
#print len(r)
#print r
#print r[0][0]
for k in range(len(r)):
    #print k
    #print "a"
    print "Name: ",r[k][0],"Predict %: ",r[k][1],"X: ",r[k][2][0],"Y: ",r[k][2][1],"W: ",r[k][2][2],"Z: ",r[k][2][3],'\n'

#get detected image
im2 = np.array(Image.open(img), dtype=np.uint8)
fig,ax = plt.subplots(1)
#fig.set_size_inches(imgw,imgh)
ax.imshow(im2)
for k in range(len(r)):
    print r[k][2][0]
    print r[k][2][3]
    rect = patches.Rectangle((r[k][2][0],r[k][2][3]),r[k][2][2],r[k][2][3],linewidth=1,edgecolor='r',facecolor='none')

rect = patches.Rectangle((r[k][2],r[k][5]),r[k][4],r[k][5],linewidth=1,edgecolor='r',facecolor='none')

    ax.add_patch(rect)
plt.show()
plt.savefig('image.jpg')
#print (r[:10])
#print (r[0])

Ankit09 on 7 Dec 2017

Can you format it all using
```
```
at the top and bottom of the code to make one snippet

TheMikeyR on 7 Dec 2017

@TheMikeyR sure brother,

'''
from ctypes import *
import math
import random
import cv2
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import matplotlib.patches as patches