Hello guys, I followed this project long time ago and successfully trained my model with my own dataset. Right now, I am trying to implement YOLO in an Snapdragon Neural Processing Engine(SNPE). Basically, that SNPE doesnt support the "detection" layer so I have to implement this one by myself. I read the YOLOv2 paper and understand the concept of how the "detection" layer works but when I tried to dig deep into the code at the yolo src folder, I can not find where did the trained model predict the bounding box from and anchor boxed and their offset as well as the confidence score for each detected object.

Take a look at this example, at the layer 14 the output is 20x20x30. So my question is exactly what did they do with this 20x20x30 tensor to make prediction. Please help me to clarify this. Thanks !
@phongnhhn92 Hi,
Region_layer consists of:
flatten - in OpenCV version it is implemented as separate Permute layer: https://github.com/AlexeyAB/darknet/blob/master/src/region_layer.c#L150region_layer - here is the following:For example, if:
Then output tensor for yolo-voc.cfg = 3 x 3 x 40
Where is each cell of these 40 cells calculated as:


Thanks for your reply !
So using the latest version of OpenCV, I run the yolo_object_detecion.cpp as you suggested and I see how OpenCV return detected bounding box. In my case, I am using tiny-yolo trained 1 class, the input is 416x416 and the ouput is 13x13x30. Using the code above, the return detectionMat has 6 cols and 845 rows and equals to 13x13x30.
My question is that in the loop that we find the row which has the prob_obj higher then the threshold (0.5 for example) then what is "in" in the "logistic_activate(in)" and what is the relationship with logistic activation ?

@phongnhhn92 Hi,
I updated image from my previous post. Yes, I made typo, correctly SINGLE CELL SIZE = 40.
So if your input is 416x416 and the ouput is 13x13x40, then output in the OpenCV-Darknet will be 845x8 (where 845=13x13x5= output_width X output_height X anchors)
Then we will iterate each row, and will find maximum probability for each ANCHOR in each CELL, where final_probability_obj_X = t0 * prob_obj_X. And if this final_probability_obj_MAX > threshold then there is object.

If threshold < MAX (t0*prob_obj_1, t0*prob_obj_2, t0*prob_obj_3) then there is object.
HI~ @AlexeyAB
I need to modify region layer, too
On get_region_detections function , I do not find to execute the logistic_activation at original darknet(pjreddie/darknet).
please, could you tell me, where is execute this ?
Most helpful comment
@phongnhhn92 Hi,
Region_layer consists of:
flatten- in OpenCV version it is implemented as separatePermutelayer: https://github.com/AlexeyAB/darknet/blob/master/src/region_layer.c#L150region_layer- here is the following:For example, if:
Then output tensor for
yolo-voc.cfg= 3 x 3 x 40Where is each cell of these 40 cells calculated as: