Javacv: sample for yolo3(darknet)

Created on 15 Jul 2019  ·  42Comments  ·  Source: bytedeco/javacv

is there an example for Yolo3(darknet)?

enhancement

All 42 comments

Should be pretty much the same as this:
https://docs.opencv.org/4.1.0/da/d9d/tutorial_dnn_yolo.html
Let me know if you need any help with that.

now I find a strange problem in "opencv_dnn.NMSBoxes(boxes, con, threshold, nmsThreshold, indices); "
if the boxes.size() > 1 ,i.e 4, the "indices" only replace the first two value,the remaining two value is not replaced!

Do you have the rest of the code?

import org.bytedeco.javacpp.*;
import org.bytedeco.leptonica.*;
import org.bytedeco.opencv.global.opencv_core;
import org.bytedeco.opencv.global.opencv_dnn;
import org.bytedeco.tesseract.*;
import org.bytedeco.tesseract.global.tesseract;


import org.bytedeco.opencv.opencv_dnn.*;
import org.bytedeco.opencv.opencv_core.*;
import org.bytedeco.opencv.opencv_imgproc.*;
import org.bytedeco.javacpp.indexer.*;

import static org.bytedeco.leptonica.global.lept.*;
import static org.bytedeco.tesseract.global.tesseract.*;
import static org.bytedeco.opencv.global.opencv_core.*;
import static org.bytedeco.opencv.global.opencv_dnn.*;
import static org.bytedeco.opencv.global.opencv_imgproc.*;
import static org.bytedeco.opencv.global.opencv_imgcodecs.*;

import org.apache.commons.lang3.time.StopWatch;
import java.util.*;
import java.util.concurrent.*;
import java.io.*;
/**
 * Created by Administrator on 2019/7/9.
 */
public class BasicExample {

    public void Yolo3(){
        Mat img = imread("E:\\CS_OpenCvSharpYolo3\\5troadnd.jpg");

        //setting blob, size can be:320/416/608
        //opencv blob setting can check here https://github.com/opencv/opencv/tree/master/samples/dnn#object-detection
        Mat blob = opencv_dnn.blobFromImage(img, 1.0 / 255, new Size(416, 416), new Scalar(), true, false,CV_32F);

        //load model and config, if you got error: "separator_index < line.size()", check your cfg file, must be something wrong.
        String cfg = "E:\\CS_OpenCvSharpYolo3\\models\\yolov3.cfg";
        String model = "E:\\CS_OpenCvSharpYolo3\\models\\yolov3.weights";
        Net net = opencv_dnn.readNetFromDarknet(cfg, model);
        //set preferable
        net.setPreferableBackend(0);
            /*
            0:DNN_BACKEND_DEFAULT
            1:DNN_BACKEND_HALIDE
            2:DNN_BACKEND_INFERENCE_ENGINE
            3:DNN_BACKEND_OPENCV
             */
        net.setPreferableTarget(0);
            /*
            0:DNN_TARGET_CPU
            1:DNN_TARGET_OPENCL
            2:DNN_TARGET_OPENCL_FP16
            3:DNN_TARGET_MYRIAD
            4:DNN_TARGET_FPGA
             */

        //input data
        net.setInput(blob);

        //get output layer name
        StringVector outNames = net.getUnconnectedOutLayersNames();
        //create mats for output layer
        //MatVector outs = outNames.Select(_ => new Mat()).ToArray();

        MatVector outs = new MatVector();
        for(int i=0;i<outNames.size();i++){
            outs.put(new Mat());
        }

        //forward model
        StopWatch  sw = StopWatch.createStarted();
        net.forward(outs, outNames);
        sw.stop();
        System.out.println("over:" + sw.getTime(TimeUnit.MILLISECONDS) + "ms");

        //get result from all output
        float threshold = 0.5f;       //for confidence
        float nmsThreshold = 0.3f;    //threshold for nms
        GetResult(outs, img, threshold, nmsThreshold,true);
    }

    private void GetResult(MatVector output, Mat image, float threshold, float nmsThreshold, boolean nms)
    {
        nms = true;
        //for nms
        ArrayList<Integer> classIds = new ArrayList<>();
        ArrayList<Float> confidences = new ArrayList<>();
        ArrayList<Float> probabilities = new ArrayList<>();
        Rect2dVector boxes = new Rect2dVector();
        try{
            int w = image.cols();
            int h = image.rows();
            /*
             YOLO3 COCO trainval output
             0 1 : center                    2 3 : w/h
             4 : confidence                  5 ~ 84 : class probability
            */
            int prefix = 5;   //skip 0~4
/**/
            int indiceNum = 0;
            long boxNum = 0;
            for(int k=0;k<output.size();k++)
            {
                Mat prob = output.get(k);
                final FloatRawIndexer probIdx = prob.createIndexer();
                for (int i = 0; i < probIdx.rows(); i++)
                {
                    float confidence = probIdx.get(i, 4);
                    if (confidence > threshold)
                    {
                        //get classes probability
                        indiceNum++;
                        DoublePointer minVal= new DoublePointer();
                        DoublePointer maxVal= new DoublePointer();
                        Point min = new Point();
                        Point max = new Point();
                        minMaxLoc(prob.rows(i).colRange(prefix, prob.cols()), minVal, maxVal, min, max, null);
                        int classes = max.x();
                        float probability = probIdx.get(i, classes + prefix);

                        if (probability > threshold) //more accuracy, you can cancel it
                        {
                            //get center and width/height
                            float centerX = probIdx.get(i, 0) * w;
                            float centerY = probIdx.get(i, 1) * h;
                            float width = probIdx.get(i, 2) * w;
                            float height = probIdx.get(i, 3) * h;

                            if (!nms)
                            {
                                // draw result (if don't use NMSBoxes)
                                //Draw(image, classes, confidence, probability, centerX, centerY, width, height);
                                continue;
                            }

                            //put data to list for NMSBoxes
                            boxNum++;
                            classIds.add(classes);
                            confidences.add(confidence);
                            probabilities.add(probability);
                            boxes.put(new Rect2d(centerX, centerY, width, height));
                            boxes.resize(boxNum);
                        }
                    }
                }
            }

            if (!nms) return;

            //using non-maximum suppression to reduce overlapping low confidence box
            int[] indices = new int[]{8,8,8,8,8,8,8,8};
            //int[] indices = new int[]{confidences.size()};
            float[] con = new float[confidences.size()];
            for(int i=0;i<confidences.size();i++){
                con[i] = confidences.get(i);
            }
            opencv_dnn.NMSBoxes(boxes, con, threshold, nmsThreshold, indices); //strange problem here

            List<String> list = new ArrayList<String>();
            FileInputStream fis = new FileInputStream("E:\\CS_OpenCvSharpYolo3\\models\\coco.names");

            InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
            BufferedReader br = new BufferedReader(isr);
            String line;
            while ((line = br.readLine()) != null) {
                list.add(line);
            }
            String[] Labels = list.toArray(new String[list.size()]);
            br.close();
            isr.close();
            fis.close();
            //Console.WriteLine($"NMSBoxes drop {confidences.Count - indices.Length} overlapping result.");

            //for (int i : indices)
            for (int i=0;i<boxNum;i++)
            {
                Rect2d box = boxes.get(i);
                //]]Draw(image, classIds[i], confidences[i], probabilities[i], box.x(), box.y(), box.width(), box.height());
                String res = "name="+Labels[classIds.get(i)]+" classIds="+classIds.get(i)+" confidences="+confidences.get(i)+" probabilities="+probabilities.get(i);
                res += " box.x="+box.x() + " box.y="+box.y() + " box.width="+box.width() + " box.height="+box.height();
                System.out.println(res);
            }
        }catch(Exception e){
            System.out.println("GetResult error:" + e.getMessage());
        }


    }

    public static void main(String[] args) {
        BasicExample be =  new BasicExample();
        be.Yolo3();
    }
}

indices looks like an output vector to me:
https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.cpp#L400
We probably shouldn't be initializing it with anything else than new IntPointer().

I replace int[] with IntPointer(),unfortunately,the problem is still in here,and some value in IntPointer likes 32768,-12876

Can you update the sample code above?

Ah, here's the issue:

boxes.put(new Rect2d(centerX, centerY, width, height));
boxes.resize(boxNum);

That won't work, you're only setting the first box. Just use push_back() like in C++ for simplicity.

import org.bytedeco.javacpp.*;
import org.bytedeco.leptonica.*;
import org.bytedeco.opencv.global.opencv_core;
import org.bytedeco.opencv.global.opencv_dnn;
import org.bytedeco.tesseract.*;
import org.bytedeco.tesseract.global.tesseract;


import org.bytedeco.opencv.opencv_dnn.*;
import org.bytedeco.opencv.opencv_core.*;
import org.bytedeco.opencv.opencv_imgproc.*;
import org.bytedeco.javacpp.indexer.*;

import static org.bytedeco.leptonica.global.lept.*;
import static org.bytedeco.tesseract.global.tesseract.*;
import static org.bytedeco.opencv.global.opencv_core.*;
import static org.bytedeco.opencv.global.opencv_dnn.*;
import static org.bytedeco.opencv.global.opencv_imgproc.*;
import static org.bytedeco.opencv.global.opencv_imgcodecs.*;

import org.apache.commons.lang3.time.StopWatch;
import java.util.*;
import java.util.concurrent.*;
import java.io.*;
/**
 * Created by Administrator on 2019/7/9.
 */
public class BasicExample {



    public void Yolo3(){
        Mat img = imread("E:\\CS_OpenCvSharpYolo3\\5troadnd.jpg");

        //setting blob, size can be:320/416/608
        //opencv blob setting can check here https://github.com/opencv/opencv/tree/master/samples/dnn#object-detection
        Mat blob = opencv_dnn.blobFromImage(img, 1.0 / 255, new Size(416, 416), new Scalar(), true, false,CV_32F);
        //Mat blob = opencv_dnn.blobFromImage(img, 1.0, new Size(608, 608), new Scalar(), true, false,CV_8U);

        //load model and config, if you got error: "separator_index < line.size()", check your cfg file, must be something wrong.
        String cfg = "E:\\CS_OpenCvSharpYolo3\\models\\yolov3.cfg";
        String model = "E:\\CS_OpenCvSharpYolo3\\models\\yolov3.weights";
        Net net = opencv_dnn.readNetFromDarknet(cfg, model);
        //set preferable
        net.setPreferableBackend(3);
            /*
            0:DNN_BACKEND_DEFAULT
            1:DNN_BACKEND_HALIDE
            2:DNN_BACKEND_INFERENCE_ENGINE
            3:DNN_BACKEND_OPENCV
             */
        net.setPreferableTarget(0);
            /*
            0:DNN_TARGET_CPU
            1:DNN_TARGET_OPENCL
            2:DNN_TARGET_OPENCL_FP16
            3:DNN_TARGET_MYRIAD
            4:DNN_TARGET_FPGA
             */

        //input data
        net.setInput(blob);

        //get output layer name
        StringVector outNames = net.getUnconnectedOutLayersNames();
        //create mats for output layer
        //MatVector outs = outNames.Select(_ => new Mat()).ToArray();

        MatVector outs = new MatVector();
        for(int i=0;i<outNames.size();i++){
            outs.put(new Mat());
        }

        //forward model
        StopWatch  sw = StopWatch.createStarted();
        net.forward(outs, outNames);
        sw.stop();

        //get result from all output
        float threshold = 0.5f;       //for confidence
        float nmsThreshold = 0.3f;    //threshold for nms
        GetResult(outs, img, threshold, nmsThreshold,true);
    }

    private void GetResult(MatVector output, Mat image, float threshold, float nmsThreshold, boolean nms)
    {
        nms = true;
        //for nms
        ArrayList<Integer> classIds = new ArrayList<>();
        ArrayList<Float> confidences = new ArrayList<>();
        ArrayList<Float> probabilities = new ArrayList<>();
        ArrayList<Rect2d> rect2ds = new ArrayList<>();
        //Rect2dVector boxes = new Rect2dVector();
        try{
            int w = image.cols();
            int h = image.rows();
            /*
             YOLO3 COCO trainval output
             0 1 : center                    2 3 : w/h
             4 : confidence                  5 ~ 84 : class probability
            */
            int prefix = 5;   //skip 0~4
/**/
            int indiceNum = 0;
            long boxNum = 0;
            for(int k=0;k<output.size();k++)
            {
                Mat prob = output.get(k);
                final FloatRawIndexer probIdx = prob.createIndexer();
                for (int i = 0; i < probIdx.rows(); i++)
                {
                    float confidence = probIdx.get(i, 4);
                    if (confidence > threshold)
                    {
                        //get classes probability
                        //Cv2.MinMaxLoc(prob.Row[i].ColRange(prefix, prob.Cols), out _, out Point max);
                        //Point min;
                        //Point max;
                        //minMaxLoc(prob.rows(i).colRange(prefix, prob.cols()), null,null, min, max,null);
                        indiceNum++;
                        DoublePointer minVal= new DoublePointer();
                        DoublePointer maxVal= new DoublePointer();
                        Point min = new Point();
                        Point max = new Point();
                        minMaxLoc(prob.rows(i).colRange(prefix, prob.cols()), minVal, maxVal, min, max, null);
                        int classes = max.x();
                        float probability = probIdx.get(i, classes + prefix);

                        if (probability > threshold) //more accuracy, you can cancel it
                        {
                            //get center and width/height
                            float centerX = probIdx.get(i, 0) * w;
                            float centerY = probIdx.get(i, 1) * h;
                            float width = probIdx.get(i, 2) * w;
                            float height = probIdx.get(i, 3) * h;

                            if (!nms)
                            {
                                // draw result (if don't use NMSBoxes)
                                //Draw(image, classes, confidence, probability, centerX, centerY, width, height);
                                continue;
                            }

                            //put data to list for NMSBoxes
                            boxNum++;
                            classIds.add(classes);
                            confidences.add(confidence);
                            probabilities.add(probability);
                            rect2ds.add(new Rect2d(centerX, centerY, width, height));
                            //boxes.put(new Rect2d(centerX, centerY, width, height));
                            //boxes.resize(boxNum);
                        }
                    }
                }
            }

            if (!nms) return;

            //using non-maximum suppression to reduce overlapping low confidence box
            //CvDnn.NMSBoxes(boxes, confidences, threshold, nmsThreshold, out int[] indices);
            //int[] indices = new int[]{8,8,8,8,8,8,8,8};
            IntPointer indices = new IntPointer(confidences.size());

            Rect2dVector boxes = new Rect2dVector();
            for(int i=0;i<rect2ds.size();i++){
                boxes.push_back(rect2ds.get(i));
            }

            FloatPointer con = new FloatPointer(confidences.size());
            for(int i=0;i<confidences.size();i++){
                con.put(confidences.get(i));
            }
            opencv_dnn.NMSBoxes(boxes, con, threshold, nmsThreshold, indices); 

            List<String> list = new ArrayList<String>();
            FileInputStream fis = new FileInputStream("E:\\CS_OpenCvSharpYolo3\\models\\coco.names");

            InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
            BufferedReader br = new BufferedReader(isr);
            String line;
            while ((line = br.readLine()) != null) {
                list.add(line);
            }
            String[] Labels = list.toArray(new String[list.size()]);
            br.close();
            isr.close();
            fis.close();
            //Console.WriteLine($"NMSBoxes drop {confidences.Count - indices.Length} overlapping result.");

            for (int m=0;m<indices.sizeof();m++)
            {
                int i = indices.get(m);
                System.out.println(i);
                Rect2d box = boxes.get(i);
                //]]Draw(image, classIds[i], confidences[i], probabilities[i], box.x(), box.y(), box.width(), box.height());
                String res = "name="+Labels[classIds.get(i)]+" classIds="+classIds.get(i)+" confidences="+confidences.get(i)+" probabilities="+probabilities.get(i);
                res += " box.x="+box.x() + " box.y="+box.y() + " box.width="+box.width() + " box.height="+box.height();
                System.out.println(res);
            }
        }catch(Exception e){
            System.out.println("GetResult error:" + e.getMessage());
        }


    }

    public static void main(String[] args) {
        BasicExample be =  new BasicExample();
        be.Yolo3();
    }
}

Once you're done with the code, please consider adding it to the samples directory and send a pull request. Thanks a lot for your time on this!

con.put(confidences.get(i));

This won't work either. You're setting only the first element.

so。Can you tell me how to add value in FloatPointer?

We can use the array or buffer constructor, update the position() before calling that put() method, or use one of the other put() methods:
http://bytedeco.org/javacpp/apidocs/org/bytedeco/javacpp/FloatPointer.html

At last,it works .Thank You!

Awesome! Please create a pull request with it :)

Sorry,I dont't know how to pull request.I paste the correct code below,please create a pull request into Sample.thanks a lot!

import org.bytedeco.javacpp.*;
import org.bytedeco.leptonica.*;
import org.bytedeco.opencv.global.opencv_core;
import org.bytedeco.opencv.global.opencv_dnn;

import org.bytedeco.opencv.opencv_dnn.*;
import org.bytedeco.opencv.opencv_core.*;
import org.bytedeco.opencv.opencv_imgproc.*;
import org.bytedeco.javacpp.indexer.*;

import static org.bytedeco.leptonica.global.lept.*;
import static org.bytedeco.opencv.global.opencv_core.*;
import static org.bytedeco.opencv.global.opencv_dnn.*;
import static org.bytedeco.opencv.global.opencv_imgproc.*;
import static org.bytedeco.opencv.global.opencv_imgcodecs.*;

import org.apache.commons.lang3.time.StopWatch;
import java.util.*;
import java.util.concurrent.*;
import java.io.*;
/**
 * Created by Jacky on 2019/7/9.
 */
public class Yolo3Example {

    public void Yolo3(){
        Mat img = imread("E:\\CS_OpenCvSharpYolo3\\5troadnd.jpg");

        //setting blob, size can be:320/416/608
        //opencv blob setting can check here https://github.com/opencv/opencv/tree/master/samples/dnn#object-detection
        Mat blob = opencv_dnn.blobFromImage(img, 1.0 / 255, new Size(416, 416), new Scalar(), true, false,CV_32F);
        //Mat blob = opencv_dnn.blobFromImage(img, 1.0, new Size(608, 608), new Scalar(), true, false,CV_8U);

        //load model and config, if you got error: "separator_index < line.size()", check your cfg file, must be something wrong.
        String cfg = "E:\\CS_OpenCvSharpYolo3\\models\\yolov3.cfg";
        String model = "E:\\CS_OpenCvSharpYolo3\\models\\yolov3.weights";
        Net net = opencv_dnn.readNetFromDarknet(cfg, model);
        //set preferable
        net.setPreferableBackend(3);
            /*
            0:DNN_BACKEND_DEFAULT
            1:DNN_BACKEND_HALIDE
            2:DNN_BACKEND_INFERENCE_ENGINE
            3:DNN_BACKEND_OPENCV
             */
        net.setPreferableTarget(0);
            /*
            0:DNN_TARGET_CPU
            1:DNN_TARGET_OPENCL
            2:DNN_TARGET_OPENCL_FP16
            3:DNN_TARGET_MYRIAD
            4:DNN_TARGET_FPGA
             */

        //input data
        net.setInput(blob);

        //get output layer name
        StringVector outNames = net.getUnconnectedOutLayersNames();
        //create mats for output layer
        //MatVector outs = outNames.Select(_ => new Mat()).ToArray();

        MatVector outs = new MatVector();
        for(int i=0;i<outNames.size();i++){
            outs.put(new Mat());
        }

        //forward model
        StopWatch  sw = StopWatch.createStarted();
        net.forward(outs, outNames);
        sw.stop();
        System.out.println("over,takes:" + sw.getTime(TimeUnit.MILLISECONDS) + "ms");

        //get result from all output
        float threshold = 0.5f;       //for confidence
        float nmsThreshold = 0.3f;    //threshold for nms
        GetResult(outs, img, threshold, nmsThreshold,true);
    }

    private void GetResult(MatVector output, Mat image, float threshold, float nmsThreshold, boolean nms)
    {
        nms = true;
        //for nms
        ArrayList<Integer> classIds = new ArrayList<>();
        ArrayList<Float> confidences = new ArrayList<>();
        ArrayList<Float> probabilities = new ArrayList<>();
        ArrayList<Rect2d> rect2ds = new ArrayList<>();
        //Rect2dVector boxes = new Rect2dVector();
        try{
            int w = image.cols();
            int h = image.rows();
            /*
             YOLO3 COCO trainval output
             0 1 : center                    2 3 : w/h
             4 : confidence                  5 ~ 84 : class probability
            */
            int prefix = 5;   //skip 0~4
/**/
            for(int k=0;k<output.size();k++)
            {
                Mat prob = output.get(k);
                final FloatRawIndexer probIdx = prob.createIndexer();
                for (int i = 0; i < probIdx.rows(); i++)
                {
                    float confidence = probIdx.get(i, 4);
                    if (confidence > threshold)
                    {
                        //get classes probability
                        DoublePointer minVal= new DoublePointer();
                        DoublePointer maxVal= new DoublePointer();
                        Point min = new Point();
                        Point max = new Point();
                        minMaxLoc(prob.rows(i).colRange(prefix, prob.cols()), minVal, maxVal, min, max, null);
                        int classes = max.x();
                        float probability = probIdx.get(i, classes + prefix);

                        if (probability > threshold) //more accuracy, you can cancel it
                        {
                            //get center and width/height
                            float centerX = probIdx.get(i, 0) * w;
                            float centerY = probIdx.get(i, 1) * h;
                            float width = probIdx.get(i, 2) * w;
                            float height = probIdx.get(i, 3) * h;

                            if (!nms)
                            {
                                // draw result (if don't use NMSBoxes)
                                continue;
                            }

                            //put data to list for NMSBoxes
                            classIds.add(classes);
                            confidences.add(confidence);
                            probabilities.add(probability);
                            rect2ds.add(new Rect2d(centerX, centerY, width, height));
                        }
                    }
                }
            }

            if (!nms) return;

            //using non-maximum suppression to reduce overlapping low confidence box
            IntPointer indices = new IntPointer(confidences.size());
            Rect2dVector boxes = new Rect2dVector();
            for(int i=0;i<rect2ds.size();i++){
                boxes.push_back(rect2ds.get(i));
            }

            FloatPointer con = new FloatPointer(confidences.size());
            float[] cons = new float[confidences.size()];
            for(int i=0;i<confidences.size();i++){
                cons[i] = confidences.get(i);
            }
            con.put(cons);

            opencv_dnn.NMSBoxes(boxes, con, threshold, nmsThreshold, indices); //只会修改前2个参数,后面不动?

            List<String> list = new ArrayList<String>();
            FileInputStream fis = new FileInputStream("E:\\CS_OpenCvSharpYolo3\\models\\coco.names");
            InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
            BufferedReader br = new BufferedReader(isr);
            String line;
            while ((line = br.readLine()) != null) {
                list.add(line);
            }
            String[] Labels = list.toArray(new String[list.size()]);
            br.close();
            isr.close();
            fis.close();
            //Console.WriteLine($"NMSBoxes drop {confidences.Count - indices.Length} overlapping result.");

            for (int m=0;m<indices.sizeof();m++)
            {
                int i = indices.get(m);
                //System.out.println(i);
                Rect2d box = boxes.get(i);
                String res = "name="+Labels[classIds.get(i)]+" classIds="+classIds.get(i)+" confidences="+confidences.get(i)+" probabilities="+probabilities.get(i);
                res += " box.x="+box.x() + " box.y="+box.y() + " box.width="+box.width() + " box.height="+box.height();
                System.out.println(res);
            }
        }catch(Exception e){
            System.out.println("GetResult error:" + e.getMessage());
        }


    }

    public static void main(String[] args) {
        Yolo3Example be =  new Yolo3Example();
        be.Yolo3();
    }
}

Hi Saudet;
I m try use readNetFromDarknet with javacv too, and i realized that the quality of the result is different whed i use o package opencv_dnn.readNetFromDarknet and o package Dnn.readNetFromDarknet, do you can help me understand why this happens?
I did tests with the class found in this link https://github.com/suddh123/YOLO-object-detection-in-java and in this one of this issue, the previous class recognizes many more objects than it has tha same configurations.

@decioschmitt If you have any code that doesn't work, please open another issue about that. Thanks!

Saudet,I'm sorry if I expressed myself badly, the code is the same as it is here in this issue, I'm trying to use, but I noticed that it is not giving the same result as the link code I commented, I would like to use this code, I would like to understand because this difference.
Thank you for your shared work.

For one thing, that example resizes images differently, so that has a large impact:

Size sz = new Size(288,288);

The values for threshold and nmsThresh are also different. You'll need to set everything the same to get the same results.

The values for threshold and nmsThresh are also different. You'll need to set everything the same to get the same results.

Yes, but I changed all the settings to be the same as this example, so I found it strange to give so different result.

I replace int[] with IntPointer(),unfortunately,the problem is still in here,and some value in IntPointer likes 32768,-12876

This problem still occurs even in the latest version of the code published here. I used image demo from darknet dog.jpg e this error happened.

over,takes:627ms
11
name=bicycle classIds=1 confidences=0.9998864 probabilities=0.0 box.x=221.33023071289062 box.y=383.4708251953125 box.width=197.1485137939453 box.height=320.8028564453125
0
name=person classIds=0 confidences=0.99879277 probabilities=0.0 box.x=582.5922241210938 box.y=126.7623291015625 box.width=220.1399688720703 box.height=79.4445571899414
6
name=truck classIds=7 confidences=0.9907458 probabilities=0.0 box.x=343.3991394042969 box.y=278.7822570800781 box.width=452.3638000488281 box.height=307.33184814453125
1667589221
java.lang.RuntimeException: vector
at org.bytedeco.opencv.opencv_core.Rect2dVector.get(Native Method)
at schmitt.javacvtools.manager.Yolo3Example.GetResult(Yolo3Example.java:180)
at schmitt.javacvtools.manager.Yolo3Example.Yolo3(Yolo3Example.java:80)
at schmitt.javacvtools.manager.Yolo3Example.main(Yolo3Example.java:195)
GetResult error:vector

@decioschmitt Please post the exact versions of the code you're talking about.

version that i talking:

package schmitt.javacvtools.manager;
import org.bytedeco.javacpp.*;
import org.bytedeco.leptonica.*;
import org.bytedeco.opencv.global.opencv_core;
import org.bytedeco.opencv.global.opencv_dnn;
import org.bytedeco.opencv.opencv_dnn.*;
import org.bytedeco.opencv.opencv_core.*;
import org.bytedeco.opencv.opencv_imgproc.*;
import org.bytedeco.javacpp.indexer.*;
import static org.bytedeco.leptonica.global.lept.*;
import static org.bytedeco.opencv.global.opencv_core.*;
import static org.bytedeco.opencv.global.opencv_dnn.*;
import static org.bytedeco.opencv.global.opencv_imgproc.*;
import static org.bytedeco.opencv.global.opencv_imgcodecs.*;
import org.apache.commons.lang3.time.StopWatch;
import java.util.*;
import java.util.concurrent.*;
import java.io.*;
/**
 * Created by Jacky on 2019/7/9.
 */
public class Yolo3Example {

    public void Yolo3(){
        Mat img = imread("dog.jpg");

        //setting blob, size can be:320/416/608
        //opencv blob setting can check here https://github.com/opencv/opencv/tree/master/samples/dnn#object-detection
        Mat blob = opencv_dnn.blobFromImage(img, 1.0 / 255, new Size(416, 416), new Scalar(), true, false,CV_32F);
        //Mat blob = opencv_dnn.blobFromImage(img, 1.0, new Size(608, 608), new Scalar(), true, false,CV_8U);

        //load model and config, if you got error: "separator_index < line.size()", check your cfg file, must be something wrong.
        String model = "/Users/ds/projetos/core_surveillance/master/javacvtools/yolov3.weights"; //Download and load only wights for YOLO , this is obtained from official YOLO site//
        String cfg = "/Users/ds/projetos/core_surveillance/master/javacvtools/yolov3.cfg";//Download and load cfg file for YOLO , can be obtained from official site//

        Net net = opencv_dnn.readNetFromDarknet(cfg, model);
        //set preferable
        net.setPreferableBackend(3);
            /*
            0:DNN_BACKEND_DEFAULT
            1:DNN_BACKEND_HALIDE
            2:DNN_BACKEND_INFERENCE_ENGINE
            3:DNN_BACKEND_OPENCV
             */
        net.setPreferableTarget(0);
            /*
            0:DNN_TARGET_CPU
            1:DNN_TARGET_OPENCL
            2:DNN_TARGET_OPENCL_FP16
            3:DNN_TARGET_MYRIAD
            4:DNN_TARGET_FPGA
             */

        //input data
        net.setInput(blob);

        //get output layer name
        StringVector outNames = net.getUnconnectedOutLayersNames();
        //create mats for output layer
        //MatVector outs = outNames.Select(_ => new Mat()).ToArray();

        MatVector outs = new MatVector();
        for(int i=0;i<outNames.size();i++){
            outs.put(new Mat());
        }

        //forward model
        StopWatch  sw = StopWatch.createStarted();
        net.forward(outs, outNames);
        sw.stop();
        System.out.println("over,takes:" + sw.getTime(TimeUnit.MILLISECONDS) + "ms");

        //get result from all output
        float threshold = 0.5f;       //for confidence
        float nmsThreshold = 0.3f;    //threshold for nms
        GetResult(outs, img, threshold, nmsThreshold,true);
    }

    private void GetResult(MatVector output, Mat image, float threshold, float nmsThreshold, boolean nms)
    {
        nms = true;
        //for nms
        ArrayList<Integer> classIds = new ArrayList<>();
        ArrayList<Float> confidences = new ArrayList<>();
        ArrayList<Float> probabilities = new ArrayList<>();
        ArrayList<Rect2d> rect2ds = new ArrayList<>();
        //Rect2dVector boxes = new Rect2dVector();
        try{
            int w = image.cols();
            int h = image.rows();
            /*
             YOLO3 COCO trainval output
             0 1 : center                    2 3 : w/h
             4 : confidence                  5 ~ 84 : class probability
            */
            int prefix = 5;   //skip 0~4
/**/
            for(int k=0;k<output.size();k++)
            {
                Mat prob = output.get(k);
                final FloatRawIndexer probIdx = prob.createIndexer();
                for (int i = 0; i < probIdx.rows(); i++)
                {
                    float confidence = probIdx.get(i, 4);
                    if (confidence > threshold)
                    {
                        //get classes probability
                        DoublePointer minVal= new DoublePointer();
                        DoublePointer maxVal= new DoublePointer();
                        Point min = new Point();
                        Point max = new Point();
                        minMaxLoc(prob.rows(i).colRange(prefix, prob.cols()), minVal, maxVal, min, max, null);
                        int classes = max.x();
                        float probability = probIdx.get(i, classes + prefix);

//                        if (probability > threshold) //more accuracy, you can cancel it
//                        {
                            //get center and width/height
                            float centerX = probIdx.get(i, 0) * w;
                            float centerY = probIdx.get(i, 1) * h;
                            float width = probIdx.get(i, 2) * w;
                            float height = probIdx.get(i, 3) * h;

                            if (!nms)
                            {
                                // draw result (if don't use NMSBoxes)
                                continue;
                            }

                            //put data to list for NMSBoxes
                            classIds.add(classes);
                            confidences.add(confidence);
                            probabilities.add(probability);
                            rect2ds.add(new Rect2d(centerX, centerY, width, height));
//                        }
                    }
                }
            }

            if (!nms) return;

            //using non-maximum suppression to reduce overlapping low confidence box
            IntPointer indices = new IntPointer(confidences.size());
            Rect2dVector boxes = new Rect2dVector();
            for(int i=0;i<rect2ds.size();i++){
                boxes.push_back(rect2ds.get(i));
            }

            FloatPointer con = new FloatPointer(confidences.size());
            float[] cons = new float[confidences.size()];
            for(int i=0;i<confidences.size();i++){
                cons[i] = confidences.get(i);
            }
            con.put(cons);

            opencv_dnn.NMSBoxes(boxes, con, threshold, nmsThreshold, indices); //只会修改前2个参数,后面不动?

            List<String> list = new ArrayList<String>();
            FileInputStream fis = new FileInputStream("coco.names");
            InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
            BufferedReader br = new BufferedReader(isr);
            String line;
            while ((line = br.readLine()) != null) {
                list.add(line);
            }
            String[] Labels = list.toArray(new String[list.size()]);
            br.close();
            isr.close();
            fis.close();
            //Console.WriteLine($"NMSBoxes drop {confidences.Count - indices.Length} overlapping result.");

            for (int m=0;m<indices.sizeof();m++)
            {
                int i = indices.get(m);
                System.out.println(i);
                Rect2d box = boxes.get(i);
                String res = "name="+Labels[classIds.get(i)]+" classIds="+classIds.get(i)+" confidences="+confidences.get(i)+" probabilities="+probabilities.get(i);
                res += " box.x="+box.x() + " box.y="+box.y() + " box.width="+box.width() + " box.height="+box.height();
                System.out.println(res);
            }
        }catch(Exception e){
            e.printStackTrace();
            System.out.println("GetResult error:" + e.getMessage());
        }


    }

    public static void main(String[] args) {
        Yolo3Example be =  new Yolo3Example();
        be.Yolo3();
    }
}

out:

over,takes:714ms
11
name=bicycle classIds=1 confidences=0.9998864 probabilities=0.0 box.x=221.33023071289062 box.y=383.4708251953125 box.width=197.1485137939453 box.height=320.8028564453125
0
name=person classIds=0 confidences=0.99879277 probabilities=0.0 box.x=582.5922241210938 box.y=126.7623291015625 box.width=220.1399688720703 box.height=79.4445571899414
6
name=truck classIds=7 confidences=0.9907458 probabilities=0.0 box.x=343.3991394042969 box.y=278.7822570800781 box.width=452.3638000488281 box.height=307.33184814453125
1667589221
java.lang.RuntimeException: vector
at org.bytedeco.opencv.opencv_core.Rect2dVector.get(Native Method)
at schmitt.javacvtools.manager.Yolo3Example.GetResult(Yolo3Example.java:180)
at schmitt.javacvtools.manager.Yolo3Example.Yolo3(Yolo3Example.java:80)
at schmitt.javacvtools.manager.Yolo3Example.main(Yolo3Example.java:195)
GetResult error:vector

Also note that the classId returns wrong in the example above.

version two :

```java package schmitt.javacvtools.manager;

import org.opencv.core.;
import org.opencv.dnn.
;
import org.opencv.utils.*;
import org.opencv.imgcodecs.Imgcodecs;
import org.opencv.imgproc.Imgproc;
import org.opencv.videoio.VideoCapture;

import java.util.ArrayList;
import java.util.List;

import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;

import javax.imageio.ImageIO;
import javax.swing.ImageIcon;
import javax.swing.JFrame;
import javax.swing.JLabel;
import static org.opencv.imgcodecs.Imgcodecs.imread;

public class yolo {

private static List<String> getOutputNames(Net net) {
    List<String> names = new ArrayList<>();

    List<Integer> outLayers = net.getUnconnectedOutLayers().toList();
    List<String> layersNames = net.getLayerNames();

    outLayers.forEach((item) -> names.add(layersNames.get(item - 1)));//unfold and create R-CNN layers from the loaded YOLO model//
    return names;
}

public static void main(String[] args) throws InterruptedException {
    System.out.println("NATIVE_LIBRARY_NAME:" + Core.NATIVE_LIBRARY_NAME);
    System.load("/usr/local/Cellar/opencv/4.1.0_2/share/java/opencv4/libopencv_java410.dylib"); // Load the openCV 4.0 dll //
    String modelWeights = "/Users/ds/projetos/core_surveillance/master/javacvtools/yolov3.weights"; //Download and load only wights for YOLO , this is obtained from official YOLO site//
    String modelConfiguration = "/Users/ds/projetos/core_surveillance/master/javacvtools/yolov3.cfg";//Download and load cfg file for YOLO , can be obtained from official site//


    Net net = Dnn.readNetFromDarknet(modelConfiguration, modelWeights); //OpenCV DNN supports models trained from various frameworks like Caffe and TensorFlow. It also supports various networks architectures based on YOLO//
    //Thread.sleep(5000);

    //Mat image = Imgcodecs.imread("D:\\yolo-object-detection\\yolo-object-detection\\images\\soccer.jpg");
    Size sz = new Size(416, 416);

    List<Mat> result = new ArrayList<>();
    List<String> outBlobNames = getOutputNames(net);

    //     while (true) {
    //        if (cap.read(frame)) {
    Mat src = imread("dog.jpg");

    Mat blob = Dnn.blobFromImage(src, 0.00392, sz, new Scalar(0), true, false); // We feed one frame of video into the network at a time, we have to convert the image to a blob. A blob is a pre-processed image that serves as the input.//
    net.setInput(blob);
    net.forward(result, outBlobNames); //Feed forward the model to get output //

    // outBlobNames.forEach(System.out::println);
    // result.forEach(System.out::println);
    float confThreshold = 0.5f; //Insert thresholding beyond which the model will detect objects//
    List<Integer> clsIds = new ArrayList<>();
    List<Float> confs = new ArrayList<>();
    List<Rect> rects = new ArrayList<>();
    for (int i = 0; i < result.size(); ++i) {
        // each row is a candidate detection, the 1st 4 numbers are
        // [center_x, center_y, width, height], followed by (N-4) class probabilities
        Mat level = result.get(i);
        for (int j = 0; j < level.rows(); ++j) {
            Mat row = level.row(j);
            Mat scores = row.colRange(5, level.cols());
            Core.MinMaxLocResult mm = Core.minMaxLoc(scores);
            float confidence = (float) mm.maxVal;
            Point classIdPoint = mm.maxLoc;
            if (confidence > confThreshold) {
                int centerX = (int) (row.get(0, 0)[0] * src.cols()); //scaling for drawing the bounding boxes//
                int centerY = (int) (row.get(0, 1)[0] * src.rows());
                int width = (int) (row.get(0, 2)[0] * src.cols());
                int height = (int) (row.get(0, 3)[0] * src.rows());
                int left = centerX - width / 2;
                int top = centerY - height / 2;

                clsIds.add((int) classIdPoint.x);
                confs.add((float) confidence);
                rects.add(new Rect(left, top, width, height));
            }
        }
    }
    float nmsThresh = 0.5f;
    MatOfFloat confidences = new MatOfFloat(Converters.vector_float_to_Mat(confs));
    Rect[] boxesArray = rects.toArray(new Rect[0]);
    MatOfRect boxes = new MatOfRect(boxesArray);
    MatOfInt indices = new MatOfInt();
    Dnn.NMSBoxes(boxes, confidences, confThreshold, nmsThresh, indices); //We draw the bounding boxes for objects here//

    int[] ind = indices.toArray();
    int j = 0;
    for (int i = 0; i < ind.length; ++i) {
        int idx = ind[i];
        Rect box = boxesArray[idx];
        if (clsIds.get(i) == 0) {
            Imgproc.rectangle(src, box.tl(), box.br(), new Scalar(0, 0, 255), 2);
        } else if (clsIds.get(i) == 27) {
            Imgproc.rectangle(src, box.tl(), box.br(), new Scalar(0, 255, 0), 2);
        } else {
            Imgproc.rectangle(src, box.tl(), box.br(), new Scalar(255, 0, 0), 2);
        }
        //i=j;

// System.out.println(idx);
System.out.println(clsIds.get(i) + " | " + confs.get(i) + "%" + " | " + idx);

    }
    Imgcodecs.imwrite("out.png", src);


}

}

```
this out of last code:

NATIVE_LIBRARY_NAME:opencv_java410
7 | 0.9397407% | 2
1 | 0.989901% | 1
16 | 0.997903% | 0

image used: https://github.com/AlexeyAB/darknet/blob/master/data/dog.jpg

@decioschmitt You're using different parameters...

@decioschmitt You're using different parameters...

You must be referring to nmsThresh = 0.5f , ok I forgot to change this value, but I already tested with the condition of 0.3 does not change the result.

What I could prove is that classid returns differentlly. And the problem that

some value in IntPointer likes 32768,-12876

keeps happinings with some settings like the one above with javacv.
Why could this be happining Saudet?
Thanks for a lot.

```java
for (int m=0;m

That's most likely not what you want to do. Try to use limit() instead.

for (int m=0;m<indices.sizeof();m++)

That's most likely not what you want to do. Try to use limit() instead.

Thanks Saudet, this solved the problem with the pointer.
But the ClassID keeps coming back wrong:
return person(classid=0) instead of dog(classid=16)

name=bicycle classIds=1 confidences=0.99944705 probabilities=0.0 box.x=222.72230529785156 box.y=374.7823181152344 box.width=208.48818969726562 box.height=292.552978515625
name=person classIds=0 confidences=0.9946068 probabilities=0.0 box.x=583.8402709960938 box.y=122.71275329589844 box.width=200.05613708496094 box.height=86.29911804199219
name=truck classIds=7 confidences=0.99454296 probabilities=0.0 box.x=340.8481140136719 box.y=281.95098876953125 box.width=456.3893737792969 box.height=298.6319580078125

Could you help me with this too?
Thanks for a lot.

minMaxLoc(prob.rows(i).colRange(prefix, prob.cols()), minVal, maxVal, min, max, null);

You're passing all nulls here. It might not return anything in that case. Try to pass non-null pointers.

Maybe a bit an old problem, but for all who are still struggling with it, here is an implementation which uses the javacv namespaces instead of the opencv ones and works:

YoloV3 Implementation JavaCV

Cool! Please send a pull request to put that as part of the samples.

@saudet Yep, as soon as I have time. Maybe I'll add some more examples I am working on. For example human pose estimation with javacv.

@saudet I am currently creating some DNN samples for JavaCV and noticed that my YOLO implementation laggs from time to time. I guess it is the garbage collector which has to clean up a lot of stuff. Most of the post-processing (reading the results) can not be done with cv-methods so I have to use a float-data-pointer as well as native structures to store confidences and bounding boxes. I am talking about the following part:

Also this part is run about ~2500 times per prediction, which is quite a lot. In the following graph you see the spikes of the GC and also the rise of the heap size..I guess there is a memory leak, but I can not see exactly where.

image

So my question to you, does it make sense to replace all the list to store the data to Java structures to avoid switching back and forth between native and java:

IntVector classIds = new IntVector();
FloatVector confidences = new FloatVector();
RectVector boxes = new RectVector();

And do you have another suggestion how to convert the C++ code? I think creating a FloatPointer to read the results as well as the minMaxLoc call takes a lot of time (not yet profiled).

Ok, I tried to fix as much as possible by releasing everything and remove all nested memory leaks. You can see the changes here: https://github.com/cansik/deep-vision-processing/commit/790d39aee906eac943545b31ca1b720a520bfd5c

But it seems I can not remove the row & column access. Is there another solution to this problem or a suggestion in which direction I could look?

// ...
for (int j = 0; j < result.rows(); j++) {
    Mat row = result.row(j);
    BytePointer dataPointer = row.data();
    FloatPointer data = new FloatPointer(dataPointer);

    Mat scores = row.colRange(5, result.cols());
// ...

And is it even necessary to call releaseReference() since it just decrements the reference counter (as far as I know)...so in the end we will have a lot of dead references the garbage collector has to remove.

image

It seems I could remove the column access by converting the BytePointer into a ByteBuffer. I also replaced the minMaxLoc through a basic java version because it's only 1D:

for (int j = 0; j < result.rows(); j++) {
    Mat row = result.row(j);

    BytePointer dataPointer = row.data();
    dataPointer.capacity(row.cols() * 4);
    FloatBuffer data = dataPointer.asByteBuffer().asFloatBuffer();

    // minMaxLoc implemented in java
    int maxIndex = -1;
    float maxScore = Float.MIN_VALUE;
    for(int k = 5; k < data.limit(); k++) {
        float score = data.get(k);
        if(score > maxScore) {
            maxScore = score;
            maxIndex = k - 5;
        }
    }

image

@cansik It shouldn't matter what kind of data structure we use, but make sure any native objects that are not needed anymore get deallocated. We can more easily do that with PointerScope: http://bytedeco.org/news/2018/07/17/bytedeco-as-distribution/

Also, please open a pull request to have your sample added here: https://github.com/bytedeco/javacv/tree/master/samples

I finally managed to remove the memory leak completely by using the ByteBuffer of the result Mat and making the col / row calculation on my own: YOLONetwork.java#L113-L116.

Yes will do that tomorrow, thank you!

BTW, it's easier and more efficient to use FloatIndexer instead of FloatBuffer: http://bytedeco.org/news/2014/12/23/third-release/

Thank you @cansik! We now have a great sample here:
https://github.com/bytedeco/javacv/blob/master/samples/YOLONet.java

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chrisliu12345 picture chrisliu12345  ·  4Comments

The-Crocop picture The-Crocop  ·  5Comments

iamazy picture iamazy  ·  4Comments

nghiepvth picture nghiepvth  ·  3Comments

cansik picture cansik  ·  4Comments