Darknet: Some problems with tracking

Created on 2 Apr 2019 · 11Comments · Source: AlexeyAB/darknet

I am trying to track vehicles with recent version of yolo_console_dll but i am facing some problems:

The detection results using darknet.exe detector demo is much better than it is in tracking. I realize in tracking the detection is not performed on each frame. Is it possible to make it run detection on more frames?
some times the tracking box stays even after the object has left the scene.

this is detection results:
ezgif com-video-to-gif(1)

but the tracking is like this:
ezgif com-video-to-gif

How to reduce the size of colored boxes containing labels of the detected objects? Is it possible to make them transparent?
The fps of the resulting video is the same as the input video, but the resulting video is a bit slower.

Thanks

Source

hadi-ghnd

All 11 comments

in my case it was really fast. Fatsrer then the normal input video, tracking video. whats the frame story and max distance u r using

buzdarbalooch on 2 Apr 2019

how long is ur video?

buzdarbalooch on 2 Apr 2019

@buzdarbalooch I change here std::max(35, video_fps) to std::max(1, video_fps):
https://github.com/AlexeyAB/darknet/blob/0543278a5bd7064fae6538afd1761b06b10f73ee/src/yolo_console_dll.cpp#L296

The resulting video that you see while running the program is different than the one that is saved as result.avi. The one you see is faster.
My fps is 15 and the video is about 15 minutes long.

hadi-ghnd on 2 Apr 2019

@hadi-ghnd

Optical-flow tracker should be used for video-stream from Camera instead of Video-file if your GPU can't process each frame from Camera. Optical-flow allows to achive more FPS than Neural Network, it tracks objects between detections.

Comment this line to do Detection for each frame: https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L294

some times the tracking box stays even after the object has left the scene. - to fix it, set here true instead of false https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L509

Instead of TRACK_OPTFLOW you can use the Kalman filter if all your objects have linear trajectory - just set true instead of false: https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L290
And use uselib instead of uselib_track (and comment #define TRACK_OPTFLOW)

To reduce the colored rectangles and text size you should change these lines: https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L200-L203

use std::max(1, video_fps) instead of std::max(35, video_fps) to for video-files https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/src/yolo_console_dll.cpp#L369

AlexeyAB on 3 Apr 2019

👍1

@AlexeyAB thank you for your helpful response.
I tried Kalman filter as you suggested and the results got better:
ezgif com-video-to-gif(2)

But I still have some questions:

As you can see there are double detections for some objects. Can this be solved by tracking or increasing nms?
some times the tracking box stays behind the object. I think this can be because of the Kalman filter. Is there an option to fix this like the one you mentioned for TRACK_OPTFLOW?
My last question: Is there an option to count the objects of each class or should I add it myself?

hadi-ghnd on 3 Apr 2019

As you can see there are double detections for some objects. Can this be solved by tracking or increasing nms?

some times the tracking box stays behind the object. I think this can be because of the Kalman filter. Is there an option to fix this like the one you mentioned for TRACK_OPTFLOW?

No. You should train your model better. Collect much more images from such video with the same point of view and the same relative sizes of objects.
And read: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
Or you should use better model, for example yolov3-spp.cfg / weights

You should implement it by yourself - some hints:

Use std::vector<int> track_id_vec; https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/include/yolo_v2_class.hpp#L654

track_id_vec(classes_number) pass here number of classes https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/include/yolo_v2_class.hpp#L809
Instead of https://github.com/AlexeyAB/darknet/blob/6231b748c44e2007b5c3cbf765a50b122782c5a2/include/yolo_v2_class.hpp#L958-L959
use:

                    track_id_state_id_time[i].track_id = ++track_id_vec[result_vec_pred[i].obj_id];
                    result_vec_pred[i].track_id = track_id_vec[result_vec_pred[i].obj_id];

..... etc

AlexeyAB on 3 Apr 2019

👍1

Dear @AlexeyAB and @hadi-ghnd . i shall be greatful if u can help me debug the tracking issues i am face.

i have experimented on two datasets of football(soccer), ap of both datasets are below. 1st datset is for four classes, second is for seven classes.
debug1
debug2

Below are the result when i try to run tracking:
LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH ./uselib data/obj2.names cfg/yolov2.cfg Backup1/yolov2_3700.weights cornerkick1.mov

frame_id = 184
track_id = 441, obj_id = 3, x = 57, y = 235, w = 6, h = 143, prob = 0.316

track_id = 446, obj_id = 3, x = 244, y = 103, w = 3, h = 268, prob = 0.273

track_id = 276, obj_id = 3, x = 51, y = 116, w = 0, h = 156, prob = 0.254

track_id = 297, obj_id = 3, x = 55, y = 333, w = 0, h = 167, prob = 0.254

track_id = 443, obj_id = 3, x = 251, y = 174, w = 67, h = 149, prob = 0.253

track_id = 540, obj_id = 3, x = 244, y = 9, w = 0, h = 245, prob = 0.252

track_id = 263, obj_id = 1, x = 251, y = 301, w = 0, h = 120, prob = 0.262

track_id = 409, obj_id = 3, x = 154, y = 356, w = 0, h = 119, prob = 0.247

track_id = 270, obj_id = 1, x = 152, y = 172, w = 0, h = 250, prob = 0.285

track_id = 277, obj_id = 1, x = 156, y = 272, w = 0, h = 178, prob = 0.292

track_id = 535, obj_id = 3, x = 152, y = 69, w = 0, h = 250, prob = 0.227

track_id = 1503, obj_id = 0, x = 1168, y = 77, w = 167, h = 0, prob = 0.271

track_id = 542, obj_id = 3, x = 933, y = 32, w = 0, h = 208, prob = 0.221

track_id = 572, obj_id = 3, x = 464, y = 186, w = 175, h = 0, prob = 0.221

track_id = 95, obj_id = 3, x = 251, y = 57, w = 0, h = 282, prob = 0.217

track_id = 1516, obj_id = 0, x = 867, y = 131, w = 168, h = 0, prob = 0.314

track_id = 1509, obj_id = 0, x = 767, y = 131, w = 164, h = 0, prob = 0.295

track_id = 444, obj_id = 3, x = 48, y = 81, w = 0, h = 118, prob = 0.212

track_id = 1671, obj_id = 0, x = 760, y = 408, w = 73, h = 79, prob = 0.23

track_id = 543, obj_id = 3, x = 146, y = 38, w = 0, h = 212, prob = 0.211

track_id = 573, obj_id = 3, x = 245, y = 429, w = 0, h = 83, prob = 0.208

track_id = 1795, obj_id = 0, x = 48, y = 188, w = 224, h = 0, prob = 0.214

track_id = 1052, obj_id = 0, x = 1077, y = 635, w = 130, h = 35, prob = 0.415

track_id = 1733, obj_id = 0, x = 1042, y = 186, w = 187, h = 0, prob = 0.218

track_id = 266, obj_id = 2, x = 1039, y = 11, w = 0, h = 145, prob = 0.201

track_id = 278, obj_id = 1, x = 648, y = 233, w = 0, h = 135, prob = 0.278

if u notice, system is producing wrong bounding boxes for objects even with very less probability

buzdarbalooch on 4 Apr 2019

another error which i caught is below , when i try to implement tracking on aa image. Here, if u notice the command i execute and the weight it loads are different , weights are loaded from the previous model i trained. see the weights the system loads Loading weights from backup/yolo_10100.weights...

akhan@tensorflow-System-Product-Name:~/darknet-master$ LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH ./uselib data/obj2.names cfg/yolov2.cfg Backup1/yolov2_3700.weights experi.jpg
Used GPU 0
layer filters size input output
0 conv 32 3 x 3 / 1 416 x 416 x 3 -> 416 x 416 x 32 0.299 BF
1 max 2 x 2 / 2 416 x 416 x 32 -> 208 x 208 x 32 0.006 BF
2 conv 64 3 x 3 / 1 208 x 208 x 32 -> 208 x 208 x 64 1.595 BF
3 max 2 x 2 / 2 208 x 208 x 64 -> 104 x 104 x 64 0.003 BF
4 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF
5 conv 64 1 x 1 / 1 104 x 104 x 128 -> 104 x 104 x 64 0.177 BF
6 conv 128 3 x 3 / 1 104 x 104 x 64 -> 104 x 104 x 128 1.595 BF
7 max 2 x 2 / 2 104 x 104 x 128 -> 52 x 52 x 128 0.001 BF
8 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF
9 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BF
10 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BF
11 max 2 x 2 / 2 52 x 52 x 256 -> 26 x 26 x 256 0.001 BF
12 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF
13 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF
14 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF
15 conv 256 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 256 0.177 BF
16 conv 512 3 x 3 / 1 26 x 26 x 256 -> 26 x 26 x 512 1.595 BF
17 max 2 x 2 / 2 26 x 26 x 512 -> 13 x 13 x 512 0.000 BF
18 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF
19 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF
20 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF
21 conv 512 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 512 0.177 BF
22 conv 1024 3 x 3 / 1 13 x 13 x 512 -> 13 x 13 x1024 1.595 BF
23 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 3.190 BF
24 conv 1024 3 x 3 / 1 13 x 13 x1024 -> 13 x 13 x1024 3.190 BF
25 route 16
26 conv 64 1 x 1 / 1 26 x 26 x 512 -> 26 x 26 x 64 0.044 BF
27 reorg / 2 26 x 26 x 64 -> 13 x 13 x 256
28 route 27 24
29 conv 1024 3 x 3 / 1 13 x 13 x1280 -> 13 x 13 x1024 3.987 BF
30 conv 45 1 x 1 / 1 13 x 13 x1024 -> 13 x 13 x 45 0.016 BF
31 detection
mask_scale: Using default '1.000000'
Total BFLOPS 29.343
Loading weights from backup/yolo_10100.weights...
seen 64
Done!
input image or video filename: Time: 0.200338 sec
: cannot connect to X server

buzdarbalooch on 4 Apr 2019

@AlexeyAB Thank you for your suggestions. The modifications you mentioned were enough to count instances of different classes.

hadi-ghnd on 4 Apr 2019

👍1

@AlexeyAB I got a dataset with a bunch of videos with linear movements (the camera is hovering above and facing the ground in a linear pattern) so I give a try to Kalman Filter and it seems to work pretty well.

False Positives are greatly reduced and box sizes are more consistent and stable. Currently I'm using your C++ library through yolo_console_dll.cpp and I'm wondering if Kalman Filtering is available through the Python API or if I must implement it myself using OpenCV?

Last question, are the changes I made to yolo_console_dll.cpp to enable Kalman Fitering passed on yolo_cpp_dll.dll (through this API: yolo_v2_class.hpp) i.e when using Detector class, are track_ids tracked by Kalman Filter?

laclouis5 on 9 Dec 2019

@laclouis5

False Positives are greatly reduced and box sizes are more consistent and stable. Currently I'm using your C++ library through yolo_console_dll.cpp and I'm wondering if Kalman Filtering is available through the Python API or if I must implement it myself using OpenCV?

You should implement it by yourselft using OpenCV.

Last question, are the changes I made to yolo_console_dll.cpp to enable Kalman Fitering are passed on yolo_cpp_dll.dll (through this API: yolo_v2_class.hpp) i.e when using Detector class, are track_ids tracked by Kalman Filter?

No. File yolo_console_dll.cpp is used only for ./uselib. It isn't used in yolo_cpp_dll.dll

AlexeyAB on 9 Dec 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings