Darknet: A Simple and Efficient Network for Small Target Detection

Created on 3 Nov 2019  ·  28Comments  ·  Source: AlexeyAB/darknet

Hi,

This paper proposes a new network configuration for small target detection and claims that it has a performance near YoloV3 while a speed near YoloV3-Tiny. The main idea is to use dilated and 1x1 convolutions.

image

image

I tried to implement the network using this repo but in training always get NaN for loss and avg loss.

Here is the configuration that I used for single class detection:

[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=8
width=512
height=512
channels=1
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 2.0
hue=0

learning_rate=0.001
burn_in=1000
max_batches = 8000
policy=steps
steps=6400,7200
scales=.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
dilation=2

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
dilation=4

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=9, 13

[convolutional]
batch_normalize=1
filters=32
size=1
stride=1
pad=1
activation=leaky

[route]
layers=-1, -3

[convolutional]
batch_normalize=1
filters=128
size=3
stride=2
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[route]
layers=8

[convolutional]
batch_normalize=1
filters=16
size=1
stride=1
pad=1
activation=leaky

[reorg3d]
stride=2

[route]
layers=25, 28

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=leaky

and this is the proposed network in the paper:
image

Any advice for solving the problem?

Solved want enhancement

Most helpful comment

I thought that in the table, left column is the architecture of authors proposed network and right column is the architecture of Tiny YoloV3 and each column presents a separate independent architecture. Therefore, the [yolo] layer you mentioned, is in the Tiny YoloV3 not the proposed network.

All 28 comments

I don't see [yolo] layer in you cfg-file.
Can you rename to txt-file and attach whole your cfg-file?

The cfg-file: network.cfg.txt

I don't see [yolo] layer in you cfg-file.

The proposed network in the paper does not have any [yolo] or [cost] layers.

Based on the yolov3-tiny.cfg file, I changed the activation function of last layer to linear and added a [yolo] layer after it (network_with_yolo.cfg.txt). Now it can be trained but the performance is weaker than YoloV3-Tiny. No NaN for loss and avg loss values and these values oscillate in a much larger range compared to the YoloV3-Tiny.

The proposed network in the paper does not have any [yolo] or [cost] layers.

The proposed network in the paper has [yolo] layer

image

I thought that in the table, left column is the architecture of authors proposed network and right column is the architecture of Tiny YoloV3 and each column presents a separate independent architecture. Therefore, the [yolo] layer you mentioned, is in the Tiny YoloV3 not the proposed network.

Yes, sure, you are right ) But still no detection network can work without a detection head: [yolo], SSD, Faster RCNN, ...

Thanks. So there may be a mistake in the table. As I mentioned before, adding a [yolo] layer after the last convolution layer did not give any interesting results:

Based on the yolov3-tiny.cfg file, I changed the activation function of last layer to linear and added a [yolo] layer after it (network_with_yolo.cfg.txt). Now it can be trained but the performance is weaker than YoloV3-Tiny. No NaN for loss and avg loss values and these values oscillate in a much larger range compared to the YoloV3-Tiny.

Despite the [yolo] layer, is the configuration in network_with_yolo.cfg.txt conforming with the proposed network in the paper? I used [route] layer for Concatenation layers and [reorg3d] layer for the Passthrough layer.

Now it can be trained but the performance is weaker than YoloV3-Tiny.

  • What dataset do you use?
  • How many training images?
  • What is the average size of objects after resizing images to the network size 512x512?
  • What mAP did you get in both cases?
  • Can you show chart.png with Loss & mAP for both network_with_yolo.cfg.txt and yolov3-tiny.cfg ?

image

Yes, it seems network_with_yolo.cfg.txt conforming with the proposed network in the paper

I used [route] layer for Concatenation layers and [reorg3d] layer for the Passthrough layer.

Thats right.

Try to use in the [yolo] layer

filters=36
...

mask = 0,1,2,3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
  • What dataset do you use?

A custom dataset. I have not tested the datasets used in the paper.

  • How many training images?

1734 images.

  • What is the average size of objects after resizing images to the network size 512x512?

About 30x30.

  • What mAP did you get in both cases?
  • Can you show chart.png with Loss & mAP for both network_with_yolo.cfg.txt and yolov3-tiny.cfg ?

chart.png for yolov3-tiny.cfg.txt:

yolov3-tiny

chart.png for network_with_yolo.cfg.txt:

network_with_yolo

Note that:

  • The validation set used for mAP calculation is different from the training set.
  • Anchors are calculated for the dataset using darknet detector calc_anchors.
  • Network image size for Tiny YoloV3 is 416x416.

Try to use in the [yolo] layer

filters=36
...

mask = 0,1,2,3,4,5

Without these changes the mAP was lower with avg loss swinging in a larger range.


We have a separate test set. Here are the results of darknet detector map:

With best weights using yolov3-tiny.cfg.txt:

 calculation mAP (mean average precision)...
380
 detections_count = 573, unique_truth_count = 108  
class_id = 0, name = cls, ap = 74.57%        (TP = 83, FP = 44) 

 for conf_thresh = 0.25, precision = 0.65, recall = 0.77, F1-score = 0.71 
 for conf_thresh = 0.25, TP = 83, FP = 44, FN = 25, average IoU = 45.68 % 

 IoU threshold = 40 %, used Area-Under-Curve for each unique Recall 
 mean average precision ([email protected]) = 0.745731, or 74.57 % 
Total Detection Time: 0.000000 Seconds

With best weights using network_with_yolo.cfg.txt:

 calculation mAP (mean average precision)...
380
 detections_count = 576, unique_truth_count = 108  
class_id = 0, name = cls, ap = 67.20%        (TP = 82, FP = 67) 

 for conf_thresh = 0.25, precision = 0.55, recall = 0.76, F1-score = 0.64 
 for conf_thresh = 0.25, TP = 82, FP = 67, FN = 26, average IoU = 40.94 % 

 IoU threshold = 40 %, used Area-Under-Curve for each unique Recall 
 mean average precision ([email protected]) = 0.671979, or 67.20 % 
Total Detection Time: 2.000000 Seconds

Are mAPs on the charts for Training or Validation dataset?

Are mAPs on the charts for Training or Validation dataset?

Validation

Why on the chart you get 99.9% but for ./darknet detector map ... you get 67.20% for network_with_yolo.cfg.txt ?

Why on the chart you get 99.9% but for ./darknet detector map ... you get 67.20% for network_with_yolo.cfg.txt ?

I used a separate test set for darknet detector map, which is different from the validation set used in training.

Did you get Training/Valid/Test dataset by randomly uniform dividing single dataset to 80%/10%/10%?

Did you get Training/Valid/Test dataset by randomly uniform dividing single dataset to 80%/10%/10%?

Train and valid sets are selected randomly from a single dataset with 1734 images for train and 530 images for valid . But the test set is an independent set.

So may be this is the reason. Your train for one objects, but test for others.

So may be this is the reason. Your train for one objects, but test for others.

Yes, you are right

@mrhosseini Hi,

When I using network_with_yolo.cfg I’m faced with this error.

cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16()
cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

I have 18 classes and I just changed:
filters=138 and classes to 18.

When I using network_with_yolo.cfg I’m faced with this error.

cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16()
cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

I have 18 classes and I just changed:
filters=138 and classes to 18.

@zpmmehrdad
Hi,
Unfortunately I’m not familiar with cuDNN. May be @AlexeyAB can help you.

@mrhosseini Hi,

Thanks, What CUDNN and CUDA version are you using?

@zpmmehrdad

  • What GPU do you use?
  • What command do you use?
  • Can you show output of commands:
nvcc --version
nvidia-smi

@AlexeyAB Hi,

I'm using
OS: Win10,
command: darknet.exe detector train a.obj network_with_yolo.cfg -map

output:

compute_capability = 610, cudnn_half = 0
layer filters size/strd(dil) input output
0 conv 16 3 x 3/ 1 512 x 512 x 1 -> 512 x 512 x 16 0.075 BF
1 conv 32 3 x 3/ 2 512 x 512 x 16 -> 256 x 256 x 32 0.604 BF
2 conv 16 1 x 1/ 1 256 x 256 x 32 -> 256 x 256 x 16 0.067 BF
3 conv 32 3 x 3/ 1 256 x 256 x 16 -> 256 x 256 x 32 0.604 BF
4 conv 64 3 x 3/ 2 256 x 256 x 32 -> 128 x 128 x 64 0.604 BF
5 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF
6 conv 64 3 x 3/ 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BF
7 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF
8 conv 64 3 x 3/ 1 128 x 128 x 32 -> 128 x 128 x 64 0.604 BF
9 conv 32 1 x 1/ 1 128 x 128 x 64 -> 128 x 128 x 32 0.067 BF
10
cuDNN status Error in: file: ....\src\convolutional_layer.c : get_workspace_size16() : line: 157 : build time: Oct 22 2019 - 09:30:52
cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED

cuDNN Error: CUDNN_STATUS_NOT_SUPPORTED: No error
Assertion failed: 0, file ....\src\utils.c, line 293

@zpmmehrdad What GPU do you use?

@zpmmehrdad What GPU do you use?

@AlexeyAB Hi,
My GPU is GTX 1080 ti

@AlexeyAB Hi,
I found the problem. I updated the CUDA version from 9.1 to 10.0 and it's work.

@mrhosseini Hello? I'm also studying this field recently. Are you running on windows? If so, can you send me a copy of your compiled Darknet and pack it for me? I encountered a lot of errors in compiling. My email is [email protected],I look forward to your reply.

Hi @leiyaohui , unfortunately I use Ubuntu. Try one of the methods here. You may open a new issue if encountered with errors.

Did you write the expansion convolution or did it come with Darknet itself?
---原始邮件---
发件人:"mrhosseini"notifications@github.com;
发送时间:2019年12月9日(星期一) 下午5:57
收件人:"AlexeyAB/darknet"darknet@noreply.github.com;
抄送人:"leiyaohui"1373890292@qq.com;"Mention"mention@noreply.github.com;
主题:Re: [AlexeyAB/darknet] A Simple and Efficient Network for SmallTarget Detection (#4213)

Hi @leiyaohui , unfortunately I use Ubuntu. Try one of the methods here. You may open a new issue if encountered with errors.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

Did you write the expansion convolution or did it come with Darknet itself?

The dilated convolution is implemented in this repository. You can use this configuration file for the proposed network of the paper which mentioned above.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

PROGRAMMINGENGINEER-NIKI picture PROGRAMMINGENGINEER-NIKI  ·  3Comments

shootingliu picture shootingliu  ·  3Comments

Jacky3213 picture Jacky3213  ·  3Comments

jasleen137 picture jasleen137  ·  3Comments

yongcong1415 picture yongcong1415  ·  3Comments