Darknet: About [sam] layer.

Created on 5 Aug 2019  路  51Comments  路  Source: AlexeyAB/darknet

I noticed that you added [sam] layer in darknet. How can we use it?

cfg file with [sam]: yolov3-tiny-sam.cfg.txt

COCO test-dev

| Model | Size | BFLOPS | Inference time, ms | [email protected]:.95 | [email protected] | [email protected] |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: |
| yolo_v3_tiny_pan3 aa_ae_mixup_scale_giou (no sgdr).txt | 416x416 | 8.4 | 6.4 | 18.8% | 36.8% | 17.5% |
| yolov3-tiny-prn.cfg.txt | 416x416 | 3.5 | 3.8 | - | 33.1% | - |
| enet-coco.cfg.txt | 416x416 | 3.7 | 22.7 | - | 45.5% | - |

ToDo enhancement

Most helpful comment

@AlexeyAB for best inference speed, i may share this model after discuss with my team.
image
it reduces 45% number of parameter, 38% of computation, 37% CPU computation time, 19% GPU computation time, and 25% TX2 computation time, while maintaining same [email protected] as yolo-v3-tiny.
this model achieves 485 fps on gtx 1080 ti (batch size = 1).

All 51 comments

image

notice that number of filters should be equal to from layer.

@WongKinYiu could you please share the cfg file?

@WongKinYiu Thanks for sharing another novel architecture! - Would you be kind enough to explain a little about the design? I notice it contains only a single Yolo layer, what about rough cocoAP / inf. time on an RTX?

https://github.com/AlexeyAB/darknet/issues/3380#issuecomment-503780307

u can compare it with efficientnet-b0
https://github.com/AlexeyAB/darknet/issues/3380#issuecomment-517274542

by the way, thundernet is a 2-stage detector.
you may need do some modifications to make it suitable for yolo.

Oh, I see this is the CEM + SAM + Yolov3 with 42.0% [email protected] with 2.90 BFLOPs.? Sounds great, I'll see how it goes and report back. Have you done any other experimental architectures that you would be happy sharing? Do you think it might be improved by trying to use a pan-like head?

@LukeAI If you have a time, try to train this model (CEM + SAM + Yolov3 with 42.0% [email protected] with 2.90 BFLOPs) on this dataset: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

For adding result (chart Loss & mAP, BFlops) to this table.

@AlexeyAB Is there any cfg file of CEM + SAM + Yolov3 ?
I will have a try.

enetb0-cemsam.cfg.txt

Because there is no parameter can let up-sampling layer up-sample the feature maps to the size before global average pooling layer, I use max-pooling layer instead of global average pooling layer in CEM.
(https://github.com/AlexeyAB/darknet/issues/3380#issuecomment-503780307 uses SPP instead of global average pooling layer.)

If you get error while training the model, try to set random=0 of yolo layer.

yolov3-tiny-sam.cfg.txt
here u r.

I try to train with:
./darknet detector train my_stuff/bdd100k.data my_stuff/yolov3-tiny-sam.cfg my_stuff/yolov3-tiny.conv.15 -dont_show -mjpeg_port 8090 -map -i 1`

But it immediately aborts with:

...
[yolo] params: iou loss: mse, iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 4.887 
 Allocate additional workspace_size = 1245.71 MB 
Loading weights from my_stuff/yolov3-tiny.conv.15...
 seen 64 
Done!
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
Resizing
608 x 608 
Resizing type 15 
Cannot resize this type of layer: File exists
darknet: ./src/utils.c:293: error: Assertion `0' failed.
....

UPDATE: it works if I set random=0

training now, looking good so far.
what am I missing with random=0?
How could I add scales_x_y to this model?

@LukeAI
For using scale_x_y, plz see https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

@WongKinYiu I mean, I know that the scale models have "scale_x_y = 1.05" or something like that in the Yolo layers, I just don't really understand what an appropriate value would be. I could try with 1.05 and just see how that works? or 1.1?

@LukeAI to set an appropriate value, plz see https://github.com/AlexeyAB/darknet/issues/3293#issuecomment-497895809

Hi all,
Some experiments I ran a while back using the berkley deep drive dataset (slightly reduced number of classes)
Baseline:
CEM1.cfg.txt
CEM1

With anchors generated from the dataset:
CEM_with_anchors.cfg.txt
CEM_with_anchors

Using scale_x_y=1.05
CEM_with_scale.cfg.txt
CEM_with_scale

Using swish activations:
CEM_with_swish.cfg.txt
CEM_with_swish

For comparison, the same dataset trained with tiny_3l:
tiny_3l

and with tiny_pan2:
tiny_pan2_swish_3

@LukeAI
So CEM, scale and swish doesn't give significant improvements?

tiny_pan2 is the most accuracy network?

yeah, tiny_pan2 is a good one, here's hoping for a full sized pan2 network. I didn't measure the inf. time, I guess the point of the CEM network is that is very fast whilst still being reasonably accurate?

@LukeAI Just add comparison table, with final accuracy, FLOPS, and inference time

I think the mainly improvement is from more anchors/yolo-layers.
In my experiments, yolo-v3-tiny-3l gets 5.7% higher [email protected] than yolo-v3-tiny(2l) on pedestrian detection task.

here list some results of my backbone (evaluate on coco test-dev set):

  1. model A with 2l (6 anchors): 45.0% [email protected], 4.04 BFLOPs.
  2. model A with 3l (9 anchors): 46.3% [email protected], 5.03 BFLOPs.
  3. model B with 2l (6 anchors): 46.8% [email protected], 4.76 BFLOPs.
  4. model B with cem (6 anchors): 45.2% [email protected], 4.81 BFLOPs.
  5. model B with cem sam (6 anchors): 46.1% [email protected], 4.90 BFLOPs.
  6. model B with modified cem sam (9 anchors): 48.0% [email protected], 4.95 BFLOPs.

@WongKinYiu

  1. model B with modified cem sam (9 anchors): 48.0% [email protected], 4.95 BFLOPs.

Thanks!
What modifications did you do in 6-model?

@AlexeyAB Hello, i m on a business trip, i ll share the modified cem sam tonight.

@AlexeyAB modified-cem-sam-head.txt

  1. i using spp instead global average pooling, it is becuz currently this repo can not support multi-scale training when using global average pooling as an intermediate layer.
  2. since yolo is one-stage object detector, i add sam layer for each feature pyramid.

@WongKinYiu Thanks!
Did you compare Inference time (sec) for 2. model A with 3l (9 anchors): 46.3% [email protected], 5.03 BFLOPs. and 6. model B with modified cem sam (9 anchors): 48.0% [email protected], 4.95 BFLOPs. ?

@AlexeyAB Hello,
sam layer is similar to scale channels layer, although it increase only <1% computation, it increase 20%~30% inference time on gpu. on cpu, they take similar inference time.

@WongKinYiu When you will find the best cfg-file, please share it, I will add it to this repository.

@AlexeyAB for best inference speed, i may share this model after discuss with my team.
image
it reduces 45% number of parameter, 38% of computation, 37% CPU computation time, 19% GPU computation time, and 25% TX2 computation time, while maintaining same [email protected] as yolo-v3-tiny.
this model achieves 485 fps on gtx 1080 ti (batch size = 1).

@WongKinYiu Thanks!

  • So as I see you didn't use CEM, SAM or Squeeze-and-Excitation blocks in the YOLO-v3-tiny-PRN

  • Also you didn't use Swish, SPP, PAN2, Assisted Excitation, Anti Aliasing, Mixup, scales_x_y, GIoU https://github.com/AlexeyAB/darknet/projects/1

Whad do you think about these features?

Do you plan to create a model in which all or most of these features will be?

SPP and scales_x_y are very useful.
GIoU improves [email protected]:0.95, but drops [email protected]. For some cases, [email protected] is more important.
PAN2 reduces 13% computation than PAN and reduces 0.5% [email protected] in my experiment.
Mixup can not benefit lightweight model in my experiments.
SAM and squeeze-and-excitation take too much inference time on GPU.

I think Anti Aliasing will be useful too.
And I will take a look Assisted Excitation.

However, I find a strange phenomenon.
When I use repos committed before early March to train a detector, it always get better performance than using repos committed after May.
If I train the model using old repo, then valid the model using new repos. They get worse results, too.
So I do not use new features in my model.
I will try to add features that I need to old repo and test them in following one month.

@WongKinYiu

If I train the model using old repo, then valid the model using new repos. They get worse results, too.

Maybe only the new accuracy checking function is different, and the training is just as good?
I fixed a little in mAP function.

GIoU improves [email protected]:0.95, but drops [email protected]. For some cases, [email protected] is more important.
PAN2 reduces 13% computation than PAN and reduces 0.5% [email protected] in my experiment.
Mixup can not benefit lightweight model in my experiments.

Did you test it on MS COCO dataset?

I will add PAN3 block and new tiny model today there: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

@AlexeyAB
I upload predicted bounding boxes to codalab.
And I train a same model for several times, old repo always get better results.

Yes, all of my experiment results are tested on MS COCO test-dev set.

@WongKinYiu
I added another one mode: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-528532293

cfg: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt

It seems it is the best cfg-file for this small dataset: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

You can try to train it on MS COCO and check the mAP if you have a time.

@WongKinYiu Also can you attach entire the best of yours SAM_CEM model (not only head)?
I will attach it here and close the Issue: https://github.com/AlexeyAB/darknet/issues/3702

@AlexeyAB modified-cem-sam-head.txt

i using spp instead global average pooling, it is becuz currently this repo can not support multi-scale training when using global average pooling as an intermediate layer.
since yolo is one-stage object detector, i add sam layer for each feature pyramid.

@AlexeyAB

Thank you for sharing a good model (yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou).

After discuss with my team, I can not share the backbone of https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-526991664 currently.
I will add the modified-cem-sam-head to yolo-v3-tiny and share the cfg latter.

@AlexeyAB
now training yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou (no sgdr) on coco dataset.
i ll report the result after finish training.

@WongKinYiu Try to increase assisted_excitation=4000 to assisted_excitation=20000 or 50000

COCO test-dev

| Model | Size | [email protected]:.95 | [email protected] | [email protected] |
| :-- | :-: | :-: | :-: | :-: |
| yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou (no sgdr).txt | 416x416 | 18.8% | 36.8% | 17.5% |

COCO test-dev

| Model | Size | BFLOPS | Inference time, ms | [email protected]:.95 | [email protected] | [email protected] |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: |
| yolo_v3_tiny_pan3 aa_ae_mixup_scale_giou (no sgdr).txt | 416x416 | 8.4 | 6.4 | 18.8% | 36.8% | 17.5% |
| yolov3-tiny-prn.cfg.txt | 416x416 | 3.5 | 3.8 | - | 33.1% | - |
| enet-coco.cfg.txt | 416x416 | 3.7 | 22.7 | - | 45.5% | - |

@AlexeyAB
I download the latest repo and set like following in makefile

GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=0
OPENMP=0
LIBSO=0
ZED_CAMERA=0

Then use yolov3-tiny-sam.cfg.txt in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-519329865 to train with my own dataset
But meet error like following

Total BFLOPS 4.883
Allocate additional workspace_size = 52.43 MB
Loading weights from /home/gc/4-images/9.18/darknet/yolov3-tiny.conv.15...
seen 64
Done! Loaded 23 layers from weights-file
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
If error occurs - run training with flag: -dont_show
Resizing
608 x 608
Resizing type 16
Cannot resize this type of layer:
darknet: ./src/utils.c:297锛歟rror:

@nyj-ocean

yes, you should modify the resize function of sam layer, or you can only train it with random=1.

@WongKinYiu

  • sam layers could not train with Multi-Scale(random=1),is it right?

  • How to modify the resize function of sam layer to train with random=1

@nyj-ocean

yes.

just add the case of resize function of sam layer in network.c.
it already defined in sam_layer.c.
so you can simply include it and just need a little bit modification.

@WongKinYiu
Thanks a lot

@AlexeyAB
I notice another module:CBAM: Convolutional Block Attention Module
cbam1

cbam2

Is there a need to add CBAM Module to this repo?

CBAM: Convolutional Block Attention Module.pdf
code:https://github.com/Jongchan/attention-module

The kernel function of CAM module and SAM module are SE(squeeze-and-excitation) and SAM, respectively, the already supported by this repo.

I added resizing (random=1) for [sam] layers.

@nyj-ocean
Hello, have you tried the CBAM module in YOLOV4?

@WongKinYiu
You mentioned that this repo already supports the SE module, I don鈥檛 find the relevant code and how to use it, can you help me answer it, thank you

Squeeze-and-Excitation blocks (layers: [avgpool]->[conv]->[conv]->[scale_channels])

@924175302 Example of SE
https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/enet-coco.cfg

#squeeze-n-excitation
[avgpool]

# squeeze ratio r=16 (recommended r=16)
[convolutional]
filters=24
size=1
stride=1
activation=swish

# excitation
[convolutional]
filters=384
size=1
stride=1
activation=logistic

# multiply channels
[scale_channels]
from=-4
Was this page helpful?
0 / 5 - 0 ratings

Related issues

jasleen137 picture jasleen137  路  3Comments

Yumin-Sun-00 picture Yumin-Sun-00  路  3Comments

shootingliu picture shootingliu  路  3Comments

Greta-A picture Greta-A  路  3Comments

Cipusha picture Cipusha  路  3Comments