Darknet: About [sam] layer.

Created on 5 Aug 2019 · 51Comments · Source: AlexeyAB/darknet

I noticed that you added [sam] layer in darknet. How can we use it?

cfg file with [sam]: yolov3-tiny-sam.cfg.txt

COCO test-dev

| Model | Size | BFLOPS | Inference time, ms | [email protected]:.95 | [email protected] | [email protected] |
| :-- | :-: | :-: | :-: | :-: | :-: | :-: |
| yolo_v3_tiny_pan3 aa_ae_mixup_scale_giou (no sgdr).txt | 416x416 | 8.4 | 6.4 | 18.8% | 36.8% | 17.5% |
| yolov3-tiny-prn.cfg.txt | 416x416 | 3.5 | 3.8 | - | 33.1% | - |
| enet-coco.cfg.txt | 416x416 | 3.7 | 22.7 | - | 45.5% | - |

ToDo enhancement

Source

ChenCong7375

Most helpful comment

@AlexeyAB for best inference speed, i may share this model after discuss with my team.

it reduces 45% number of parameter, 38% of computation, 37% CPU computation time, 19% GPU computation time, and 25% TX2 computation time, while maintaining same [email protected] as yolo-v3-tiny.
this model achieves 485 fps on gtx 1080 ti (batch size = 1).

WongKinYiu on 3 Sep 2019

👍4 🚀2

All 51 comments

think it's for thundernet https://github.com/AlexeyAB/darknet/issues/3702

LukeAI on 6 Aug 2019

notice that number of filters should be equal to from layer.

WongKinYiu on 6 Aug 2019

@WongKinYiu could you please share the cfg file?

ChenCong7375 on 7 Aug 2019

yolov3-tiny-sam.cfg.txt
here u r.

WongKinYiu on 8 Aug 2019

🎉1

@WongKinYiu Thanks for sharing another novel architecture! - Would you be kind enough to explain a little about the design? I notice it contains only a single Yolo layer, what about rough cocoAP / inf. time on an RTX?

LukeAI on 8 Aug 2019

https://github.com/AlexeyAB/darknet/issues/3380#issuecomment-503780307

u can compare it with efficientnet-b0
https://github.com/AlexeyAB/darknet/issues/3380#issuecomment-517274542

by the way, thundernet is a 2-stage detector.
you may need do some modifications to make it suitable for yolo.

WongKinYiu on 8 Aug 2019

Oh, I see this is the CEM + SAM + Yolov3 with 42.0% [email protected] with 2.90 BFLOPs.? Sounds great, I'll see how it goes and report back. Have you done any other experimental architectures that you would be happy sharing? Do you think it might be improved by trying to use a pan-like head?

LukeAI on 8 Aug 2019

@LukeAI If you have a time, try to train this model (CEM + SAM + Yolov3 with 42.0% [email protected] with 2.90 BFLOPs) on this dataset: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

For adding result (chart Loss & mAP, BFlops) to this table.

AlexeyAB on 8 Aug 2019

@AlexeyAB Is there any cfg file of CEM + SAM + Yolov3 ?
I will have a try.

ChenCong7375 on 8 Aug 2019

enetb0-cemsam.cfg.txt

Because there is no parameter can let up-sampling layer up-sample the feature maps to the size before global average pooling layer, I use max-pooling layer instead of global average pooling layer in CEM.
(https://github.com/AlexeyAB/darknet/issues/3380#issuecomment-503780307 uses SPP instead of global average pooling layer.)

If you get error while training the model, try to set random=0 of yolo layer.

WongKinYiu on 8 Aug 2019

yolov3-tiny-sam.cfg.txt
here u r.

I try to train with:
./darknet detector train my_stuff/bdd100k.data my_stuff/yolov3-tiny-sam.cfg my_stuff/yolov3-tiny.conv.15 -dont_show -mjpeg_port 8090 -map -i 1`

But it immediately aborts with:

...
[yolo] params: iou loss: mse, iou_norm: 0.75, cls_norm: 1.00, scale_x_y: 1.00
Total BFLOPS 4.887 
 Allocate additional workspace_size = 1245.71 MB 
Loading weights from my_stuff/yolov3-tiny.conv.15...
 seen 64 
Done!
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
Resizing
608 x 608 
Resizing type 15 
Cannot resize this type of layer: File exists
darknet: ./src/utils.c:293: error: Assertion `0' failed.
....

UPDATE: it works if I set random=0

LukeAI on 9 Aug 2019

training now, looking good so far.
what am I missing with random=0?
How could I add scales_x_y to this model?

LukeAI on 9 Aug 2019

@LukeAI
For using scale_x_y, plz see https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

WongKinYiu on 10 Aug 2019

@WongKinYiu I mean, I know that the scale models have "scale_x_y = 1.05" or something like that in the Yolo layers, I just don't really understand what an appropriate value would be. I could try with 1.05 and just see how that works? or 1.1?

LukeAI on 11 Aug 2019

@LukeAI to set an appropriate value, plz see https://github.com/AlexeyAB/darknet/issues/3293#issuecomment-497895809

WongKinYiu on 16 Aug 2019

Hi all,
Some experiments I ran a while back using the berkley deep drive dataset (slightly reduced number of classes)
Baseline:
CEM1.cfg.txt
CEM1

With anchors generated from the dataset:
CEM_with_anchors.cfg.txt

Using scale_x_y=1.05
CEM_with_scale.cfg.txt

Using swish activations:
CEM_with_swish.cfg.txt

For comparison, the same dataset trained with tiny_3l:
tiny_3l

and with tiny_pan2:
tiny_pan2_swish_3

LukeAI on 30 Aug 2019

👍2

@LukeAI
So CEM, scale and swish doesn't give significant improvements?

tiny_pan2 is the most accuracy network?

AlexeyAB on 30 Aug 2019

yeah, tiny_pan2 is a good one, here's hoping for a full sized pan2 network. I didn't measure the inf. time, I guess the point of the CEM network is that is very fast whilst still being reasonably accurate?

LukeAI on 30 Aug 2019

@LukeAI Just add comparison table, with final accuracy, FLOPS, and inference time

AlexeyAB on 30 Aug 2019

I think the mainly improvement is from more anchors/yolo-layers.
In my experiments, yolo-v3-tiny-3l gets 5.7% higher [email protected] than yolo-v3-tiny(2l) on pedestrian detection task.

WongKinYiu on 2 Sep 2019

here list some results of my backbone (evaluate on coco test-dev set):

model A with 2l (6 anchors): 45.0% [email protected], 4.04 BFLOPs.
model A with 3l (9 anchors): 46.3% [email protected], 5.03 BFLOPs.
model B with 2l (6 anchors): 46.8% [email protected], 4.76 BFLOPs.
model B with cem (6 anchors): 45.2% [email protected], 4.81 BFLOPs.
model B with cem sam (6 anchors): 46.1% [email protected], 4.90 BFLOPs.
model B with modified cem sam (9 anchors): 48.0% [email protected], 4.95 BFLOPs.

WongKinYiu on 2 Sep 2019

🎉3

@WongKinYiu

model B with modified cem sam (9 anchors): 48.0% [email protected], 4.95 BFLOPs.

Thanks!
What modifications did you do in 6-model?

AlexeyAB on 2 Sep 2019

@AlexeyAB Hello, i m on a business trip, i ll share the modified cem sam tonight.

WongKinYiu on 3 Sep 2019

@AlexeyAB modified-cem-sam-head.txt

i using spp instead global average pooling, it is becuz currently this repo can not support multi-scale training when using global average pooling as an intermediate layer.
since yolo is one-stage object detector, i add sam layer for each feature pyramid.

WongKinYiu on 3 Sep 2019

👍2

@WongKinYiu Thanks!
Did you compare Inference time (sec) for 2. model A with 3l (9 anchors): 46.3% [email protected], 5.03 BFLOPs. and 6. model B with modified cem sam (9 anchors): 48.0% [email protected], 4.95 BFLOPs. ?

AlexeyAB on 3 Sep 2019

@AlexeyAB Hello,
sam layer is similar to scale channels layer, although it increase only <1% computation, it increase 20%~30% inference time on gpu. on cpu, they take similar inference time.

WongKinYiu on 3 Sep 2019

👍1

@WongKinYiu When you will find the best cfg-file, please share it, I will add it to this repository.

AlexeyAB on 3 Sep 2019

WongKinYiu on 3 Sep 2019

👍4 🚀2

@WongKinYiu Thanks!

So as I see you didn't use CEM, SAM or Squeeze-and-Excitation blocks in the YOLO-v3-tiny-PRN
Also you didn't use Swish, SPP, PAN2, Assisted Excitation, Anti Aliasing, Mixup, scales_x_y, GIoU https://github.com/AlexeyAB/darknet/projects/1

Whad do you think about these features?

Do you plan to create a model in which all or most of these features will be?

AlexeyAB on 4 Sep 2019

SPP and scales_x_y are very useful.
GIoU improves [email protected]:0.95, but drops [email protected]. For some cases, [email protected] is more important.
PAN2 reduces 13% computation than PAN and reduces 0.5% [email protected] in my experiment.
Mixup can not benefit lightweight model in my experiments.
SAM and squeeze-and-excitation take too much inference time on GPU.

I think Anti Aliasing will be useful too.
And I will take a look Assisted Excitation.

However, I find a strange phenomenon.
When I use repos committed before early March to train a detector, it always get better performance than using repos committed after May.
If I train the model using old repo, then valid the model using new repos. They get worse results, too.
So I do not use new features in my model.
I will try to add features that I need to old repo and test them in following one month.

WongKinYiu on 5 Sep 2019

👍3

@WongKinYiu

If I train the model using old repo, then valid the model using new repos. They get worse results, too.

Maybe only the new accuracy checking function is different, and the training is just as good?
I fixed a little in mAP function.

GIoU improves [email protected]:0.95, but drops [email protected]. For some cases, [email protected] is more important.
PAN2 reduces 13% computation than PAN and reduces 0.5% [email protected] in my experiment.
Mixup can not benefit lightweight model in my experiments.

Did you test it on MS COCO dataset?

I will add PAN3 block and new tiny model today there: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

AlexeyAB on 5 Sep 2019

👍1

@AlexeyAB
I upload predicted bounding boxes to codalab.
And I train a same model for several times, old repo always get better results.

Yes, all of my experiment results are tested on MS COCO test-dev set.

WongKinYiu on 5 Sep 2019

@WongKinYiu
I added another one mode: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-528532293

cfg: https://github.com/AlexeyAB/darknet/files/3580764/yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou.cfg.txt

It seems it is the best cfg-file for this small dataset: https://github.com/AlexeyAB/darknet/issues/3114#issuecomment-494148968

You can try to train it on MS COCO and check the mAP if you have a time.

AlexeyAB on 5 Sep 2019

🎉1

@WongKinYiu Also can you attach entire the best of yours SAM_CEM model (not only head)?
I will attach it here and close the Issue: https://github.com/AlexeyAB/darknet/issues/3702

@AlexeyAB modified-cem-sam-head.txt

i using spp instead global average pooling, it is becuz currently this repo can not support multi-scale training when using global average pooling as an intermediate layer.
since yolo is one-stage object detector, i add sam layer for each feature pyramid.

AlexeyAB on 5 Sep 2019

@AlexeyAB

Thank you for sharing a good model (yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou).

After discuss with my team, I can not share the backbone of https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-526991664 currently.
I will add the modified-cem-sam-head to yolo-v3-tiny and share the cfg latter.

WongKinYiu on 6 Sep 2019

👍1

@AlexeyAB
now training yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou (no sgdr) on coco dataset.
i ll report the result after finish training.

WongKinYiu on 7 Sep 2019

🎉1

@WongKinYiu Try to increase assisted_excitation=4000 to assisted_excitation=20000 or 50000

AlexeyAB on 7 Sep 2019

COCO test-dev

| Model | Size | [email protected]:.95 | [email protected] | [email protected] |
| :-- | :-: | :-: | :-: | :-: |
| yolo_v3_tiny_pan3_aa_ae_mixup_scale_giou (no sgdr).txt | 416x416 | 18.8% | 36.8% | 17.5% |

WongKinYiu on 17 Sep 2019

COCO test-dev

AlexeyAB on 17 Sep 2019

@AlexeyAB
I download the latest repo and set like following in makefile

GPU=1
CUDNN=1
CUDNN_HALF=1
OPENCV=1
AVX=0
OPENMP=0
LIBSO=0
ZED_CAMERA=0

Then use yolov3-tiny-sam.cfg.txt in https://github.com/AlexeyAB/darknet/issues/3708#issuecomment-519329865 to train with my own dataset
But meet error like following

Total BFLOPS 4.883
Allocate additional workspace_size = 52.43 MB
Loading weights from /home/gc/4-images/9.18/darknet/yolov3-tiny.conv.15...
seen 64
Done! Loaded 23 layers from weights-file
Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005
If error occurs - run training with flag: -dont_show
Resizing
608 x 608
Resizing type 16
Cannot resize this type of layer:
darknet: ./src/utils.c:297：error:

nyj-ocean on 9 Jan 2020

@nyj-ocean

yes, you should modify the resize function of sam layer, or you can only train it with random=1.

WongKinYiu on 9 Jan 2020

@WongKinYiu

sam layers could not train with Multi-Scale(random=1),is it right?
How to modify the resize function of sam layer to train with random=1

nyj-ocean on 9 Jan 2020

@nyj-ocean

yes.

just add the case of resize function of sam layer in network.c.
it already defined in sam_layer.c.
so you can simply include it and just need a little bit modification.

WongKinYiu on 9 Jan 2020

@WongKinYiu
Thanks a lot

nyj-ocean on 9 Jan 2020

🚀1

@AlexeyAB
I notice another module:CBAM: Convolutional Block Attention Module
cbam1

cbam2

Is there a need to add CBAM Module to this repo?

CBAM: Convolutional Block Attention Module.pdf
code:https://github.com/Jongchan/attention-module

nyj-ocean on 9 Jan 2020

The kernel function of CAM module and SAM module are SE(squeeze-and-excitation) and SAM, respectively, the already supported by this repo.

WongKinYiu on 9 Jan 2020

👍1

I added resizing (random=1) for [sam] layers.

AlexeyAB on 9 Jan 2020

👍1

@nyj-ocean
Hello, have you tried the CBAM module in YOLOV4?

924175302 on 14 Jun 2020

@WongKinYiu
You mentioned that this repo already supports the SE module, I don’t find the relevant code and how to use it, can you help me answer it, thank you

924175302 on 15 Jun 2020

Squeeze-and-Excitation blocks (layers: [avgpool]->[conv]->[conv]->[scale_channels])

WongKinYiu on 15 Jun 2020

@924175302 Example of SE
https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/enet-coco.cfg

#squeeze-n-excitation
[avgpool]

# squeeze ratio r=16 (recommended r=16)
[convolutional]
filters=24
size=1
stride=1
activation=swish

# excitation
[convolutional]
filters=384
size=1
stride=1
activation=logistic

# multiply channels
[scale_channels]
from=-4

AlexeyAB on 15 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings