Darknet: [feature request] anti-aliasing within the network ~+1-2% Top1

Created on 28 Jul 2019  路  35Comments  路  Source: AlexeyAB/darknet

This technique is reported to give a small, "free" boost to accuracy, mitigating aliasing effects within the network:
https://github.com/adobe/antialiased-cnns

Does not improve enhancement

All 35 comments

  • Instead of [convolutional] size=3 stride=2 filters=256 activation=leaky antialiasing=1
    will be used
    [convolutional] size=3 stride=1 filters=256 activation=leaky
    [convolutional] size=3 stride=2 filters=256 groups=256 activation=linear with hardcoded weights:

| - | - | - |
|---|---|---|
| 1/16 | 2/16 | 1/16 |
| 2/16 | 4/16 | 2/16 |
| 1/16 | 2/16 | 1/16 |


  • Instead of [maxpool] size=2 stride=2 antialiasing=1
    will be used
    [maxpool] size=2 stride=1
    [convolutional] size=3 stride=2 filters=N_channels groups=N_channels activation=linear with hardcoded weights:

| - | - | - |
|---|---|---|
| 1/16 | 2/16 | 1/16 |
| 2/16 | 4/16 | 2/16 |
| 1/16 | 2/16 | 1/16 |


  • project page: https://richzhang.github.io/antialiased-cnns/

  • paper : https://arxiv.org/abs/1904.11486

  • video: https://www.youtube.com/watch?time_continue=74&v=HjewNBZz00w

May be better to use tri-3: Triangle-3 bluring with coefs [1, 2, 1] - bilinear downsampling

| - | - | - |
|---|---|---|
| 1 | 2 | 1 |
| 2 | 4 | 2 |
| 1 | 2 | 1 |

image


bin-5 - Binomial-5 - Just take 5x5 window, multiply elements by these values along x and y [1., 4., 6., 4., 1.], and divide result by 256: https://github.com/adobe/antialiased-cnns/blob/430d54870a2c1c5b258fd38f5f796df44aefee79/models_lpf/__init__.py#L39

Read - Page 1, first table, Index N = 4: http://web.archive.org/web/20100621232359/http://www-personal.engin.umd.umich.edu/~jwvm/ece581/21_GBlur.pdf

kernel_size = 5
stride = 1
coefficient weights =

| - | - | - | - | - |
|---|---|---|---|---|
| 1 | 4 | 6 | 4 | 1 |
| 4 | 16 | 24 | 16 | 4 |
| 6 | 24 | 36 | 24 | 6 |
| 4 | 16 | 24 | 16 | 4 |
| 1 | 4 | 6 | 4 | 1 |


68747470733a2f2f726963687a68616e672e6769746875622e696f2f616e7469616c69617365642d636e6e732f7265736f75726365732f616e7469616c6961735f6d6f642e6a7067


Different blurs are used:

  • rect-2: Rectange-2
  • tri-3: Triangle-3
  • bin-5: Binomial-5 - BinomialBlur works by repeating a 5x5 or 3x3 kernel based on pascals triangle multiple times to blur the image. http://web.archive.org/web/20100621232359/http://www-personal.engin.umd.umich.edu/~jwvm/ece581/21_GBlur.pdf

image


image


68747470733a2f2f726963687a68616e672e6769746875622e696f2f616e7469616c69617365642d636e6e732f7265736f75726365732f696d6167656e65745f696e64325f6e6f616c65782e6a7067

you added "antialiasing=1" to convolutional layers? Awesome! So I can test it by adding that parameter to every [convolutional] layer throughout the .cfg?

@LukeAI
Yes.

  • you can add antialiasing=1 to every [convolutional] which is with stride>1 or stride_x>1 or stride_y>1, since antialiasing has meaning only for stride>1

  • after that you should re-train / finetune your model

  • today I will try to implement it on CPU too

  • today I will try to add antialiasing=1 for [maxpool] with stride>1 or stride_x>1 or stride_y>1


I think some of these features should solve the problem of re-identification (blinking-issue).

ok, I'll wait until you have added antialiasing to maxpool before I retrain.

@LukeAI Did you understand, should we use antialiasing=1 for every stride=2 layer except the 1st stride=2 layer?

@LukeAI I added antialiasing=1 for [maxpool] with stride>1 or stride_x>1 or stride_y>1

ok well I have done as you suggested (see attached) trying it out now. I hadn't realised that almost all of yolov3-spp is stride=1 so I guess this won't make too much difference but I'll let you know.
yolo_v3_spp_antialias.cfg.txt

ImageNet
BFLOPs: 0.858
Top-1: 56.3 (expected value is ~60)
Top-5: 79.5

andarknet-imagenet_final.zip

ImageNet
BFLOPs: 0.970
Top-1: 54.5 (expected value is ~60)
Top-5: 77.9

andarknet.zip

i also trained another two densenet-based models.
all of these models get worse results after add antialiasing=1.

@AlexeyAB @LukeAI Hi, did you try to set random = 1 when you added antialiasing=1, it's seems a bug when both random and antialiasing =1, even set subvision = 64 will got 'out of CUDA memory', but with the max image size (e.g. 608 in my case) if random = 0, the training is working normally

random=1 does indeed increase the memory requirements so this probably isn't a bug. If you want to use random=1, try decreasing the training resolution, you can always increase it again at inference time.

I just tried training with antialiasing=1 in convolutional layers with stride=2 except for the very first one. I found that it made no real difference
with antialiasing:
chart_antialias
without:
chart

@LukeAI What dataset did you use? And what model did you use?

It was a private urban roads dataset.
yolo_v3_spp_scale_swish_aa.cfg.txt

@LukeAI Also try to get cfg/weights without antialiasing=1

  1. check mAP
  2. add antialiasing=1 and check mAP again without retraining, will be mAP higher?

Have tried doing so, adding antialiasing=1 led to broadly worse results - mostly weaker recall.
model trained and evaluated without aa:

class_id = 0, name = Car, ap = 52.54%, Precision = 0.61, Recall = 0.51, avg IOU = 0.49%, TP = 233, FP = 149
class_id = 1, name = Person, ap = 77.06%, Precision = 0.93, Recall = 0.76, avg IOU = 0.69%, TP = 358, FP = 29
class_id = 2, name = Truck, ap = 63.67%, Precision = 0.75, Recall = 0.61, avg IOU = 0.58%, TP = 476, FP = 161
class_id = 3, name = Traffic_light, ap = 56.18%, Precision = 0.56, Recall = 0.68, avg IOU = 0.38%, TP = 116, FP = 92
class_id = 4, name = Trailer, ap = 71.56%, Precision = 0.83, Recall = 0.67, avg IOU = 0.66%, TP = 268, FP = 56

 for conf_thresh = 0.10, precision = 0.75, recall = 0.64, F1-score = 0.69 
 for conf_thresh = 0.10, TP = 1451, FP = 487, FN = 826, average IoU = 57.76 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision ([email protected]) = 0.642031, or 64.20 %

same model, weights, with aa added to cfg

class_id = 0, name = Car, ap = 44.06%, Precision = 0.92, Recall = 0.41, avg IOU = 0.74%, TP = 188, FP = 17
class_id = 1, name = Person, ap = 62.32%, Precision = 0.92, Recall = 0.54, avg IOU = 0.66%, TP = 254, FP = 21
class_id = 2, name = Truck, ap = 39.23%, Precision = 0.76, Recall = 0.33, avg IOU = 0.58%, TP = 260, FP = 83
class_id = 3, name = Traffic_light, ap = 34.33%, Precision = 0.48, Recall = 0.48, avg IOU = 0.33%, TP = 81, FP = 88
class_id = 4, name = Trailer, ap = 54.51%, Precision = 0.83, Recall = 0.48, avg IOU = 0.65%, TP = 190, FP = 39

 for conf_thresh = 0.10, precision = 0.80, recall = 0.43, F1-score = 0.56 
 for conf_thresh = 0.10, TP = 973, FP = 248, FN = 1304, average IoU = 60.17 % 

 IoU threshold = 50 %, used Area-Under-Curve for each unique Recall 
 mean average precision ([email protected]) = 0.468911, or 46.89 %

So may be it doesn't give any advantage for this dataset.

Did you check the mAP on separate validation dataset?

@LukeAI
I added antialiasing=2 so you can try to use it. It uses 2x2 filters instead of 3x3 filters.
There are also several changes:

  • Use ignore_thresh only if class_id matched.
  • Temporary changed Assisted_Excitation (reduces background activations rather than enhancing objects activations).
  • Added antialiasiong=2 for 2x2.

Hey, I'll give this another go when I get GPU time - so I should add antialiasing=2 to all conv layers with stride=2 except the first one?

@LukeAI Yes. But I don't know will it bring any improvement in mAP.

I think better to try iou_thresh=0.3 param in yolo layers.

@WongKinYiu
Did you try AntiAliasing, and did you get any boost?
I didn't understand it correctly (escription of my understanding https://github.com/AlexeyAB/darknet/issues/3672#issuecomment-515779175 ), or is +1-2% Top1 with AntiAliasing just a fake?

@AlexeyAB

No ,I did not get any boost in my experiments. https://github.com/AlexeyAB/darknet/issues/3672#issuecomment-533883993

I think it is because of that we use shift-based data augmentation (random crop).
image

@WongKinYiu Yes, random-crop solves shift-issue.
Random-crop allows to remember all shifts. But I thought may be antialiasing=1 would not require remember shifts, therefore, accuracy will be the same, but will require fewer filters.
But it seems antialiasing=1 even decreases accuracy: https://github.com/AlexeyAB/darknet/issues/3672#issuecomment-533883993

@AlexeyAB
Yes, it seems decreases accuracy in this implementation.
I think we need do corresponding back-propagation of anti-aliasing.

@AlexeyAB Oh!

My models are trained before 22 Sep, maybe I should retrain the models to get accurate results.

And do you think we need corresponding back-propagation of anti-aliasing pooling?
for example, global avgpool do

state.delta[in_index] += l.delta[out_index] / (l.h*l.w)

then global anti-aliasing need do

state.delta[in_index] += l.delta[out_index] * (blur_mask[i] / sum_of_blur_mask);

for normal anti-aliasing we also need do corresponding back-propagation.

@WongKinYiu

Do you mean?

[avgpool]
antialiasing=1

state.delta[in_index] += l.delta[out_index] * (blur_mask[i] / sum_of_blur_mask);

How do we get blur_mask[]?

for normal anti-aliasing we also need do corresponding back-propagation.

What is it normal anti-aliasing.
I implemented anti-aliasing just as common depth-wise [convolutional]-layer with fixed weights.

@AlexeyAB

I mean

[maxpool]
antialiasing=1

for example,
image
currently the blur_mask of blure_size=2 is:

| | |
| :-: | :-: |
| 1 | 1 |
| 1 | 1 |

It equivalent we do maxpool(size=2, stride=1) then do avgpool(size=2, stride=2) in forward pass.
But the backward pass seems only considerate the maxpool part.
image
We need do backward pass of avgpool(size=2, stride=2) then do backward pass of maxpool(size=2, stride=1).

The blur_mask of blure_size!=2 is:

| | | |
| :-: | :-: | :-: |
| 1 | 2 | 1 |
| 2 | 4 | 2 |
| 1 | 2 | 1 |

Blur down-sampling can be seen as a constant weighted convolutional layer.
So we need do corresponding backward pass of blur down-sampling.

@WongKinYiu There is back-propagation for antialiasing in the [maxpool] layer for training on GPU: https://github.com/AlexeyAB/darknet/blob/649abac372446e6c0114e8fbc9bbbb8b226318b9/src/maxpool_layer_kernels.cu#L195-L209

I added it for GPU. But I didn't add for CPU, because no one is training at the CPU anyway. And if it does not work, then I will remove anti-aliasing altogether.

Do you try to use AntiAliasing for Classifier or for Detector currently?

@AlexeyAB Hello,

I will retrain the models next week.

@AlexeyAB

| model | top-1 | top-5 |
| :-: | :-: | :-: |
| original Model A | 70.9 | 90.2 |
| old aa Model A | 69.8 | 89.5 |
| new aa Model A | 69.9 | 89.4 |
| | | |
| original Model B | 70.2 | 89.7 |
| old aa Model B | 68.9 | 88.9 |
| new aa Model B | 68.9 | 88.8 |

@WongKinYiu Thanks! So I think it should be removed.

@AlexeyAB can we still use "antialiasing=1" in our cfg.

@israfila3 It is deprecated, so I will remove it for 2 months, since it doesn't give any advantage.

@AlexeyAB thanks for your reply..
Actually i am making some report and i wanted to add "antialiasing" results in the report.
Is there any chance to use it.
I have updated this "Darkent repository" on 20 December

@israfila3 Yes. It works, only if random=0 in cfg-file.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

HilmiK picture HilmiK  路  3Comments

Cipusha picture Cipusha  路  3Comments

Yumin-Sun-00 picture Yumin-Sun-00  路  3Comments

Mididou picture Mididou  路  3Comments

siddharth2395 picture siddharth2395  路  3Comments