Darknet: the route of yolov4-tiny.cfg

Created on 29 Jun 2020 · 21Comments · Source: AlexeyAB/darknet

[route]
layers=-1
groups=2
group_id=1

what doed the "groups=2 group_id=1" mean?

Source

XDhughie

Most helpful comment

if width * height * channel is w * h * c, the feature map will be fm[0:w, 0:h, 0:c].
groups = 2, group_id = 0 will gets fm[0:w, 0:h, 0:c/2].
groups = 2, group_id = 1 will gets fm[0:w, 0:h, c/2:c].

WongKinYiu on 29 Jun 2020

👍5

All 21 comments

split previous layer (layers=-1) into two parts through channel (groups=2) and route the second part (group_id=1, id start from 0).

WongKinYiu on 29 Jun 2020

split previous layer (layers=-1) into two parts through channel (groups=2) and route the second part (group_id=1, id start from 0).

hey, can I ask what the meaning of channel here?

ai815 on 29 Jun 2020

size of feature map is width * height * channel.

WongKinYiu on 29 Jun 2020

size of feature map is width * height * channel.

oh I see thanks. but how could we seperaste a layer into two parts, can you plz explain?

ai815 on 29 Jun 2020

if width * height * channel is w * h * c, the feature map will be fm[0:w, 0:h, 0:c].
groups = 2, group_id = 0 will gets fm[0:w, 0:h, 0:c/2].
groups = 2, group_id = 1 will gets fm[0:w, 0:h, c/2:c].

WongKinYiu on 29 Jun 2020

👍5

@WongKinYiu I don't understand why (visually with Netron) the num. of c is 64

Shouldn't be 32 because c/2?

ankandrew on 29 Jun 2020

maybe it because darknet use group_id but netron use groups_id.
https://github.com/AlexeyAB/darknet/blob/master/src/parser.c#L1036
https://github.com/lutzroeder/netron/blob/master/src/darknet-metadata.json#L281

you can try to modify the code of netron and generate the graph again.

WongKinYiu on 29 Jun 2020

@WongKinYiu I just tried changing group_id to groups_id but they produce both the same result. I should open an issue in netron I guess

ankandrew on 29 Jun 2020

👀3

@WongKinYiu Hi

Yolov4-tiny only use the group_id=1? So many channel is thrown away?

ShaneHsieh on 30 Jun 2020

no, cross stage connection routes all of channel of base layer in yolov4-tiny.

WongKinYiu on 30 Jun 2020

@WongKinYiu I can understand it.
You mean that only this route layer uses half channel,but previous layer that uses all channel will be concat?

ShaneHsieh on 30 Jun 2020

WongKinYiu on 30 Jun 2020

👍3

is right?

`class ResConv2dBatchLeaky(nn.Module):

def __init__(self, in_channels, inter_channels, kernel_size, stride=1, leaky_slope=0.1, return_extra=False):
    super(ResConv2dBatchLeaky, self).__init__()

    self.return_extra = return_extra
    self.in_channels = in_channels
    self.inter_channels = inter_channels
    self.kernel_size = kernel_size
    self.stride = stride
    if isinstance(kernel_size, (list, tuple)):
        self.padding = [int(ii / 2) for ii in kernel_size]
    else:
        self.padding = int(kernel_size / 2)
    self.leaky_slope = leaky_slope

    self.layers0 = Conv2dBatchLeaky(self.in_channels//2, self.inter_channels, self.kernel_size, self.stride,
                                    self.padding)
    self.layers1 = Conv2dBatchLeaky(self.inter_channels, self.inter_channels, self.kernel_size, self.stride,
                                    self.padding)
    self.layers2 = Conv2dBatchLeaky(self.in_channels, self.in_channels, 1, 1, 0)

def forward(self, x):
    y0 = x
    channel = x.shape[1]
    x0 = x[:, channel // 2:, ...]
    x1 = self.layers0(x0)
    x2 = self.layers1(x1)
    x3 = torch.cat((x2, x1), dim=1)
    x4 = self.layers2(x3)
    x = torch.cat((y0, x4), dim=1)
    if self.return_extra:
        return x, x4
    else:
        return x`

@WongKinYiu

hhaAndroid on 1 Jul 2020

The whole model structure is as follows? is right?

`class TinyYolov4(nn.Module):

def __init__(self, pretrained=False):
    super(TinyYolov4, self).__init__()

    # Network
    backbone = OrderedDict([
        ('0_convbatch', vn_layer.Conv2dBatchLeaky(3, 32, 3, 2)),
        ('1_convbatch', vn_layer.Conv2dBatchLeaky(32, 64, 3, 2)),
        ('2_convbatch', vn_layer.Conv2dBatchLeaky(64, 64, 3, 1)),
        ('3_resconvbatch', vn_layer.ResConv2dBatchLeaky(64, 32, 3, 1)),
        ('4_max', nn.MaxPool2d(2, 2)),
        ('5_convbatch', vn_layer.Conv2dBatchLeaky(128, 128, 3, 1)),
        ('6_resconvbatch', vn_layer.ResConv2dBatchLeaky(128, 64, 3, 1)),
        ('7_max', nn.MaxPool2d(2, 2)),
        ('8_convbatch', vn_layer.Conv2dBatchLeaky(256, 256, 3, 1)),
        ('9_resconvbatch', vn_layer.ResConv2dBatchLeaky(256, 128, 3, 1, return_extra=True)),
    ])

    head = [
        OrderedDict([
            ('10_max', nn.MaxPool2d(2, 2)),
            ('11_conv', vn_layer.Conv2dBatchLeaky(512, 512, 3, 1)),
            ('12_conv', vn_layer.Conv2dBatchLeaky(512, 256, 1, 1)),
        ]),

        OrderedDict([
            ('13_conv', vn_layer.Conv2dBatchLeaky(256, 512, 3, 1)),
            ('14_conv', nn.Conv2d(512, 3 * (5 + 80), 1)),
        ]),

        OrderedDict([
            ('15_convbatch', vn_layer.Conv2dBatchLeaky(256, 128, 1, 1)),
            ('16_upsample', nn.Upsample(scale_factor=2)),
        ]),

        OrderedDict([
            ('17_convbatch', vn_layer.Conv2dBatchLeaky(384, 256, 3, 1)),
            ('18_conv', nn.Conv2d(256, 3 * (5 + 80), 1)),
        ]),
    ]

    self.backbone = nn.Sequential(backbone)
    self.head = nn.ModuleList([nn.Sequential(layer_dict) for layer_dict in head])
    self.init_weights(pretrained)

def forward(self, x):
    stem, extra_x = self.backbone(x)
    stage0 = self.head[0](stem)
    head0 = self.head[1](stage0)

    stage1 = self.head[2](stage0)
    stage2 = torch.cat((stage1, extra_x), dim=1)
    head1 = self.head[3](stage2)
    head = [head1, head0]
    return head`

hhaAndroid on 1 Jul 2020

i think yes.

class ResConv2dBatchLeaky(nn.Module):
  def __init__(self, in_channels, inter_channels, kernel_size, stride=1, leaky_slope=0.1, return_extra=False):
    super(ResConv2dBatchLeaky, self).__init__()

    self.return_extra = return_extra
    self.in_channels = in_channels
    self.inter_channels = inter_channels
    self.kernel_size = kernel_size
    self.stride = stride
    if isinstance(kernel_size, (list, tuple)):
        self.padding = [int(ii / 2) for ii in kernel_size]
    else:
        self.padding = int(kernel_size / 2)
    self.leaky_slope = leaky_slope

    self.layers0 = Conv2dBatchLeaky(self.in_channels//2, self.inter_channels, self.kernel_size, self.stride,
                                    self.padding)
    self.layers1 = Conv2dBatchLeaky(self.inter_channels, self.inter_channels, self.kernel_size, self.stride,
                                    self.padding)
    self.layers2 = Conv2dBatchLeaky(self.in_channels, self.in_channels, 1, 1, 0)

  def forward(self, x):
    y0 = x
    channel = x.shape[1]
    x0 = x[:, channel // 2:, ...]
    x1 = self.layers0(x0)
    x2 = self.layers1(x1)
    x3 = torch.cat((x2, x1), dim=1)
    x4 = self.layers2(x3)
    x = torch.cat((y0, x4), dim=1)
    if self.return_extra:
        return x, x4
    else:
        return x

WongKinYiu on 1 Jul 2020

How come v4 tiny uses the route groups in its cfg file, but the full v4 cfg does not? It looks like they're both CSP-based, but I'm not sure why only v4 tiny uses the route groups. Is it just to increase speed by cutting the feature size in half?

willbattel on 23 Jul 2020

groups and group_id are implemented after cspnet is designed, so new csp models such as yolov4-tiny implement csp using route groups.

WongKinYiu on 23 Jul 2020

Interesting. So if YOLOv4 had not been published until later, it would have used the route groups? And I assume the functionality would have been the same as it is currently, just a different implementation? Thanks.

willbattel on 23 Jul 2020

yes, both of two implementations are equivalent.

WongKinYiu on 23 Jul 2020

👍1

yolov4 is based on resnet, it split channel in base layer and remove bottleneck of res layers. so two path will be {2x, 2x}.
yolov4-tiny is based on vovnet, if we split channel in base layer too, two path will be {1x, 3x}. so we split it in computational block to make it be {2x, 2x}. this modification is used for optimizing memory bandwidth.

WongKinYiu on 27 Jul 2020

if width * height * channel is w * h * c, the feature map will be fm[0:w-1, 0:h-1, 0:c-1].
groups = 2, group_id = 0 will gets fm[0:w-1, 0:h-1, 0:c/2-1].
groups = 2, group_id = 1 will gets fm[0:w-1, 0:h-1, c/2:c-1].