Darknet: What is the meaning of the route layer and reorg layer.What do they do ?

Created on 6 Jul 2017 · 2Comments · Source: AlexeyAB/darknet

Hello guys, I have some question.
In the yolo-voc.2.0.cfg, the definition of route and reorg layer is confusing to me
[route]
layers=-9

[reorg]
stride=2

[route]
layers=-1,-3

From what i see from console windows when I loaded yolo weight, i see that:

At layer 25, the filter size is 16 but in the cfg, there is nothing related to 16 but layer = -9 ??? From that output of layer 24 (13x13x1024) transform into 26x26x512 at layer 26. So my question is what happens in the layer 25 ?
In the layer 26, I kind a predict that the feature maps get reorganized to shrink but could anyone explain further about it (in the pixels level).
I am sorry if my question is stupid !!!

question

Source

phongnhhn92

Most helpful comment

Hi, @phongnhhn92

route layer is not convolutional layer, so these values -1, -3 are not filter sizes, this is relative index of layers from output of which we get data.
Yes, for example we use [reorg] stride=2, and have as input 2x2 x8, then we will have output 1x1 x32. Without changes of values.

As usual, every next layer takes as input the result of the preceding layer, and then process it (convolutional, max-pool, ...), however:

If we use [route] layers=-1, we simply takes as input the result of the preceding layer (current_layer_number-1), without any processing.
If we use [route] layers=-2, we takes as input the result of the layer with index = (current_layer_number-2), without any processing.
If we use [route] layers= -1, -3, we takes as input the result of the layers with indexes = (current_layer_number-1) and (current_layer_number-3), and merge them into one layer
If at layer-27 we have [route] layers= -1, -3, then it will take two layers 26=(27-1) and 24=(27-3), and merge its in depth: 13x13x1024 + 13x13x2048 = 13x13x3072 - is output of layer-27.

yolo_voc 2 0

AlexeyAB on 6 Jul 2017

🎉6 ❤5 👍3

Hi, @phongnhhn92

route layer is not convolutional layer, so these values -1, -3 are not filter sizes, this is relative index of layers from output of which we get data.
Yes, for example we use [reorg] stride=2, and have as input 2x2 x8, then we will have output 1x1 x32. Without changes of values.

As usual, every next layer takes as input the result of the preceding layer, and then process it (convolutional, max-pool, ...), however:

If we use [route] layers=-1, we simply takes as input the result of the preceding layer (current_layer_number-1), without any processing.
If we use [route] layers=-2, we takes as input the result of the layer with index = (current_layer_number-2), without any processing.
If we use [route] layers= -1, -3, we takes as input the result of the layers with indexes = (current_layer_number-1) and (current_layer_number-3), and merge them into one layer
If at layer-27 we have [route] layers= -1, -3, then it will take two layers 26=(27-1) and 24=(27-3), and merge its in depth: 13x13x1024 + 13x13x2048 = 13x13x3072 - is output of layer-27.