Incubator-mxnet: gluon feature request: proper registration/initialization of layers inside a list (container) for custom (Hybrid)Blocks

Created on 14 Mar 2018 · 8Comments · Source: apache/incubator-mxnet

Dear all, it would be very useful if one could add NN layers of a gluon custom model inside a list, similar to torch.nn.ModuleList, something like:

class CustomNet(HybridBlock):

    def __init__(self,**kwards):
        HybridBlock.__init__(self,**kwards)
        with self.name_scope():
           layers_list = []
           for i in range(5):
               layers_list += [gluon.nn.Conv2D( SomeArguments )]


    def hybrid_forward(self,F,_x):
         # Some manipulation of layers_list elements
         out = ... 

        return out

I can think of many use cases, but one important one is indexing for neuroevolution problems, i.e. using a variable architecture of a specified set of layers.

Thank you very much for the great work you put into gluon/mxnet.

Source

feevos

Most helpful comment

Good point on making the feature known. cc'd @zackchase, @astonzhang, @mli

szha on 15 Mar 2018

👍2

All 8 comments

how do you intend to use layers_list in your example? it is possible to use Sequential/HybridSequential just as containers without using its forward functionality.

szha on 14 Mar 2018

Hi @szha, thank you for your reply. I've done so in simpler architectures as you describe but now I want to try something more advanced.

The basic idea is that one can have a set of layers that live in a list, layers_list. Then one can form a sparse connectivity matrix, Sij where each row corresponds to the connections of layer_i to layer_j. The connectivity matrix will be an individual inside an evolutionary algorithm. The architecture of the network is defined by Sij. For example, a simple Sequential module, where one stacks 4 layers

net = Sequential()
for i in range(3):
    net.add(Dense(5))

can be represented with the following connectivity matrix:

   | 1   2   3   4
------------------- 
1  | 0  1   0   0
2  | 0  0   1   0
3  | 0  0   0   1 
4  | 0  0   0   0 
 ```
Starting from row, ```layer_1``` connects to ```layer_2```, ```layer_2``` to ```layer_3``` and so on. Layer 4 has no connectivy( last layer). But if we want more advanced topology of the network (like ```layer_1``` connecting with ```layer_2``` and ```layer_3```)
```Python
   | 1   2   3   4
------------------- 
1  | 0  1   1   0
2  | 0  0   1   0
3  | 0  0   0   1 
4  | 0  0   0   0 
 ```
the ```Sequential``` breaks down. It is possible again to formulate it with Sequential but I think it lacks the flexibility of indexing.  

Now assuming one has the layers in a container (a list in this example, I can think of dictionary usage as well), ```layers_list```, and ```Sij``` is a sparse matrix, one can formulate a ```forward``` function (design prototype, not the true solution, [here](https://stackoverflow.com/questions/4319014/iterating-through-a-scipy-sparse-vector-or-matrix) is an example of iterating over sparse matrix): 

```Python 
def hybrid_forward(self, F, input):
    out = self.first_layer(input)
    cx = Sij.tocoo()    
    # This for loop iterates over non zero elements. 
    for i,j,_ in itertools.izip(cx.row, cx.col, cx.data):
        out = self.layer_list[j](self.layer_list[i](out))
        out = F.relu(out)
    return out

The basic idea is to create a DAG on the fly, using lists and a connectivity matrix is the first way that comes into my mind of implemeting this (I may be wrong, am pretty sure there are perhaps better ways of doing so, but I don't know any). I think this functionality, in combination with the flexibility of gluon imperative style, can help a lot of people play with variable architectures.

feevos on 14 Mar 2018

In [1]: import mxnet as mx

In [2]: net = mx.gluon.model_zoo.vision.alexnet()

In [3]: net
Out[3]:
AlexNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
    (2): Conv2D(None -> 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
    (4): Conv2D(None -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (5): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
    (8): Flatten
    (9): Dense(None -> 4096, Activation(relu))
    (10): Dropout(p = 0.5, axes=())
    (11): Dense(None -> 4096, Activation(relu))
    (12): Dropout(p = 0.5, axes=())
  )
  (output): Dense(None -> 1000, linear)
)

In [4]: net.features
Out[4]:
HybridSequential(
  (0): Conv2D(None -> 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
  (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
  (2): Conv2D(None -> 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
  (4): Conv2D(None -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (5): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (6): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
  (8): Flatten
  (9): Dense(None -> 4096, Activation(relu))
  (10): Dropout(p = 0.5, axes=())
  (11): Dense(None -> 4096, Activation(relu))
  (12): Dropout(p = 0.5, axes=())
)

In [5]: net.features[3]
Out[5]: MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)

szha on 14 Mar 2018

👍1

Refer to https://github.com/apache/incubator-mxnet/issues/9264 .

sxjscience on 14 Mar 2018

👍1

@feevos Currently the HybridSequential and Sequential have the same functionality as ModuleList. Thus we previously decide to not add an additional ModuleList. We can bring it to the table again.

sxjscience on 14 Mar 2018

👍1

Hi @szha and @sxjscience thank you very much for your reply. So if I understand correctly, (Hybrid)Sequential can also be used as a container of the various layers and indexed just like a list so I can use the contained layers in any _order_ I want (without the stacked sequential forward functionality). If I understand correctly, I can use something like:

class CustomNet(HybridBlock):

    def __init__(self,**kwards):
        HybridBlock.__init__(self,**kwards)

        with self.name_scope():
            self.net = HybridSequential()
            # Add some convolution operators
            for i in range(3):
                net.add(Conv2D(....))

    # Change the order of the layers in the self.net. 
    # This is not equivalent to self.net(input)
    def hybrid_forward(self,F, input):
        out = self.net[2]( input)
        out = self.net[0] (out) 
        out = self.net[1] (out) 

       return out

if my understanding is correct then yes, there is no need for something similar to ModuleList. I haven't seen anything like what you just described in the documentation (it would be nice to add it in the gluon book and API).

Thank you very much!

feevos on 15 Mar 2018

Good point on making the feature known. cc'd @zackchase, @astonzhang, @mli

szha on 15 Mar 2018

👍2

This solution is kind of weird. Sequential feels like it ought to be composed of things that can feed into one another. But if you are just using it as a list, the shapes might not even be right for that.

I admit it isn't a high priority, but just for sugar it might be nice to implement a separate blocklist class.