Keras: Can we build two models together? with one nesting in another one

Created on 22 Aug 2015  路  29Comments  路  Source: keras-team/keras

Hello All,
My code is supposed to be:
model = Sequential()
model.add(Conv2D)
Layer2Model = Graph
for i in range(20):
Layer2Model.add(Conv2D)
output=Layer2Model.layers.get_output

model.add(Flatten())
model.layer[1].set_previous(Layer2Model.FinalLayer) //layer[1] means Flatten

model.Dense()

Layer2Model.compile
model.compile

Can we do like this? I tried before, but it seems model didn't connect with Layer2Model..

All 29 comments

Did you try adding a model as a node? Something like this always worked for me:

model1 = Sequential()
model1.add ...
...

model2 = Sequential()
model2.add(model1)
model2.add ...
...

you just have to respect the input and outputs that each model expects.

Does the whole process is as follows?:

model1 = Sequential()
model1.add ...
...

model2 = Sequential()
model2.add(model1)
model2.add ...

model2.compile (model2's inputs, model2's outputs)
model1.compile (model1's inputs, model1's outputs)

you only need to compile model2

Thank you Eder. Are you familiar with CNN?
As we know that current filters have same rows and same columns. In my second layer, I would define, saying for example, 3 filters with 1_6 for each. Every filter is separated into two parts, the first 1_3 is located at the beginning, the last 1*3 would be located 6 steps far from the first part. The ill. fig. is shown as follows:
layer2structure

That's to say, each filter has the same input while different output. Can I build my model as:
model1 = Sequential()
model1. add(Conv)

model2= Graph()
model2.add(model1)
model2.add(ConvDifFilter(filter1))
model2.add(ConvDifFilter(filter2))
model2.add(ConvDifFilter(filter3))

model3 = Sequential()
model3.add(model2)
model3.add(Flatten)
model3.add(Dense())

model3.compile()

I'd say this would be easier to implement with Graphs. Check the docs.
But basically, each of your filter will be a node getting a different input, which would be a version of the original input. Did you get the ideas?

Do you mean I put the model2 as a graph model, while model1 and model3 are sequential?

you can do that!!!
them you pass each model as node a different input and combine them later

Thank you.
How do you set them different inputs? I added a layer named "setInput" in core layer.py.
For example:
model = Sequential()
model.add(Conv1)
output=model.layer[0].get_output()
model.add(SetInput(output))
model.add(Conv2)

However, Conv2 still choose the output from Conv1.. Could you fix the error?

see the Graph examples here http://keras.io/models/

I saw in the docs, we can set the different inputs in graph.fit function.
But what if I put the graph model between two sequential models? Like my example before.
Does this mean, I do not need to care about the exact inputs, but set them in model2.train_on_batch() (or model2.fit etc.)
So, the whole process should be:

model1 = Sequential()
model1. add(Conv)

model2= Graph()
model2.add(model1)
model2.add...

model3 = Sequential()
model3.add(model2)
model3. add...

model3.compile()

model2.train_on_batch(define inputs here)
model3.train_on_batch()

it says in there look for graph.add_input

Thank you Eder.
The graph.add_input can only define input name here.
It will fetch the actual input in graph.fit function.
However, My situation is the graph model is in the middle of two sequential layers. That means I am not able to get the label output of this model. So I cannot use graph.fit in my situation.
But how can I define my inputs without using the graph.fit function?

I'm saying that you should make the outer model a Graph since it is more flexible. This way you can define what is the input of each node. If you use Sequential as the outermost model, you won't be able to do that.

My DNN structure is as follows:
layerstructure

So the current problem is:
For graph model, I use, for example:
history = graph.fit({'input1':X_train, 'input2':X2_train, 'output':y_train}, nb_epoch=10) (from keras.io)
to define each input and output

In my structure, I should use:
model.fit({'input1':X_train, 'input2':changeformat1(a), 'input3':changeformat2(a),'input4':changeformate3(a),'output1':how to describe?,'output2':y_train}, nb_epoch=10)

However, for my input2-4 and output1, they are the middle computing value of input1.
How to describe them? I don't know how to illustrate such input and output.

You can have 3 actual inputs (each one is a delayed version of the "original"), pass all them on the first conv1. You get 3 outputs using the same conv1 (yes, you can reuse a layer) them you pass each one to its second layer. Like

conv1 = model ....
conv1_copy1 = deepcopy(conv1)
conv1_copy1.params = []
conv1_copy2 = deepcopy(conv1)
conv1_copy2.params = []

graph.add_input ... #input
graph.add_input ... #input delayed
graph.add_input ... #input delaye again
graph.add_node(conv1, input=input1, name='conv1_1')
graph.add_node(conv1_copy1, input=input2, name='conv1_2)
graph.add_node(conv1_copy2, input=input3, name='conv1_3)
graph.add_node # I believe you got the rest

The reason to make the params of the copy empty is to avoid theano trying to calculate a gradient twice for the same set of weights

Thank you Eder. I think I didn't clarify my problem.
For the output of the first Conv layer (or its copies), I need to change its format.
It means that, my input of the second layer is a transformational output of my first layer. So I need to change the format of this intermediate calculation value. And then, use the transformed output as the input of my second layer.

Can I do that using Keras?

try writing a layer between the first and second convolutions to do the transformation you need. Check the layer Reshape inside keras.layers.core.py
Try writing your changeformat as a layer instead.

Thank you so...o much Eder! The same idea just came into my mind!!
I have written a layer in the core.py.
I would now like to say a small incident I met before. At the very beginning, I wrote it in convolutional.py. However, an error of class Type error came to point to the super (Myclass, self). Then I copied Myclass code into core.py. No error and it compiled successfully.

I don't know why I cannot add a layer in convolutional.py...Wired..

Anyway, now my code has been compiled successfully. However, an "Output dimension is not valid" came when training.

My code is:

def get_output(self, train):
        nb_col = 6
        aaQua=[57.0519,71.0788,87.0782,97.1167,99.1326,101.1051,103.1388,113.1594,113.1594,114.1038,115.0886,128.1307,128.1741,129.1155,131.1926,137.1411,147.1766,156.1875,163.176,186.2132]
        aaQuaInt = np.zeros(20)
        for i in range(20):
           aaQuaInt[i]=int(aaQua[i]*10)
                                   #               50, 4, 1, 995
        X = self.get_input(train)  # 4D tensor: nb_samples, feature_map, 1, nb_col
        layer20aaLen=X.shape[3]-nb_col-aaQuaInt[0]
        border_mode = self.border_mode
        X3 = np.zeros((X.shape[0], X.shape[1], 2, X.shape[3]),dtype=theano.config.floatX)
        conv_out = np.zeros((X.shape[0], 20, X.shape[2],layer20aaLen),dtype=theano.config.floatX)
        for i in range(20):
            length1=int(X.shape[3]-nb_col-aaQuaInt[i])
            length2=int(nb_col+aaQuaInt[i])
            X1= X[:,:,0,0:int(length1)]
            X2= X[:,:,0,length2:]
            for inum in range(X.shape[0]):
                for jnum in range(X.shape[1]):
                    for knum in range(length1):
                        X3[inum,jnum,0,knum]=X1[inum,jnum,0,knum]
                        X3[inum,jnum,1,knum]=X2[inum,jnum,0,knum]
            print ("XXXXXXX3Shape2",X3.shape[2])
            print ("XXXXXXX3NDIM",X3.ndim)
          # the above is only re-format the input of layer2
          # the output is each filter's
            current_conv_out= theano.tensor.nnet.conv.conv2d(X3, self.W[i,:,:,:],border_mode=border_mode, subsample=self.subsample)

            for ii in range(X.shape[0]):  # X.shape[0] is the nb_samples
                for jj in range(current_conv_out.shape[3]):
                    conv_out[ii,i,0,jj]=current_conv_out[ii,0,0,jj]


        return self.activation(conv_out + self.b.dimshuffle('x', 0, 'x', 'x'))

I double checked my dimensions. I don't know why ConvOp has such an error.

Please just focus on one sentence:

for i in range(X.shape[0])

the error shows that X.shape[0] is a tensor variable not an interger
Then, I change it to

for i in range(int(X.shape[0]))

Still the error..

How to change format for tensor variable?

You can't use Theano tensors as regular numpy values, they are just simbolic values.
You'll have to change the shape using reshape, slicing and concatenation. If you have to create new tensors, you cannot do assignment either. Stuff like X[0,0] = 1 doesn't work with theano tensors. You have to find your solution around these constraints.

Thank you Eder.
So, if I have three 4D tensors X1, X2 and X3, I cannot do:

X3[:,:,0,:]=X1
X3[:,:,1,:]=X2

either, can I?

I think I know how to do it now.
Actually, I don't need to do such assignment, but just let X1 and X2 do convolution by themselves, and sum up their values, by using Merge Layer, to achieve such goal.
I will now try whether it works or not.

Hello Eder,
In keras convolutional.py, a piece of code is:

     if self.border_mode == 'same':
            shift_x = (self.nb_row - 1) // 2
            shift_y = (self.nb_col - 1) // 2
            conv_out = conv_out[:, :, shift_x:X.shape[2] + shift_x, shift_y:X.shape[3] + shift_y]

I created a Changeformat layer in the core.py, which says:

        if self.sign ==0:
            X1= X[:,:,:,0:X.shape[3]-shift]
            return X1

The error is:
raise TypeError("Expected an integer")
TypeError: Expected an integer

WHY can't I write the same code? How can I choose some part of the output:"(

Then I follow the example of class ZeroPadding2D, and write the code:

  shift = self.nb_col + self.index
        in_shape = X.shape
        out_shape = (in_shape[0], in_shape[1], in_shape[2], in_shape[3]-shift)
        out = T.zeros(out_shape)
        if self.sign ==0:
            indices = (slice(None), slice(None), slice(None), slice(0, in_shape[3] - shift))
            return T.set_subtensor(out[indices],X)

Again, say it needs integer..
the error is:
TypeError: Shape arguments to Alloc must be integers, but argument 3 is not for apply node: Elemwise{sub,no_inplace}.0

So WIRED!!! How can I solve such a strange problem:"(

@ghost, I found something that could be useful: http://bfy.tw/JCF

@fchollet ,
First of all, please accept my apologize for bothering your in-depth discussion there. Apparently, I misunderstood the usage of issues. I mis-treated the issues as a question-arose platform.
I came into deep learning research field about one and a half months ago. As I need to implement a DNN for our application alone in a short time, I chose a strategy as: posting a question online when meeting it, thinking about the solution at the same time (That's why you can see I reply to my own issues sometimes, once I find a possible solution). I thought somebody may answer my question if he already knows how to do it. My original purpose is to save my developing time.
By reading your serious reminding, I realized this is not a good strategy and obviously bothering you a lot. I feel deep sorrow for that.
Secondly, a "simple" question still existed for me. As I already mentioned, I need to change format of my input. Because I don't know theano tensor so much, I wrote my code referring to your "class ZeroPadding2D(Layer)" code.
Follows are your code:

def get_output(self, train):
        X = self.get_input(train)
        width = self.width
        in_shape = X.shape
        out_shape = (in_shape[0], in_shape[1], in_shape[2] + 2 * width, in_shape[3] + 2 * width)
        out = T.zeros(out_shape)
        indices = (slice(None), slice(None), slice(width, in_shape[2] + width), slice(width, in_shape[3] + width))
        return T.set_subtensor(out[indices], X)

Mine is like this:

 def get_output(self, train):
        X= self.get_input(train)
        nb_col = self.nb_col
        index = self.index        
        shift = nb_col + index
        in_shape = X.shape
        out_shape = (in_shape[0], in_shape[1], in_shape[2], in_shape[3] - shift)
        out = T.zeros(out_shape)
        if self.sign ==0:
            indices = (slice(None), slice(None), slice(None), slice(0, in_shape[3] - shift))
            return T.set_subtensor(out[indices],X)
        elif self.sign ==1:
            X2= X[:,:,:,shift:]
            return X2
        else:
            print ("Order Error: the order of params should be: sign, index, nb_col.")

When I run my code, it says
"TypeError: Shape arguments to Alloc must be integers, but argument 3 is not for apply node: Elemwise{sub,no_inplace}.0"
NOTE: this error pointed to the sentence: out = T.zeros(out_shape)

I re-read my code many times, but still cannot find the problem.

I know it is very simple for you, and I don't mean to fetch an obvious answer here. If you think I shouldn't get an solution, could you show me a way to find the problem?

I am not a smart person, but I would like to conquer all difficulties I met by my hard work.

Thank you again for your kindly understanding.

Elemwise{sub,no_inplace}.0 means you trying to use a Theano.tensor as an int. You shouldn't use symbolic elements as if they were numpy values.

So yeah, I've seen people take a backwards route sometimes here. Keras is written with Theano and lots of the questions here are regarding problems with the later. You won't be able to extend Keras without knowing Theano. So, don't hurry. Give yourself some time to learn Theano using their awesome tutorials. If you don't learn it now, you will have the same problems again later and we may not be able to help. I don't mean to tell you what to do. But I really believe this will help.

Keep up with the good work and good luck!

Thank you for your suggestion Eder. Furthermore, thank you for your encouragement, really!!
Actually, I know the problem may be type-mismatch.

My doubt is:
in Keras convolutional.py, a ZeroPadding layer was defined and there are codes shown:

 X=self.get_input (train)
 in_shape = X.shape
 out_shape = (in_shape[0], in_shape[1], in_shape[2] + 2 * width, in_shape[3] + 2 * width)
 out = T.zeros(out_shape)

There, "in_shape" is apparently a tensor4() too. "width" is an integer.

I can use Keras ZeroPadding successfully.

My code is:

 X= self.get_input(train)
in_shape = X.shape
out_shape = (in_shape[0], in_shape[1], in_shape[2], in_shape[3] - shift)
out = T.zeros(out_shape)

I compare my code and the above code many times. I first thought '-' should not be used for tensor variable and an integer, then I change it to '+' to check this error, then add '*' to match the formula in ZeroPadding too. Error still comes.

But why ZeroPadding can be used successfully not my layer? Same usage. What is the essential difference between them? This problem still bothers me..

I finally fixed the error:
If you want to use an "int" type array. You should use illustrate dtype=int when initialize it, otherwise all the values would be treated as float values, even after using cast function.
For example:

aaQuaInt = np.zeros(20,dtype=int)

states all the elements in aaQuaInt are integers.

However, even after using

aaQuaInt = np.zeros(20)
for i in range(20):
    aaQuaInt[i]= int(aaQua[i]*10)

the elements in aaQuaInt are still float data types.

That's why, we get the error:
TypeError: Shape arguments to Alloc must be integers, but argument xx is not for apply node: Elemwise{sub,no_inplace}.0

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rantsandruse picture rantsandruse  路  3Comments

LuCeHe picture LuCeHe  路  3Comments

fredtcaroli picture fredtcaroli  路  3Comments

anjishnu picture anjishnu  路  3Comments

snakeztc picture snakeztc  路  3Comments