Keras: Can we build two models together? with one nesting in another one

Created on 22 Aug 2015 · 29Comments · Source: keras-team/keras

Hello All,
My code is supposed to be:
model = Sequential()
model.add(Conv2D)
Layer2Model = Graph
for i in range(20):
Layer2Model.add(Conv2D)
output=Layer2Model.layers.get_output

model.add(Flatten())
model.layer[1].set_previous(Layer2Model.FinalLayer) //layer[1] means Flatten

model.Dense()

Layer2Model.compile
model.compile

Can we do like this? I tried before, but it seems model didn't connect with Layer2Model..

Source

ghost

All 29 comments

Did you try adding a model as a node? Something like this always worked for me:

model1 = Sequential()
model1.add ...
...

model2 = Sequential()
model2.add(model1)
model2.add ...
...

you just have to respect the input and outputs that each model expects.

EderSantana on 22 Aug 2015

Does the whole process is as follows?:

model1 = Sequential()
model1.add ...
...

model2 = Sequential()
model2.add(model1)
model2.add ...

model2.compile (model2's inputs, model2's outputs)
model1.compile (model1's inputs, model1's outputs)

ghost on 24 Aug 2015

you only need to compile model2

EderSantana on 24 Aug 2015

👍1

Thank you Eder. Are you familiar with CNN?
As we know that current filters have same rows and same columns. In my second layer, I would define, saying for example, 3 filters with 1_6 for each. Every filter is separated into two parts, the first 1_3 is located at the beginning, the last 1*3 would be located 6 steps far from the first part. The ill. fig. is shown as follows:
layer2structure

That's to say, each filter has the same input while different output. Can I build my model as:
model1 = Sequential()
model1. add(Conv)

model2= Graph()
model2.add(model1)
model2.add(ConvDifFilter(filter1))
model2.add(ConvDifFilter(filter2))
model2.add(ConvDifFilter(filter3))

model3 = Sequential()
model3.add(model2)
model3.add(Flatten)
model3.add(Dense())

model3.compile()

ghost on 24 Aug 2015

👍1

I'd say this would be easier to implement with Graphs. Check the docs.
But basically, each of your filter will be a node getting a different input, which would be a version of the original input. Did you get the ideas?

EderSantana on 24 Aug 2015

Do you mean I put the model2 as a graph model, while model1 and model3 are sequential?

ghost on 24 Aug 2015

you can do that!!!
them you pass each model as node a different input and combine them later

EderSantana on 24 Aug 2015

Thank you.
How do you set them different inputs? I added a layer named "setInput" in core layer.py.
For example:
model = Sequential()
model.add(Conv1)
output=model.layer[0].get_output()
model.add(SetInput(output))
model.add(Conv2)

However, Conv2 still choose the output from Conv1.. Could you fix the error?

ghost on 24 Aug 2015

see the Graph examples here http://keras.io/models/

EderSantana on 24 Aug 2015

I saw in the docs, we can set the different inputs in graph.fit function.
But what if I put the graph model between two sequential models? Like my example before.
Does this mean, I do not need to care about the exact inputs, but set them in model2.train_on_batch() (or model2.fit etc.)
So, the whole process should be:

model1 = Sequential()
model1. add(Conv)

model2= Graph()
model2.add(model1)
model2.add...

model3 = Sequential()
model3.add(model2)
model3. add...

model3.compile()

model2.train_on_batch(define inputs here)
model3.train_on_batch()

ghost on 24 Aug 2015

it says in there look for graph.add_input

EderSantana on 24 Aug 2015

Thank you Eder.
The graph.add_input can only define input name here.
It will fetch the actual input in graph.fit function.
However, My situation is the graph model is in the middle of two sequential layers. That means I am not able to get the label output of this model. So I cannot use graph.fit in my situation.
But how can I define my inputs without using the graph.fit function?

ghost on 24 Aug 2015

I'm saying that you should make the outer model a Graph since it is more flexible. This way you can define what is the input of each node. If you use Sequential as the outermost model, you won't be able to do that.

EderSantana on 24 Aug 2015

My DNN structure is as follows:
layerstructure

So the current problem is:
For graph model, I use, for example:
history = graph.fit({'input1':X_train, 'input2':X2_train, 'output':y_train}, nb_epoch=10) (from keras.io)
to define each input and output

In my structure, I should use:
model.fit({'input1':X_train, 'input2':changeformat1(a), 'input3':changeformat2(a),'input4':changeformate3(a),'output1':how to describe?,'output2':y_train}, nb_epoch=10)

However, for my input2-4 and output1, they are the middle computing value of input1.
How to describe them? I don't know how to illustrate such input and output.

ghost on 25 Aug 2015

You can have 3 actual inputs (each one is a delayed version of the "original"), pass all them on the first conv1. You get 3 outputs using the same conv1 (yes, you can reuse a layer) them you pass each one to its second layer. Like

conv1 = model ....
conv1_copy1 = deepcopy(conv1)
conv1_copy1.params = []
conv1_copy2 = deepcopy(conv1)
conv1_copy2.params = []

graph.add_input ... #input
graph.add_input ... #input delayed
graph.add_input ... #input delaye again
graph.add_node(conv1, input=input1, name='conv1_1')
graph.add_node(conv1_copy1, input=input2, name='conv1_2)
graph.add_node(conv1_copy2, input=input3, name='conv1_3)
graph.add_node # I believe you got the rest

The reason to make the params of the copy empty is to avoid theano trying to calculate a gradient twice for the same set of weights

EderSantana on 25 Aug 2015

Thank you Eder. I think I didn't clarify my problem.
For the output of the first Conv layer (or its copies), I need to change its format.
It means that, my input of the second layer is a transformational output of my first layer. So I need to change the format of this intermediate calculation value. And then, use the transformed output as the input of my second layer.

Can I do that using Keras?

ghost on 25 Aug 2015

try writing a layer between the first and second convolutions to do the transformation you need. Check the layer Reshape inside keras.layers.core.py
Try writing your changeformat as a layer instead.

EderSantana on 25 Aug 2015

Thank you so...o much Eder! The same idea just came into my mind!!
I have written a layer in the core.py.
I would now like to say a small incident I met before. At the very beginning, I wrote it in convolutional.py. However, an error of class Type error came to point to the super (Myclass, self). Then I copied Myclass code into core.py. No error and it compiled successfully.

I don't know why I cannot add a layer in convolutional.py...Wired..

Anyway, now my code has been compiled successfully. However, an "Output dimension is not valid" came when training.

My code is:

def get_output(self, train):
        nb_col = 6
        aaQua=[57.0519,71.0788,87.0782,97.1167,99.1326,101.1051,103.1388,113.1594,113.1594,114.1038,115.0886,128.1307,128.1741,129.1155,131.1926,137.1411,147.1766,156.1875,163.176,186.2132]
        aaQuaInt = np.zeros(20)
        for i in range(20):
           aaQuaInt[i]=int(aaQua[i]*10)
                                   #               50, 4, 1, 995
        X = self.get_input(train)  # 4D tensor: nb_samples, feature_map, 1, nb_col
        layer20aaLen=X.shape[3]-nb_col-aaQuaInt[0]
        border_mode = self.border_mode
        X3 = np.zeros((X.shape[0], X.shape[1], 2, X.shape[3]),dtype=theano.config.floatX)
        conv_out = np.zeros((X.shape[0], 20, X.shape[2],layer20aaLen),dtype=theano.config.floatX)
        for i in range(20):
            length1=int(X.shape[3]-nb_col-aaQuaInt[i])
            length2=int(nb_col+aaQuaInt[i])
            X1= X[:,:,0,0:int(length1)]
            X2= X[:,:,0,length2:]
            for inum in range(X.shape[0]):
                for jnum in range(X.shape[1]):
                    for knum in range(length1):
                        X3[inum,jnum,0,knum]=X1[inum,jnum,0,knum]
                        X3[inum,jnum,1,knum]=X2[inum,jnum,0,knum]
            print ("XXXXXXX3Shape2",X3.shape[2])
            print ("XXXXXXX3NDIM",X3.ndim)
          # the above is only re-format the input of layer2
          # the output is each filter's
            current_conv_out= theano.tensor.nnet.conv.conv2d(X3, self.W[i,:,:,:],border_mode=border_mode, subsample=self.subsample)

            for ii in range(X.shape[0]):  # X.shape[0] is the nb_samples
                for jj in range(current_conv_out.shape[3]):
                    conv_out[ii,i,0,jj]=current_conv_out[ii,0,0,jj]


        return self.activation(conv_out + self.b.dimshuffle('x', 0, 'x', 'x'))

I double checked my dimensions. I don't know why ConvOp has such an error.

ghost on 26 Aug 2015

Please just focus on one sentence:

for i in range(X.shape[0])

the error shows that X.shape[0] is a tensor variable not an interger
Then, I change it to

for i in range(int(X.shape[0]))

Still the error..

How to change format for tensor variable?

ghost on 26 Aug 2015

You can't use Theano tensors as regular numpy values, they are just simbolic values.
You'll have to change the shape using reshape, slicing and concatenation. If you have to create new tensors, you cannot do assignment either. Stuff like X[0,0] = 1 doesn't work with theano tensors. You have to find your solution around these constraints.

EderSantana on 26 Aug 2015

Thank you Eder.
So, if I have three 4D tensors X1, X2 and X3, I cannot do:

X3[:,:,0,:]=X1
X3[:,:,1,:]=X2

either, can I?

ghost on 26 Aug 2015

I think I know how to do it now.
Actually, I don't need to do such assignment, but just let X1 and X2 do convolution by themselves, and sum up their values, by using Merge Layer, to achieve such goal.
I will now try whether it works or not.

ghost on 26 Aug 2015

Hello Eder,
In keras convolutional.py, a piece of code is:

     if self.border_mode == 'same':
            shift_x = (self.nb_row - 1) // 2
            shift_y = (self.nb_col - 1) // 2
            conv_out = conv_out[:, :, shift_x:X.shape[2] + shift_x, shift_y:X.shape[3] + shift_y]

I created a Changeformat layer in the core.py, which says:

        if self.sign ==0:
            X1= X[:,:,:,0:X.shape[3]-shift]
            return X1

The error is:
raise TypeError("Expected an integer")
TypeError: Expected an integer

WHY can't I write the same code? How can I choose some part of the output:"(

ghost on 26 Aug 2015

Then I follow the example of class ZeroPadding2D, and write the code:

  shift = self.nb_col + self.index
        in_shape = X.shape
        out_shape = (in_shape[0], in_shape[1], in_shape[2], in_shape[3]-shift)
        out = T.zeros(out_shape)
        if self.sign ==0:
            indices = (slice(None), slice(None), slice(None), slice(0, in_shape[3] - shift))
            return T.set_subtensor(out[indices],X)

Again, say it needs integer..
the error is:
TypeError: Shape arguments to Alloc must be integers, but argument 3 is not for apply node: Elemwise{sub,no_inplace}.0

So WIRED!!! How can I solve such a strange problem:"(

ghost on 26 Aug 2015

@ghost, I found something that could be useful: http://bfy.tw/JCF

lukedeo on 26 Aug 2015

👎7

@fchollet ,
First of all, please accept my apologize for bothering your in-depth discussion there. Apparently, I misunderstood the usage of issues. I mis-treated the issues as a question-arose platform.
I came into deep learning research field about one and a half months ago. As I need to implement a DNN for our application alone in a short time, I chose a strategy as: posting a question online when meeting it, thinking about the solution at the same time (That's why you can see I reply to my own issues sometimes, once I find a possible solution). I thought somebody may answer my question if he already knows how to do it. My original purpose is to save my developing time.
By reading your serious reminding, I realized this is not a good strategy and obviously bothering you a lot. I feel deep sorrow for that.
Secondly, a "simple" question still existed for me. As I already mentioned, I need to change format of my input. Because I don't know theano tensor so much, I wrote my code referring to your "class ZeroPadding2D(Layer)" code.
Follows are your code:

def get_output(self, train):
        X = self.get_input(train)
        width = self.width
        in_shape = X.shape
        out_shape = (in_shape[0], in_shape[1], in_shape[2] + 2 * width, in_shape[3] + 2 * width)
        out = T.zeros(out_shape)
        indices = (slice(None), slice(None), slice(width, in_shape[2] + width), slice(width, in_shape[3] + width))
        return T.set_subtensor(out[indices], X)

Mine is like this:

 def get_output(self, train):
        X= self.get_input(train)
        nb_col = self.nb_col
        index = self.index        
        shift = nb_col + index
        in_shape = X.shape
        out_shape = (in_shape[0], in_shape[1], in_shape[2], in_shape[3] - shift)
        out = T.zeros(out_shape)
        if self.sign ==0:
            indices = (slice(None), slice(None), slice(None), slice(0, in_shape[3] - shift))
            return T.set_subtensor(out[indices],X)
        elif self.sign ==1:
            X2= X[:,:,:,shift:]
            return X2
        else:
            print ("Order Error: the order of params should be: sign, index, nb_col.")

When I run my code, it says
"TypeError: Shape arguments to Alloc must be integers, but argument 3 is not for apply node: Elemwise{sub,no_inplace}.0"
NOTE: this error pointed to the sentence: out = T.zeros(out_shape)

I re-read my code many times, but still cannot find the problem.

I know it is very simple for you, and I don't mean to fetch an obvious answer here. If you think I shouldn't get an solution, could you show me a way to find the problem?

I am not a smart person, but I would like to conquer all difficulties I met by my hard work.

Thank you again for your kindly understanding.

NancyZxll on 27 Aug 2015

Elemwise{sub,no_inplace}.0 means you trying to use a Theano.tensor as an int. You shouldn't use symbolic elements as if they were numpy values.

So yeah, I've seen people take a backwards route sometimes here. Keras is written with Theano and lots of the questions here are regarding problems with the later. You won't be able to extend Keras without knowing Theano. So, don't hurry. Give yourself some time to learn Theano using their awesome tutorials. If you don't learn it now, you will have the same problems again later and we may not be able to help. I don't mean to tell you what to do. But I really believe this will help.

Keep up with the good work and good luck!

EderSantana on 27 Aug 2015

Thank you for your suggestion Eder. Furthermore, thank you for your encouragement, really!!
Actually, I know the problem may be type-mismatch.

My doubt is:
in Keras convolutional.py, a ZeroPadding layer was defined and there are codes shown:

 X=self.get_input (train)
 in_shape = X.shape
 out_shape = (in_shape[0], in_shape[1], in_shape[2] + 2 * width, in_shape[3] + 2 * width)
 out = T.zeros(out_shape)

There, "in_shape" is apparently a tensor4() too. "width" is an integer.

I can use Keras ZeroPadding successfully.

My code is:

 X= self.get_input(train)
in_shape = X.shape
out_shape = (in_shape[0], in_shape[1], in_shape[2], in_shape[3] - shift)
out = T.zeros(out_shape)

I compare my code and the above code many times. I first thought '-' should not be used for tensor variable and an integer, then I change it to '+' to check this error, then add '*' to match the formula in ZeroPadding too. Error still comes.

But why ZeroPadding can be used successfully not my layer? Same usage. What is the essential difference between them? This problem still bothers me..

NancyZxll on 27 Aug 2015

I finally fixed the error:
If you want to use an "int" type array. You should use illustrate dtype=int when initialize it, otherwise all the values would be treated as float values, even after using cast function.
For example:

aaQuaInt = np.zeros(20,dtype=int)

states all the elements in aaQuaInt are integers.

However, even after using

aaQuaInt = np.zeros(20)
for i in range(20):
    aaQuaInt[i]= int(aaQua[i]*10)

the elements in aaQuaInt are still float data types.

That's why, we get the error:
TypeError: Shape arguments to Alloc must be integers, but argument xx is not for apply node: Elemwise{sub,no_inplace}.0

NancyZxll on 30 Aug 2015

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error with ImageDataGenerator

harishkrishnav · 3Comments

Understanding stateful_lstm.py

amityaffliction · 3Comments

keras crashing when using convolutions

braingineer · 3Comments

concatenate models (not layers)

LuCeHe · 3Comments

How to add reshape layer following embedding(for the purpose of 2d convolution)?

rantsandruse · 3Comments