Hi,
I'm new to keras, and I'm working to define a BiDNN (https://github.com/v-v/BiDNN) like model using keras (using tensorflow as backend).
However, it's a little bit confused for me to do such thing like Lasagne (https://github.com/Lasagne/Lasagne) does, that share weights between two layers with transposed input and output dimensions. Just like the following Lasagne code:
l1 = DenseLayer(4, num_units=8, W=GlorotUniform()) # with input dim (4) and output dim (8)
l2 = DenseLayer(8, num_units=4, W=l1.W.T) # with input dim (8) and output dim (4)
Thus l2 shares l1's weights by transposing this tensor.
Is this possible in keras implementation? Thank you for any advises.
I don't know if anyone has built it in keras already, but I could be wrong.
Basically, use a Dense layer for one direction. Write a custom layer for the other direction.
The layer would roughly look like this:
class DenseTranspose(Layer):
def __init__(other_layer):
self.other_layer=other_layer
def call(x):
return K.dot(x-self.other_layer.b, K.transpose(self.other_layer.W))
Cheers
@bstriner Where do you get these K.dot and K.transpose operations from?
import keras.backend as K
@mattdornfeld Hi , do you get the way to do this?could you share your method?
Read through the code in files like core.py to see how the core layers work. K is the keras backend and is used for pretty much all computations (transpose, dot, sin, etc.). K is a proxy for either theano or tensorflow depending on what you are using. If you want to know more about that those functions do and how they work, refer to theano or keras as appropriate.
@buaaliyi note that with keras-2 that code would be self.other_layer.bias and self.other_layer.kernel. Any progress? If you have a simple layer and it seems like a common need you should push it to keras-contrib.
@Lzc6996 what is your question exactly?
@Lzc6996
Here's the code. I did run into problems using Keras's load_model function with this type of layers, since load_model doesn't expect the __init__ function to have other_layer has an arg, but it works otherwise.
import tensorflow as tf
class DenseTranspose(Dense):
"""
A Keras dense layer that has its weights set to be the transpose of
another layer. Used for implemeneting BidNNs.
"""
def __init__(self, other_layer, **kwargs):
super().__init__(other_layer.input_dim, **kwargs)
self.other_layer = other_layer
def build(self, input_shape):
assert len(input_shape) >= 2
input_dim = input_shape[-1]
self.input_dim = input_dim
self.input_spec = [InputSpec(dtype=K.floatx(),
ndim='2+')]
self.W = tf.transpose(self.other_layer.W)
if self.bias:
self.b = self.add_weight((self.output_dim,),
initializer='zero',
name='{}_b'.format(self.name),
regularizer=self.b_regularizer,
constraint=self.b_constraint)
else:
self.b = None
if self.initial_weights is not None:
self.set_weights(self.initial_weights)
del self.initial_weights
self.built = True
@mattdornfeld BTW, load_weights is way more reliable than load as soon as you start to do anything interesting. Plenty of open issues regarding load but I've never had a problem with load_weights.
This is a very interesting thread. I am thinking of implementing a ladder network in keras
The problem is that ladder network uses batch normalization, but the batch statistics are only updated on the clean patch, so I need to create a frozen batch normalization layer that shares the weights (moving_mean and moving_variance) with the standard batch_normalization layer of the clean patch
Your trick can be a reasonable clean solution
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
Most helpful comment
I don't know if anyone has built it in keras already, but I could be wrong.
Basically, use a Dense layer for one direction. Write a custom layer for the other direction.
The layer would roughly look like this:
Cheers