Transformers: how to output specific layer of TFBertForSequenceClassification, or add layer?

Created on 25 Nov 2019 · 7Comments · Source: huggingface/transformers

how to output the last layer of TFBertForSequenceClassification?

I want to output the layer before classifier (Dense)

Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  102267648 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  3845      
=================================================================
Total params: 102,271,493
Trainable params: 102,271,493
Non-trainable params: 0

I tried tf.keras function

dense1_layer_model = Model(inputs=model.input, outputs=model.get_layer('bert').output)

It didnt worked.

Source

roccqqck

Most helpful comment

In order to create a Sequential model with TensorFlow.Keras framework, you have to specify the input shape through input_shape parameter on the input layer, otherwise TensorFlow.Keras doesn't know the input shape of the model you're creating.

add layer

input_layer = Input(shape = (512,), dtype='int64')
bert = TFBertModel.from_pretrained('bert-base-chinese')(input_layer)
bert = bert[0]              # i think there is a bug here
flat = Flatten()(bert)
classifier = Dense(units=5)(flat)
model = Model(inputs=input_layer, outputs=classifier)
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_4 (InputLayer)         [(None, 512)]             0         
_________________________________________________________________
tf_bert_model_3 (TFBertModel ((None, 512, 768), (None, 102267648 
_________________________________________________________________
flatten_2 (Flatten)          (None, 393216)            0         
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1966085   
=================================================================
Total params: 104,233,733
Trainable params: 104,233,733
Non-trainable params: 0



md5-7cd9b78e72bbf84794ebf3cca6465111



optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=loss, metrics=[metric])

model_fit = model.fit(train_input_ids, train_label, 
                      batch_size=4, epochs=4, 
                      validation_data=(validation_input_ids, validation_label)
                   )



md5-2a0c4e90312bba959afc6f134abd01b4



flatten_layer_model = Model(inputs=model.input, outputs=model.get_layer('flatten_2').output)
predictions = flatten_layer_model.predict(validation_input_ids)
print(type(predictions))
print(predictions.shape)



md5-0bc67ab640ba46058c892dfd392af8b7



<class 'numpy.ndarray'>
(8359, 393216)

roccqqck on 26 Nov 2019

👍5 🎉1

All 7 comments

Please, copy and paste the source code in order to reproduce your problem.

how to output the last layer of TFBertForSequenceClassification?

I want to output the layer before classifier (Dense)

Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  102267648 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  3845      
=================================================================
Total params: 102,271,493
Trainable params: 102,271,493
Non-trainable params: 0

I tried tf.keras function

dense1_layer_model = Model(inputs=model.input, outputs=model.get_layer('bert').output)

It didnt worked.

TheEdoardo93 on 25 Nov 2019

Please, copy and paste the source code in order to reproduce your problem.

this is my original code

model = TFBertForSequenceClassification.from_pretrained('bert-base-chinese', num_labels=5)
model.summary()

Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  102267648 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  3845      
=================================================================
Total params: 102,271,493
Trainable params: 102,271,493
Non-trainable params: 0



md5-b78299ac61220c2e2aafdc3e39175e33



model = TFBertModel.from_pretrained('bert-base-chinese')
model.summary()

But I need the (N, 512, 768) output after fine tuning.

roccqqck on 26 Nov 2019

I tried this too

model = Sequential()
model.add( TFBertModel.from_pretrained('bert-base-chinese') )
model.add( Dropout(0.5))
model.add( Dense(5,activation="softmax") )
model.summary()

ValueError: This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build.

roccqqck on 26 Nov 2019

In order to create a Sequential model with TensorFlow.Keras framework, you have to specify the input shape through input_shape parameter on the input layer, otherwise TensorFlow.Keras doesn't know the input shape of the model you're creating.

TheEdoardo93 on 26 Nov 2019

In order to create a Sequential model with TensorFlow.Keras framework, you have to specify the input shape through input_shape parameter on the input layer, otherwise TensorFlow.Keras doesn't know the input shape of the model you're creating.

add layer

input_layer = Input(shape = (512,), dtype='int64')
bert = TFBertModel.from_pretrained('bert-base-chinese')(input_layer)
bert = bert[0]              # i think there is a bug here
flat = Flatten()(bert)
classifier = Dense(units=5)(flat)
model = Model(inputs=input_layer, outputs=classifier)
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_4 (InputLayer)         [(None, 512)]             0         
_________________________________________________________________
tf_bert_model_3 (TFBertModel ((None, 512, 768), (None, 102267648 
_________________________________________________________________
flatten_2 (Flatten)          (None, 393216)            0         
_________________________________________________________________
dense_1 (Dense)              (None, 5)                 1966085   
=================================================================
Total params: 104,233,733
Trainable params: 104,233,733
Non-trainable params: 0



md5-7cd9b78e72bbf84794ebf3cca6465111



optimizer = tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0)
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=loss, metrics=[metric])

model_fit = model.fit(train_input_ids, train_label, 
                      batch_size=4, epochs=4, 
                      validation_data=(validation_input_ids, validation_label)
                   )



md5-2a0c4e90312bba959afc6f134abd01b4



flatten_layer_model = Model(inputs=model.input, outputs=model.get_layer('flatten_2').output)
predictions = flatten_layer_model.predict(validation_input_ids)
print(type(predictions))
print(predictions.shape)



md5-0bc67ab640ba46058c892dfd392af8b7



<class 'numpy.ndarray'>
(8359, 393216)

roccqqck on 26 Nov 2019

👍5 🎉1

Hi @roccqqck, I am also doing something similar. Most of my queries are cleared by your comment. I have just one more doubt. The documentation states that the input of model should look like this [input_ids, attention_mask]. So, are you providing attention mask as input?

Have you uploaded full the code mentioned above on your github with data? If yes, can you please share the link?