Transformers: Albert pretrained weights change across runs.

Created on 5 Jun 2020 · 6Comments · Source: huggingface/transformers

🐛 Bug

Information

Model I am using (Bert, XLNet ...): TFAlbertModel

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

[ ] the official example scripts: (give details below)
[X] my own modified scripts: (give details below)

import tensorflow as tf
from transformers import AlbertTokenizer, TFAlbertModel

tokenizer = AlbertTokenizer.from_pretrained('albert-base-v2')
model = TFAlbertModel.from_pretrained('albert-base-v2')
model.summary()

print(len(model.trainable_weights))

print(model.trainable_weights[23])

input_ids = tf.constant(tokenizer.encode("Hello, my dog is cute"))[None, :]

outputs = model(input_ids)

print(outputs[0].shape, outputs[1].shape, len(outputs))
last_hidden_states = outputs[0]

print(last_hidden_states)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[X] my own task or dataset: (give details below)
Trying to load pre-trained weights

To reproduce

Run the code above two times and you will see that the weights of the model are not the same across the two runs
Steps to reproduce the behavior:

Run the code the first time and log the output
Run the code a second time and log the output
Check that the two logs are not the same.

Expected behavior

Since the model is loading pre-trained weights the results should be the same across runs.

Environment info

transformers version: 2.11.0
Platform: Linux-4.4.0-179-generic-x86_64-with-debian-stretch-sid
Python version: 3.6.9
PyTorch version (GPU?): 1.4.0 (True)
Tensorflow version (GPU?): 2.0.1 (True)
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

I apologize if the issue is due to me misusing your library, first time using Albert.

Source

CVxTz

Most helpful comment

This should work now, the weights have been changed to use the with-prefix weights.

LysandreJik on 22 Jun 2020

👍2

All 6 comments

I just did the same experiment with Roberta weights and did not have the same issue.

CVxTz on 5 Jun 2020

Hi, I can reproduce. This is due to the archive maps not being available anymore, and therefore the wrong ALBERT models are linked.

Thanks for raising the issue, this is quite a bug.

cc @julien-c

LysandreJik on 6 Jun 2020

👍2

My bad! It's my fault. I added a warning to the release notes about this: https://github.com/huggingface/transformers/releases/tag/v2.11.0

julien-c on 8 Jun 2020

Is there a plan to fix this? Looks like the issue is that the "real" model we want is named with-prefix-tf_model.h5, which needs to be renamed to tf_model.h5. https://huggingface.co/albert-base-v2#list-files