Transformers: Accuracy on classification task is lower than the official tensorflow version

Created on 30 Nov 2018  ·  2Comments  ·  Source: huggingface/transformers

Hi, I am running the same task with the same hyper parameters as the official Google Tensorflow implementation of BERT, however, I am getting around 1.5% lower accuracy. Can you please give any hint about the possible cause?

Thanks!

Most helpful comment

Hi @ejld, yes BERT has a large variance on many fine-tuning tasks (see also the discussion in #64).
You should try a bunch of different seeds (like 10 seeds for example) and compare the mean and standard deviation of the results.

All 2 comments

Hi!
Could it be different seeds?
See e.g. https://github.com/huggingface/pytorch-pretrained-BERT/issues/53#issuecomment-441565229

Hi @ejld, yes BERT has a large variance on many fine-tuning tasks (see also the discussion in #64).
You should try a bunch of different seeds (like 10 seeds for example) and compare the mean and standard deviation of the results.

Was this page helpful?
0 / 5 - 0 ratings