Bert: BERT has a non deterministic behaviour

Created on 17 Apr 2019 · 5Comments · Source: google-research/bert

I am using the BERT implementation in https://github.com/google-research/bert for feature extracting and I have noticed a weird behaviour which I was not expecting: if I execute the program twice on the same text, I get different results. I need to know if this is normal and why this happens in order to treat this fact in one or another way. Why is the reason for this? Aren't neural networks deterministic algorithms?

Source

RodSernaPerez

👍10

Most helpful comment

I am also facing the same issue:(

Vibha111094 on 17 Apr 2019

👍3

All 5 comments

I am also facing the same issue:(

Vibha111094 on 17 Apr 2019

👍3

I have just finished a kaggle competition where I made extensive use of BERT. It certainly is a little unstable in fine tuning and susceptible to random seed variations, especially with small data sets.

However, extracting features from a pre-trained model without changing the weights should yield the same output every time. There must be something wrong, but I am not entirely familiar with the code for feature extraction. Are you sure dropout is disabled in the code you are using to do feature extraction. If dropout was mistakenly left at the default training value of 0.9 (keep probability) then it would cause some variation in output.

kenkrige on 26 Apr 2019

👍1

@RodSernaPerez How did you extract the features? Did you use extract_features.py? Can you share your code/command line?

eikevons on 25 Jul 2019

@RodSernaPerez, @Vibha111094 I had same issue too. My mistake was in init_checkpoint. The path ALWAYS should be YOUR_PATH/bert_model.ckpt, despite the fact, that you don't have the bert_model.ckpt file. For example, I have only bert_model.ckpt.data* and bert_model.ckpt.index files.
Earlier I used *.index file ininit_checkpoint (just because there is no errors raised).

7rick03ligh7 on 3 Oct 2019

I have the same issue, I've removed dropout from the model, set my random seed with tf.random.set_random_seed() (Just in case), and I don't have the same mistake as @DiggiDon, so I'm at a loss. Anything else to try?

EDIT: Turns out that the script I was using to run BERT (run_squad.py) was loading the .ckpt file from --output_dir, rather than the more intuitive --init_checkpoint. As I set --do_train=False, the script was never running fine-tuning, so a .ckpt was never output to --output_dir, meaning the parameters got randomly initialized each time I tried to run inference.