I am using the BERT implementation in https://github.com/google-research/bert for feature extracting and I have noticed a weird behaviour which I was not expecting: if I execute the program twice on the same text, I get different results. I need to know if this is normal and why this happens in order to treat this fact in one or another way. Why is the reason for this? Aren't neural networks deterministic algorithms?
I am also facing the same issue:(
I have just finished a kaggle competition where I made extensive use of BERT. It certainly is a little unstable in fine tuning and susceptible to random seed variations, especially with small data sets.
However, extracting features from a pre-trained model without changing the weights should yield the same output every time. There must be something wrong, but I am not entirely familiar with the code for feature extraction. Are you sure dropout is disabled in the code you are using to do feature extraction. If dropout was mistakenly left at the default training value of 0.9 (keep probability) then it would cause some variation in output.
@RodSernaPerez How did you extract the features? Did you use extract_features.py? Can you share your code/command line?
@RodSernaPerez, @Vibha111094 I had same issue too. My mistake was in init_checkpoint. The path ALWAYS should be YOUR_PATH/bert_model.ckpt, despite the fact, that you don't have the bert_model.ckpt file. For example, I have only bert_model.ckpt.data* and bert_model.ckpt.index files.
Earlier I used *.index file ininit_checkpoint (just because there is no errors raised).
I have the same issue, I've removed dropout from the model, set my random seed with tf.random.set_random_seed() (Just in case), and I don't have the same mistake as @DiggiDon, so I'm at a loss. Anything else to try?
EDIT: Turns out that the script I was using to run BERT (run_squad.py) was loading the .ckpt file from --output_dir, rather than the more intuitive --init_checkpoint. As I set --do_train=False, the script was never running fine-tuning, so a .ckpt was never output to --output_dir, meaning the parameters got randomly initialized each time I tried to run inference.
Most helpful comment
I am also facing the same issue:(